Predict incomes from census data

This is a Classroom Challenge forked from INCPR.

🕵️ Introduction

We have found a creative and a very useful application of US Census Bureau Data. In this problem, you have to predict if a person is earning more or less than $50,000 per year based on their Census data.

💾 Dataset

This data was extracted from the US Census Bureau Database . It conatains various datapoints for each person - such as age, education, working hours(per week) and more!

The last column contains  1  if the income of the citizen is more than or equal to $50,000 and 0 if it is less. More information about the dataset fields can be found in dataset_info.txt.

You need to predict 1 if the person earns more than 50k/year otherwise 0.

📁 Files

The following files can be found in the resources section:

  • train.csv - (32559 samples) This csv file contains the information about the person along with the label as 1/0 i.e. if he earns more than or less that 50k/year.

  • test.csv - (16280 samples)This csv file contains the information about the person but not the label as 1/0 i.e. if he earns more than or less that 50k/year. The labels of this samples will be used for evaluation.

🚀 Submission

  • Prepare a csv containing header as income and predicted value as 1/0.
  • Sample submission format available at sample_submission.csv.

🖊 Evaluation Criteria

During evaluation F1 score will be used to test the efficiency of the model where,

