Predicting Cervical Cancer


πŸ›  Contribute: Found a typo? Or any other change in the description that you would like to see? Please consider sending us a pull request in the public repo of the challenge here.

πŸ•΅οΈ Introduction

We have said it earlier and we say it again - 'With Great Power Comes Great Responsibility' And yes we do have the power to do good for the world. Let us be responsible and put that power to use.

This time, we pick up our weapons against cancer.

Given information of different risk factors in a woman, predict as best as possible, the presence or absence of cervical cancer in the woman.

Understand with code! Here is getting started code for you.πŸ˜„

πŸ’Ύ Dataset

This dataset contains indicators and risk factors for predicting whether a woman will get cervical cancer. There are total of 15 attributes out of which first 14 features include demographic data such as age, lifestyle, and medical history. The last attribute called Biopsy is target variable and it's value is 0 for Healthy and 1 for Cancer. The first 14 attributes are as:

  • Age [ in years ]
  • Number of sexual partners
  • First sexual intercourse [ age in years ]
  • Number of pregnancies
  • Smoking [ yes or no ]
  • Smoking [ in years ]
  • Hormonal contraceptives [ yes or no ]
  • Hormonal contraceptives [ in years ]
  • Intrauterine device [ yes or no (IUD) ]
  • Number of years with an intrauterine device (IUD)
  • Has patient ever had a sexually transmitted disease (STD) [ yes or no ]
  • Number of STD diagnoses
  • Time since first STD diagnosis
  • Time since last STD diagnosis
  • The biopsy results - Target outcome.[ 0 for Healthy or 1 for Cancer ]

πŸ“ Files

Following files are available in the resources section:

  • train.csv - (686 samples) This csv file contains the attributes describing the risk factors along with its biopsy results.
  • test.csv - (172 samples) File that will be used for actual evaluation for the leaderboard score but does not have its biopsy result.

πŸš€ Submission

  • Prepare a CSV containing header as Biopsy and predicted value as digit 0 or 1 with name as submission.csv.
  • Sample submission format available at sample_submission.csv.

Make your first submission here πŸš€ !!

πŸ–Š Evaluation Criteria

During evaluation F1 score and Log Loss will be used to test the efficiency of the model where,

πŸ”— Links

πŸ“± Contact

πŸ“š References

  • Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

  • Source: Kelwin Fernandes (kafc at inesctec dot pt) - INESC TEC & FEUP, Porto, Portugal. Jaime S. Cardoso - INESC TEC & FEUP, Porto, Portugal. Jessica Fernandes - Universidad Central de Venezuela, Caracas, Venezuela.

  • Image source


Getting Started


01 ashivani 0.936
01 darthgera123 0.936
01 adithyasunil26 0.936
01 luka_beverin 0.936

Latest Submissions

luka_beverin graded
adithyasunil26 graded
darthgera123 graded
ashivani graded