Challenge: Completed
64.6k
1659
103
7071

The challenge has come to an end! You can browse the Discussions around the challenge here, browse all Community submitted notebooks here.

🎉 Check out the Final winners of the challenge here!

📢 Due to the nature of the competition, the dataset used for the challenge is no longer available.

❗ Please Note: This competition uses a non-standard submission process because of the sensitivity of the data. Submissions are made through the ADDI WorkbenchYou can follow this step-by-step guide to make your submission to the challenge.

## 🕵️ Introduction

The Clock Drawing Test (CDT) is a simple test to detect signs of dementia. In this test the patient is asked to draw an analog clock with hands on the clock indicating ‘ten minutes past 11 o’clock.’ The test can be done on a blank paper or on a paper with a pre-drawn circle. This single test may be sensitive to dementia because it involves many cognitive areas that can be affected by dementia, including executive function, visuospatial abilities, motor programming, attention, and concentration. A qualified doctor then examines the drawing for the signs of dementia.

In the current dataset, clocks were scored by experts on a scale from 0 (not recognizable as a clock) to 5 (accurate depiction of a clock). Criteria for each numerical score included:

5 (accurate depiction) — numbers in correct quadrants; hands pointing to the numbers 11 and 2; minute hand longer than the hour hand.
4 (reasonably accurate depiction) — numbers in roughly correct quadrants; hands reasonably close to the numbers 11 and 2; hands could be of equal length or the minute hand could be shorter than the hour hand; numbers may be outside the perimeter of the clock face.
3 (mildly distorted depiction) — some numbers may be missing or disoriented; there may be a few extra numbers; hands may be incorrectly drawn or pointing to wrong number combinations; a hand may be missing.
2 (moderately distorted depiction) — several numbers are missing, repeated, or drawn in reverse order; there are more than two hands or no hands.
1 (severely distorted depiction) — viewer may be able to tell that the drawing is a clock but cannot tell the time shown.
0 (not recognizable as a clock) — viewer is not be able to tell drawing is a clock.

There are other widely acceptable scoring methodologies that are usually followed for scoring clocks drawn during cognitive assessment. The results from cognitive assessments by CDT are used to diagnose underlying cognitive disabilities, including Alzheimer’s disease.

Figure 1: A hand drawn clock image from CDT (Score 4)

The challenge is to use the features extracted from the Clock Drawing Test to build an automated algorithm to predict whether each participant is in one of three phases:

1)    Pre-Alzheimer’s (Early Warning)
2)    Post-Alzheimer’s (Detection)
3)    Normal (Not an Alzheimer’s patient)

In machine learning terms: this is a 3-class classification task.

## 🌍 Background

Dementia refers to a symptom where an adult demonstrates memory disorder and cognitive impairment. Early diagnosis of dementia is very important for medication management and prognosis. The clock drawing test is one of the most common cognitive screening tools for dementia. For decades, this neuropsychological assessment has included paper and pen tests, asking the participant to draw an analog clock. Analyses of these drawings rely on clinical judgment of specified features, which makes interpretation highly subjective. Due to this subjectivity and manual bias of interpretations, the test results are not widely comparable.

Disproportionate increases in dementia morbidity challenge established screening methodologies because of language, culture barriers, varying access to health services, and varying manual interpretations. The current need is to establish a simple, automated, and objective screening technique which can adapt to a range of health and social service settings and would enable early detection.

## 🔍 Research Objective

The current research aims at leveraging the collected data (derived features from decomposition of clock-drawing images) to build an algorithm that will help in predicting if the patient is in the Pre-Alzheimer’s phase (Early Warning), Post-Alzheimer’s phase (Detection) or Normal (Not an Alzheimer’s patient). Neuropsychologists are more interested in specific elements in a clock image such as positioning of the center dot, location of hands and digits, or size of digits relative to the clock-face than the actual image itself. These elements are typically captured by a trained psychologist from the clock drawing and kept in the patient record. After our discussion with these psychologists, we realized that an automated solution to capture these clock elements will be of significant value for the community. Therefore, we've run de-identified clock images from cognitive assessments through a pre-trained clock decomposition pipeline.

This pipeline breaks down various elements of the clock drawings and creates numerical features out of them, which can be grouped into the following categories:

1.    Center dot: Variables indicating the presence of a center dot in the clock drawing, its location, and difference from geometric clock center.
2.    Clock face: Lengths of vertical and horizontal axes, areas of upper-half, lower-half, left-half, and right-half of the clock face.
3.    Clock hands: Presence of hour and minute hand, their location, length, ratio, and proximity to digits 11 and 2, respectively.
4.    Digits: Presence of 12 digits, their location, orientation, clockwise or anti-clockwise sequence, height and width, and angle of separation between digits.
5.    Perseverance: Detection of perseverance while drawing clock-face, hands, center-dots or digits which is measured as an increase in black pixel percentage around a clock element (such as hand/digit) relative to normal.

The next step of this research is to build classification models using these derived features that can identify individuals in Pre-Alzheimer’s, Post-Alzheimer’s or Normal phase based on the pre-existing labels we have for these tests. These labels have been generated from the Alzheimer's diagnosis of individuals from different testing methods such as neuroimaging, cognitive, and health assessments. Once an individual was diagnosed with Alzheimer's we have labelled all images drawn by that individual after diagnosis into post-Alzheimer's and all the images drawn before diagnosis into pre-Alzheimer's. For individuals who were never diagnosed with Alzheimer's, we labelled the images drawn by them as normal.

## 💾 Dataset

Each row in the data set represents the results from one clock drawing test of a single participant. The data set contains ~121 features(exact feature descriptions can be found here). The description of each feature, as well as all the dataset files are shared in the Aridihia Workbench. (check out this guide to get started with the workbench)

Training data
Training data consists of 32,778 observations, which is a stratified random sample based on class labels of the original dataset. The labels are present as (Pre-Alzheimer’s, Post-Alzheimer’s, and Normal).

Testing data

The test data set consists of roughly 1,473  observations without label information. For each row predict a label (Pre-Alzheimer’s, Post-Alzheimer’s, and Normal).

Output

row_id normal_diagnosis_probability pre_alzheimer_diagnosis_probability post_alzheimer_diagnosis_probability
23123_R2 0.8 0.1 0.1
46453_R4 0.1 0.33 0.54
98349_R1 0.02 0.9 0.03

## 📁 Files

The dataset files will be available only in the Aridhia workspace. The following file will be present there:

• train.csv (32778 samples)
• validation.csv (363 samples)
• validation_ground_truth.csv (363 samples)

## 🖊 Evaluation Criteria

The final evaluation will be based on the multi-class log loss:

where, the true labels for a set of samples are encoded as a 1-of-K binary indicator matrix Y, i.e., yi,k=1 if sample i has label k taken from a set of K labels. P is a matrix of probability estimates,  with pi,k=Pr⁡(yi,k=1)

F1 scores and # of variables will also be present on the leaderboard but they won’t be used for sorting.

## 🚀 Submission

❗ Please Note: This competition uses a non-standard submission process because of the sensitivity of the data. Submissions are made through the ADDI Workbench.

Detailed instructions for making a submission to the challenge are available here.

If you face any issues with the Aridihia Workbench, please reach out here.

## 📅 Rounds

🏃‍♂️ The Competition will run from 26th April 2021, 12 PM UTC to 8th June 2021, 8 AM UTC.

📝 The Community Contribution Prize deadline is on 26th May, 8 AM UTC. 8th June 2021, 8 AM UTC (new deadline)

🥶 The team freeze deadline is on 26th May 2021, 8 AM UTC.

## 🏆 Prizes

Prizes will be awarded for best scores and Contest community contributions. There will be four (4) cash prizes and 14 non-cash prizes:

Score-based Prizes:

• Rank #1     $20,000 USD • Rank #2$15,000 USD
• Rank #3    $10,000 USD • Rank #4$5,000 USD
• Rank #5    1 x Sony PlayStation 5
• Rank #6    1 x Sony PlayStation 5
• Rank #7    DJI Mavic Mini 2
• Rank #8    DJI Mavic Mini 2
• Rank #9    Oculus Quest 2
• Rank #10   Oculus Quest 2

Contest Community Contribution Prizes (8 total):

• 1 x Sony PlayStation 5
• 1 x X-Box Series X
• 3 x DJI Manic Mini 2
• 3 x Oculus Quest 2

⏰ Deadline for Community Contribution Prize: 26th May 2021, 8 AM UTC

## 📚 Acknowledgement

Alzheimer’s Disease Data Initiative (ADDI)  would like to acknowledge NHATS for their publicly accessible datasets which were a source for this study.

#### Notebooks

 8 What about constant solution??? By sweetlhare Almost 2 years ago 0 9 R you normal? (explore datapoints + xgboost training 0.606) By demarsylvain Almost 2 years ago 0 17 How to Draw a Clock By adilism Almost 2 years ago 5 7 F1:0.376 Image pixel representation + CNN Model Baseline By nilabha Almost 2 years ago 0 15 Detailed Data Analysis & Simple CatBoost - 0.640 on LB By sweetlhare Almost 2 years ago 0