ImageCLEF 2020 Tuberculosis - CT report
Note: ImageCLEF 2020 Tuberculosis is part of the official ImageCLEF 2020 medical task. Here is a list of other ImageCLEF 2020 medical task challenges:
Note: Do not forget to read the Rules section on this page. Pressing the red Participate button leads you to a page where you have to agree with those rules. You will not be able to submit any results before agreeing with the rules.
Note: Before trying to submit results, read the Submission instructions section on this page.
Welcome to the 4th edition of the Tuberculosis Task!
Tuberculosis (TB) is a bacterial infection caused by a germ called Mycobacterium tuberculosis. About 130 years after its discovery, the disease remains a persistent threat and a leading cause of death worldwide according to WHO. This bacteria usually attacks the lungs, but it can also damage other parts of the body. Generally, TB can be cured with antibiotics. However, the different types of TB require different treatments, and therefore the detection of the TB type and the evaluation of lesion characteristics are important real-world tasks.
In this year edition, we decided to concentrate on the automated CT report generation task, since it has important outcome that can have a major impact in the real-world clinical routines. In order to make the task both more attractive for participants and practically valuable, this year report generation is lung-based rather than CT-based, which means the labels for left and right lungs will be provided independently. The set of target labels in the CT Report was updated with accordance to the opinion of medical experts. This year we provide 3 labels for each lung: presence of TB lesions in general, presence of pleurisy and caverns in particular. Also the dataset size was increased compared to the previous year.
As soon as the data is released it will be available under the “Resources” tab.
In this edition, a dataset containing chest CT scans of 403 (283 for train and 120 for test) TB patients is used. Since the labels are provided on lung-wise scale rather than CT-wise scale, the total number of cases is virtually increased twice.
Provided data includes sets of train and test CT images, lungs masks, CT report for train data.
We provide 3D CT image which are stored in NIFTI file format with .nii.gz file extension (g-zipped .nii files). This file format stores raw voxel intensities in Hounsfield units (HU) as well the corresponding image metadata such as image dimensions, voxel size in physical units, slice thickness, etc. A freely-available tool called “VV” can be used for viewing image files. Currently, there are various tools available for reading and writing NIFTI files. Among them there are load_nii and save_nii functions for Matlab; Niftilib library for C, Java, Matlab and Python and NiBabel package for Python.
We provide two versions of automatically extracted masks of the lungs which are stored in the same file format as CTs.
The first version of segmentation was retrieved using the same technique as previous year. The details of this segmentation can be found here.
The second version of segmentation was retrieved using non-rigid image registration scheme. The details of this segmentation and open-source implementation can be found here.
The first version of segmentation provides more accurate masks, but it tends to miss large abnormal regions of lungs in the most severe TB cases. The second segmentation on the contrary provides more rough bounds, but behaves more stable in terms of including lesion areas.
Please note, that only first version of segmentation allows extracting mask for left and right individually (voxel values differs per lung), while second version of segmentation needs custom post-processing.
In case the participants use the provided masks in their experiments, please refer to the section “Citations” at the end of this page to find the appropriate citation for this lung segmentation technique.
CT report for train CT images
We provide labels for training set as a simple .csv file, containing following columns (headed included):
Filename - train file name
LeftLungAffected - binary label for presence of any TB lesions in the left lung
RightLungAffected - binary label for presence of any TB lesions in the right lung
CavernsLeft - binary label for presence of caverns in the left lung
CavernsRight - binary label for presence of caverns in the right lung
PleurisyLeft - binary label for presence of pleurisy in the left lung
PleurisyRight - binary label for presence of pleurisy in the right lung
Plese note, that “presence of any TB lesions” means any TB lesions, not limited to caverns or pleurisy. So rows like “1,1,0,0,0,0” are correct.
Test data release planned for mid-March (tentative).
As soon as the submission is open, you will find a “Create Submission” button on this page (next to the tabs).
Before being allowed to submit your results, you have to first press the red participate button, which leads you to a page where you have to accept the challenges rules.
Submit a plain text file named with the prefix CTR (e.g. CTRfree-text.txt) with the following format:
<Filename>,<Probability of “left lung affected”>,<Probability of “right lung affected”>,<Probability of “presence of caverns in the left lung”>,<Probability of “presence of caverns in the right lung”>,<Probability of “pleurisy in the left lung”>,<Probability of “pleurisy in the right lung”>
CTR_TST_001.nii.gz,0.89,0.1,0.84,0.05,0.9,0.2 CTR_TST_002.nii.gz,0.1,0.6,0.222,0.333,0.444,0.55 CTR_TST_003.nii.gz,0.1,0.7,0.0,0.2,0.1,0.46 CTR_TST_004.nii.gz,0.88,0.78,0.59,0.65,0.8,0.4
You need to respect the following constraints:
Filenames must be same as original test file names
All filenames must be present in the runfiles
Only use numbers between 0 and 1 for the probabilities. Use the dot (.) as a decimal point (no commas accepted)
This task is considered as a multi-binary classification problem.
The ranking of this task will be done first by average AUC and then by min AUC over the 3 target labels.
The AUC values will be evaluated in a lung-wise manner.
Note: In order to participate in this challenge you have to sign an End User Agreement (EUA). You will find more information on the ‘Resources’ tab.
ImageCLEF lab is part of the Conference and Labs of the Evaluation Forum: CLEF 2020. CLEF 2020 consists of independent peer-reviewed workshops on a broad range of challenges in the fields of multilingual and multimodal information access evaluation, and a set of benchmarking activities carried in various labs designed to test different aspects of mono and cross-language Information retrieval systems. More details about the conference can be found here .
Submitting a working note with the full description of the methods used in each run is mandatory. Any run that could not be reproduced thanks to its description in the working notes might be removed from the official publication of the results. Working notes are published within CEUR-WS proceedings, resulting in an assignment of an individual DOI (URN) and an indexing by many bibliography systems including DBLP. According to the CEUR-WS policies, a light review of the working notes will be conducted by ImageCLEF organizing committee to ensure quality. As an illustration, ImageCLEF 2019 working notes (task overviews and participant working notes) can be found within CLEF 2019 CEUR-WS proceedings.
Participants of this challenge will automatically be registered at CLEF 2020. In order to be compliant with the CLEF registration requirements, please edit your profile by providing the following additional information:
Regarding the username, please choose a name that represents your team.
This information will not be publicly visible and will be exclusively used to contact you and to send the registration data to CLEF, which is the main organizer of all CLEF labs
Participating as an individual (non affiliated) researcher
We welcome individual researchers, i.e. not affiliated to any institution, to participate. We kindly ask you to provide us with a motivation letter containing the following information:
the presentation of your most relevant research activities related to the task/tasks
your motivation for participating in the task/tasks and how you want to exploit the results
a list of the most relevant 5 publications (if applicable)
the link to your personal webpage
The motivation letter should be directly concatenated to the End User Agreement document or sent as a PDF file to bionescu at imag dot pub dot ro. The request will be analyzed by the ImageCLEF organizing committee. We reserve the right to refuse any applicants whose experience in the field is too narrow, and would therefore most likely prevent them from being able to finish the task/tasks.
Information will be posted after the challenge ends.
ImageCLEF 2020 is an evaluation campaign that is being organized as part of the CLEF initiative labs. The campaign offers several research tasks that welcome participation from teams around the world. The results of the campaign appear in the working notes proceedings, published by CEUR Workshop Proceedings (CEUR-WS.org). Selected contributions among the participants, will be invited for publication in the following year in the Springer Lecture Notes in Computer Science (LNCS) together with the annual lab overviews.
- You can ask questions related to this challenge on the Discussion Forum. Before asking a new question please make sure that question has not been asked before.
- Click on Discussion tab above or direct link: https://discourse.aicrowd.com/c/imageclef-2020-tuberculosis-ct-report
We strongly encourage you to use the public channels mentioned above for communications between the participants and the organizers. In extreme cases, if there are any queries or comments that you would like to make using a private communication channel, then you can send us an email at :
You can find additional information on the challenge here: https://www.imageclef.org/2020/medical/tuberculosis