ImageCLEF 2022 Tuberculosis - Caverns Detection
Note: ImageCLEF Tuberculosis includes 2 subtasks. This page is about the Caverns Detection subtask. For information about the Caverns Report subtask click here. Both challenges' datasets are shared together, so registering for one of these challenges will automatically give you access to the other one.
Note: Do not forget to read the Rules section on this page. Pressing the red Participate button leads you to a page where you have to agree with those rules. You will not be able to submit any results before agreeing with the rules.
Note: Before trying to submit results, read the Submission instructions section on this page.
Welcome to the 6th edition of the Tuberculosis Task!
Tuberculosis (TB) is a bacterial infection caused by a germ called Mycobacterium tuberculosis. About 130 years after its discovery, the disease remains a persistent threat and a leading cause of death worldwide according to WHO. This bacteria usually attacks the lungs, but it can also damage other parts of the body. Generally, TB can be cured with antibiotics. However, the different types of TB require different treatments, and therefore the detection of the TB type and characteristics are important real-world tasks.
Task history and lessons learned:
- Tuberculosis task exists in ImageCLEF since 2017 and it was modified from year to year.
- In the first and second editions of this task, held at ImageCLEF 2017 and ImageCLEF 2018, participants had to detect Multi-drug resistant patients (MDR subtask) and to classify the TB type (TBT subtask) both based only on the CT image. After 2 editions we concluded that the MDR subtask was not possible to solve based only on the image. In the TBT subtask, there was a slight improvement in 2018 with respect to 2017 on the classification results, however, not enough considering the amount of extra data provided in the 2018 edition, both in terms of new images and meta-data.
- In the 3d edition Tuberculosis task was restructured to allow usage of uniform dataset, and included two subtasks - continued Severity Score (SVR) prediction subtask and a new subtask based on providing an automatic report (CT Report) on the TB case. In the 4th edition, the SVR subtask was dropped and automated CT report generation task was modified to be lung-based rather than CT-based. Because of fairly high results achieved by the participants in the CTR task in 4th edition, the task organizers have decided to discontinue the CTR task and brought back to life the Tuberculosis Type classification task from the 1st and 2nd ImageCLEFmed Tuberculosis editions to check if recent Machine Learning and Deep Learning methods allows to improve previous rather low results.
In this year's edition the task is upgraded from classification problem to detection problem. This challenge (subtask) is about the detection itself: participants must detect lung cavern regions in lung CT images associated with lung tuberculosis. The problem is important because even after successful treatment which fulfills the existing criteria of recovery the caverns may still contain colonies of Mycobacterium Tuberculosis that could lead to unpredictable disease relapse.
The task dataset contains 559 train and 140 test cases. In addition, participants may also use 60 training cases from the Caverns Report task. Any other public dataset usage is also welcome.
Each case includes the CT image, two versions of automatically extracted lung masks, and information on cavern area location.
For all patients, we provide a single 3D CT image with an image size per slice of 512×512 pixels and the number of slices being around 100. All the CT images are stored in NIFTI file format with .nii.gz file extension (g-zipped .nii files). This file format stores raw voxel intensities in Hounsfield units (HU) as well the corresponding image metadata such as image dimensions, voxel size in physical units, slice thickness, etc. A freely-available tool called "VV" can be used for viewing the image files. Currently, there are various tools available for reading and writing NIFTI files. Among them, there are load_nii and save_nii functions for Matlab and Niftilib library for C, Java, Matlab and Python, NiBabel package for Python.
For all the CT images we provide two versions of automatically extracted masks of the lungs. These data can be downloaded together with the patients' CT images. The description of the first version of segmentation can be found here. The description of the second version of segmentation can be found here. The first version of segmentation provides more accurate masks, but it tends to miss large abnormal regions of lungs in the most severe TB cases. The second segmentation on the contrary provides more rough bounds, but behaves more stable in terms of including lesion areas. In case the participants use the provided masks in their experiments, please refer to the section "Citations" at the end of this page to find the appropriate citation for the corresponding lung segmentation technique.
Cavern area location
Cavern area location information includes a cavern area bounding box and cavern area centroid. This information is automatically extracted from manual segmentation masks. Please note, that source manual segmentation may have been done by different radiologists, therefore bounding boxes may be of different accuracy (for example, some margins may appear).
Note: A single CT can have none, one or multiple cavern regions. In some rare cases cavern regions may overlap.
Cavern areas bounding boxes are the prediction target in the task.
The file contains the following columns (header included):
id - train case id
bbox_X1, bbox_Y1, bbox_Z1 - coordinates of the first bounding box corner
bbox_X2, bbox_Y2, bbox_Z2 - coordinates of the second (diagonal) bounding box corner
centroid_X, centroid_Y, centroid_Z - coordinates of the cavern area centroid
As soon as the data are released they will be available under the "Resources" tab.
As soon as the submission is open, you will find a “Create Submission” button on this page (next to the tabs).
Before being allowed to submit your results, you have to first press the red participate button, which leads you to a page where you have to accept the challenge's rules.
Submit a plain text file with the following format:
<Filename>,<Comma-separated coordinates (X1,Y1,Z1) of the first bounding box corner>,<Comma-separated coordinates (X2,Y2,Z2) of the second bounding box corner>
You need to respect the following constraints:
File should not have a header.
Filenames must be same as original test file names
Single case may have none, one or multiple bounding boxes. If none caverns were detected - file name should be absent in the submission file. If multiple caverns were detected - one line per each cavern should be present (see TST_002 in the example above).
All coordinates should be integer values inside bounds of the corresponding CT image.
Coordinates of corners should be ordered: X1 < X2, Y1 < Y2, Z1 < Z2
This task is a detection problem which is evaluated on the mean average precision at different intersection over union (IoU) thresholds.
The IoU of a set of predicted bounding boxes (PredBB) and ground truth bounding boxes (GTBB) is calculated as:
IoU = (PredBB ∩ GTBB) / (PredBB ∪ GTBB)
The metric sweeps over a range of IoU thresholds t, for each t calculating an average precision (AP) value. At a threshold of t, a predicted object is considered a "true positive" if its intersection over union with a ground truth object is greater than t.
At each threshold value t, a precision value is calculated based on the number of true positives (TP), false negatives (FN), and false positives (FP) resulting from comparing the predicted bounding box to all ground truth bounding boxes:
AP(t) = TP(t) / (TP(t) + FP(t) + FN(t))
A true positive is counted when a single predicted bounding box matches a ground truth bounding box with an IoU above the threshold. A false positive is counted when a predicted bounding box had no associated ground truth bounding box with an IoU above the threshold. A false negative indicates a ground truth bounding box had no associated predicted bounding box with an IoU above the threshold.
If there are no ground truth bounding box at all for a given CT image, ANY number of predictions (false positives) will result in the image receiving a score of zero, and being included in the mean average precision.
The average precision of a single case is calculated as the mean of the above AP(t) values at each IoU threshold t = (0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75).
Since the metric may be non-obvious, the python implementation of evaluator is provided here.
Note: In order to participate in this challenge you have to sign an End User Agreement (EUA). You will find more information on the 'Resources' tab.
ImageCLEF lab is part of the Conference and Labs of the Evaluation Forum: CLEF 2022. CLEF 2022 consists of independent peer-reviewed workshops on a broad range of challenges in the fields of multilingual and multimodal information access evaluation, and a set of benchmarking activities carried in various labs designed to test different aspects of mono and cross-language Information retrieval systems. More details about the conference can be found here.
Submitting a working note with the full description of the methods used in each run is mandatory. Any run that could not be reproduced thanks to its description in the working notes might be removed from the official publication of the results. Working notes are published within CEUR-WS proceedings, resulting in an assignment of an individual DOI (URN) and an indexing by many bibliography systems including DBLP. According to the CEUR-WS policies, a light review of the working notes will be conducted by ImageCLEF organizing committee to ensure quality. As an illustration, ImageCLEF 2021 working notes (task overviews and participant working notes) can be found within CLEF 2021 CEUR-WS proceedings.
Participants of this challenge will automatically be registered at CLEF 2022. In order to be compliant with the CLEF registration requirements, please edit your profile by providing the following additional information:
Regarding the username, please choose a name that represents your team.
This information will not be publicly visible and will be exclusively used to contact you and to send the registration data to CLEF, which is the main organizer of all CLEF labs
Participating as an individual (non affiliated) researcher
We welcome individual researchers, i.e. not affiliated to any institution, to participate. We kindly ask you to provide us with a motivation letter containing the following information:
the presentation of your most relevant research activities related to the task/tasks
your motivation for participating in the task/tasks and how you want to exploit the results
a list of the most relevant 5 publications (if applicable)
the link to your personal webpage
The motivation letter should be directly concatenated to the End User Agreement document or sent as a PDF file to bionescu at imag dot pub dot ro. The request will be analyzed by the ImageCLEF organizing committee. We reserve the right to refuse any applicants whose experience in the field is too narrow, and would therefore most likely prevent them from being able to finish the task/tasks.
Information will be posted after the challenge ends.
ImageCLEF 2022 is an evaluation campaign that is being organized as part of the CLEF initiative labs. The campaign offers several research tasks that welcome participation from teams around the world. The results of the campaign appear in the working notes proceedings, published by CEUR Workshop Proceedings (CEUR-WS.org). Selected contributions among the participants will be invited for publication in the following year in the Springer Lecture Notes in Computer Science (LNCS) together with the annual lab overviews.
- You can ask questions related to this challenge on the Discussion Forum. Before asking a new question please make sure that question has not been asked before.
- Click on Discussion tab above or direct link: https://www.aicrowd.com/challenges/imageclef-2022-tuberculosis-caverns-detection/discussion
We strongly encourage you to use the public channels mentioned above for communications between the participants and the organizers. In extreme cases, if there are any queries or comments that you would like to make using a private communication channel, then you can send us an email at :
- kozlovski [dot] serge [at] gmail [dot] com
- vitali [dot] liauchuk [at] gmail [dot] com
- yashin [dot] dicente [at] warwick [dot] ac [dot] uk
You can find additional information on the challenge here: https://www.imageclef.org/2022/medical/tuberculosis