[Task 1] CXR-LT: Long-tailed, multi-label, and zero-shot classification on chest X-rays

Organized by CXR-LT-2024 - Current server time: Oct. 4, 2025, 11:43 p.m. UTC

First phase

Development
May 1, 2024, midnight UTC

End

Competition Ends
Sept. 6, 2024, midnight UTC

CXR-LT: Long-tailed, multi-label, and zero-shot classification on chest X-rays

MICCAI 2024 Challenge Task 1

**Notice: This competition requires credentialed access to a medical imaging dataset. Please carefully follow all instructions in the Terms and Conditions before registering!**

To Participate

This competition uses data from MIMIC-CXR-JPG v2.0.0, which requires credentialing through PhysioNet and a signed data use agreement (DUA) for MIMIC-CXR-JPG. To participate in this competition, you must follow these steps:

  1. Become a credentialed user through PhysioNet and sign the DUA for MIMIC-CXR-JPG v2.0.0 access. (If you are already credentialed with MIMIC-CXR-JPG access, you can skip this step.)
  2. Fill out this Google Form providing (i) your CodaLab email address, (ii) proof that you are a credentialed PhysioNet user, and (iii) proof that you signed the DUA for MIMIC-CXR-JPG v2.0.0. **IMPORTANT: Please ensure your PhysioNet email is the same as your CodaLab email to get approved.**
  3. Register for the competition on CodaLab via the "Participate" tab and await our review.

If you have completed these steps correctly, you will be admitted to the competition and we will provide links to download the necessary data by email! You are not permitted to share these labels whatsoever.

Background

Chest radiography, like many diagnostic medical exams, produces a long-tailed distribution of clinical findings; while a small subset of diseases is routinely observed, the vast majority of diseases are relatively rare [1]. This poses a challenge for standard deep learning methods, which exhibit bias toward the most common classes at the expense of the important but rare "tail" classes [2]. Many existing methods [3] have been proposed to tackle this specific type of imbalance, though only recently has attention been given to long-tailed medical image recognition problems [4-6]. Diagnosis on chest X-rays (CXRs) is also a multi-label problem, as patients often present with multiple disease findings simultaneously; however, only a select few studies incorporate knowledge of label co-occurrence into the learning process [7-10]. Since most large-scale image classification benchmarks contain single-label images with a mostly balanced distribution of labels, many standard deep learning methods fail to accommodate the class imbalance and co-occurrence problems posed by the long-tailed, multi-label nature of tasks like disease diagnosis on CXRs [2]. This task will evaluate a model's ability to perform "in-distribution" long-tailed, multi-label disease classification on CXRs when evaluated on a large test set with noisy, automatically text-mined labels that have been encountered during training.

Dataset

This challenge will use an expanded version of MIMIC-CXR-JPG [11], a large benchmark dataset for automated thorax disease classification. Each CXR study in the dataset was labeled with 26 newly added disease findings (see figure above) extracted from the associated radiology reports. The resulting long-tailed (LT) dataset contains 377,110 CXRs, each labeled with at least one of 40 clinical findings (including a "Normal" class).

Task

Given a CXR, detect all clinical findings. If no findings are present, predict "Normal", which simply means that no cardiopulmonary disease or abnormality was found (excluding "Support Devices"). To do this, you will train multi-label thorax disease classifiers on the provided labeled training data.

Challenge

This challenge is hosted in conjunction with the MICCAI 2024 challenge. After completing the challenge, we will invite participants to submit their solutions for potential presentation at the MICCAI 2024 CXR-LT 2024 challenge. Additionally, we plan to coordinate a publication summarizing the challenge results, with invitations extended to the top-performing teams to serve as coauthors. We intend to select the top 3 teams for oral presentations at the MICCAI 2024 challenge in Morocco. For more information about MICCAI 2024 CXR-LT 2024 challenge, click here.

 

Prizes

  • 🥇 1st place: $500
  • 🥈 2nd place: $300
  • 🥉 3rd place: $200

Tentative Timeline

  • 05/01/2024: Development Phase begins. Participants can begin making submissions and tracking results on the public leaderboard.
  • 08/01/2024: Testing Phase begins. Unlabeled test data will be released to registered participants. The leaderboard will be kept private for this phase.
  • 08/04/2024: Test phase ends and competition is closed.
  • 08/15/2024: Top-performing teams invited to present at MICCAI 2024.
  • 10/10/2024: MICCAI 2024 CXR-LT Challenge event.

Steering Committee

Organizing Committee

References

  1. Zhou SK, Greenspan H, Davatzikos C, Duncan JS, Van Ginneken B, Madabhushi A, Prince JL, Rueckert D, Summers RM. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE. 2021 Feb 26;109(5):820-38.
  2. Holste G, Wang S, Jiang Z, Shen TC, Shih G, Summers RM, Peng Y, Wang Z. Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New Benchmark Study. In Data Augmentation, Labelling, and Imperfections: Second MICCAI Workshop, DALI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings 2022 Sep 16 (pp. 22-32). Cham: Springer Nature Switzerland.
  3. Zhang Y, Kang B, Hooi B, Yan S, Feng J. Deep long-tailed learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 Apr 19.
  4. Zhang R, Haihong E, Yuan L, He J, Zhang H, Zhang S, Wang Y, Song M, Wang L. MBNM: multi-branch network based on memory features for long-tailed medical image recognition. Computer Methods and Programs in Biomedicine. 2021 Nov 1;212:106448).
  5. Ju L, Wang X, Wang L, Liu T, Zhao X, Drummond T, Mahapatra D, Ge Z. Relational subsets knowledge distillation for long-tailed retinal diseases recognition. In Medical Image Computing and Computer Assisted Intervention--MICCAI 2021: 24th International Conference, Strasbourg, France, September 27-October 1, 2021, Proceedings, Part VIII 24 2021 (pp. 3-12). Springer International Publishing.
  6. Yang Z, Pan J, Yang Y, Shi X, Zhou HY, Zhang Z, Bian C. ProCo: Prototype-Aware Contrastive Learning for Long-Tailed Medical Image Classification. In Medical Image Computing and Computer Assisted Intervention--MICCAI 2022: 25th International Conference, Singapore, September 18-22, 2022, Proceedings, Part VIII 2022 Sep 16 (pp. 173-182). Cham: Springer Nature Switzerland.
  7. Chen H, Miao S, Xu D, Hager GD, Harrison AP. Deep hierarchical multi-label classification of chest X-ray images. In International Conference on Medical Imaging with Deep Learning 2019 May 24 (pp. 109-120). PMLR.
  8. Wang G, Wang P, Cong J, Liu K, Wei B. BB-GCN: A Bi-modal Bridged Graph Convolutional Network for Multi-label Chest X-Ray Recognition. arXiv preprint arXiv:2302.11082. 2023 Feb 22.
  9. Chen B, Li J, Lu G, Yu H, Zhang D. Label co-occurrence learning with graph convolutional networks for multi-label chest x-ray image classification. IEEE Journal of Biomedical and Health Informatics. 2020 Jan 16;24(8):2292-302.
  10. Moukheiber D, Mahindre S, Moukheiber L, Moukheiber M, Wang S, Ma C, Shih G, Peng Y, Gao M. Few-Shot Learning Geometric Ensemble for Multi-label Classification of Chest X-Rays. In Data Augmentation, Labelling, and Imperfections: Second MICCAI Workshop, DALI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings 2022 Sep 16 (pp. 112-122). Cham: Springer Nature Switzerland.
  11. PhysioNet. MIMIC-CXR-JPG - chest radiographs with structured labels [Internet]. Available from: https://physionet.org/content/mimic-cxr-jpg/2.0.0/.

CXR-LT: Evaluation

Participants will upload image-level predictions on the provided test sets for evaluation. Since this is a multi-label classification problem with severe imbalance, the primary evaluation metric will be mean Average Precision (mAP) (i.e., "macro-averaged" AP across the 40 classes). While Area Under the Receiver Operating Characteristic Curve (AUC) is a standard metric for related datasets, AUC can be heavily inflated in the presence of strong imbalance. Instead, mAP is more appropriate for the long-tailed, multi-label setting since it both (i) measures performance across decision thresholds and (ii) does not degrade under class imbalance. For thoroughness, mean AUC (mAUC) and mean F1 score (mF1) -- using a decision threshold of 0.5 for each class -- will be calculated and appear on the leaderboard, but not contribute to team rankings. Mean expected calibration error (mECE) will also be computed to assess model calibration.

There will be two phases of the competition:

  • Phase 1: Development Phase (starts 05/01/2024; ends 08/01/2024). We will provide you with labeled training data (70% of the data) and unlabeled validation data for model development (10% of the data). Train your models on the training data and upload predictions on this development set. Model performance on the development set data will be displayed on the public leaderboard for your feedback.
  • Phase 2: Test Phase (starts 08/26/2024; ends 08/30/2024). We will provide you with unlabeled testing data (20% of the data) for final evaluation. Using your models trained on the provided training data, upload predictions on this newly released test set. Final competition results will only be based on model performance during the Test Phase on the provided test set. For this phase, the leaderboard will be kept hidden until the CXR-LT event at MICCAI 2024, and you will be given up to five successful submissions.

CXR-LT: Terms and Conditions

This competition uses data from MIMIC-CXR-JPG v2.0.0, which requires credentialing through PhysioNet and a signed data use agreement (DUA) for MIMIC-CXR-JPG. To participate in this competition, you must

  1. Become a credentialed user through PhysioNet and sign the DUA for MIMIC-CXR-JPG v2.0.0 access. (If you are already credentialed with MIMIC-CXR-JPG access, you can skip this step.)
  2. Fill out this Google Form providing (i) your CodaLab email address, (ii) proof that you are a credentialed PhysioNet user, and (iii) proof that you signed the DUA for MIMIC-CXR-JPG v2.0.0.
  3. Register for the competition on CodaLab via the "Participate" tab and await our review. If you have completed these steps correctly, you will be admitted to the competition and we will provide links to download the necessary data by email! You are not permitted to share these labels whatsoever.

By registering for this competition, you also agree to the following terms and conditions:

  • You cannot train on images in this competition's development and test sets. If we find any evidence that you have trained on MIMIC-CXR images in the competition's unlabeled development and test sets, you will be disqualified and not permitted to participate in the MICCAI 2024 CXR-LT challenge.
    • You are permitted to train on external data and use pretrained models, so long as they were not trained on images in this competition's development and test sets. For example, this means that you can use ImageNet-pretrained models, but you likely cannot use models pretrained on MIMIC-CXR since those models were almost certainly trained on images in this competition's development and/or test sets (we use different splits from the official MIMIC-CXR-JPG train/validation/test split).
  • All submissions must contain (i) a predictions .csv file and (ii) a "code/" directory will all code used for model development and inference.
    • Please see detailed submission instructions under "Learn the Details" -> "Submission Format". You can find a properly formatted sample submission for the development set under "Participate" -> "Files" once registered for the competition. In short:
      1. Your .csv must contain the "dicom_id" column denoting each unique image in the given evaluation set and must contain a column for each of the 40 classes. All values in these 40 class-specific columns must be values in the range [0, 1] representing the probability that the given image (row) contains the given finding (column).
      2. All of your training and inference code must be placed in a "code/" directory. You will only be allowed to submit to and participate in the MICCAI 2024 CXR-LT challenge if your code is transparent and reproducible.
      3. Your "code/" directory and prediction .csv file should be compressed together into a single .zip file, which you will upload for submission.
  • You may, under no circumstances, share any of the provided labeled training data in accordance with the MIMIC-CXR-JPG v2.0.0 DUA.
  • If you wish to participate in a team, you must use a single shared CodaLab account.
  • After completion of the challenge (08/30/2024), you must delete all provided labels from your computer(s).

 

Submission Format

Submission File Structure

All CodaLab submissions are required to be in .zip format. For this competition, this compressed .zip file must contain (i) a predictions .csv file and (ii) a "code/" directory with all of your training and inference code. The required file structure is as follows:

        xxx.csv  # predictions .csv file
        code/  # code directory
        ├── yyy.py
        ├── zzz.py
        ├── ...

To create the final submission .zip file, you might then run zip -r submission.zip xxx.csv code. Please note that the names of your individual submission files do not matter, though the code directory must be named "code".

Predictions .csv File Requirements

Your predictions .csv file must contain image-level predictions of the probability that each of the 40 classes are present in a given image. Specifically,

  • You must have a "dicom_id" column with the provided unique image IDs for the given evaluation set.
    • Entries must be strings.
  • You must have a column for each of the 40 class labels ("Adenopathy", "Atelectasis", etc.).
    • Entries must be floats or integers in the interval [0, 1].

Please see the "Starting Kit" under "Participate" -> "Files" for a full, valid sample submission once registered for the competition.

Development

Start: May 1, 2024, midnight

Description: Development Phase: Train models with the given labeled training data and upload your predictions on the unlabeled development set. See the sample submission under "Participate" -> "Files" for an example of a properly formatted submission.

Test

Start: Aug. 26, 2024, midnight

Description: Test Phase: Train models with the given labeled training data and upload your predictions on the unlabeled test set (to be released after the Development Phase ends). For this phase, the leaderboard will be kept private, though you will receive feedback on your submissions by clicking "Download output from scoring step" on a successful submission. You are only allowed 5 successful submissions during this phase, so be very careful! See the sample submission under "Participate" -> "Files" for an example of a properly formatted submission. Make absolutely sure that your submission contains predictions and dicom_ids for the *test set* (not the development set)!

Competition Ends

Sept. 6, 2024, midnight

You must be logged in to participate in competitions.

Sign In