MEDDOPLACE Shared Task

Organized by s-lilo - Current server time: Sept. 23, 2025, 12:14 p.m. UTC

First phase

Task 4 End-to-End [Practice]
May 1, 2023, midnight UTC

End

Competition Ends
June 17, 2023, noon UTC

Task Summary

MEDDOPLACE stands for MEDical DOcument PLAce-related Content Extraction. It is a shared task and set of resources focused on the detection of different kinds of places, and related types of information such as nationalities or patient movements, in medical documents in Spanish.

MEDDOPLACE Summary Figure

For more information about the task, data examples, schedule, etc. please visit https://temu.bsc.es/meddoplace.

To contact the organizers directly, you can write to <salvador.limalopez [at] gmail.com> or  <krallinger.martin [at] gmail.com>.

 

Sub-tasks

MEDDOPLACE offers four different sub-tasks, each focused on a specific problem:

- SUB-TASK 1: Named Entity Recognition

This a classic NER task where participants have to detect entities in text using the full-text documents as input.

- SUB-TASK 2: Entity Linking / Toponym Resolution

In this sub-task, participants must assign a unique identifier to every text mention to disambiguate them. This sub-task is divided in three:

  • Sub-task 2.1: Normalization to GeoNames
  • Sub-task 2.2: Normalization to PlusCodes
  • Sub-task 2.3: Normalization to SNOMED CT

- SUB-TASK 3: Entity Clasification

This sub-task involves further classifying location entities in text inside pre-defined classes of clinical relevance, such as birth place, residence or healthcare attention.

- SUB-TASK 4: End-to-end

This sub-task challenges participants to do all three previous sub-tasks at once. That is, they must detect entities in text, normalize them to the corresponding ontology and classify appropriate entities. The main difference with the previous sub-tasks is that the normalization and classification systems will not be evaluated on a vacuum, but instead depend on the mentions detected by the NER system.

 

For more information about each of them, please refer to the MEDDOPLACE website and the Task Guide.

Evaluation Setting

To allow the individual evaluation of normalization and classification systems, the evaluation will be divided in two phases:

  • Phase 1: Sub-task 1 (NER) + Sub-task 4 (End-to-End)
  • Phase 2: Sub-task 2 (Normalization) + Sub-task 3 (Classification)

Because of this, the test data will be released in two parts. For Phase 1, only the test text files will be released. Once it’s finished, the list of entities found in the test files will be published so that participants can create predictions for their normalization and classification in Phase 2. Participants are allowed to re-use their systems for Sub-task 4 in the rest of the tasks.

The schedule for these phases is available on https://temu.bsc.es/meddoplace/schedule/.

You will find that in this CodaLab there is a Practice and Final submission for each sub-task. Use the Practice submission to check that everything works correctly and to evaluate your system in a reduced version of the test set. Once you feel you are ready, use the Final submission to upload the predictions that you want to count towards the task, which will be evaluated on the complete test set. If you need to do some evaluations on your own, you can also use the MEDDOPLACE scorer.

For more information on the evaluation and sub-tasks, including data formats and tips, please read the Task Guide.

 

Evaluation Metrics

Tasks 1 and 4 will use as their main metric strict, micro-averaged precision, recall and F-1 score, while Tasks 2 and 3 will use accuracy (percentage of correct mentions out of the total). In addition, some sub-tasks include additional metrics that may help further understand and interpret the results:

Sub-task Additional Metric
Sub-task 1 (NER) Overlapping, micro-averaged precision, recall and F-1 score
Sub-task 2.1 (GeoNames normalization)   Accuracy@161km, Area Under the Curve (AUC), Mean and Median Error
Sub-task 2.2 (PlusCodes normalization)    Accuracy@161km, Area Under the Curve (AUC), Mean and Median Error

 

 

 

 

The additional metrics given for the GeoNames and PlusCodes normalizations are distance-based metrics, which are more appropriate for Toponym Resolution. They are described in:

Gritta, M., Pilehvar, M.T. & Collier, N. A pragmatic guide to geoparsing evaluation. Lang Resources & Evaluation 54, 683–712 (2020). https://doi.org/10.1007/s10579-019-09475-3 [pages 694-697]

 

Submission Format

Predictions must be submitted as .TSV files with one annotation per row (same as the training data format). You must include headers as the first row. These are the columns for each sub-task (provided fields are in round letters, while the fields that participants must predict are in italics):

  • Sub-task 1 (NER): filename, label, start_span, end_span and text
  • Sub-task 2 (Normalization): filename, label, start_span, end_span, text, normalization and source
  • Sub-task 3 (Classification): filename, label, start_span, end_span, text and class
  • Sub-task 4 (End-to-end): filename, label, start_span, end_span, text, normalization, source and class

There is more information on format specifics and the meaning of each of the columns in the Task Guide, check it out here.

With your Final submissions, you also need to add a .TXT file that includes:

  • Your team's name (e.g."teambsc") and your contact information.
  • Short system descriptions (e.g. "Spanish Roberta model fine-tuned on...").
  • Used resources (e.g. MEDDOPLACE corpus, gazetteer obtained from...).

Put together in a .ZIP file one .TSV file (and one .TXT) per submission and give it a recognizable name (e.g. your team's name + a short description + a number; teambsc_roberta_01.zip). It is important that there is only one .TSV file inside your .ZIP file, otherwise the scorer will fail.

MEDDOPLACE Terms and Conditions

- MEDDOPLACE is organized by the Barcelona Supercomputing Center's NLP for Biomedical Information Analysis and funded by the Spanish government's Plan de Tecnologias del Lenguaje (Plan-TL).
- MEDDOPLACE is part of the IberLEF 2023 workshop, held within SEPLN 2023.
- The task's website hosts more information about the task in general, including schedule, corpus examples and download.
- Participating in the shared task is free of charge.
- Participants are invited to write a paper explaining their methodology and results (system description) in the IberLEF 2023 Proceedings (also free of charge).
- Participants who decide to write a system description must follow the formatting instructions described in the task website's Publications page.
 
- Only submissions in the sub-tasks' Final run will be counted
- Submissions in the Final run must include a .TXT explaining the submission (see Evaluation tab)
- Submissions after the final deadline won't count towards the task's official leaderboard (included in the task's overview paper)
- If you use the MEDDOPLACE data or task, please cite the task's overview paper. Citation coming soon!

Task 4 End-to-End [Practice]

Start: May 1, 2023, midnight

Description: Task 4 End-to-End [Practice] Use this task for non-final results.

Task 4 End-to-End [Final]

Start: May 1, 2023, midnight

Description: Task 4 End-to-End [Final] Submit to this task your final results.

Task 1 NER [Practice]

Start: May 1, 2023, midnight

Description: Task 1 NER [Practice] Use this task for non-final results.

Task 1 NER [Final]

Start: May 1, 2023, midnight

Description: Task 1 NER [Final] Submit to this task your final results.

Task 2.2 PlusCodes [Final]

Start: June 8, 2023, midnight

Description: Task 2.2 PlusCodes [Final] Submit to this task your final results.

Task 2.1 GeoNames [Final]

Start: June 8, 2023, midnight

Description: Task 2.1 GeoNames [Final] Submit to this task your final results.

Task 3 Classification [Final]

Start: June 8, 2023, midnight

Description: Task 3 Classification [Final] Submit to this task your final results.

Task 2.3 Snomed-CT [Practice]

Start: June 8, 2023, midnight

Description: Task 2.3 Snomed-CT [Practice] Use this task for non-final results.

Task 2.3 Snomed-CT [Final]

Start: June 8, 2023, midnight

Description: Task 2.3 Snomed-CT [Final] Submit to this task your final results.

Task 2.2 PlusCodes [Practice]

Start: June 8, 2023, midnight

Description: Task 2.2 PlusCodes [Practice] Use this task for non-final results.

Task 2.1 GeoNames [Practice]

Start: June 8, 2023, midnight

Description: Task 2.1 GeoNames [Practice] Use this task for non-final results.

Task 3 Classification [Practice]

Start: June 8, 2023, midnight

Description: Task 3 Classification [Practice] Use this task for non-final results.

Competition Ends

June 17, 2023, noon

You must be logged in to participate in competitions.

Sign In