DISPLACE-2024

Organized by displace2024 - Current server time: Aug. 17, 2025, 10:50 a.m. UTC

First phase

Speaker Diarization evaluation on Dev
Feb. 2, 2024, midnight UTC

End

Competition Ends
May 20, 2024, midnight UTC

Overview

The second DIriazation of SPeaker and LAnguage in Conversational Environments (DISPLACE) challenge entails a first of kind task to perform speaker and language diarization on the same data containing multi-speaker social conversations in multi-lingual code-mixed speech. DISPLACE2024 reflects the theme of Interspeech 2024 - “Speech and Beyond” as it aims to:

  1. Establish new benchmarks for speaker diarization (SD) in multilingual settings, language diarization (LD) in multi-speaker settings, and ASR in multi-accent settings
  2. Evaluate the performance of submitted systems on this dataset

In multi-lingual communities, social conversations frequently involve code-mixed and code-switched speech. Code-mixing happens when words or morphemes from one language (secondary) are used within a sentence of another language (primary), whereas code-switching refers to the switching of languages at the sentence or phrase level leading to a shift in the conversational language. In such cases, various speech processing systems need to perform the speaker and language segmentation before any downstream task. The current speaker diarization systems are not equipped to handle multi-lingual conversations, while the language recognition systems may not be able to handle the same talker speaking in multiple languages within the same recording. With this motivation, the DISPLACE challenge attempts to benchmark and improve Speaker Diarization (SD) in multi-lingual settings, Language Diarization (LD) in multi-speaker scenarios and and ASR in multi-accent settings, using the same underlying dataset (only audio modality).

Participation in this challenge is open to all who are interested in contributing towards reaching a new milestone in the speaker diarization, language diarization and ASR areas. This challenge provides dataset (development and evaluation), leaderboard (for online evaluation) and baseline systems to all the participating teams. The results of the challenge will be presented at Interspeech conference to be held at Kos Island, Greece during 1st - 5th September 2024. More information about the challenge can be found on the DISPLACE2024 website.

Evaluation

The DISPLACE challenge is organized as an open evaluation where blind evaluation sets will be sent to each team. The evaluation will be done in two phases, namely, Phase-1 and Phase-2. The overall timeline for the challenge can be found here, whereas the relevant evaluation deadlines for these phases are as follows:

PhaseDates
Phase-1 Evaluation Opens         1 Feb 2024
Phase-1 Evaluation Closes 28 Feb 2024
Phase-2 Evaluation Opens 1 Apr 2024
Phase-2 Evaluation Closes 20 Apr 2024

 

Each team needs to evaluate their systems on the Phase-1 and Phase-2 blind sets locally and upload their output RTTMs and Transcription files(with specified file naming and folder structure) to the DISPLACE online leaderboard for evaluation.

  • Each team must submit at least one valid system to the registered track before the end of the evaluation duration of Phase-1.
  • No information obtained from other test segments should be utilized for evaluation.
  • Probing the evaluation segments through any kind of manual means is strongly discouraged.
  • Any kind of the automatically derived information, such as domain knowledge, etc., can be used for the development and evaluation set.
  • For Phase-1, each team can make up to 50 submissions in total. However, a maximum of 7 submissions can be done by a team in a single day. Only the most recently processed valid submission of a team will be displayed on the leaderboard.
  • Teams failing to stand by the above rules will NOT be considered for future evaluations.

The submitted compressed file (in .zip) must have the following directory structure:

For Track-1 (Speaker diarization in multilingual scenarios)

SPEAKER.zip/<session1_name>_SPEAKER_sys.rttm

SPEAKER.zip/<session2_name>_SPEAKER_sys.rttm

For Track-2 (Language diarization in multi-speaker settings)

LANGUAGE.zip/<session1_name>_LANGUAGE_sys.rttm

LANGUAGE.zip/<session2_name>_LANGUAGE_sys.rttm

For Track-3 (Automatic Speech Recognition in multi-accent settings)

For Closefield Data:

ASR_closefield.zip/<session1_name>_<LANGUAGEID>.wrd.trn

ASR_closefield.zip/<session2_name>_<LANGUAGEID>.wrd.trn

NOTE: Please merge all speaker transcription files of a specific language within a session into a single transcription file when submitting for evaluation. For example, concatenate adjq_S1_hi.wrd.trn, adjq_S2_hi.wrd.trn and adjq_S3_hi.wrd.trn into adjq_hi.wrd.trn

For Farfield Data:

ASR_farfield.zip/<session1_name>_<LANGUAGEID>.wrd.trn

ASR_farfield.zip/<session2_name>_<LANGUAGEID>.wrd.trn

The zipped submission folders for closefield Dev data are to be submitted in the 'ASR evaluation on Dev - Closefield' phase tab in the Submit/View Results section. Similarly, the zipped submission folders for farfield Dev data are to be submitted in the 'ASR evaluation on Dev - Farfield' phase tab in the Submit/View Results section. The same applies to Eval closefield and fardield data as well. It is mandatory to make submissions for both closefield and farfield data

NOTE: The compressed file must not contain the following structure:

SPEAKER.zip/<Directory>/<session1_name>_SPEAKER_sys.rttm

LANGUAGE.zip/<Directory>/<session1_name>_LANGUAGE_sys.rttm

Also make sure to check out the complete evaluation protocol and guidelines in the Evaluation Plan.

Terms and Conditions

The data for the challenge is provided as described in the DISPLACE dataset description document under the terms of the MIT license. All the participants are requested to give the terms and conditions document a thorough read for valid submissions.

 

The registered participants will get the datasets through email.

Speaker Diarization evaluation on Dev

Start: Feb. 2, 2024, midnight

Speaker Diarization evaluation on Phase 1

Start: Feb. 2, 2024, midnight

Speaker Diarization evaluation on Phase 2

Start: March 6, 2024, midnight

Language Diarization evaluation on Dev

Start: Feb. 2, 2024, midnight

Language Diarization evaluation on Phase 1

Start: Feb. 2, 2024, midnight

Language Diarization evaluation on Phase 2

Start: March 6, 2024, midnight

ASR evaluation on Dev - Farfield

Start: Feb. 2, 2024, midnight

ASR evaluation on Dev - Closefield

Start: Feb. 2, 2024, midnight

ASR evaluation on Phase 1 - Farfield

Start: Feb. 2, 2024, midnight

ASR evaluation on Phase 1 - Closefield

Start: Feb. 2, 2024, midnight

ASR evaluation on Phase 2 - Farfield

Start: March 6, 2024, midnight

ASR evaluation on Phase 2 - Closefield

Start: March 6, 2024, midnight

Competition Ends

May 20, 2024, midnight

You must be logged in to participate in competitions.

Sign In