ECAC (SemEval-2024 Task 3)

Organized by Fan1834 - Current server time: March 27, 2025, 5:22 p.m. UTC

Previous

Evaluation
Jan. 16, 2024, noon UTC

Current

Post-Evaluation
Feb. 1, 2024, noon UTC

End

Competition Ends
Never

SemEval-2024 Task 3: The Competition of Multimodal Emotion Cause Analysis in Conversations (ECAC)

Visit our task website: SemEval-2024_ECAC; Join the mailing group: ECF_ECA@googlegroups.com.
Our task paper and the papers submitted by the participating teams are available on the official website!
Please fill in your registered user information on the online form!

The ability to understand emotions is an essential component of human-like artificial intelligence, as emotions greatly influence human cognition, decision-making, and social interactions. Emotion Cause Analysis, the task of identifying the potential causes behind an individual’s emotional state, is of great importance.

Based on the multimodal conversational emotion cause dataset we built, we define the following two subtasks:

Subtask 1: Textual Emotion-Cause Pair Extraction in Conversations

  • Task definition: Extracting all emotion-cause pairs from the given conversation solely based on text, where the emotion cause is defined and annotated as a textual span.
    • Input: a conversation containing the speaker and the text of each utterance
    • Output: all emotion-cause pairs, where each pair contains an emotion utterance along with its emotion category and the textual cause span in a specific cause utterance, e.g., (3_joy, 2_You made up!). The emotion category should be one of Ekman’s six basic emotions including Anger, Disgust, Fear, Joy, Sadness and Surprise. * Note: There may be multiple cause spans corresponding to the same emotion, thus forming multiple pairs.
  • Task example:
{
"conversation_ID": 5,
"conversation": [
	{
		"utterance_ID": 1,
		"text": "Oh , look , wish me luck !",
		"speaker": "Rachel",
		"emotion": "joy"
	},
	{
		"utterance_ID": 2,
		"text": "What for ?",
		"speaker": "Monica",
		"emotion": "neutral"
	},
	{
		"utterance_ID": 3,
		"text": "I am gonna go get one of those job things .",
		"speaker": "Rachel",
		"emotion": "joy"
	}
	],
"emotion-cause_pairs": [
	[
		"1_joy",
		"3_I am gonna go get one of those job things ."
	],
	[
		"3_joy",
		"3_I am gonna go get one of those job things ."
	]
	]
}
  • Evaluation metrics: In addition to the micro F1 score, we also calculate a weighted average of F1 scores across the six emotion categories. For the textual cause span, we adopt two strategies to determine whether the span is extracted correctly: Strict Match (the predicted span should be exactly the same as the annotated span) and Proportional Match (considering the overlap proportion of the predicted span and the annotated one).
    • w-avg. Strict F1
    • w-avg. Proportional F1 (main)
    • Strict F1
    • Proportional F1

Subtask 2: Multimodal Emotion-Cause Pair Extraction in Conversations

  • Task definition: In consideration of three modalities, extracting all emotion-cause pairs in the conversation, where the emotion cause is defined and annotated at the utterance level and each utterance is represented by text, audio and video.
    • Input: a conversation including the speaker, text, and audio-visual clip for each utterance
    • Output: all emotion-cause pairs, where each pair contains an emotion utterance along with its emotion category and a cause utterance, e.g., (5_disgust, 5).
  • Task example:
{
"conversation_ID": 5,
"conversation": [
	{
		"utterance_ID": 1,
		"text": "Oh , look , wish me luck !",
		"speaker": "Rachel",
		"emotion": "joy",
		"video_name": "dia5utt1.mp4"
	},
	{
		"utterance_ID": 2,
		"text": "What for ?",
		"speaker": "Monica",
		"emotion": "neutral",
		"video_name": "dia5utt2.mp4"
	},
	{
		"utterance_ID": 3,
		"text": "I am gonna go get one of those job things .",
		"speaker": "Rachel",
		"emotion": "joy",
		"video_name": "dia5utt3.mp4"
	}
	],
"emotion-cause_pairs": [
	[
		"1_joy",
		"3"
	],
	[
		"3_joy",
		"3"
	]
	]
}
  • Evaluation metrics: In addition to the micro F1 score, we also calculate a weighted average of F1 scores across the six emotion categories.
    • w-avg. F1 (main)
    • F1

Evaluation

The calculation instructions for each evaluation metric and the official evaluation script are available on GitHub.

Terms and Conditions

We provide the following terms and conditions to clearly outline the guidelines that participants must adhere to. The organizers reserve the right to modify the following terms in any manner, and in such cases, the modifications will be announced through our Google Group and CodaLab forum. Participants may contact the organizers if any of the following terms raises your concern. 

Participation

Participation in this competition signifies your full agreement to these terms and conditions. You should understand and agree that your submissions and scores may be made public.

Anyone interested is free to participate in the competition. Teams are allowed, but a participant can only join one team. Teams and individual participants must create exactly one account on CodaLab for the competition. Once the trial phase begins, the composition of teams cannot be altered.

Submission

Submission files must adhere to the specified format requirements. Each team is allowed to submit up to three times per day. The leaderboard defaults to show the best submission based on the metric in the first column. Participants should submit the best submission to the leaderboard based on the Main Metric we set (w-avg. Proportional F1 or w-avg. F1).

Scoring

The organizers are not obligated to release scores during the evaluation phase. Official scores may be removed if the organizers judge a submitted work to be incomplete, erroneous, deceptive, or violating the competition rules.

Data Usage

The dataset provided should be used responsibly and ethically. Do not attempt to abuse it in any way, including but not limited to reconstructing the test set, any non-scientific or unconscionable data usage.

Submission Format

A valid submission is a zip-compressed file containing your prediction files. You can generate the compressed file for submission by following steps:

  1. Rename your prediction file as Subtask_1_pred.json or Subtask_2_pred.json. The file name clearly indicates the subtask you will participate in. You can participate in either or both of the two subtasks.
  2. Select your prediction file(s) and compress it directly into a zip file. (*Please note that you should not put your JSON file into a folder and then compress the folder. This will cause an error.)
  3. Rename the zip file in English, e.g., "ffwang_submission.zip".

Each prediction file should be formatted similarly to the provided files of the training/evaluation data:

  • Each file stores a list composed of dictionaries, where each dictionary contains all the information for one instance. The keys conversation_ID, utterance_ID, and text and their corresponding values in the dictionary must be retained. The value of the key emotion-cause_pairs contains the prediction results.
  • In particular, for Subtask 1, you need to provide the position indexes of your predicted cause span within the utterance to avoid confusion. The position index starts from 0, and the ending index is the index of the last token plus 1, i.e., the index interval of the cause span is left-closed and right-open, e.g., the indexes of the span "nice day" in the utterance "What a nice day today ." should be "2_4". Please note that, for accurate evaluation, your cause span should not include the punctuation token at the beginning and end. Here's a valid example in the prediction file Subtask_1_pred.json:
{
"conversation_ID": 48,
"conversation": [
	{
		"utterance_ID": 1,
		"text": "... Dammit , hire the girl ! Okay , everybody ready ?",
		"speaker": "Director"
	},
	{
		"utterance_ID": 2,
		"text": "Uh , listen , I just wanna thank you for this great opportunity .",
		"speaker": "Joey"
	},
	{
		"utterance_ID": 3,
		"text": "Lose the robe .",
		"speaker": "Director"
	},
	{
		"utterance_ID": 4,
		"text": "Me ?",
		"speaker": "Joey"
	}
	],
"emotion-cause_pairs": [
	[
	"2_joy",
	"2_10_13"
	],
	[
	"4_surprise",
	"3_0_3"
	]
	]
}

Important Dates

Tasks announced July 17, 2023
Training data ready September 4, 2023
Practice start December 1, 2023
Evaluation start January 15, 2024
Evaluation end January 31, 2024
Paper submission due February 19, 2024
Notification to authors March 18 ,2024
Camera ready due April 01, 2024
SemEval workshop June 16–21, 2024 (co-located with NAACL 2024)

 

* Note: All deadlines are 23:59 UTC-12 (AOE). The time displayed on the Phases interface of CodaLab is UTC. 

Practice

Start: Dec. 1, 2023, noon

Description: Uses the official evaluation script, but on the trial data.

Evaluation

Start: Jan. 16, 2024, noon

Description: Uses the official evaluation script, on the official evaluation data. Each team is allowed to submit up to three times per day. The leaderboard defaults to show the best submission based on the metric in the first column. Participants should submit the best submission to the leaderboard based on the Main Metric we set (w-avg. Proportional F1 or w-avg. F1).

Post-Evaluation

Start: Feb. 1, 2024, noon

Description: Uses the official evaluation script, on the official evaluation data. The leaderboard defaults to show the best submission based on the metric in the first column.

Competition Ends

Never

You must be logged in to participate in competitions.

Sign In
# Username Score
1 Mercurialzs 0.2300
2 Choloe_guo 0.1518
3 aranjan25 0.1431