DIMEMEX@IberLEF 2024

Organized by hugojair - Current server time: March 30, 2025, 8:23 a.m. UTC

First phase

Development (Subtask 1)
March 15, 2024, 1 a.m. UTC

End

Competition Ends
May 21, 2024, noon UTC

About the task

Social networks are increasingly playing crucial roles in people’s lives, transforming the dynamics of communication and information sharing. Analyzing the content originated on these platforms has become a hot research topic for the computational linguistics community. However, despite the notable advances made in recent years, there are still open challenges that merit additional research for better treatment or deeper understanding. One such challenge is the detection of abusive content, which includes aspects like hate speech, aggression, offensive language, and other related phenomena. 

Given the multimodal nature of social media platforms, we aim to promote the research and development of multimodal computational models for the detection of abusive content in Mexican Spanish, particularly hate, offensive, and vulgar memes. Memes are defined as the conjunction of a text and an image which often, provide a joint meaning. This meaning is predominantly humorous or ironic, and the absence of either text or image may alter its interpretation. Accordingly, combining information from both modalities to identify a meme as abusive represents an exciting and challenging problem. 

 

Subtasks

DIMEMEX comprises two subtasks:

  • Subtask 1. A three-way classification: hate speech, inappropriate content, and neither.
  • Subtask 2.  A finer-grained classification distinguishing instances containing hate speech  into different categories such as classism, sexism, racism, and others. 

 

Both subtasks will rely on the DIMEMEX corpus, and participants can approach either or both tasks.

During the development phase, submissions will be evaluated on the validation partition. Participants will receive immediate feedback on the performance of their submissions. During the final phase submissions will be evaluated on the test partition. Results in the final phase will be used to determine the final and official ranking.

The following evaluation measures will be used for both subtasks:

  • Macro-average of Precision, recall and f1-score, macro-average of f1-score will be the leading evaluation measure for this subtask.

 

For each submission participants are free to use text, images or a combination of both modalities, just please remember to specify this information in your filenames (see the "Format of submissions" site). A single leaderboard will be maintained for each subtask, but at the end of the competition we will announce the modalities used by the different participants.

 

By registering to this competition you agree to use the data exclusively for the purpose of participation to this competition. Data cannot be stored after the competition, shared or distributed under any condition.

By submitting results to this competition, you consent to the public release of your scores at the IberLEF workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatically and manually calculated quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.

You further agree that if your team has several members, each of them will register to the competition and build a competition team and that if you are a single participant you will build a team with a single member.

You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.

 

IMPORTANT: Add either: "TXT", "IMG", or "MM" to your submission filename to indicate that you are using Textual information only (TXT), Visual information only (IMG), or Textual and visual information (MM), respectively

Each team can participate with up to 10 submissions for the final phase. During the validation phase a maximum of 100 submissions are allowed, 5 per day. Files to be uploaded must be compressed in a .zip file. This is the expected format for submissions:

Substask 1

Format for predictions is a CSV file with three predictions per line, each prediction is associated to a single category (1 for the presence of the category and 0 otherwise) use a comma "," separator between predictions in the same line.  The order (columns in the labels files) of categories is as follows: hate-speech, inappropriate content, and none, for columns 1 to 3, respectively.    You should use the same order as the corresponding data files i.e., line 1 in the prediction file must correspond to meme 1 in the data file). The file must be in a .zip file for submission, please do not use folders.

Example:

0,0,1

1,0,0

0,1,0

1,0,0

0,1,0

1,0,0

...

 

Substask 2

Format for predictions is a CSV file with four predictions per line, each prediction is associated to a single category (1 for the presence of the category and 0 otherwise) use a comma "," separator between predictions in the same line. The order (columns in the labels files) of categories is as follows: Classicism, Racism, Sexism, other (hate-speech), inappropriate content, and none for columns 1 to 6, respectively.  This order corresponds to the order available in training data.  You should use the same order as the corresponding data files i.e., line 1 in the prediction file must correspond to meme 1 in the data file). The file must be in a .zip file for submission, please do not use folders..

Example:

0,0,1,0,0,0

0,0,1,0,0,0

1,0,0,0,0,0

1,0,0,0,0,0

0,0,1,0,0,0

0,0,0,0,0,1

...

You can check development data to see the format for both data and submissions by looking into reference (ground-truth) files.

 

Organization team

  • Horacio Jesús Jarquín (INAOE, Mexico)
  • Itzel Tlelo Coyotecatl (INAOE, Mexico)
  • Delia Irazú Hernández (INAOE, Mexico)
  • Marco Casavantes  (INAOE, Mexico)
  • Hugo Jair Escalante (INAOE, Mexico)
  • Luis Villaseñor (INAOE, Mexico)
  • Manuel Montes (INAOE, Mexico)

 

 

Contact: please use the forum to contact organizers and other participants (preferred contact channel)

Paper submission

Format details will be communicated shortly, according to the specifications of IberLEF organizers.

 

Schedule

  • Mar 15th, 2024: Release of training corpora.
  • May 10th, 2024: Release of test corpora and start of evaluation campaign
  • May 21st, 2024: End of evaluation campaign
  • June 7th, 2024: Deadline for paper submission
  • Jun 21th, 2024: Acceptance notification
  • Jun 28th, 2024: Camera ready submission deadline
  • September 2024: Workshop with SEPLN 2024
Download Size (mb) Phase
Public Data 0.132 #1 Development (Subtask 1)
Public Data 0.132 #1 Development (Subtask 2)
Public Data 0.035 #2 Final (Subtask 1)
Public Data 0.035 #2 Final (Subtask 2)

Development (Subtask 1)

Start: March 15, 2024, 1 a.m.

Development (Subtask 2)

Start: March 15, 2024, 1 a.m.

Final (Subtask 1)

Start: May 10, 2024, 1 a.m.

Final (Subtask 2)

Start: May 10, 2024, 1 a.m.

Competition Ends

May 21, 2024, noon

You must be logged in to participate in competitions.

Sign In