Hope is one of the exceptional human capacities that allows for flexible anticipation of future events and possible expected outcomes. These visions significantly influence emotions, behaviour and mood (Bruininks and Malle, 2005). Individuals with high hope do not react in the same way to barriers as individuals with low hope, but instead view barriers as challenges to overcome and use their pathway thoughts to plan an alternative route to their goals (Snyder, 1994, 2000). In addition, high hope has been found to correlate with a number of beneficial elements, such as academic performance (Snyder, 2002) and lower levels of depression (Snyder et al., 1997). In contrast, low hope is associated with negative outcomes, such as reduced well-being (Diener, 2009).
Despite the importance and prevalence of hope, it has received little attention in the field of Natural Language Processing (NLP). Machine learning and NLP techniques can be used to analyze social media data and provide insights into the nature of hope in human behavior and decision-making. Therefore, in the last two years, we have promoted research on this topic by organizing shared tasks. We organized shared tasks on hope speech detection at the second workshop on Language Technology for Equality, Diversity and Inclusion (LT-EDI-2022), as a part of ACL 2022 (Chakravarthi et al., 2022), at LT-EDI-2023, within RANLP 2023 (Kumaresan et al., 2023) and the shared task HOPE in IberLEF 2023 (Jiménez-Zafra et al., 2023). The participants registered in these previous tasks show that there is an expected target community, as the maximum number of participants registered was 126 and the minimum 50.
The main novelty of this new edition is the study of hope from two perspectives: i) hope for equality, diversity and inclusion, and ii) hope as expectations. The first perspective was explored in the last edition of IberLEF 2023 for English and Spanish. In this new edition, we focus on expanding and improving our Spanish dataset to answer one of the most frequently asked questions from previous participants and researchers in hope speech detection: Is it possible to detect hope speech in multiple domains, even when we only train our models with texts from one specific area? Therefore, this time we again provide the teams with a training corpus focused on the LGTBI community, but we ask them to test their systems with texts belonging to the LGTBI domain and new unknown domains. The second perspective has not been studied previously in any shared task and, for the IberLEF 2024 edition, we propose its study from a binary and multi-class perspective for English and Spanish.
The proposed shared task will have two task to explore hope in social media texts:
The organizers of the competition are:
The team members have working experience in NLP and a background in organizing different shared tasks and workshops. Fazlourrahman Balouchzahi, Sabur Butt, Alexander Gelbukh and Grigori Sidorov have experience in organizing shared tasks (Urduthreat2020, Urdufake2021, Emothreat 2022, Kanglish 2022, and Abusive Language@FIRE 2021). Moreover, Daniel García Baena, Salud María Jiménez Zafra, Miguel Ángel García Cumbreras, José Antonio García Díaz, Rafael Valencia García and L. Alfonso Ureña López have organized different events: i) shared tasks on SemEval, ACL, RANLP, EVALITA and IberLEF; ii) conferences and workshops, such as SEPLN and IberLEF; and iii) Doctoral Symposiums, such as the Doctoral Symposium on Natural Language Processing.
Daniel García Baena is a secondary school computer science teacher and a doctoral student at the Universidad de Jaén. His research is focused on Natural Language Processing and Text Categorization, specially in Hope-Speech. He participated in the committees of the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion at ACL LT-EDI 2022; and of HOPE at IberLEF 2023.
Fazlourrahman Balouchzahi a graduate Ph.D. student from CIC IPN Mexico, with a strong background in text processing and machine learning, including deep learning and transformers, he has a passion for exploring and contributing to the field of NLP. His research has focused on social media analysis and psychological emotion analysis, specifically detecting hope and regret from the text. Additionally, he brings a vast amount of experience in organizing around 5 and being a participant in more than 20 shared tasks in low-resource languages while bagging top positions in more than 80% of them.
Salud María Jiménez-Zafra, SINAI, Computer Science Department, Universidad de Jaén, Spain (sjzafra@ujaen.es). Her research is focused on Natural Language Processing and Text Classification, specially on Negation Processing, Sentiment Analysis, Offensive Language, Hope-Speech, Social Media Analysis and Language Resource Generation. She has been part of the organizing committee of the three editions of NEGES workshop, of SemEval-2016 Task 5: Aspect Based Sentiment Analysis, of the 32nd and 39th Annual SEPLN conference, of TASS at IberLEF 2020, of EmoEvalEs at IberLEF 2021, of the 2020, 2021, 2022 editions of the Doctoral Symposium on NLP from the PLN.net thematic network, of the 2023 edition of the the Doctoral Symposium on NLP from the Proyecto ILENIA, of the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI 2022 - ACL 2022 and at RANLP - LT-EDI 2023, of PoliticES at IberLEF 2022 and at IberLEF 2023, of HOPE at IberLEF 2023, of EVALITA 2023 Task - PoliticIT and of the 2023 edition of IberLEF workshop. Currently, she is part of the organizing committee of IberLEF 2024, of the Doctoral Symposium on NLP and of the Homophobia/Transphobia detection task at LT-EDI EACL 2024.
Sabur Butt is a Postdoctoral Researcher at the Institute for the Future of Education (IFE) at Tecnológico de Monterrey, Mexico. He specializes in the field of Natural Language Processing (NLP) and has a strong background in computer science.
Selen Bozkurt is an Assistant Professor at Emory University, specializing in natural language processing (NLP), machine learning (ML), and clinical decision support systems. With a background including roles at Stanford University and Flatiron Health, she brings expertise in biomedical data analytics and predictive modeling. Her research focuses on developing clinical decision support tools using advanced data science methods, including NLP and ML algorithms.
Bharathi Raja Chakravarthi is a permanent Lecturer-Above-the-Bar/Assistant Professor at the School of Computer Science at the University of Galway, Ireland. Before this, he was a Postdoctoral Fellow at the Insight SFI Research Centre for Data Analytics, Data Science Institute, University of Galway, Ireland. He completed his PhD from the Data Science Institute, University of Galway, Ireland. His recent research focuses on text classification, multimodal machine learning, sentiment analysis, abusive/offensive language detection, bias in natural language processing tasks, inclusive language detection, positivity in social media platforms, machine translation, and multilingualism. He has published multiple international conference papers (COLING, LREC, MTSUMMIT, DSAA, LDK, GWC, AICS, FIRE, etc.) and highly reputed journal papers (Computer Speech & Language, Language Resources and Evaluation, Social Network Analysis and Mining, Multimedia Tools and Applications, International Journal of Data Science and Analytics, etc.). He has received the Best Application Paper Award at DSAA 2020 IEEE and ACMfunded conference. Dr. Chakravarthi served as chair and lead organizer for the 1st and 2nd Workshop on Language Technology for Equality, Diversity and Inclusion (https://sites.google.com/view/lt-edi-2022) and Workshop on Speech and Language Technologies for Dravidian Languages (https://dravidianlangtech.github.io/2022/). He served on program committees for a number of ACL conferences and workshops. He also served as guest editor for special issues in Computer Speech & Language, Language Resources and Evaluation, and ACM Transactions on Asian and Low Resource Language Information Processing journals.
Hector G. Ceballos serves as a full-time faculty member in the Computer Science Graduate Program (DCC) and is affiliated with the Research Group on Advanced Artificial Intelligence. His primary research focuses on Machine Learning, Data Science, Process Mining, and Causality, applied to Research and Learning Analytics. With a publication record exceeding 60 papers in journals and conferences, Hector has also lent his expertise as a consultant to banks and IT companies. Over the years, he has mentored more than 20 master’s and PhD students. Currently, Hector G. Ceballos directs the Living Lab & Data Hub at the Institute for the Future of Education (IFE) at Tecnologico de Monterrey. Recognized for his contributions, Hector is a Level 1 member of the Mexican National System of Researchers (SNI) and an adherent member of the Mexican Academy of Computing (AMEXCOMP). Furthermore, Hector takes on the role of Organizing Committee Chair for the Special Interest Group in Learning Analytics in Latin America (LALA-SIG) within the Society for Learning Analytics Research (SOLAR). He is also a member of the QS EduData Summit Advisory Committee.
Rafael Valencia-García is Full Professor at the Department of Informatics and Systems in the Universidad de Murcia. His research interests focus on Natural Language Processing, Sentiment Analysis, Semantic Web and Recommender Systems. He was the General Chair of the SEPLN 2017 conference held in Murcia. He has participated in more than 35 research projects and published more than 150 articles in journals, conferences, and book chapters. He has been guest editor of several NLP related Special Issues in different JCR-indexed journals such as PMC, CSI, IJSEKE, JRPIT, JUCS or SCP. He was part of the organizing committee of the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI 2022 - ACL 2022 and part of the organizing committee of FinancES 2023 and PoliticES 2022 and 2023 in Iber- LEF and PoliticIT 2023 task in EVALITA.
Grigori Sidorov has co-authored more than 250 scientific publications with an h-index of 30. He is well experienced in the field of Emotion Detection in text. Apart from that, he is a regular member of the Mexican Academy of Sciences, National Researcher of Mexico (SNI) level 3 (highest). He is also the Editor-in-Chief of the research journal "Computación y Sistemas" (Clarivate Web of Science (Scielo, CORE collection (emerging sources)), Scopus, DBLP, index of excellence of Conacyt, etc.). He was also one of the organizers of the FIRE 2020 and 2021.
L. Alfonso Ureña-López, Professor of Computer Science, director of the SINAI research group of Universidad de Jaén, Spain (laurena@ujaen.es). He has been the president of SEPLN (Spanish Society for Natural Language Processing). His research is focused on Natural Language Processing, Word Sense Disambiguation, Text Categorization, Sentiment Analysis, Offensive Language, Hope Speech... He has organized and chaired numerous congresses in the area of NLP. Likewise, he has formed and is part of numerous scientific committees of conferences and workshops in NLP.
Alexander Gelbukh is the founder and chair of the CICLing International Conference series. He has been Honorary Chair of ENC-2008, Program co-chair of some recent MICAI, CIC, CORE, and some other conferences. He is also the founding Editor-in-Chief of the International Journal of Computational Linguistics and Applications (IJCLA) and Editor-in-Chief of the journal POLIBITS.
Hope speech is the type of speech that is able to relax a hostile environment (Palakodety et al., 2019) and that helps, gives suggestions and inspires for good to a number of people when they are in times of illness, stress, loneliness or depression (Chakravarthi, 2020). On social media, offensive messages are posted against people because of their race, color, ethnicity, gender, sexual orientation, nationality, or religion. As Chakravarthi (2020) stated, how vulnerable groups interact with social media has been studied and found that it plays an essential role in shaping the individual’s personality and view of society (Burnap et al., 2017; Kitzie, 2018; Milne et al., 2016). Examples of these vulnerable groups are the Lesbian, Gay, Bisexual, and Transgender (LGBT) community, racial minorities or people with disabilities. This task is related to the inclusion of vulnerable groups and focuses on the study of the detection of hope speech, in pursuit of Equality, Diversity and Inclusion. Given a tweet written in Spanish, identify whether it contains hope speech or not. Specifically, this task is divided into two subtasks, but participants will participate in both of them at the same time.
There will be a real time leaderboard and participants will be allowed to make a maximum of 10 submissions through CodaLab, from which each team will have to select the best one for ranking. Precision, Recall and F1-score will be measured per category and averaged using the macro-average method. Systems will be ranked using the macro-f1 score. As they are strongly correlated, it is not possible to participate in only one subtask (1.a or 1.b) from task 1. Therefore, submissions for task 1 will automatically qualify for both subtask 1.a and 1.b.
Giving training data on LGTBI tweets, the participants will have to classify each of the LGTBI-related tweets of the test set in one of the following categories:
Giving training data on LGTBI tweets, the participants will have to classify each of the tweets of the test set on unknown domains in one of the following categories:
Each team can participate with up to 10 submissions (except for development, where 100 submissions are allowed). The expected format for submissions is a .zip file (no folders within) with the predictions in a .csv file named as predictions.csv (comma separated file).
id,category
id,category
0,hs
1,nhs
Hope is characterized as "openness of spirit toward the future, a desire, expectation, and wish for something to happen or to be true" that remarkably affects a human’s state of mind, emotions, behaviors, and decisions (Bruininks and Malle, 2005; Balouchzahi et al., 2023). Nowadays, social media platforms significantly affect human life, and people freely express their thoughts on these platforms (Balouchzahi et al., 2022). Therefore, analyzing hope in social media is considered an essential determinant of well-being that can provide potentially valuable insights into the trajectory of goal-directed behaviors, persistence in the face of misfortunes, and the processes underlying adjustment to positive and negative life changes (Balouchzahi et al., 2023). This task focuses on expectations, and desirable and undesirable facts. Specifically, it is divided into two subtasks.
The participants will be given three-run submissions through CodaLab and they will be evaluated on Precision, Recall, and F1 scores. To evaluate multi-class hope detection detection, we will use accuracy and macro-averaged F1 and weighted F1 scores as used in the original paper (Balouchzahi et al., 2023).
Given training data, the participant will classify the text into two categories. In this problem, each text will be assigned one of the following labels :
Given training data, the participant will classify the text into following categories:
Each team can participate with up to 10 submissions. The expected format for submissions is a .zip file (no folders within) with the predictions in a .csv file named as predictions.csv (comma separated file).
id,category
id,category
0,Hope
1,Not Hope
By downloading the data or by accessing it any manner, you agree not to redistribute the data except for non-commercial and academic-research purposes. The data must not be used for providing surveillance, analyses or research that isolates a group of individuals or any single individual for any unlawful or discriminatory purpose.
You should cite this paper if you are using our data:
Start: Feb. 16, 2024, midnight
Start: Feb. 16, 2024, midnight
Start: Feb. 16, 2024, midnight
Start: Feb. 16, 2024, midnight
Start: Feb. 16, 2024, midnight
Start: April 1, 2024, midnight
Start: April 1, 2024, midnight
Start: April 1, 2024, midnight
Start: April 1, 2024, midnight
Start: April 1, 2024, midnight
Never
You must be logged in to participate in competitions.
Sign In# | Username | Score |
---|---|---|
1 | ChauPhamQuocHung | 0.73 |
2 | olp | 0.72 |
3 | hongson04 | 0.71 |