To facilitate the use of Twitter data for monitoring personal experiences of COVID-19 in real time and on a large scale, this binary classification task involves automatically distinguishing tweets that self-report a COVID-19 diagnosis (annotated as "1")—for example, a postitive test, clinical diagnosis, or hospitalization—from those that do not (annotated as "0"). By this definition, a tweet that merely states that the user has experienced COVID-19 would not be considered a diagnosis. A benchmark classifier, based on a COVID-Twitter-BERT pretrained model, achieved an F1-score of 0.94 for the "positive" class (i.e., tweets that self-report a COVID-19 diagnosis).
To participate in #SMM4H 2023 Task 1, please register your team here with the same e-mail address as your CodaLab account. When your registration is approved, you will be invited to a Google group, where the training, validation, and test data will be made available. Please check the #SMM4H 2023 website for important dates.
Systems will be evaluated based on the F1-score for the class of tweets that self-report the user's exact age (annotated as "1"), where F1-score = 2 * (Precision * Recall) / (Precision + Recall), Precision = True Positives / (True Positives + False Positives), and Recall = True Positives / (True Positives + False Negatives).
By submitting results to this competition, you consent to the public release of your scores at the SMM4H'23 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers. You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science. You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers. You further agree to submit and present a short paper describing your system during the workshop. You agree not to redistribute the training and test data except in the manner prescribed by its licence.
System predictions should be submitted as a ZIP file containing a TSV file with only two columns: the tweet_id column first and the label column second. The TSV file should not be in a folder in the ZIP file, and the ZIP file should not contain any files or folders other than the TSV file. The TSV file should be named prediction_task1.tsv.
Ari Klein (ariklein@pennmedicine.upenn.edu)
Start: April 25, 2023, midnight
Start: July 10, 2023, midnight
Start: July 15, 2023, midnight
Never
You must be logged in to participate in competitions.
Sign In| # | Username | Score |
|---|---|---|
| 1 | sumam | 0.92 |