This competition page is dedicated to subtask I. In order to participate in subtask II, follow this link.
In subtask I, participants are asked to develop classifiers that take (a subset of) the available metadata of articles as input and output one of 123 predefined hierarchical classes from the ORKG taxonomy of research fields. Classifiers will be trained and tested using a dataset of 59.3K English scientific papers constructed by fetching metadata from ORKG (CC BY-SA 4.0) and arXiv (CC0 1.0).
The following metadata fields are available:
Systems will be evaluated using accuracy as well as weighted scores of recall, precision, and F1.
The phases of the competition are as follows:
1. Development phase: during this phase, participants will develop classification models using the provided train and validation sets. Results of this phase will not be used for final standings.
2. Testing phase: Starting from January 10, 2024 until February 22, 2024 February 29, 2024, a test set will be released. Participants are expected to use their developed models and upload their results. A leaderboard will be formed from these results, and will decide the final standings of the challenge.
Participants are encouraged to submit a short paper describing their systems (up to 8 pages in length, excluding references) to the NSLP 2024 workshop, which will be co-located with ESWC 2024 in Crete. Please refer to the guidelines here.
The training and validation datasets can be accessed here: https://zenodo.org/records/10438530
The testing dataset can be accessed here: https://zenodo.org/records/10469550
Note: When uploading your predictions, make sure that they are in the "predictions.csv" format. For more details, please go to the Evaluation tab. A sample code for preparing the predictions of the validation data can be accessed here: https://drive.google.com/file/d/1bqLz0Nt33pVOMV8LTGeFipwmupjy5cgb/view
Important: The total number of system submissions is 10.
If you have any questions, feel free to contact us:
Or write your question in the dedicated Forums tab.
Systems will be evaluated using accuracy as well as weighted scores of precision, recall, and F1.
When uploading your predictions, please note to have them in the correct format.
You should upload a "predictions.csv" file inside a "predictions.zip" file. Please note to save the final predictions using the text of the original labels from the ORKG taxonomy (and not numerical categories). Please also note to save the predictions in a column named "target".
Start: Jan. 2, 2024, midnight
Start: Jan. 10, 2024, midnight
March 1, 2024, midnight
You must be logged in to participate in competitions.
Sign In