Generating summaries of scientific documents is known to be a challenging task. Majority of existing work in summarization assumes only one single best gold summary for each given document. Having only one gold summary negatively impacts our ability to evaluate the quality of summarization systems as writing summaries is a subjective activity. At the same time, annotating multiple gold summaries for scientific documents can be extremely expensive as it requires domain experts to read and understand long scientific documents. This shared task will enable exploring methods for generating multi-perspective summaries. We introduce a novel summarization corpus, leveraging data from scientific peer reviews to capture diverse perspectives from the reader's point of view. Peer reviews in various scientific fields often include an introductory paragraph that summarizes the key contributions of a paper from the reviewer standpoint and each paper usually receives multiple reviews. We leverage data from OpenReview, an open and publicly available platform for scientific publishing. We collect a corpus of papers and their reviews from venues on openreview such as ICLR, NeurIPS, and AKBC primarily from the AI, Machine Learning and Natural Language Processing fields. We use carefully designed heuristics to only include first paragraphs of reviews that are summary-like. We manually check the summaries obtained from this approach on a subset of the data and ensure the high quality of the summaries. The corpus contains a total of 8.5K papers, and 19K summaries (with average number of 2.57 summaries per paper). The summaries are on average 100.1 words long (space tokenized). Please refer to our Github page for further instructions: https://github.com/allenai/mup
The intrinsic evaluation will be done by ROUGE, using ROUGE-1, -2, -L metrics. The average of the ROUGE-F scores obtained against the multiple summaries would be used for final ranking. Please refer to our Github page for further instructions:https://github.com/allenai/mup
This page enumerated the terms and conditions of the competition.
Start: Jan. 1, 2022, midnight
Description: Test phase: submit results on test data; Submit a single zip file that contains your testing.csv file
Never
You must be logged in to participate in competitions.
Sign In# | Username | Score |
---|---|---|
1 | guir | 41.36 |
2 | guneetAI | 41.36 |
3 | armanc | 40.80 |