A Task By CNLP-NITS
Automatic summarization is one of the most difficult tasks in Natural Language Processing as it requires a comprehensive understanding of input documents, identification of relevant content, and generation of a synthetic perspective of the document, often subject to a length constraint. Yet, this task is very important in the context of the deluge of varying quality information our generation has to tackle. Building better summarization systems requires progress in summarization evaluation metrics, which are used to assess the quality of the summaries they produce. There is two current trends to summarization evaluation: manual and automatic evaluation. Manual evaluation consists in ranking summaries or parts of summaries according to a set of factors such as faithfulness to the original, linguistic fluency. The automatic evaluation focuses more on comparing the system production to a set of human-authored summaries deemed a gold standard. Manual evaluation is more accurate but much more costly than automatic evaluation, and it is often not actionable in a machine learning environment (systems require rapid and repeated evaluation of their output in order to learn how to summarize). Current methods for automatic evaluation fail because they involve a too candid representation of meaning (through word n-grams for ROUGE, for example), a problem which has been identified as a major hurdle for the advancement of the field. So, the aim of this shared task is to design an automatic evaluation system for system generated summaries where the evaluation system will compare the test summaries with gold summaries and estimate their relevance in terms of BLUE, ROUGE and Semantic score. Based on the regorous review, the working note of the shared task will be published in ICICSA-2022.
Test Summaries and Gold Summaries.
An automatic evaluation system need to be developed which compares the test summaries with the gold summaries and evaluate the Semantic scores which should be normalized between 0 to 100.
Participants need to submit the results in CSV format.
The submitted score will be varified by the organizer in terms of Semantic and Human evaluation score.
Dataset | Gold Summary | Predicted Summary |
---|---|---|
CNN Daily News | 2000 | 2000 |
MSMO | 2000 | 2000 |
XSUM | 2000 | 2000 |
Date | Event |
---|---|
Task Registration | |
Data Released | |
Registration Closed | |
Result Submission | |
Result Declaration | 15 June, 2022 |
Working note submission | 25 June, 2022 |
Vetagiri Advaitha
Shyambabu Pandey
Prachurya Nath
Nihar Jyoti Basisth
Md. Arshad Ahmed
Shayak Chakraborty
Dr. Sudip Kumar Naskar
Jadavpur University, India.
Dr. Somnath Mukhopadhyay
Assam University, India.
Dr. Alexander Gelbukh
Instituto Politécnico Naciona, Mexico.
Dr. Ajoy Kumar Khan
Mizoram University, India.
Dr. Amitava Das
Wipro AI Lab, India.
Dr. Rabiah Abdul Kadir
Universiti Kebangsaan Malaysia, Selangor.
Dr. Maya Silvi Lydia
University of Sumatera Utara, Medan.
Dr. Dipankar Das
Jadavpur University, India.