TextSumEval-2022

Abstract

Automatic summarization is one of the most difficult tasks in Natural Language Processing as it requires a comprehensive understanding of input documents, identification of relevant content, and generation of a synthetic perspective of the document, often subject to a length constraint. Yet, this task is very important in the context of the deluge of varying quality information our generation has to tackle. Building better summarization systems requires progress in summarization evaluation metrics, which are used to assess the quality of the summaries they produce. There is two current trends to summarization evaluation: manual and automatic evaluation. Manual evaluation consists in ranking summaries or parts of summaries according to a set of factors such as faithfulness to the original, linguistic fluency. The automatic evaluation focuses more on comparing the system production to a set of human-authored summaries deemed a gold standard. Manual evaluation is more accurate but much more costly than automatic evaluation, and it is often not actionable in a machine learning environment (systems require rapid and repeated evaluation of their output in order to learn how to summarize). Current methods for automatic evaluation fail because they involve a too candid representation of meaning (through word n-grams for ROUGE, for example), a problem which has been identified as a major hurdle for the advancement of the field. So, the aim of this shared task is to design an automatic evaluation system for system generated summaries where the evaluation system will compare the test summaries with gold summaries and estimate their relevance in terms of BLUE, ROUGE and Semantic score. Based on the regorous review, the working note of the shared task will be published in ICICSA-2022.

Key Steps

Input

Test Summaries and Gold Summaries.

System Development

An automatic evaluation system need to be developed which compares the test summaries with the gold summaries and evaluate the Semantic scores which should be normalized between 0 to 100.

Output

Participants need to submit the results in CSV format.

System Evaluation

The submitted score will be varified by the organizer in terms of Semantic and Human evaluation score.

Dataset	Gold Summary	Predicted Summary
CNN Daily News	2000	2000
MSMO	2000	2000
XSUM	2000	2000

References

[1] El-Kassas, Wafaa S., Cherif R. Salama, Ahmed A. Rafea, and Hoda K. Mohamed. "Automatic text summarization: A comprehensive survey." Expert Systems with Applications 165 (2021): 113679.
[2] Kryściński, Wojciech, Bryan McCann, Caiming Xiong, and Richard Socher. "Evaluating the factual consistency of abstractive text summarization." arXiv preprint arXiv:1910.12840 (2019).
[3] Gliwa, Bogdan, Iwona Mochol, Maciej Biesek, and Aleksander Wawer. "SAMSum corpus: A human-annotated dialogue dataset for abstractive summarization." arXiv preprint arXiv:1911.12237 (2019
[4] Agirre, Eneko, Aitor Gonzalez Agirre, Inigo Lopez-Gazpio, Montserrat Maritxalar, German Rigau Claramunt, and Larraitz Uria. "Semeval-2016 task 2: Interpretable semantic textual similarity." In SemEval-2016. 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 512-24. ACL (Association for Computational Linguistics), 2016.
[5] Goutam Majumder, Partha Pakray, Ranjita Das and David Eduardo Pinto Avendaño, “Interpretable Semantic Textual Similarity of sentences using alignment of chunks with Classification and Regression“, Applied Intelligence (The International Journal of Research on Intelligent Systems for Real Life Complex Problems) (Published online: 08/03/2021, Springer, Impact Factor: 5.086) (Q2 in SJR citation index) (2021)
[6] Graham, Yvette. "Re-evaluating automatic summarization with BLEU and 192 shades of ROUGE." In Proceedings of the 2015 conference on empirical methods in natural language processing, pp. 128-137. 2015.
[7] https://github.com/abisee/cnn-dailymail
[8] https://aclanthology.org/D18-1448/
[9] https://github.com/EdinburghNLP/XSum/tree/master/XSum-Dataset

Date	Event
Task Registration	~~14 March, 2022~~
Data Released	~~01 April, 2022~~
Registration Closed	~~30 April, 2022~~
Result Submission	~~30 May, 2022~~
Result Declaration	15 June, 2022
Working note submission	25 June, 2022

Student Volunteers

Vetagiri Advaitha

Shyambabu Pandey

Prachurya Nath

Nihar Jyoti Basisth

Md. Arshad Ahmed

Shayak Chakraborty

Technical Program Committee

Dr. Sudip Kumar Naskar

Jadavpur University, India.

Dr. Somnath Mukhopadhyay

Assam University, India.

Dr. Alexander Gelbukh

Instituto Politécnico Naciona, Mexico.

Dr. Ajoy Kumar Khan

Mizoram University, India.

Dr. Amitava Das

Wipro AI Lab, India.

Dr. Rabiah Abdul Kadir

Universiti Kebangsaan Malaysia, Selangor.

Dr. Maya Silvi Lydia

University of Sumatera Utara, Medan.

Dr. Dipankar Das

Jadavpur University, India.

TextSumEval'2022

Abstract

Key Steps

Input

System Development

Output

System Evaluation

References

Important Dates

Announcement

Organizers

Dr. Partha Pakray

Prof. Sivaji Bandyopadhyay

Dr. Benoit Favre

Pr. Thierry Artières

Task Coordinators

Pankaj Dadure

Sahinur R. Laskar

Tawmo

Prottay Adhikary

Student Volunteers

Technical Program Committee

TextSumEval'2022

Abstract

Key Steps

Input

System Development

Output

System Evaluation

Test Data

References

Important Dates

Announcement

Organizers

Dr. Partha Pakray

Prof. Sivaji Bandyopadhyay

Dr. Benoit Favre

Pr. Thierry Artières

Task Coordinators

Pankaj Dadure

Sahinur R. Laskar

Tawmo

Prottay Adhikary

Student Volunteers

Technical Program Committee

Contact Us

Pankaj Dadure: +91 95457 57478