TextSumEval'2022

A Task By CNLP-NITS



Abstract

Automatic summarization is one of the most difficult tasks in Natural Language Processing as it requires a comprehensive understanding of input documents, identification of relevant content, and generation of a synthetic perspective of the document, often subject to a length constraint. Yet, this task is very important in the context of the deluge of varying quality information our generation has to tackle. Building better summarization systems requires progress in summarization evaluation metrics, which are used to assess the quality of the summaries they produce. There is two current trends to summarization evaluation: manual and automatic evaluation. Manual evaluation consists in ranking summaries or parts of summaries according to a set of factors such as faithfulness to the original, linguistic fluency. The automatic evaluation focuses more on comparing the system production to a set of human-authored summaries deemed a gold standard. Manual evaluation is more accurate but much more costly than automatic evaluation, and it is often not actionable in a machine learning environment (systems require rapid and repeated evaluation of their output in order to learn how to summarize). Current methods for automatic evaluation fail because they involve a too candid representation of meaning (through word n-grams for ROUGE, for example), a problem which has been identified as a major hurdle for the advancement of the field. So, the aim of this shared task is to design an automatic evaluation system for system generated summaries where the evaluation system will compare the test summaries with gold summaries and estimate their relevance in terms of BLUE, ROUGE and Semantic score. Based on the regorous review, the working note of the shared task will be published in ICICSA-2022.

Key Steps

 
 
 
 
 

Input

Test Summaries and Gold Summaries.

System Development

An automatic evaluation system need to be developed which compares the test summaries with the gold summaries and evaluate the Semantic scores which should be normalized between 0 to 100.

 
 
 
 
 
 
 
 
 
 

Output

Participants need to submit the results in CSV format.

System Evaluation

The submitted score will be varified by the organizer in terms of Semantic and Human evaluation score.

 
 
 
 
 

References

  • [1] El-Kassas, Wafaa S., Cherif R. Salama, Ahmed A. Rafea, and Hoda K. Mohamed. "Automatic text summarization: A comprehensive survey." Expert Systems with Applications 165 (2021): 113679.
  • [2] Kryściński, Wojciech, Bryan McCann, Caiming Xiong, and Richard Socher. "Evaluating the factual consistency of abstractive text summarization." arXiv preprint arXiv:1910.12840 (2019).
  • [3] Gliwa, Bogdan, Iwona Mochol, Maciej Biesek, and Aleksander Wawer. "SAMSum corpus: A human-annotated dialogue dataset for abstractive summarization." arXiv preprint arXiv:1911.12237 (2019
  • [4] Agirre, Eneko, Aitor Gonzalez Agirre, Inigo Lopez-Gazpio, Montserrat Maritxalar, German Rigau Claramunt, and Larraitz Uria. "Semeval-2016 task 2: Interpretable semantic textual similarity." In SemEval-2016. 10th International Workshop on Semantic Evaluation; 2016 Jun 16-17; San Diego, CA. Stroudsburg (PA): ACL; 2016. p. 512-24. ACL (Association for Computational Linguistics), 2016.
  • [5] Goutam Majumder, Partha Pakray, Ranjita Das and David Eduardo Pinto Avendaño, “Interpretable Semantic Textual Similarity of sentences using alignment of chunks with Classification and Regression“, Applied Intelligence (The International Journal of Research on Intelligent Systems for Real Life Complex Problems) (Published online: 08/03/2021, Springer, Impact Factor: 5.086) (Q2 in SJR citation index) (2021)
  • [6] Graham, Yvette. "Re-evaluating automatic summarization with BLEU and 192 shades of ROUGE." In Proceedings of the 2015 conference on empirical methods in natural language processing, pp. 128-137. 2015.
  • [7] https://github.com/abisee/cnn-dailymail
  • [8] https://aclanthology.org/D18-1448/
  • [9] https://github.com/EdinburghNLP/XSum/tree/master/XSum-Dataset




Important Dates

Date Event
Task Registration 14 March, 2022
Data Released 01 April, 2022
Registration Closed 30 April, 2022
Result Submission 30 May, 2022
Result Declaration 15 June, 2022
Working note submission 25 June, 2022


Announcement

Organizers

image1

Dr. Partha Pakray

image1

Prof. Sivaji Bandyopadhyay

image1

Dr. Benoit Favre

image1

Pr. Thierry Artières

Task Coordinators

image1

Pankaj Dadure

image1

Sahinur R. Laskar

image1

Tawmo

image1

Prottay Adhikary

Student Volunteers

Vetagiri Advaitha

Shyambabu Pandey


Prachurya Nath

Nihar Jyoti Basisth


Md. Arshad Ahmed

Shayak Chakraborty



Technical Program Committee

Dr. Sudip Kumar Naskar

Jadavpur University, India.

Dr. Somnath Mukhopadhyay

Assam University, India.

Dr. Alexander Gelbukh

Instituto Politécnico Naciona, Mexico.

Dr. Ajoy Kumar Khan

Mizoram University, India.

Dr. Amitava Das

Wipro AI Lab, India.

Dr. Rabiah Abdul Kadir

Universiti Kebangsaan Malaysia, Selangor.

Dr. Maya Silvi Lydia

University of Sumatera Utara, Medan.

Dr. Dipankar Das

Jadavpur University, India.