Multi-granularity entity recognition based sentence ranking for multi-document summarization

Guowei Zhang*, Xuyun Zhang, Zhiyong Wang, Amin Beheshti

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Text summarization aims to condense text documents into a concise textual summary, which improves the efficiency of people in comprehending information. While deep learning-based summarization methods for individual documents have achieved good performance, there is an increasing demand for summarizing multiple related documents of a topic or event can yield a more coherent and succinct summary of the document set. However, the characteristics of multiple documents with more information, longer texts, and different styles impose new challenges to existing methods in dealing with the multi-aspect of a topic or an event. Therefore, in this paper, we propose a novel multi-granularity model with entity recognition for better sentence ranking and capturing the key information of different documents with a comprehensive and accurate summary. Specifically, we use PRIMERA as a token encoder based on the encoder-decoder framework. Then, a named entity recognition model is trained to identify key elements in documents such as people, location, organization, etc. The proposed model will focus more on these key elements. Based on the named entity recognition results, we further devise a sentence ranking module that allows the model to assign different weights to different sentences based on the sum of the frequencies of the entities contained in the sentences. Finally, based on the results of the multi-granularity encoding vector, the decoder can generate a comprehensive and accurate summary. To evaluate the performance of our proposed model, we conducted experiments on CoNLL2003, DUC2003, and DUC2004, which demonstrated the performance improvement of our proposed method over four previous models.

Original languageEnglish
Title of host publication2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA)
Subtitle of host publicationproceedings
EditorsYannis Manolopoulos, Zhi-Hua Zhou
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages547-556
Number of pages10
ISBN (Electronic)9798350345032
ISBN (Print)9798350345049
DOIs
Publication statusPublished - 2023
Event10th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2023 - Thessaloniki, Greece
Duration: 9 Oct 202312 Oct 2023

Conference

Conference10th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2023
Country/TerritoryGreece
CityThessaloniki
Period9/10/2312/10/23

Fingerprint

Dive into the research topics of 'Multi-granularity entity recognition based sentence ranking for multi-document summarization'. Together they form a unique fingerprint.

Cite this