Transformer-based models for long document summarisation in financial domain

Urvashi Khanna, Samira Ghodratnama, Diego Mollá, Amin Beheshti

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

10 Citations (Scopus)
77 Downloads (Pure)

Abstract

Summarisation of long financial documents is a challenging task due to the lack of large-scale datasets and the need for domain knowledge experts to create human-written summaries. Traditional summarisation approaches that generate a summary based on the content cannot produce summaries comparable to human-written ones and thus are rarely used in practice. In this work, we use the Longformer-Encoder-Decoder (LED) model to handle long financial reports. We describe our experiments and participating systems in the financial narrative summarisation shared task. Multi-stage fine-tuning helps the model generalise better on niche domains and avoids the problem of catastrophic forgetting. We further investigate the effect of the staged fine-tuning approach on the FNS dataset. Our systems achieved promising results in terms of ROUGE scores on the validation dataset.
Original languageEnglish
Title of host publicationProceedings of the 4th Financial Narrative Processing Workshop (FNP 2022)
EditorsMahmoud El-Haj, Paul Rayson, Nadhem Zmandar
Place of PublicationParis
PublisherEuropean Language Resources Association (ELRA)
Pages73-78
Number of pages6
ISBN (Electronic)9791095546740
Publication statusPublished - 2022
EventFinancial Narrative Processing Workshop (4th : 2022) - Marseille, France
Duration: 24 Jun 202224 Jun 2022
Conference number: 4th

Conference

ConferenceFinancial Narrative Processing Workshop (4th : 2022)
Abbreviated titleFNP 2022
Country/TerritoryFrance
CityMarseille
Period24/06/2224/06/22

Bibliographical note

Copyright the Publisher. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Document summarisation
  • Financial documents
  • Longformer
  • LED
  • Sequential fine-tuning

Fingerprint

Dive into the research topics of 'Transformer-based models for long document summarisation in financial domain'. Together they form a unique fingerprint.

Cite this