Abstract
Summarisation of long financial documents is a challenging task due to the lack of large-scale datasets and the need for domain knowledge experts to create human-written summaries. Traditional summarisation approaches that generate a summary based on the content cannot produce summaries comparable to human-written ones and thus are rarely used in practice. In this work, we use the Longformer-Encoder-Decoder (LED) model to handle long financial reports. We describe our experiments and participating systems in the financial narrative summarisation shared task. Multi-stage fine-tuning helps the model generalise better on niche domains and avoids the problem of catastrophic forgetting. We further investigate the effect of the staged fine-tuning approach on the FNS dataset. Our systems achieved promising results in terms of ROUGE scores on the validation dataset.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 4th Financial Narrative Processing Workshop (FNP 2022) |
| Editors | Mahmoud El-Haj, Paul Rayson, Nadhem Zmandar |
| Place of Publication | Paris |
| Publisher | European Language Resources Association (ELRA) |
| Pages | 73-78 |
| Number of pages | 6 |
| ISBN (Electronic) | 9791095546740 |
| Publication status | Published - 2022 |
| Event | Financial Narrative Processing Workshop (4th : 2022) - Marseille, France Duration: 24 Jun 2022 → 24 Jun 2022 Conference number: 4th |
Conference
| Conference | Financial Narrative Processing Workshop (4th : 2022) |
|---|---|
| Abbreviated title | FNP 2022 |
| Country/Territory | France |
| City | Marseille |
| Period | 24/06/22 → 24/06/22 |
Bibliographical note
Copyright the Publisher. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Document summarisation
- Financial documents
- Longformer
- LED
- Sequential fine-tuning
Fingerprint
Dive into the research topics of 'Transformer-based models for long document summarisation in financial domain'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver