Abstract
Summarisation of long financial documents is a challenging task due to the lack of large-scale datasets and the need for domain knowledge experts to create human-written summaries. Traditional summarisation approaches that generate a summary based on the content cannot produce summaries comparable to human-written ones and thus are rarely used in practice. In this work, we use the Longformer-Encoder-Decoder (LED) model to handle long financial reports. We describe our experiments and participating systems in the financial narrative summarisation shared task. Multi-stage fine-tuning helps the model generalise better on niche domains and avoids the problem of catastrophic forgetting. We further investigate the effect of the staged fine-tuning approach on the FNS dataset. Our systems achieved promising results in terms of ROUGE scores on the validation dataset.
Original language | English |
---|---|
Title of host publication | Proceedings of the 4th Financial Narrative Processing Workshop (FNP 2022) |
Editors | Mahmoud El-Haj, Paul Rayson, Nadhem Zmandar |
Place of Publication | Paris |
Publisher | European Language Resources Association (ELRA) |
Pages | 73-78 |
Number of pages | 6 |
ISBN (Electronic) | 9791095546740 |
Publication status | Published - 2022 |
Event | Financial Narrative Processing Workshop (4th : 2022) - Marseille, France Duration: 24 Jun 2022 → 24 Jun 2022 Conference number: 4th |
Conference
Conference | Financial Narrative Processing Workshop (4th : 2022) |
---|---|
Abbreviated title | FNP 2022 |
Country/Territory | France |
City | Marseille |
Period | 24/06/22 → 24/06/22 |
Bibliographical note
Copyright the Publisher. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Document summarisation
- Financial documents
- Longformer
- LED
- Sequential fine-tuning