Abstract
Background: Datasets available for abstract sentence classification modelling are predominately comprised of abstracts sourced from biomedical research.
Aims: To contribute a large non-biomedical multidisciplinary dataset for abstract sentence classification model research.
Method: Bulk extract and transformation of Emerald Group Publishing structured abstracts indexed on Scopus.
Results: We present the largest multidisciplinary dataset for abstract sentence classification modelling, consisting of 1,050,397 sentences from 103,457 abstracts.
Aims: To contribute a large non-biomedical multidisciplinary dataset for abstract sentence classification model research.
Method: Bulk extract and transformation of Emerald Group Publishing structured abstracts indexed on Scopus.
Results: We present the largest multidisciplinary dataset for abstract sentence classification modelling, consisting of 1,050,397 sentences from 103,457 abstracts.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 17th Workshop of the Australasian Language Technology Association |
| Editors | Meladel Mistica, Massimo Piccardi, Andrew MacKinlay |
| Place of Publication | Melbourne, VIC |
| Publisher | Australasian Language Technology Association |
| Pages | 120-125 |
| Number of pages | 6 |
| Publication status | Published - 2019 |
| Event | 17th Annual Workshop of The Australasian Language Technology Association (ALTA 2019) - Sydney, Australia Duration: 4 Dec 2019 → 6 Dec 2019 |
Conference
| Conference | 17th Annual Workshop of The Australasian Language Technology Association (ALTA 2019) |
|---|---|
| Country/Territory | Australia |
| City | Sydney |
| Period | 4/12/19 → 6/12/19 |
Bibliographical note
Copyright the Publisher 2019. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Structured abstracts
- Natural language processing
- information systems