Abstract
A simple machine learning model of pluralisation as a linear regression problem minimising a p-adic metric substantially outperforms even the most robust of Euclidean-space regressors on languages in the Indo-European, Austronesian, Trans New-Guinea, Sino-Tibetan, Nilo-Saharan, Oto-Meanguean and Atlantic-Congo language families. There is insufficient evidence to support modelling distinct noun declensions as a p-adic neighbourhood even in Indo-European languages.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (volume 2: short papers) |
Place of Publication | Stroudsburg |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 24-32 |
Number of pages | 9 |
ISBN (Electronic) | 9781955917643 |
Publication status | Published - 2022 |
Event | The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing - Online Duration: 20 Nov 2022 → 23 Nov 2022 |
Conference
Conference | The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing |
---|---|
Abbreviated title | AACL-IJCNLP 2022 |
City | Online |
Period | 20/11/22 → 23/11/22 |