Leveraging meta information in short text aggregation

He Zhao, Lan Du*, Guanfeng Liu, Wray Buntine

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

2 Downloads (Pure)

Abstract

Analysing topics in short texts (e.g., tweets and new headings) is a challenging task because short texts often contain insufficient word co-occurrence information, which is important to learn good topics in conventional topic topics. To deal with the insufficiency, we propose a generative model that aggregates short texts into clusters by leveraging the associated meta information. Our model can generate more interpretable topics as well as document clusters. We develop an effective Gibbs sampling algorithm favoured by the fully local conjugacy in the model. Extensive experiments demonstrate that our model achieves better performance in terms of document clustering and topic coherence.

Original languageEnglish
Title of host publicationProceedings of the 57th Annual Meeting of the Association for Computational Linguistics
EditorsAnna Korhonen, David Traum, Lluís Màrquez
Place of PublicationStroudsburg, PA
PublisherAssociation for Computational Linguistics (ACL)
Pages4042-4049
Number of pages8
ISBN (Print)9781950737482
DOIs
Publication statusPublished - 2019
Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
Duration: 28 Jul 20192 Aug 2019

Publication series

NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
CountryItaly
CityFlorence
Period28/07/192/08/19

Bibliographical note

Copyright the Publisher 2019. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Cite this