Topic segmentation with an ordering-based topic model

Lan Du, John K. Pate, Mark Johnson

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

Abstract

Documents from the same domain usually discuss similar topics in a similar order. However, the number of topics and the exact topics discussed in each individual document can vary. In this paper we present a simple topic model that uses generalised Mallows models and incomplete topic orderings to incorporate this ordering regularity into the probabilistic generative process of the new model. We show how to reparame-terise the new model so that a point-wise sampling algorithm from the Bayesian word segmentation literature can be used for inference. This algorithm jointly samples not only the topic orders and the topic assignments but also topic segmentations of documents. Experimental results show that our model performs significantly better than the other ordering-based topic models on nearly all the corpora that we used, and competitively with other state-of-the-art topic segmentation models on corpora that have a strong ordering regularity.

LanguageEnglish
Title of host publicationProceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
Place of PublicationPalo Alto, CA
PublisherAssociation for the Advancement of Artificial Intelligence
Pages2232-2238
Number of pages7
Volume3
ISBN (Electronic)9781577357018
Publication statusPublished - 1 Jun 2015
Event29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 - Austin, United States
Duration: 25 Jan 201530 Jan 2015

Other

Other29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
CountryUnited States
CityAustin
Period25/01/1530/01/15

Fingerprint

Sampling

Cite this

Du, L., Pate, J. K., & Johnson, M. (2015). Topic segmentation with an ordering-based topic model. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 (Vol. 3, pp. 2232-2238). Palo Alto, CA: Association for the Advancement of Artificial Intelligence.
Du, Lan ; Pate, John K. ; Johnson, Mark. / Topic segmentation with an ordering-based topic model. Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. Vol. 3 Palo Alto, CA : Association for the Advancement of Artificial Intelligence, 2015. pp. 2232-2238
@inproceedings{7003e3a4843e49a0aa8806f4506e0869,
title = "Topic segmentation with an ordering-based topic model",
abstract = "Documents from the same domain usually discuss similar topics in a similar order. However, the number of topics and the exact topics discussed in each individual document can vary. In this paper we present a simple topic model that uses generalised Mallows models and incomplete topic orderings to incorporate this ordering regularity into the probabilistic generative process of the new model. We show how to reparame-terise the new model so that a point-wise sampling algorithm from the Bayesian word segmentation literature can be used for inference. This algorithm jointly samples not only the topic orders and the topic assignments but also topic segmentations of documents. Experimental results show that our model performs significantly better than the other ordering-based topic models on nearly all the corpora that we used, and competitively with other state-of-the-art topic segmentation models on corpora that have a strong ordering regularity.",
author = "Lan Du and Pate, {John K.} and Mark Johnson",
year = "2015",
month = "6",
day = "1",
language = "English",
volume = "3",
pages = "2232--2238",
booktitle = "Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015",
publisher = "Association for the Advancement of Artificial Intelligence",
address = "United States",

}

Du, L, Pate, JK & Johnson, M 2015, Topic segmentation with an ordering-based topic model. in Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. vol. 3, Association for the Advancement of Artificial Intelligence, Palo Alto, CA, pp. 2232-2238, 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015, Austin, United States, 25/01/15.

Topic segmentation with an ordering-based topic model. / Du, Lan; Pate, John K.; Johnson, Mark.

Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. Vol. 3 Palo Alto, CA : Association for the Advancement of Artificial Intelligence, 2015. p. 2232-2238.

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

TY - GEN

T1 - Topic segmentation with an ordering-based topic model

AU - Du, Lan

AU - Pate, John K.

AU - Johnson, Mark

PY - 2015/6/1

Y1 - 2015/6/1

N2 - Documents from the same domain usually discuss similar topics in a similar order. However, the number of topics and the exact topics discussed in each individual document can vary. In this paper we present a simple topic model that uses generalised Mallows models and incomplete topic orderings to incorporate this ordering regularity into the probabilistic generative process of the new model. We show how to reparame-terise the new model so that a point-wise sampling algorithm from the Bayesian word segmentation literature can be used for inference. This algorithm jointly samples not only the topic orders and the topic assignments but also topic segmentations of documents. Experimental results show that our model performs significantly better than the other ordering-based topic models on nearly all the corpora that we used, and competitively with other state-of-the-art topic segmentation models on corpora that have a strong ordering regularity.

AB - Documents from the same domain usually discuss similar topics in a similar order. However, the number of topics and the exact topics discussed in each individual document can vary. In this paper we present a simple topic model that uses generalised Mallows models and incomplete topic orderings to incorporate this ordering regularity into the probabilistic generative process of the new model. We show how to reparame-terise the new model so that a point-wise sampling algorithm from the Bayesian word segmentation literature can be used for inference. This algorithm jointly samples not only the topic orders and the topic assignments but also topic segmentations of documents. Experimental results show that our model performs significantly better than the other ordering-based topic models on nearly all the corpora that we used, and competitively with other state-of-the-art topic segmentation models on corpora that have a strong ordering regularity.

UR - http://www.scopus.com/inward/record.url?scp=84959934772&partnerID=8YFLogxK

M3 - Conference proceeding contribution

VL - 3

SP - 2232

EP - 2238

BT - Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015

PB - Association for the Advancement of Artificial Intelligence

CY - Palo Alto, CA

ER -

Du L, Pate JK, Johnson M. Topic segmentation with an ordering-based topic model. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. Vol. 3. Palo Alto, CA: Association for the Advancement of Artificial Intelligence. 2015. p. 2232-2238