What and with whom? Identifying topics in Twitter through both interactions and text

Robertus Nugroho, Jian Yang, Weiliang Zhao, Cecile Paris, Surya Nepal

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


The overwhelming amount of information continuously flowing through the Twitter environment makes topic derivation essential. It indeed plays a valuable role in a variety of Twitter-based applications, including content recommendations, news summarization, market analysis, etc. Topic derivation methods are typically based on semantic features of tweet contents. Because tweets are short by nature, such methods suffer from data sparsity. To alleviate this problem, this paper proposes a topic derivation method that incorporates tweet text similarity and interactions measures. Besides the tweet contents, the approach takes into account several types of interactions amongst tweets: Tweets which mention the same people, replies and retweets. Topic derivation is done through a two-step matrix factorization process. We conducted a number of experiments on several Twitter datasets to reveal both the individual and integrated effects of the various features being considered. Our experimental results against TREC2014 and our self collected tweetMarch datasets demonstrate that the proposed method is able to provide more than 30 percent improvement compared to other advanced topic derivation methods.

Original languageEnglish
Pages (from-to)584-596
Number of pages13
JournalIEEE Transactions on Services Computing
Issue number3
Publication statusPublished - May 2020


  • joint-NMF
  • topic derivation
  • tweets interactions
  • Twitter


Dive into the research topics of 'What and with whom? Identifying topics in Twitter through both interactions and text'. Together they form a unique fingerprint.

Cite this