Similarity metrics for clustering PubMed abstracts for evidence based medicine

Hamed Hassanzadeh, Diego Mollá, Tudor Groza, Anthony Nguyen, Jane Hunter

Research output: Contribution to journalConference paperpeer-review

1 Citation (Scopus)


We present a clustering approach for documents returned by a PubMed search, which enable the organisation of evidence underpinning clinical recommendations for Evidence Based Medicine. Our approach uses a combination of document similarity metrics, which are fed to an agglomerative hierarchical clusterer. These metrics quantify the similarity of published abstracts from syntactic, semantic, and statistical perspectives. Several evaluations have been performed, including: an evaluation that uses ideal documents as selected and clustered by clinical experts; a method that maps the output of PubMed to the ideal clusters annotated by the experts; and an alternative evaluation that uses the manual clustering of abstracts. The results of using our similarity metrics approach shows an improvement over K-means and hierarchical clustering methods using TFIDF.
Original languageEnglish
Pages (from-to)48-56
Number of pages9
JournalALTA 2015 : Proceedings of Australasian Language Technology Association Workshop 2015
Publication statusPublished - 2015
EventAustralasian Language Technology Association Workshop (13th : 2015) - Parramatta, NSW
Duration: 8 Dec 20159 Dec 2015


Dive into the research topics of 'Similarity metrics for clustering PubMed abstracts for evidence based medicine'. Together they form a unique fingerprint.

Cite this