We present a clustering approach for documents returned by a PubMed search, which enable the organisation of evidence underpinning clinical recommendations for Evidence Based Medicine. Our approach uses a combination of document similarity metrics, which are fed to an agglomerative hierarchical clusterer. These metrics quantify the similarity of published abstracts from syntactic, semantic, and statistical perspectives. Several evaluations have been performed, including: an evaluation that uses ideal documents as selected and clustered by clinical experts; a method that maps the output of PubMed to the ideal clusters annotated by the experts; and an alternative evaluation that uses the manual clustering of abstracts. The results of using our similarity metrics approach shows an improvement over K-means and hierarchical clustering methods using TFIDF.
|Number of pages||9|
|Journal||ALTA 2015 : Proceedings of Australasian Language Technology Association Workshop 2015|
|Publication status||Published - 2015|
|Event||Australasian Language Technology Association Workshop (13th : 2015) - Parramatta, NSW|
Duration: 8 Dec 2015 → 9 Dec 2015
Hassanzadeh, H., Mollá, D., Groza, T., Nguyen, A., & Hunter, J. (2015). Similarity metrics for clustering PubMed abstracts for evidence based medicine. ALTA 2015 : Proceedings of Australasian Language Technology Association Workshop 2015, 48-56.