TY - JOUR
T1 - Outcome polarity identification of medical papers
AU - Sarker, Abeed
AU - Mollá-Aliod, Diego
AU - Paris, Cécile
N1 - Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
PY - 2011
Y1 - 2011
N2 - A medical publication may or may not present an outcome. When an outcome is present,
its polarity may be positive, negative or neutral. Information about the polarity of an outcome is a vital one, particularly for practitioners who use the outcome information for decision making. We model the problem of automatic outcome polarity identification as a three-way document classification problem and attempt to solve it via supervised machine learning. We combine domain knowledge and linguistic features of medical text, and apply natural language processing to extract features
for the chosen classifiers. We introduce two novel features — Relative Average Negation Count and Sentence Signature — and show that they are effective in improving classification accuracy. We also include features, such as n-grams and semantic orientation of terms, that have been used for similar text classification
problems in other domains. Using these features, we obtain a maximum accuracy of
74.9% for the classification problem. Our experiments suggest that through careful feature selection, machine learning can be used to solve this problem.
AB - A medical publication may or may not present an outcome. When an outcome is present,
its polarity may be positive, negative or neutral. Information about the polarity of an outcome is a vital one, particularly for practitioners who use the outcome information for decision making. We model the problem of automatic outcome polarity identification as a three-way document classification problem and attempt to solve it via supervised machine learning. We combine domain knowledge and linguistic features of medical text, and apply natural language processing to extract features
for the chosen classifiers. We introduce two novel features — Relative Average Negation Count and Sentence Signature — and show that they are effective in improving classification accuracy. We also include features, such as n-grams and semantic orientation of terms, that have been used for similar text classification
problems in other domains. Using these features, we obtain a maximum accuracy of
74.9% for the classification problem. Our experiments suggest that through careful feature selection, machine learning can be used to solve this problem.
M3 - Conference paper
SP - 105
EP - 114
JO - Proceedings of the Australasian Language Technology Association Workshop 2011
JF - Proceedings of the Australasian Language Technology Association Workshop 2011
SN - 1834-7037
T2 - Australasian Language Technology Workshop (9th : 2011)
Y2 - 1 December 2011 through 2 December 2011
ER -