TY - JOUR
T1 - Learning multiple diagnosis codes for ICU patients with local disease correlation mining
AU - Wang, Sen
AU - Li, Xue
AU - Chang, Xiaojun
AU - Yao, Lina
AU - Sheng, Quan Z.
AU - Long, Guodong
PY - 2017
Y1 - 2017
N2 - In the era of big data, a mechanism that can automatically annotate disease codes to patients' records in the medical information system is in demand. The purpose of this work is to propose a framework that automatically annotates the disease labels of multi-source patient data in Intensive Care Units (ICUs). We extract features from two main sources, medical charts and notes. The Bag-of-Words model is used to encode the features. Unlike most of the existing multi-label learning algorithms that globally consider correlations between diseases, our model learns disease correlation locally in the patient data. To achieve this, we derive a local disease correlation representation to enrich the discriminant power of each patient data. This representation is embedded into a unified multi-label learning framework. We develop an alternating algorithm to iteratively optimize the objective function. Extensive experiments have been conducted on a real-world ICU database. We have compared our algorithm with representative multi-label learning algorithms. Evaluation results have shown that our proposed method has state-of-the-art performance in the annotation of multiple diagnostic codes for ICU patients. This study suggests that problems in the automated diagnosis code annotation can be reliably addressed by using a multi-label learning model that exploits disease correlation. The findings of this study will greatly benefit health care and management in ICU considering that the automated diagnosis code annotation can significantly improve the quality and management of health care for both patients and caregivers.
AB - In the era of big data, a mechanism that can automatically annotate disease codes to patients' records in the medical information system is in demand. The purpose of this work is to propose a framework that automatically annotates the disease labels of multi-source patient data in Intensive Care Units (ICUs). We extract features from two main sources, medical charts and notes. The Bag-of-Words model is used to encode the features. Unlike most of the existing multi-label learning algorithms that globally consider correlations between diseases, our model learns disease correlation locally in the patient data. To achieve this, we derive a local disease correlation representation to enrich the discriminant power of each patient data. This representation is embedded into a unified multi-label learning framework. We develop an alternating algorithm to iteratively optimize the objective function. Extensive experiments have been conducted on a real-world ICU database. We have compared our algorithm with representative multi-label learning algorithms. Evaluation results have shown that our proposed method has state-of-the-art performance in the annotation of multiple diagnostic codes for ICU patients. This study suggests that problems in the automated diagnosis code annotation can be reliably addressed by using a multi-label learning model that exploits disease correlation. The findings of this study will greatly benefit health care and management in ICU considering that the automated diagnosis code annotation can significantly improve the quality and management of health care for both patients and caregivers.
KW - diagnosis code annotation
KW - ICU data mining
KW - pattern discovery
KW - MIMIC II database
KW - multi-label learning
KW - local correlation exploiting
UR - http://purl.org/au-research/grants/arc/DP140100104
UR - http://purl.org/au-research/grants/arc/LP160100630
UR - http://www.scopus.com/inward/record.url?scp=85014767985&partnerID=8YFLogxK
U2 - 10.1145/3003729
DO - 10.1145/3003729
M3 - Article
AN - SCOPUS:85014767985
SN - 1556-4681
VL - 11
SP - 1
EP - 21
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
IS - 3
M1 - 31
ER -