Learning multiple diagnosis codes for ICU patients with local disease correlation mining

Sen Wang, Xue Li, Xiaojun Chang*, Lina Yao, Quan Z. Sheng, Guodong Long

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

45 Citations (Scopus)


In the era of big data, a mechanism that can automatically annotate disease codes to patients' records in the medical information system is in demand. The purpose of this work is to propose a framework that automatically annotates the disease labels of multi-source patient data in Intensive Care Units (ICUs). We extract features from two main sources, medical charts and notes. The Bag-of-Words model is used to encode the features. Unlike most of the existing multi-label learning algorithms that globally consider correlations between diseases, our model learns disease correlation locally in the patient data. To achieve this, we derive a local disease correlation representation to enrich the discriminant power of each patient data. This representation is embedded into a unified multi-label learning framework. We develop an alternating algorithm to iteratively optimize the objective function. Extensive experiments have been conducted on a real-world ICU database. We have compared our algorithm with representative multi-label learning algorithms. Evaluation results have shown that our proposed method has state-of-the-art performance in the annotation of multiple diagnostic codes for ICU patients. This study suggests that problems in the automated diagnosis code annotation can be reliably addressed by using a multi-label learning model that exploits disease correlation. The findings of this study will greatly benefit health care and management in ICU considering that the automated diagnosis code annotation can significantly improve the quality and management of health care for both patients and caregivers.
Original languageEnglish
Article number31
Pages (from-to)1-21
Number of pages21
JournalACM Transactions on Knowledge Discovery from Data
Issue number3
Publication statusPublished - 2017


  • diagnosis code annotation
  • ICU data mining
  • pattern discovery
  • MIMIC II database
  • multi-label learning
  • local correlation exploiting


Dive into the research topics of 'Learning multiple diagnosis codes for ICU patients with local disease correlation mining'. Together they form a unique fingerprint.

Cite this