Diagnosis code assignment using sparsity-based disease correlation embedding

Sen Wang, Xiao Jun Chang, Xue Li, Guodong Long, Lina Yao, Quan Z. Sheng

Research output: Contribution to journalArticlepeer-review

75 Citations (Scopus)


With the latest developments in database technologies, it becomes easier to store the medical records of hospital patients from their first day of admission than was previously possible. In Intensive Care Units (ICU), modern medical information systems can record patient events in relational databases every second. Knowledge mining from these huge volumes of medical data is beneficial to both caregivers and patients. Given a set of electronic patient records, a system that effectively assigns the disease labels can facilitate medical database management and also benefit other researchers, e.g., pathologists. In this paper, we have proposed a framework to achieve that goal. Medical chart and note data of a patient are used to extract distinctive features. To encode patient features, we apply a Bag-of-Words encoding method for both chart and note data. We also propose a model that takes into account both global information and local correlations between diseases. Correlated diseases are characterized by a graph structure that is embedded in our sparsity-based framework. Our algorithm captures the disease relevance when labeling disease codes rather than making individual decision with respect to a specific disease. At the same time, the global optimal values are guaranteed by our proposed convex objective function. Extensive experiments have been conducted on a real-world large-scale ICU database. The evaluation results demonstrate that our method improves multi-label classification results by successfully incorporating disease correlations.

Original languageEnglish
Pages (from-to)3191-3202
Number of pages12
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number12
Publication statusPublished - Dec 2016
Externally publishedYes


  • ICD code labeling
  • multi-label learning
  • sparsity-based regularization
  • disease correlation embedding


Dive into the research topics of 'Diagnosis code assignment using sparsity-based disease correlation embedding'. Together they form a unique fingerprint.

Cite this