With the latest developments in database technologies, it becomes easier to store the medical records of hospital patients from their first day of admission than was previously possible. In Intensive Care Units (ICU), modern medical information systems can record patient events in relational databases every second. Knowledge mining from these huge volumes of medical data is beneficial to both caregivers and patients. Given a set of electronic patient records, a system that effectively assigns the disease labels can facilitate medical database management and also benefit other researchers, e.g., pathologists. In this paper, we have proposed a framework to achieve that goal. Medical chart and note data of a patient are used to extract distinctive features. To encode patient features, we apply a Bag-of-Words encoding method for both chart and note data. We also propose a model that takes into account both global information and local correlations between diseases. Correlated diseases are characterized by a graph structure that is embedded in our sparsity-based framework. Our algorithm captures the disease relevance when labeling disease codes rather than making individual decision with respect to a specific disease. At the same time, the global optimal values are guaranteed by our proposed convex objective function. Extensive experiments have been conducted on a real-world large-scale ICU database. The evaluation results demonstrate that our method improves multi-label classification results by successfully incorporating disease correlations.
|Number of pages||12|
|Journal||IEEE Transactions on Knowledge and Data Engineering|
|Publication status||Published - Dec 2016|
- ICD code labeling
- multi-label learning
- sparsity-based regularization
- disease correlation embedding