Classifying high dimensional data by interactive visual analysis

Ke Bing Zhang*, Mehmet A. Orgun, Rajan Shankaran, Du Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Data mining techniques such as classification algorithms are applied to data which are usually high dimensional and very large. In order to assist the user to perform a classification task, visual techniques can be employed to represent high dimensional data in a more comprehensible 2D or 3D space. However, such representation of high dimensional data in the 2D or 3D space may unavoidably cause overlapping data and information loss. This issue can be addressed by interactive visualization. With expert domain knowledge, the user can build classifiers that are as competitive as automated ones using a 2D or 3D visual interface interactively. Several visual techniques have been proposed for classifying high dimensional data. However, the user's interaction with those techniques is highly dependent on the experience of the user in the visual identification of classifying data, and as a result, the classification results of those techniques may vary and may not be repeatable. To address this deficiency, this article presents an interactive visual approach to the classification of high dimensional data. Our approach employs the enhanced separation feature of a visual technique called HOV3 by which the user plots the training dataset by applying statistical measurements on a 2D space in order to separate data points into groups with the same class labels. A data group with its corresponding statistical measurement which separated it from the others is taken as a visual classifier. Then the user mixes the data points in a classifier with the unlabeled dataset and plots them in HOV3 by the measurement of the classifier. The data points which overlap the labeled ones in the 2D space are assigned the corresponding label. Our approach avoids the randomness in the existing interactive visual classification techniques, as the visual classifier in this approach only depends on the training dataset and its statistical measurement. As a result, this work provides an intuitive and effective approach to classify high dimensional data by interactive visualization.

Original languageEnglish
Pages (from-to)24-36
Number of pages13
JournalJournal of Visual Languages and Computing
Volume33
DOIs
Publication statusPublished - 1 Apr 2016

Keywords

  • Interactive Visual Analysis (IVA)
  • Classification
  • Visual classifer
  • Data projection

Fingerprint

Dive into the research topics of 'Classifying high dimensional data by interactive visual analysis'. Together they form a unique fingerprint.

Cite this