Mix2Vec: unsupervised mixed data representation

Chengzhang Zhu, Qi Zhang, Longbing Cao, Arman Abrahamyan

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

7 Citations (Scopus)

Abstract

Unsupervised representation learning on mixed data is highly challenging but rarely explored. It has to tackle significant challenges related to common issues in real-life mixed data, including sparsity, dynamics and heterogeneity of attributes and values. This work introduces an effective and efficient unsupervised deep representer called Mix2Vec to automatically learn a universal representation of dynamic mixed data with the above complex characteristics. Mix2Vec is empowered with three effective mechanisms: random shuffling prediction, prior distribution matching, and structural informativeness maximization, to tackle the aforementioned challenges. These mechanisms are implemented as an unsupervised deep neural representer Mix2Vec. Mix2Vec converts complex mixed data into vector space-based representations that are universal and comparable to all data objects and transparent and reusable for both unsupervised and supervised learning tasks. Extensive experiments on four large mixed datasets demonstrate that Mix2Vec performs significantly better than state-of-the-art deep representation methods. We also empirically verify the designed mechanisms in terms of representation quality, visualization and capability of enabling better performance of downstream tasks.

Original languageEnglish
Title of host publication2020 IEEE 7th International Conference on Data Science and Advanced Analytics
Subtitle of host publicationproceedings
EditorsGeoff Webb, Zhongfei Zhang, Vincent S. Tseng, Graham Williams, Michalis Vlachos, Longbing Cao
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages118-127
Number of pages10
ISBN (Electronic)9781728182063
ISBN (Print)9781728182070
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event7th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2020 - Sydney, Australia
Duration: 6 Oct 20209 Oct 2020

Conference

Conference7th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2020
Country/TerritoryAustralia
CitySydney
Period6/10/209/10/20

Keywords

  • Unsupervised Learning
  • Mixed Data
  • Neural Networks
  • Representation Learning
  • Deep Learning

Fingerprint

Dive into the research topics of 'Mix2Vec: unsupervised mixed data representation'. Together they form a unique fingerprint.

Cite this