Abstract
Unsupervised representation learning on mixed data is highly challenging but rarely explored. It has to tackle significant challenges related to common issues in real-life mixed data, including sparsity, dynamics and heterogeneity of attributes and values. This work introduces an effective and efficient unsupervised deep representer called Mix2Vec to automatically learn a universal representation of dynamic mixed data with the above complex characteristics. Mix2Vec is empowered with three effective mechanisms: random shuffling prediction, prior distribution matching, and structural informativeness maximization, to tackle the aforementioned challenges. These mechanisms are implemented as an unsupervised deep neural representer Mix2Vec. Mix2Vec converts complex mixed data into vector space-based representations that are universal and comparable to all data objects and transparent and reusable for both unsupervised and supervised learning tasks. Extensive experiments on four large mixed datasets demonstrate that Mix2Vec performs significantly better than state-of-the-art deep representation methods. We also empirically verify the designed mechanisms in terms of representation quality, visualization and capability of enabling better performance of downstream tasks.
Original language | English |
---|---|
Title of host publication | 2020 IEEE 7th International Conference on Data Science and Advanced Analytics |
Subtitle of host publication | proceedings |
Editors | Geoff Webb, Zhongfei Zhang, Vincent S. Tseng, Graham Williams, Michalis Vlachos, Longbing Cao |
Place of Publication | Piscataway, NJ |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 118-127 |
Number of pages | 10 |
ISBN (Electronic) | 9781728182063 |
ISBN (Print) | 9781728182070 |
DOIs | |
Publication status | Published - 2020 |
Externally published | Yes |
Event | 7th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2020 - Sydney, Australia Duration: 6 Oct 2020 → 9 Oct 2020 |
Conference
Conference | 7th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2020 |
---|---|
Country/Territory | Australia |
City | Sydney |
Period | 6/10/20 → 9/10/20 |
Keywords
- Unsupervised Learning
- Mixed Data
- Neural Networks
- Representation Learning
- Deep Learning