Projects per year
Abstract
Anomalous node detection in a static graph faces significant challenges due to the rarity of anomalies and the substantial cost of labeling their deviant structure and attribute patterns. These challenges give rise to data-centric problems, including extremely imbalanced data distributions and intricate graph learning, which significantly impede machine learning and deep learning methods from discerning the patterns of graph anomalies with few labels. While these issues remain crucial, much of the current research focuses on addressing the induced technical challenges, treating the shortage of labeled data as a given. Distinct from previous efforts, this work focuses on tackling the data-centric problems by generating auxiliary training nodes that conform to the original graph topology and attribute distribution. We categorize this approach as data-centric, aiming to enhance existing anomaly detectors by training them on our synthetic data. However, the methods for generating nodes and the effectiveness of utilizing synthetic data for graph anomaly detection remain unexplored in the realm. To answer these questions, we thoroughly investigate the denoising diffusion model. Drawing from our observations on the diffusion process, we illuminate the shifts in graph energy distribution and establish two principles for designing denoising neural networks tailored to graph anomaly generation. From the insights, we propose a diffusion-based graph generation method to synthesize training nodes, which can be promptly integrated to work with existing anomaly detectors. The empirical results on eight widely-used datasets demonstrate our generated data can effectively enhance the nine state-of-the-art graph detectors' performance.
| Original language | English |
|---|---|
| Title of host publication | KDD '24 |
| Subtitle of host publication | proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining |
| Place of Publication | New York |
| Publisher | Association for Computing Machinery |
| Pages | 2153–2164 |
| Number of pages | 12 |
| ISBN (Electronic) | 9798400704901 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | ACM SIGKDD Conference on Knowledge Discovery and Data Mining (30th : 2024) - Barcelona, Spain Duration: 25 Aug 2024 → 29 Aug 2024 |
Conference
| Conference | ACM SIGKDD Conference on Knowledge Discovery and Data Mining (30th : 2024) |
|---|---|
| Abbreviated title | KDD '24 |
| Country/Territory | Spain |
| City | Barcelona |
| Period | 25/08/24 → 29/08/24 |
Bibliographical note
Copyright the Author(s) 2024. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Graph Anomaly Detection
- Generative Graph Diffusion
Fingerprint
Dive into the research topics of 'Graph anomaly detection with few labels: a data-centric approach'. Together they form a unique fingerprint.Projects
- 1 Finished
-
DP230100899: New Graph Mining Technologies to Enable Timely Exploration of Social Events
Wu, J. (Primary Chief Investigator) & Yang, J. (Chief Investigator)
1/01/23 → 31/12/25
Project: Research