Abstract
The weakly supervised video anomaly detection aims to learn a detection model using only video-level labeled data. Prior studies ignore the complexity or duration of anomalies present in abnormal videos during temporal modeling. Moreover, existing works usually detect the most abnormal segments, potentially overlooking the completeness of anomalies. We propose a dynamic erasing network (DE-Net) for weakly supervised video anomaly detection, which learns video-specific temporal features via adaptive temporal modeling (ATM) to address these limitations. Specifically, to handle duration variations of abnormal events, we propose an ATM module capable of adaptively selecting and aggregating the most appropriate K temporal scale features for each video. Then, we design a dynamic erasing (DE) strategy that dynamically assesses the completeness of the detected anomalies and erases prominent abnormal segments to encourage the model to discover gentle abnormal segments. The proposed method achieves favorable performance compared to several state-of-the-art approaches on the widely used XD-Violence, TAD, and UCF-Crime datasets.
| Original language | English |
|---|---|
| Pages (from-to) | 16706-16720 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Neural Networks and Learning Systems |
| Volume | 36 |
| Issue number | 9 |
| Early online date | 8 Apr 2025 |
| DOIs | |
| Publication status | Published - Sept 2025 |
Fingerprint
Dive into the research topics of 'Dynamic erasing network with adaptive temporal modeling for weakly supervised video anomaly detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver