Projects per year
Abstract
The susceptibility of Deep Neural Networks (DNNs) to adversarial attacks has raised concerns regarding their practical applications in real-world scenarios. Although the vulnerability of DNNs to adversarial attacks has been extensively studied in the image domain, research in the audio domain, particularly in the black-box setting with Automatic Speech Recognition (ASR) models, remains limited. While various black-box attacks have been proposed for ASR models, such as transfer attacks, hardware attacks, and query-based attacks, this study concentrates on query-based black-box attacks. The article introduces a new gradient estimation technique, Temporal Natural Evolution Strategies (T-NES), to generate adversarial audio samples more efficiently than existing attacks. T-NES leverages the temporal correlation present in audio to speed up gradient estimation based on the probability scores returned by the target model. The empirical results on benchmark datasets, LibriSpeech and TEDLIUM, and two state-of-the-art ASR models, DeepSpeech2 and Wav2Letter, demonstrate that T-NES can generate successful attacks with up to 30% fewer queries than existing attacks within 500 queries. T-NES could provide a robust baseline for evaluating the black-box adversarial vulnerability of ASR systems.
Original language | English |
---|---|
Pages (from-to) | 3981-3992 |
Number of pages | 12 |
Journal | IEEE/ACM Transactions on Audio Speech and Language Processing |
Volume | 31 |
DOIs | |
Publication status | Published - 2023 |
Fingerprint
Dive into the research topics of 'Query-efficient black-box adversarial attacks on automatic speech recognition'. Together they form a unique fingerprint.Projects
- 1 Finished
-
SUT led : Context-aware verification and validation framework for autonomous driving
Chen, T., Vu, H., Liu, H., Zheng, J. & Zhou, Z.
25/02/21 → 24/02/24
Project: Research