Abstract
Despite the impressive capabilities of Deep Neural Networks (DNN), these systems remain fault-prone due to unresolved issues of robustness to perturbations and concept drift. Existing approaches to interpreting faults often provide only low-level abstractions, while struggling to extract meaningful concepts to understand the root cause. Furthermore, these prior methods lack integration and generalization across multiple types of faults. To address these limitations, we present a fault diagnosis tool (akin to a General Practitioner) DNN-GP, an integrated interpreter designed to diagnose various types of model faults through the interpretation of latent concepts. DNN-GP incorporates probing samples derived from adversarial attacks, semantic attacks, and samples exhibiting drifting issues to provide a comprehensible interpretation of a model's erroneous decisions. Armed with an awareness of the faults, DNN-GP derives countermeasures from the concept space to bolster the model's resilience. DNN-GP is trained once on a dataset and can be transferred to provide versatile, unsupervised diagnoses for other models, and is sufficiently general to effectively mitigate unseen attacks. DNN-GP is evaluated on three real-world datasets covering both attack and drift scenarios to demonstrate state-to-the-art detection accuracy (near 100%) with low false positive rates (< 5%).
Original language | English |
---|---|
Title of host publication | Proceedings of the 33rd USENIX Security Symposium |
Place of Publication | Online |
Publisher | USENIX Association |
Pages | 1297-1314 |
Number of pages | 18 |
ISBN (Electronic) | 9781939133441 |
Publication status | Published - 2024 |
Event | 33rd USENIX Security Symposium - Philadelphia, United States Duration: 14 Aug 2024 → 16 Aug 2024 Conference number: 33rd |
Conference
Conference | 33rd USENIX Security Symposium |
---|---|
Country/Territory | United States |
City | Philadelphia |
Period | 14/08/24 → 16/08/24 |