Understanding model weaknesses: a path to strengthening DNN-based Android malware detection

Haodong Li, Xiao Cheng*, Yanjie Zhao, Guosheng Xu, Guoai Xu, Haoyu Wang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

156 Downloads (Pure)

Abstract

Android malware detection remains a critical challenge in cybersecurity research. Recent advancements leverage AI techniques, particularly deep neural networks (DNNs), to train a detection model, but their effectiveness is often compromised by the pronounced imbalance among malware families in commonly used training datasets. This imbalance leads to overfitting in dominant categories and poor performance in underrepresented ones, increasing predictive uncertainty for less common malware families. To address the suboptimal performance of many DNN models, we introduce MalTutor, a novel framework that enhances model robustness through an optimized training process. Our primary insight lies in transforming uncertainties from “liabilities” into “assets” by strategically incorporating them into DNN training methodologies. Specifically, we begin by evaluating the predictive uncertainty of DNN models throughout various training epochs, which guides our sample categorization. Incorporating Curriculum Learning strategies, we commence training with easy-to-learn samples with lower uncertainty, progressively incorporating difficult-to-learn ones with higher uncertainty. Our experimental results demonstrate that MalTutor significantly improves the performance of models trained on imbalanced datasets, increasing accuracy by 31.0%, elevating the F1 score by 138.8%, and specifically boosting the average accuracy in detecting various types of malicious apps by 133.9%. Our findings provide valuable insights into the potential benefits of incorporating uncertainty to improve the robustness of DNN models for prediction-oriented software engineering tasks.
Original languageEnglish
Article numberISSTA015
Pages (from-to)1-23
Number of pages23
JournalProceedings of the ACM on Software Engineering
Volume2
Issue numberISSTA
DOIs
Publication statusPublished - Jul 2025
Externally publishedYes

Bibliographical note

Copyright the Author(s) 2025. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Android malware detection
  • Uncertainty
  • Curriculum Learning

Fingerprint

Dive into the research topics of 'Understanding model weaknesses: a path to strengthening DNN-based Android malware detection'. Together they form a unique fingerprint.

Cite this