TY - GEN
T1 - Facing the challenge of leveraging untrained humans in malware analysis
AU - Zhao, Benjamin Zi Hao
AU - Asghar, Hassan Jameel
AU - Ikram, Muhammad
AU - Kaafar, Mohamed Ali
AU - Lamont, Sean
AU - Coscia, Daniel
PY - 2025
Y1 - 2025
N2 - Software binary analysis, tools and machine learning aid security analysts in interpreting data, by automated means that filter, prioritize, and arrange pertinent information for skilled analysts. In this work, we revisit cooperative human-machine teams and evaluate the possibility of enabling untrained humans to assist machines and skilled analysts in their analysis of software binaries. Specifically, we propose a pipeline to transform a complex input domain into facial images on which untrained individuals make similarity decisions. Our faces include realistic human, animal, artistic, and anime faces that preserve inherent distances between data points of the input domain. Our approach is evaluated through a human study, where untrained respondents with minimal training successfully flag machine misclassifications. The untrained human does not replace the machine or skilled analyst, instead, utilized in a triage setting, to identify samples without historical precedence, deferring the decision to the skilled analyst for deeper inspection.
AB - Software binary analysis, tools and machine learning aid security analysts in interpreting data, by automated means that filter, prioritize, and arrange pertinent information for skilled analysts. In this work, we revisit cooperative human-machine teams and evaluate the possibility of enabling untrained humans to assist machines and skilled analysts in their analysis of software binaries. Specifically, we propose a pipeline to transform a complex input domain into facial images on which untrained individuals make similarity decisions. Our faces include realistic human, animal, artistic, and anime faces that preserve inherent distances between data points of the input domain. Our approach is evaluated through a human study, where untrained respondents with minimal training successfully flag machine misclassifications. The untrained human does not replace the machine or skilled analyst, instead, utilized in a triage setting, to identify samples without historical precedence, deferring the decision to the skilled analyst for deeper inspection.
KW - Malware
KW - Binaries
KW - Data Visualization
KW - Human
KW - Faces
KW - Rapid
KW - Triage
UR - https://www.scopus.com/pages/publications/105005932994
U2 - 10.1007/978-3-031-92882-6_5
DO - 10.1007/978-3-031-92882-6_5
M3 - Conference proceeding contribution
SN - 9783031928819
SN - 9783031928840
T3 - IFIP Advances in Information and Communication Technology
SP - 61
EP - 75
BT - ICT Systems Security and Privacy Protection
A2 - Nemec Zlatolas, Lili
A2 - Rannenberg, Kai
A2 - Welzer, Tatjana
A2 - Garcia-Alfaro, Joaquin
PB - Springer, Springer Nature
CY - Cham, Switzerland
T2 - 40th IFIP International Conference on ICT Systems Security and Privacy Protection, SEC 2025
Y2 - 20 May 2025 through 23 May 2025
ER -