TY - GEN
T1 - Am I rare? An intelligent summarization approach for identifying hidden anomalies
AU - Ghodratnama, Samira
AU - Zakershahrak, Mehrdad
AU - Sobhanmanesh, Fariborz
PY - 2021
Y1 - 2021
N2 - Monitoring network traffic data to detect any hidden patterns of anomalies is a challenging and time-consuming task which requires high computing resources. To this end, an appropriate summarization technique is of great importance, where it can be a substitute for the original data. However, the summarized data is under the threat of removing anomalies. Therefore, it is vital to create a summary that can reflect the same pattern as the original data. Therefore, in this paper, we propose an INtelligent Summarization approach for IDENTifying hidden anomalies, called INSIDENT. The proposed approach guarantees to keep the original data distribution in summarized data. Our approach is a clustering-based algorithm that dynamically maps original feature space to a new feature space by locally weighting features in each cluster. Therefore, in new feature space, similar samples are closer, and consequently, outliers are more detectable. Besides, selecting representatives based on cluster size keeps the same distribution as the original data in summarized data. INSIDENT can be used both as the preprocess approach before performing anomaly detection algorithms and anomaly detection algorithm. The experimental results on benchmark datasets prove a summary of the data can be a substitute for original data in the anomaly detection task.
AB - Monitoring network traffic data to detect any hidden patterns of anomalies is a challenging and time-consuming task which requires high computing resources. To this end, an appropriate summarization technique is of great importance, where it can be a substitute for the original data. However, the summarized data is under the threat of removing anomalies. Therefore, it is vital to create a summary that can reflect the same pattern as the original data. Therefore, in this paper, we propose an INtelligent Summarization approach for IDENTifying hidden anomalies, called INSIDENT. The proposed approach guarantees to keep the original data distribution in summarized data. Our approach is a clustering-based algorithm that dynamically maps original feature space to a new feature space by locally weighting features in each cluster. Therefore, in new feature space, similar samples are closer, and consequently, outliers are more detectable. Besides, selecting representatives based on cluster size keeps the same distribution as the original data in summarized data. INSIDENT can be used both as the preprocess approach before performing anomaly detection algorithms and anomaly detection algorithm. The experimental results on benchmark datasets prove a summary of the data can be a substitute for original data in the anomaly detection task.
KW - Anomaly detection
KW - Summarization
KW - Network data
KW - Clustering
KW - Classification
UR - http://www.scopus.com/inward/record.url?scp=85111377766&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-76352-7_31
DO - 10.1007/978-3-030-76352-7_31
M3 - Conference proceeding contribution
AN - SCOPUS:85111377766
SN - 9783030763510
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 309
EP - 323
BT - Service-Oriented Computing – ICSOC 2020 Workshops
A2 - Hacid, Hakim
A2 - Outay, Fatma
A2 - Paik, Hye-young
A2 - Alloum, Amira
A2 - Petrocchi, Marinella
A2 - Bouadjenek, Mohamed Reda
A2 - Beheshti, Amin
A2 - Liu, Xumin
A2 - Maaradji, Abderrahmane
PB - Springer, Springer Nature
CY - Cham, Switzerland
T2 - AIOps, CFTIC, STRAPS, AI-PA, AI-IOTS, and Satellite Events held in conjunction with 18th International Conference on Service-Oriented Computing, ICSOC 2020
Y2 - 14 December 2020 through 17 December 2020
ER -