Peter Anthony
PhD student at Slovak.AI Comenius University in Bratislava, Slovakia
Abstract: Malware detection is a critical task in cybersecurity, and traditional signature-based approaches are often ineffective against new and evolving threats. Recent research has shown that machine learning models can improve the accuracy of malware classification. However, existing methods often suffer from poor generalization performance and lack of explainability making it difficult to understand how they arrived at their predictions. This can make it challenging for cybersecurity experts to assess the reliability of the model. In this work, we aim at a novel approach that combines a graph-based representation of malware with a neural network classifier. Entities and relationships in a knowledge graph are projected into a low-dimensional space. The approach involves learning a vector representation for each entity and relationship in the knowledge base while preserving their semantic meaning, to accurately discriminate between malicious and benign software. Our main objectives include: first, enhancing performance by injecting prior symbolic knowledge. Secondly, to develop a mechanism for deriving meaningful explanations for its predictions from the resulting embeddings, giving cybersecurity experts insights into malware behavior and decision-making processes. Overall, we want to achieve an approach that will present a valuable tool for malware detection and analysis in real-world settings, with accurate predictions and meaningful explanations.
Keywords: Malware detection, Explainable AI, Knowledge Graph Embedding, Knowledge Base Embedding
Scientific area: Artificial Intelligence, Cybersecurity
Visiting period: 01.11.2023 to 31.01.2024
Vising Lab: Siena Artificial Intelligence Lab, University of Siena
Bio: Peter Anthony is a Ph.D. student at Comenius University Bratislava, Slovakia, specializing in Artificial Intelligence and cybersecurity. His research focuses on enhancing malware detection through Explainable AI techniques, integrating graph-based malware representation, symbolic knowledge, and neural networks to achieve a more robust system with human-interpretable explanations.