Malware classification methodologies

Malware detection refers to the classification of a software as malicious or benign. Many attempts, employing diverse techniques, have been tried to tackle this issue. In the present thesis, we present a graph-based solution to the malware detection problem, which implements resources extraction...

Full description

Bibliographic Details
Main Author: Τσουβάλας, Βασίλειος
Other Authors: Tsouvalas, Vasileios
Language:English
Published: 2020
Subjects:
Online Access:http://hdl.handle.net/10889/14031
Description
Summary:Malware detection refers to the classification of a software as malicious or benign. Many attempts, employing diverse techniques, have been tried to tackle this issue. In the present thesis, we present a graph-based solution to the malware detection problem, which implements resources extraction from executable samples and applies machine learning algorithms to those resources so as to decide on the nature of the executable (malicious or benign). Given an unknown Windows executable sample, we first extract the calls that the sample makes to the Windows Application Programming Interface (API) and arrange them in the form of an API Call Graph, based on which, an Abstract API Call Graph is constructed. Subsequently, using a Random Walk Graph Kernel, we are able to quantify the similarity between the graph of the unknown sample and the corresponding graphs hailing from a labeled dataset of known samples (benign and malicious Windows executables), in order to carry out the binary classification using Support Vector Machines. Following the aforementioned process, we achieve accuracy levels up to 98.25%, using a substantially smaller dataset than the one proposed by similar efforts, while being considerably more efficient in time and computational power.