Malware classification methodologies

Malware detection refers to the classification of a software as malicious or benign. Many attempts, employing diverse techniques, have been tried to tackle this issue. In the present thesis, we present a graph-based solution to the malware detection problem, which implements resources extraction...

Πλήρης περιγραφή

Λεπτομέρειες βιβλιογραφικής εγγραφής
Κύριος συγγραφέας:	Τσουβάλας, Βασίλειος
Άλλοι συγγραφείς:	Tsouvalas, Vasileios
Γλώσσα:	English
Έκδοση:	2020
Θέματα:	Malware detection Machine learning SVM Graphs Graph kernel Κακόβουλο λογισμικό Μηχανική μάθηση Γράφοι Kernel γράφων
Διαθέσιμο Online:	http://hdl.handle.net/10889/14031

id	nemertes-10889-14031
record_format	dspace
spelling	nemertes-10889-140312022-09-05T06:57:10Z Malware classification methodologies Μεθοδολογίες ταξινόμησης κακόβουλου λογισμικού Τσουβάλας, Βασίλειος Tsouvalas, Vasileios Malware detection Machine learning SVM Graphs Graph kernel Κακόβουλο λογισμικό Μηχανική μάθηση Γράφοι Kernel γράφων Malware detection refers to the classification of a software as malicious or benign. Many attempts, employing diverse techniques, have been tried to tackle this issue. In the present thesis, we present a graph-based solution to the malware detection problem, which implements resources extraction from executable samples and applies machine learning algorithms to those resources so as to decide on the nature of the executable (malicious or benign). Given an unknown Windows executable sample, we first extract the calls that the sample makes to the Windows Application Programming Interface (API) and arrange them in the form of an API Call Graph, based on which, an Abstract API Call Graph is constructed. Subsequently, using a Random Walk Graph Kernel, we are able to quantify the similarity between the graph of the unknown sample and the corresponding graphs hailing from a labeled dataset of known samples (benign and malicious Windows executables), in order to carry out the binary classification using Support Vector Machines. Following the aforementioned process, we achieve accuracy levels up to 98.25%, using a substantially smaller dataset than the one proposed by similar efforts, while being considerably more efficient in time and computational power. Η ανίχνευση κακόβουλου λογισμικού αναφέρεται στη διαδικασία κατά την οποία, χρησιμοποιώντας διάφορες μεθόδους και τεχνικές ανάλυσης λογισμικού, έχουμε τη δυνατότητα να κατηγοριοποιήσουμε ένα πρόγραμμα ως κακόβουλο ή καλόβουλο. Στην παρούσα εργασία, παρουσιάζουμε μία λύση, κατά την οποία εξάγουμε πληροφορίες από ένα εκτελέσιμο δείγμα, μοντελοποιούμε αυτές τις πληροφορίες σε γράφους και εφαρμόζοντας τεχνικές μηχανικής μάθησης αποφαινόμαστε για τη φύση του εκτελέσιμου (καλόβουλο ή κακόβουλο). Ξεκινώντας με ένα σύνολο δεδομένων από καλόβουλα και κακόβουλα Windows εκτελέσιμα δείγματα, εξάγουμε , μέσω στατικής ανάλυσης, τις κλήσεις που πραγματοποιεί το κάθε εκτελέσιμο στο Windows API, μοντελοποιούμε ένα Γράφο ΑΡΙ Κλήσεων και εν συνεχεία έναν Αφηρημένο Γράφο ΑΡΙ Κλήσεων, ώστε να πραγματοποιήσουμε τη δυαδική ταξινόμηση με τη χρήση των Support Vector Machines (SVM's). Ακολουθώντας την παραπάνω διαδικασία, επιτυγχάνονται επίπεδα ακριβείας μέχρι 98.25% χρησιμοποιώντας μικρότερο σύνολο δεδομένων από εκείνο που χρησιμοποιείται σε παρόμοιες προσπάθειες, αλλά και μειώνοντας τις απαιτήσεις σε χρόνο και υπολογιστική ισχύ. 2020-10-21T10:50:23Z 2020-10-21T10:50:23Z 2020-10-14 http://hdl.handle.net/10889/14031 en application/pdf
institution	UPatras
collection	Nemertes
language	English
topic	Malware detection Machine learning SVM Graphs Graph kernel Κακόβουλο λογισμικό Μηχανική μάθηση Γράφοι Kernel γράφων
spellingShingle	Malware detection Machine learning SVM Graphs Graph kernel Κακόβουλο λογισμικό Μηχανική μάθηση Γράφοι Kernel γράφων Τσουβάλας, Βασίλειος Malware classification methodologies
description	Malware detection refers to the classification of a software as malicious or benign. Many attempts, employing diverse techniques, have been tried to tackle this issue. In the present thesis, we present a graph-based solution to the malware detection problem, which implements resources extraction from executable samples and applies machine learning algorithms to those resources so as to decide on the nature of the executable (malicious or benign). Given an unknown Windows executable sample, we first extract the calls that the sample makes to the Windows Application Programming Interface (API) and arrange them in the form of an API Call Graph, based on which, an Abstract API Call Graph is constructed. Subsequently, using a Random Walk Graph Kernel, we are able to quantify the similarity between the graph of the unknown sample and the corresponding graphs hailing from a labeled dataset of known samples (benign and malicious Windows executables), in order to carry out the binary classification using Support Vector Machines. Following the aforementioned process, we achieve accuracy levels up to 98.25%, using a substantially smaller dataset than the one proposed by similar efforts, while being considerably more efficient in time and computational power.
author2	Tsouvalas, Vasileios
author_facet	Tsouvalas, Vasileios Τσουβάλας, Βασίλειος
author	Τσουβάλας, Βασίλειος
author_sort	Τσουβάλας, Βασίλειος
title	Malware classification methodologies
title_short	Malware classification methodologies
title_full	Malware classification methodologies
title_fullStr	Malware classification methodologies
title_full_unstemmed	Malware classification methodologies
title_sort	malware classification methodologies
publishDate	2020
url	http://hdl.handle.net/10889/14031
work_keys_str_mv	AT tsoubalasbasileios malwareclassificationmethodologies AT tsoubalasbasileios methodologiestaxinomēsēskakobouloulogismikou
_version_	1771297181146284032

Malware classification methodologies

Παρόμοια τεκμήρια