Using Static and Dynamic Malware features to perform Malware Ascription
Main Article Content
Article Sidebar
Abstract
Malware ascription is a relatively unexplored area, and it is rather difficult to attribute malware and detect authorship. In this paper, we employ various Static and Dynamic features of malicious executables to classify malware based on their family. We leverage Cuckoo Sandbox and machine learning to make progress in this research. Post analysis, classification is performed using various deep learning and machine learning algorithms. Using the features gathered from VirusTotal (static) and Cuckoo (dynamic) reports, we trained and tested them on Naive Bayes and Support Vector Machine classifiers. In a follow up experiment, we converted our malware into grayscale and coloured images to feed into a Convolutional Neural Network (CNN) for classification. For each classifier, we tuned the hyper-parameters using exhaustive search methods. Our reports can be extremely useful in malware ascription.
Classification using VirusTotal Features (95,000 Samples)
Accuracy
Precision
Recall
F-score
Time (s)
84.99
83.98
84.99
83.72
3341
Classification using VirusTotal and Cuckoo Features (1,936 Samples)
Accuracy
Precision
Recall
F-score
Time (s)
67.98
69.79
67.98
66.66
1946
How to Cite
Article Details
malware, attribution, CuckooSandbox, Naive Bayes, Support Vector Machine, Convolutional Neural Network
“Insurehub.org.” [Online]. Available: https://insurehubg.org/
“File statistics during last 7 days.” [Online]. Available: https://www.virustotal.com/en/statistics/
“Desktop operating system market share worldwide.” [Online]. Available: http://gs.statcounter.com/os-market-share/desktop/worldwide
Levinec, “Malware names.” [Online]. Available: https://docs.microsoft.com/en-us/windows/security/threat- protection/intelligence/malware-naming
D. Ucci, L. Aniello, and R. Baldoni, “Survey of machine learning techniques for malware analysis,” Computers & Security, vol. 81, pp. 123–147, 2019. [Online]. Available: https://doi.org/10.1016/j.cose.2018.11.001
Y. Ye, T. Li, D. Adjeroh, and S. S. Iyengar, “A survey on malware detection using data mining techniques,” ACM Comput. Surv., vol. 50, no. 3, pp. 41:1–41:40, Jun. 2017. [Online]. Available: http://doi.acm.org/10.1145/3073559
R. Sihwail, K. Omar, and K. A. Z. Ariffin, “A survey on malware analysis techniques: Static, dynamic, hybrid and memory analysis,” 2018.
L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath, “Malware images: visualization and automatic classification,” in 2011 International Symposium on Visualization for Cyber Security, VizSec ’11, Pittsburgh, PA, USA, July 20, 2011, 2011, p. 4. [Online]. Available: https://doi.org/10.1145/2016904.2016908
E. Gandotra, D. Bansal, and S. Sofat, “Malware analysis and classification: A survey,” Journal of Information Security, vol. 05, no. 02, pp. 56–64, 2014. [Online]. Available: https://doi.org/10.4236/jis.2014.52006
R. M. Verma, M. Kantarcioglu, D. J. Marchette, E. L. Leiss, and T. Solorio, “Security analytics: Essential data analytics knowledge for cybersecurity professionals and students,” IEEE Security & Privacy, vol. 13, no. 6, pp. 60–65, 2015. [Online]. Available: https://doi.org/10.1109/MSP.2015.121
“Virusshare.com.” [Online]. Available: https://virusshare.com/
“Automated malware analysis.” [Online]. Available: https://cuckoosandbox.org/about
M. Sebastia´n, R. Rivera, P. Kotzias, and J. Caballero, “Avclass: A tool for massive malware labeling,” in Research in Attacks, Intrusions, and Defenses - 19th International Symposium, RAID 2016, Paris, France, September 19-21, 2016, Proceedings, 2016, pp. 230–253. [Online]. Available: https://doi.org/10.1007/978-3-319-45719-2 11
Z. Cui, F. Xue, X. Cai, Y. Cao, G. Wang, and J. Chen, “Detection of malicious code variants based on deep learning,” IEEE Trans. Industrial Informatics, vol. 14, no. 7, pp. 3187–3196, 2018. [Online]. Available: https://doi.org/10.1109/TII.2018.2822680
H. Sung and S. Mukkamala, “Identifying important features for intrusion detection using support vector machines and neural networks,” in 2003 Symposium on Applications and the Internet (SAINT 2003), 27- 31 January 2003 - Orlando, FL, USA, Proceedings, 2003, pp. 209–217. [Online]. Available: https://doi.org/10.1109/SAINT.2003.1183050
F. Chollet et al., “Keras,” https://keras.io, 2015.
H. Faridi, S. Srinivasagopalan, and R. Verma, “Performance Evaluation of Features and Clustering Algorithms for Malware,” in 2018 IEEE International Conference on Data Mining Workshops (ICDMW). Singapore, Singapore: IEEE, Nov. 2018, pp. 13–22. [Online]. Available: https://ieeexplore.ieee.org/document/8637452/
——, “Parameter tuning and confidence limits of malware clustering,” in Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, CODASPY 2019, Richardson, TX, USA, March 25-27, 2019, 2019, pp. 169–171. [Online]. Available: https://doi.org/10.1145/3292006.3302385