Section: New Software and Platforms


Semantic-driven Analysis of BinaRies

Keywords: Malware - Semantic - Binary analysis - Unsupervised graph clustering SCDG - Machine learning

Functional Description: Toolchain for binary analysis based on different techniques for capturing binaries' semantics and performing machine learning-assisted analysis. The primary use is malware analysis for malware detection and classification, either based on supervised and unsupervised learning.

This toolchain includes modules of the former BMA toolchain, specifically the SCDG extraction.

Our approach is based on artificial intelligence. We use concolic analysis to extract behavioral signatures from binaries in a form of system call dependency graphs (SCDGs). Our software can do both supervised and unsupervised learning. The former learns the distinctive features of different malware families on a large training set in order to classify the new binaries as malware or cleanware according to their behavioural signatures. In the unsupervised learning the binaries are clustered according to their graph similarity. The toolchain is orchestrated by an experiment manager that allows to easily setup, launch and view results of all modules of the toolchain.

  • Contact: Olivier Zendra