Section: New Results
Participants : Olivier Festor [Contact] , Mohamed Nassar.
In the context of a cooperation with the University of Liege, we have addressed the problem of SPIT from a new perspective  . Based on end-user feedback, we have proposed a scheme for generating SPIT signatures from the SIP INVITE messages. Hence it is possible to filter the next SPIT calls before ringing their destinations. The generated SPIT signatures are adaptive to the benign signaling traffic in the sense that they do not conflict with it. The generation of signatures is based on supervised machine learning techniques. We namely investigated decision trees with categorical at- tributes obtained by parsing the SIP messages.
Our system works in two modes: a batch and an online mode. The batch mode consists on training the decision tree over a labeled (spit, normal) data-set and then trans- forming the tree into an if-else rule-set. In online mode, the successive learnt signatures are aggregated and the possible conflicts are resolved. Experimenta- tion on off-the-shelf SPIT tools showed the efficiency of our approach to find the good signatures. However, experiments show that the J48 decision tree is easily defeated using some obfuscation techniques. We therefore proposed a generalisation approach to translate the tree into an if-else rule-set shows instead good robust- ness against such attacks. The overall framework provides suitable performance for operational deployment in terms of learning time, required memory, size of 18the rule-set and the call setup delay. The different parameters of the system (i.e. size of the different buffers and windows) are easily configurable.Different SPIT signatures may imply different SPIT capabilities. For example, a spitter may break a Captcha test by brute-forcing a DTMF guess. Another spitter may start talking by a human-like congratulation in order to bypass a Turing test. One of the goals of our approach is to provide a framework for applying reinforcement learning techniques and hence increasing the efficiency of the filtering process. The reinforcement learning aims at selecting the best challenge to be used when a given SPIT signature is detected. Basically the re-inforcement learning maintains a table matching each signature with the best challenge response discovered so far. The table is continuously updated using a trial and error scheme.
We did validate the approach on multiple data-sets otbained from Voice over IP operators members of the SCAMSTOP project.