The aim of the CARTE research team is to take into account adversity in computations, which is implied by actors whose behaviors are unknown or unclear. We call this notion adversary computation.

The project combines two approaches. The first one is the analysis of the behavior of systems, using tools coming from Continuous Computation Theory. The second approach is to build defenses with tools coming from logic, rewriting and, more generally, from Programming Theory.

The activities of the CARTE team are organized around two research actions:

Computation over Continuous Structures

Computer Virology.

Our team made remarkable progress into the difference between “real world” systems and artefacts due to exact (infinite) precision computations. Olivier Bournez, Daniel Graça and Emmanuel Hainry succeeded in proving an equivalence between robustness and computability: Robust dynamical systems have computable dynamical properties , a strong evidence that “real world” systems will not exhibit undecidability properties.

Another highlight of the year is a paper by Hugo Férée, Mathieu Hoyrup and Walid Gomaa, accepted in LICS 2013 that provides a systematic approach to define and analyse the complexity of algorithms acting on infinite precision numbers (infinite words).

From a historical point of view, the first official virus appeared in 1983 on Vax-PDP 11. At the very same time, a series of papers was published which always remains a reference in computer virology: Thompson , Cohen and Adleman . The literature which explains and discusses practical issues is quite extensive , . However, there are only a few theoretical/scientific studies, which attempt to give a model of computer viruses.

A virus is essentially a self-replicating program inside an
adversary environment. Self-replication has a solid background
based on works on fixed point in

a virus infects programs by modifying them,

a virus copies itself and can mutate,

it spreads throughout a system.

The above scientific foundation justifies our position to use the word virus as a generic word for self-replicating malwares. There is yet a difference. A malware has a payload, and virus may not have one. For example, worms are an autonous self-replicating malware and so fall into our definition. In fact, the current malware taxonomy (virus, worms, trojans, ...) is unclear and subject to debate.

Classical recursion theory deals with computability over discrete structures (natural numbers, finite symbolic words). There is a growing community of researchers working on the extension of this theory to continuous structures arising in mathematics. One goal is to give foundations of numerical analysis, by studying the limitations of machines in terms of computability or complexity, when computing with real numbers. Classical questions are : if a function

While the notion of a computable function over discrete data is captured by the model of Turing machines, the situation is more delicate when the data are continuous, and several non-equivalent models exist. In this case, let us mention computable analysis, which relates computability to topology , ; the Blum-Shub-Smale model (BSS), where the real numbers are treated as elementary entities ; the General Purpose Analog Computer (GPAC) introduced by Shannon with continuous time.

The rewriting paradigm is now widely used for specifying, modelizing, programming and proving. It allows to easily express deduction systems in a declarative way, and to express complex relations on infinite sets of states in a finite way, provided they are countable. Programming languages and environments with a rewriting based semantics have been developed ; see ASF+SDF , Maude , and Tom .

For basic rewriting, many techniques have been developed to prove properties of rewrite systems like confluence, completeness, consistency or various notions of termination. Proof methods have also been proposed for extensions of rewriting such as equational extensions, consisting of rewriting modulo a set of axioms, conditional extensions where rules are applied under certain conditions only, typed extensions, where rules are applied only if there is a type correspondence between the rule and the term to be rewritten, and constrained extensions, where rules are enriched by formulas to be satisfied , , .

An interesting aspect of the rewriting paradigm is that it allows automatable or semi-automatable correctness proofs for systems or programs: the properties of rewriting systems as those cited above are translatable to the deduction systems or programs they formalize and the proof techniques may directly apply to them.

Another interesting aspect is that it allows characteristics or properties of the modelled systems to be expressed as equational theorems, often automatically provable using the rewriting mechanism itself or induction techniques based on completion . Note that the rewriting and the completion mechanisms also enable transformation and simplification of formal systems or programs.

Applications of rewriting-based proofs to computer security are various. Approaches using rule-based specifications have recently been proposed for detection of computer viruses , . For several years, in our team, we have also been working in this direction. We already have proposed an approach using rewriting techniques to abstract program behaviors for detecting suspicious or malicious programs .

It is rightful to wonder why there is only a few fundamental studies on computer viruses while it is one of the important ﬂaws in software engineering. The lack of theoretical studies explains maybe the weakness in the anticipation of computer diseases and the difﬁculty to improve defenses. For these reasons, we do think that it is worth exploring fundamental aspects, and in particular self-reproducing behaviors.

The crucial question is how to detect viruses or self-replicating malwares. Cohen demonstrated that this question is undecidable. The anti-virus heuristics are based on two methods. The ﬁrst one consists in searching for virus signatures. A signature is a regular expression, which identiﬁes a family of viruses. There are obvious defects. For example, an unknown virus will not be detected, like ones related to a 0-day exploit. We strongly suggest to have a look at the independent audit in order to understand the limits of this method. The second one consists in analysing the behavior of a program by monitoring it. Following , this kind of methods is not yet really implemented. Moreover, the large number of false-positive implies this is barely usable. To end this short survey, intrusion detection encompasses virus detection. However, unlike computer virology, which has a solid scientiﬁc foundation as we have seen, the IDS notion of “malwares” with respect to some security policy is not well deﬁned. The interested reader may consult .

The aim is to deﬁne security policies in order to prevent malware propagation. For this, we need (i) to deﬁne what is a computer in different programming languages and setting, (ii) to take into consideration resources like time and space. We think that formal methods like rewriting, type theory, logic, or formal languages, should help to deﬁne the notion of a formal immune system, which deﬁnes a certiﬁed protection.

This study on computer virology leads us to propose and construct a “high security lab” in which experiments can be done in respect with the French law. This project of “high security lab” in one of the main project of the CPER 2007-2013.

Understanding computation theories for continuous systems leads to studying hardness of veriﬁcation and
control of these systems. This has been used to discuss problems in ﬁelds as diverse as veriﬁcation (see e.g.
), control theory (see e.g. ), neural
networks (see e.g. ), and so on.
We are interested in the formal decidability of properties of
dynamical systems, such as reachability ,
the Skolem-Pisot problem , the computability of
the

The other research direction on dynamical systems we are interested in is the study of properties of adversary systems or programs, i.e. of systems whose behavior is unknown or indistinct, or which do not have classical expected properties. We would like to offer proof and veriﬁcation tools, to guarantee the correctness of such systems. On one hand, we are interested in continuous and hybrid systems. In a mathematical sense, a hybrid system can be seen as a dynamical system, whose transition function does not satisfy the classical regularity hypotheses, like continuity, or continuity of its derivative. The properties to be veriﬁed are often expressed as reachability properties. For example, a safety property is often equivalent to (non-)reachability of a subset of unsure states from an initial conﬁguration, or to stability (with its numerous variants like asymptotic stability, local stability, mortality, etc ...). Thus we will essentially focus on veriﬁcation of these properties in various classes of dynamical systems.

We are also interested in rewriting techniques, used to describe dynamic systems, in particular in the adversary context. As they were initially developed in the context of automated deduction, the rewriting proof techniques, although now numerous, are not yet adapted to the complex framework of modelization and programming. An important stake in the domain is then to enrich them to provide realistic validation tools, both in providing ﬁner rewriting formalisms and their associated proof techniques, and in developing new validation concepts in the adversary case, i.e. when usual properties of the systems like, for example, termination are not veriﬁed. For several years, we have been developing speciﬁc procedures for property proofs of rewriting, for the sake of programming, in particular with an inductive technique, already applied with success to termination under strategies , , , to weak termination , sufﬁcient completeness and probabilistic termination . The last three results take place in the context of adversary computations, since they allow for proving that even a divergent program, in the sense where it does not terminate, can give the expected results. A common mechanism has been extracted from the above works, providing a generic inductive proof framework for properties of reduction relations, which can be parametrized by the property to be proved , . Provided program code can be translated into rule-based speciﬁcations, this approach can be applied to correctness proof of software in a larger context. A crucial element of safety and security of software systems is the problem of resources. We are working in the ﬁeld of Implicit Computational Complexity. Interpretation based methods like Quasi-interpretations (QI) or sup-interpretations, are the approach we have been developing these last years , , . Implicit complexity is an approach to the analysis of the resources that are used by a program. Its tools come essentially from proof theory. The aim is to compile a program while certifying its complexity.

MMDEX is a virus detector based on morphological analysis. It is composed of our own disassembler tool, on a graph transformer and a specific tree-automaton implementation. The tool is used in the EU-Fiware project and by some other partners (e.g. DAVFI project).

Written in C, 20k lines.

APP License, IDDN.FR.001.300033.000.R.P.2009.000.10000, 2009.

TraceSurfer is a self-modifying code analyzer coming with an IDA add-on. It works as a wave-builder. In the analysis of self-modifying programs, one basic task is indeed to separate parts of the code which are self-modifying into successive layers, called waves. TraceSurfer extracts waves from traces of program executions. Doing so drastically simplifies program verification.

Written in C, 5k lines.

CROCUS is a program interpretation synthetizer. Given a first order program (possibly written in OCAML), it outputs a quasi-interpretation based on max, addition and product. It is based on a random algorithm. The interpretation is actually a certificate for the program's complexity. Users are non academics (some artists).

Written in Java, 5k lines.

We investigated in , , the isomorphism (conjugacy) problem for dynamical systems. While the decidability in the one-dimensional case is a long-standing open problem, we characterize its exact complexity in higher dimensions. Our result suggest that the isomorphism problem is easier than the factoring and embedding problem (decide if one dynamical system is a subsystem of another). A traditional approach to prove two dynamical systems are not isomorphic is to prove that they have different dynamical invariants. We characterised in terms of complexity and computability classes different well known dynamic invariants (periodic points, Turing degrees) in , .

While Turing machines are usually used for computing, it is an interesting model of dynamical systems, which looks very much like two-dimensional piecewise-affine maps. We investigated dynmacial invariants (entropy and Lyapunov exponents) for Turing machines, and proved quite surprisingly that they are computable. Essentially this means that Turing machines that do interesting computations must do it so slowly that this cannot be seen in their dynamics. This work will be presented in STACS 2014

Computability and topology are closely related as computability assumptions impose topological restrictions: on a topological
space, computable functions are continuous and continuous functions are computable relative to some oracle. In the same way,
complexity assumptions as bounds on the computation time impose analytical restrictions, but in a way that is not understood yet.
For functions from the real numbers to the real numbers, it is known that polynomial-time computable functions correspond to
functions with a polynomial modulus of continuity. However for functions on other spaces no such correspondence is known. We
investigate the particular case of norms on the space of continuous real functions defined on the unit interval. We introduce
analytical characteristics of a norm, namely its dependency on points and the concept of *relevant points*, and use them to characterize the
polynomial-time computable norms. This work was presented at LICS 2013 . A full version including other
results on non-deterministic complexity classes is currently submitted .

While computability theory is well-developed and understood on large classes of topological spaces, complexity theory in analysis is still in its infancy. We argue that the usual way of representing mathematical objects by functions from finite strings to finite strings (order 1 functions) is not appropriate for general spaces. We show that as soon as the space becomes large in a topological sense, it cannot be represented by order 1 functions in a way that respects complexity notions, so we propose to represent objects using higher order functions over finite strings. However higher order complexity theory is not well-understood. The only known class to date is BFF, the class of Basic Feasible Functionals, which does not enjoy nice properties: some intuitively feasible functionals do not belong to the class. We develop a new way of carrying out complexity theory at higher order types, using an adaptation of game semantics. A preliminary version of this work was presented at CCA 2013 .

As mentioned before, computable functions must be continuous. It gives a simple way of proving that some operator is not
computable by showing that it is discontinuous. We recall that a function *single* oracle
Turing machine M that on each *preserves computability* if for each computable

In the setting of non-interference and implicit computational complexity, Emmanuel Hainry, Jean-Yves Marion, and Romain Péchoux presented a characterization of FPSPACE in a language with a fork/wait mechanism . The language used in this work is a classical imperative language with while loops complemented with a mechanism to launch new processes through forks. The fork instruction is heavily inspired by C's fork/wait construction for Unix operating systems, which anchors this work in a down-to-earth setting. Using a type system that enforces a data-ramification on variables, they show that all programs that can be typed and are terminating compute an FPSPACE function, that with a natural evaluation strategy, they indeed use only polynomial space, and conversely that this type system is complete as all FPSPACE functions can be implemented in this language in a typable way.

Emmanuel Hainry and Romain Péchoux also used data-ramification combined with non-interference principles to effectively bound the memory used by object oriented languages in . This work introduces a type system for an object oriented language (derived from java). This type system allows to compute polynomial bounds on the heap and stack used by a typable program, ensuring that if the program halts, it will only use memory under this explicit bound. As the typing procedure is doable in time polynomial in the size of the program, those bounds are easy to obtain, though not tight. Interesting features of this work include inheritance (with overloading and overriding) and, the ability to analyze programs with flow statements controled by objects, contrary to most other works in implicit computational complexity. In , Romain Péchoux has shown that the notion of (polynomial) interpretation over term rewrite systems can be adapated on a process language, a variant of the pi-calculus with process recursive definitions. This work shows that the order induced by simulation can be used wrt a given process semantics to infer time and space upper bounds on process resource usage (reduction length, size of sent values, ...).

The study on behavioural malware detection has been continued. Guillaume Bonfante, Isabelle Gnaedig and Jean-Yves Marion have been developing an approach detecting suspicious schemes on an abstract representation of the behavior of a program, by abstracting program traces, rewriting given subtraces into abstract symbols representing their functionality. Considering abstract behaviors allows us to be implementation-independent and robust to variants and mutations of malware. Suspicious behaviors are then detected by comparing trace abstractions to reference malicious behaviors.

Model checking is a strong point of our approach: the predefined behavior patterns, used to abstract program traces, are defined by first order temporal logic formulas, as well as the reference suspicious behaviors, given in a signature. The infection problem can then be seen as the satisfaction problem of the formula of the signature by an abstracted trace of the program, which can be checked using existing model checking techniques

The previous work by the team involved abstracting trace automata by rewriting them with respect to a set of predefined behavior patterns defined as a regular language described by a string rewriting system , and then, by a term rewriting system , which allows to detect information leak.

This work has been finished this year by designing a probabilistic generalization of our approach. Introducing probabilities in our technique allows to express a pertinence degree of detection when analysis of the program results in an incomplete or uncertain program dataflow, or when abstraction cannot be performed reliably. Proposing malware detection with a probabilistic rate is finer and more realistic in practice than giving the binary answer of whether a program is infected or not.

Using a tropical semiring over the reals, they have presented a formalism relying on a weighted term rewriting mechanism, where a
weight

Detection of an abstract behavior has then be defined with respect to a threshold, and a program

The weighted abstraction formalism has the advantage of providing a detection algorithm with the same complexity as in the unweighted case, that is linear in the size of the trace automaton .

Guillaume Bonfante and Bruno Guillaume provide a new graph rewriting framework adapted to Natural Language Processing. It involves a new form of edge transformation. A new termination technique is also described. The extended paper is accepted for publication in Mathematical Structure in Computer Science.

We are currently working with the consortium “malware.lu”.

The team was a funding parter in ANR Complice (Implicit Computational Complexity, Concurrency and Extraction), ref.: ANR-08-BLANC-0211-01, that ended in april 2013 and whose aim was to extend the results of ICC to other paradigms (process languages, ...) and take benefice of proof extraction techniques in order to synthesize resoure certificates. This ANR should be followed by a new ANR submission (ANR Elica proposal) involving Paris 7 PPS team, Paris 13 LCC team, ENS Lyon Plume team and Bologna Inria team Focus.

The team is a funding parter in ANR Binsec, whose aim is to fill part of the gap between formal methods over executable code on one side, and binary-level security analyses currently used in the security industry. Two main applicative domains are targeted: vulnerability analysis and virus detection. Two other closely related applications will also be investigated: crash analysis and program deobfuscation.

Emmanuel Jeandel is a member of ANR Blanche ANR-09-BLAN-0164 (EMC: *Emerging
Phenomena in Computation Models*), that ended in April 2013.

Simon Perdrix is a member of a PEPS INS2I “Information et Communication Quantique: Cryptographie et Calcul Quantiques Distribués.“ with partners in Telecom ParisTech and other labs.

Mathieu Hoyrup is principal investigator of a PEPS INS2I “Approches Topologiques de l'Information et de la Calculabilité”, with Emmanuel Jeandel and Laurent Bienvenu (CNRS, LIAFA).

Title: Morphus

Type: COOPERATION

Defi: PPP FI: Technology Foundation: Future Internet Core Platform

Instrument: Integrated Project (IP)

Objectif: PPP FI: Technology Foundation:Future Internet Core Platform

Duration: September 2011 - May 2014

Coordinator: Telefonica (Spain)

Other Partners: Thales, SAP, Inria

Inria contact: Olivier Festor

Abstract: See also: http://

The team has an informal partnership with Pr. James Royer (University of Syracuse) and PhD. Norman Danner (Wesleyan University) on the study of program higher order complexity (an Inria associated team proposal has been submitted on this domain). On the Implicit Computational Complexity part, the team has strong contacts with Universita di Torino (Pr Simona Ronchi Della Rocca), Dundee University(PhD Marco Gaboardi), Universita di Bologna (Pr Simone Martini and PhD Ugo Dal Lago).

Subramanian Kumbakonam Govindarajan, professor in Universiti Sains Malaysia, was visiting Carte team in february. He works on computational models and Parikh matrices.

Neil Jones, professor in the University of Copenhagen, visited Carte team for one month in March. He is currently working on program transformation and program obfuscation, which have obvious applications to Computer Virology.

Mathieu Hoyrup visited Universidad Andres Bello in Santiago de Chile during february. He worked there with Cristobal Rojas on extending the results from functions to relations.

Mathieu Hoyrup was in the Program Committee of Computability, Complexity and Randomness (CCA) 2013. He is guest editor for the post-proceedings of CCA 2013, special issue of Logical Methods in Computer Science. He has been invited to organize the special session Algorithmic Randomness of the conference Computability in Europe (CiE) 2013.

Simon Perdrix is a member of the Program Committee of the first Workshop on Parallel Quantum Computing (ParQ).

Guillaume Bonfante was in the Program Committee of Malware 2013 and Symposium on Foundations & Practice of Security (FPS) 2013.

The Carte Team has organized a few conferences and workshops in Nancy this year:

Journées Calculabilités 2013, April

Journées Informatique Quantique, October.

Emmanuel Jeandel gave an invited Talk at the conference (Computability, Complexity and Randomness (CCR) 2013) on entropy of Turing machines. He also presented his work on multidimensional symbolic dynamics in the PIMS Workshop on Automata Theory and Symbolic Dynamics.

Guillaume Bonfante was invited to present his work to workshop “Proof Theory and Rewriting” (Kanazawa, february) and “Journées Francophones d'Investigation Numérique” (Neuchâtel, october).

Guillaume Bonfante, Hugo Férée and Jean-Yves Marion were invited to workshop “Advances in implicit computational complexity” in Shonan Village in november.

Mathieu Hoyrup, Emmanuel Hainry, Emmanuel Jeandel, were invited to present their work to (workshop DySyCo) in Lyon, december 2013.

Emmanuel Hainry presented *Complexité d'ordre supérieur, de
l'Analyse Récursive aux Basic Feasible Functionals* at the Séminaire
d'algorithmique et de complexité du plateau de Saclay
(http://

Emmanuel Hainry reviewed articles for the journal *Computability*, for
*SIAM Journal on Computing*, and for the *STACS 2014* conference.

Emmanuel Jeandel reviewed articles for the *LICS 2013* and the
*STACS 2014* conference.

Romain Péchoux reviewed articles for the *WORDS 2013* conference

Mathieu Hoyrup reviewed articles for the CiE 2013, STACS 2014, STOC 2014 conferences and the journals Logical Methods in Computer Science and Theory of Computing Systems.

Isabelle Gnaedig is member of the scientific mediation committee of Inria Nancy Grand-Est and researcher social referee at Inria Nancy-Grand Est.

Unless specified otherwise, all teaching is done at Université de Lorraine, France.

Licence :

Guillaume Bonfante

Java, L3, Mines Nancy

Emmanuel Hainry

Operating Systems, 60 hours, L1, IUT Nancy Brabois

Algorithms and Programs, 60 hours, L1, IUT Nancy Brabois

Object Oriented Programming, 24 hours, L1, IUT Nancy Brabois

Databases, 24 hours, L2, IUT Nancy Brabois

Complexity, 28 hours, L1, IUT Nancy Brabois

Algorithmics, 12 hours, DU PFST (eq. L1), IUT Nancy Brabois

Emmanuel Jeandel

Statistics for Computer Science, 46 hours, L3 Informatique

Linear Programming, 46 hours, L3 Informatique

Algorithmics and Programming 1, 60 hours, L1 Maths-Info

Algorithmics and Programming 4, 30 hours, L3 Informatique

Networking, 20 hours, L2 and L3 Informatique

Romain Péchoux

Introduction to OO programming, 55 hours, L3 MIASHS parcours MIAGE.

Databases, 42 hours, L3 SG, ISAM-IAE

Propositional logic, 35 hours, L1 MIASHS

Algorithmic complexity, 30 hours, L3 MIASHS parcours MIAGE, IGA Casablanca, Marocco.

Master

Guillaume Bonfante

Modelling and UML, M1, Mines Nancy

Video Games, M1, Mines Nancy

Semantics, M1, Mines Nancy

Safety of Software, M2, Mines Nancy

Isabelle Gnaedig

Design of Safe Software, Coordination of the module, M2, Telecom-Nancy

Rule-based Programming, 20 hours, M2, Telecom-Nancy

Emmanuel Jeandel

Algorithmics and Complexity, M1 Informatique and M1 ENSEM, 60 hours

Combinatorial Optimization, M1 Informatique, 30 hours.

Romain Péchoux

Mathematics for computer science, 20 hours, M1 SC

Advanced Java, 52 hours, M1 MIAGE

Simon Perdrix

Pépites Algorithmiques — Informatique Quantique, 14 hours, M1/M2, Ecole des Mines de Nancy.

PhD : Joan Calvet, Analyse Dynamique de Logiciels Malveillants, Université de Lorraine and Ecole Polytechnique de Montreal, defended August 23rd, supervised by Jean-Yves Marion and José M. Fernandez.

PhD in progress: David Cattanéo, Combinatorial Modelization in Quantum Computation and Generalized Cover Problems, started sept. 2012, Pablo Arrighi (director), Simon Perdrix (co-advisor)

PhD in progress: Hugo Férée, Computational Complexity in Analysis, defense planned in December 2014, Jean-Yves Marion (director) and Mathieu Hoyrup (co-advisor).

PhD in progress: Hubert Godfroy, Semantics of Self-modifying Programs, Jean-Yves Marion

PhD in progress: Jérôme Javelle , Quantum Cryptography: Protocols and Graphs, started Jan. 2011, Pablo Arrighi (director), Mehdi Mhalla (co-advisor), Simon Perdrix (co-advisor)

PhD in progress: Thanh Dinh Ta, Malware Algebraic Modeling and Detection, started Sept. 2010, Jean-Yves Marion (director) and Guillaume Bonfante (co-advisor)

PhD in progress: Aurélien Thierry, Morphological Analysis of Malware, started Oct. 2011 supervised by Jean-Yves Marion.

Isabelle Gnaedig:

participation to the Telecom-Nancy admission committee.

Emmanuel Jeandel

Selection committee for a research assistant position in Nice (MCF 1114).

Jury of Razvan Barbulescu's PhD Defense on “Algorithmes de logarithmes discrets dans les corps ﬁnis”, defended in Université de Lorraine, December 5th.

Isabelle Gnaedig is member of the scientific vulgarization committee of Inria Nancy Grand-Est. This committee is a choice and guidance instance helping the direction of the center and the person in charge of popularization events, to elaborate a strategy, to realize events and to help researchers to get involved in various actions aiming at popularizing our research themes, and more generally computer science and mathematics.

This year, in particular, the center participated to organization of mathematics competitions and
projects for high school students, to conferences for computer science
high school teachers, to the “Fête de la Science”, to
the "Moments d'invention" exhibition of the “Nancy Renaissance”
event, and received several high school classes in various research
teams of Inria Nancy Grand-Est.
Details can be found at
https://