The goal of the Indes team is to study models for diffuse computing and develop languages for secure diffuse applications. Diffuse applications, of which Web 2.0 applications are a notable example, are the new applications emerging from the convergence of broad network accessibility, rich personal digital environment, and vast sources of information. Strong security guarantees are required for these applications, which intrinsically rely on sharing private information over networks of mutually distrustful nodes connected by unreliable media.
Diffuse computing requires an original combination of nearly all previous computing paradigms, ranging from classical sequential computing to parallel and concurrent computing in both their synchronous / reactive and asynchronous variants. It also benefits from the recent advances in mobile computing, since devices involved in diffuse applications are often mobile or portable.
The Indes team contributes to the whole chain of research
on models and languages for diffuse computing, going from the study of
foundational models and formal semantics to the design and
implementation of new languages to be put to work on concrete
applications. Emphasis is placed on correct-by-construction
mechanisms to guarantee correct, efficient and secure implementation
of high-level programs. The research is partly inspired by and
built around Hop, the web programming model proposed by the
former Mimosa team, which takes the web as its execution platform and
targets interactive and multimedia applications.
Concurrency management is at the heart of diffuse programming. Since
the execution platforms are highly heterogeneous, many different
concurrency principles and models may be involved. Asynchronous
concurrency is the basis of shared-memory process handling within
multiprocessor or multicore computers, of direct or fifo-based message
passing in distributed networks, and of fifo- or interrupt-based event
handling in web-based human-machine interaction or sensor
handling. Synchronous or quasi-synchronous concurrency is the basis of
signal processing, of real-time control, and of safety-critical
information acquisition and display. Interfacing existing devices
based on these different concurrency principles within Hop or other
diffuse programming languages will require better understanding of the
underlying concurrency models and of the way they can nicely
cooperate, a currently ill-resolved problem.
We are studying new paradigms for programming Web applications that
rely on multi-tier functional programming. We have created a Web
programming environment named Hop. It relies on a single formalism
for programming the server-side and the client-side of the
applications as well as for configuring the execution engine.
Hop is a functional language based on the Scheme programming
language. That is, it is a strict functional language, fully
polymorphic, supporting side effects, and dynamically
type-checked. Hop is implemented as an extension of the Bigloo
Scheme compiler that we develop. In the past, we have extensively studied
static analyses (type systems and inference, abstract interpretations,
as well as classical compiler optimizations) to improve the efficiency
of compilation in both space and time.
As a Hop DSL, we have created HipHop, a synchronous
orchestration language for web and IoT applications. HipHop facilitates
the design and programming of complex web/IoT applications by smoothly
integrating three computation models and programming styles that have
been historically developed in different communities and for different
purposes: i) Transformational programs that simply compute
output values from input values, with comparatively simple interaction
with their environment; ii) asynchronous concurrent programs
that perform interactions between their components or with their
environment with uncontrollable timing, using typically network-based
communication; and iii) synchronous reactive programs that
react to external events in a conceptually instantaneous and
deterministic way.
The main goal of our security research is to provide scalable and rigorous language-based techniques that can be integrated into multi-tier compilers to enforce the security of diffuse programs. Research on language-based security has been carried on before in former Inria teams. In particular previous research has focused on controlling information flow to ensure confidentiality.
Typical language-based solutions to these problems are founded on
static analysis, logics, provable cryptography, and compilers that
generate correct code by construction. Relying on the multi-tier
programming language Hop that tames the complexity of writing and
analysing secure diffuse applications, we are studying language-based
solutions to prominent web security problems such as code injection
and cross-site scripting, to name a few.
The Web is the natural application domain of the team. We are designing and implementing multitier languages for helping the development of Web applications. We are creating static and dynamic analyses for Web security. We are conducting empirical studies about privacy preservation on the Web.
More recently, we have started focusing on Internet
of Things (IoT) applications. They share many similarities
with Web applications so most of the methodologies and
expertises we have developed for the Web apply to
IoT but the restricted hardware resources made available
by many IoT devices demand new developments and new
research explorations.
Let us describe new/updated software.
The Hop programming environment consists in a web broker that intuitively combines in a single architecture a web server and a web proxy. The broker embeds a Hop interpreter for executing server-side code and a Hop client-side compiler for generating the code that will get executed by the client.
An important effort is devoted to providing Hop with a realistic and efficient implementation. The Hop implementation is validated against web applications that are used on a daily-basis. In particular, we have developed Hop applications for authoring and projecting slides, editing calendars, reading RSS streams, or managing blogs.
We present a new web application architecture that allows web developers to gain control over certain types of third party content. In the traditional web application architecture, a web application developer has no control over third party content. This allows the exchange of tracking information between the browser and the third party content provider.
To prevent this, our solution is based on the automatic rewriting of the web application in such a way that the third party requests are redirected to a trusted third party server, called the Middle Party Server. It may be either controlled by a trusted party, or by a main site owner and automatically eliminates third-party tracking cookies and other technologies that may be exchanged by the browser and third party server
We have pursued the development of Hop and our study on efficient
JavaScript implementations as well as our development of analyses for
distributed language sessions and security.
Computer systems that react continuously to their environment at a
rate set by the environment form a class of the so-called
reactive systems. They differ from classical computing systems
which takes the input at the start of execution and produce output
before terminating. Furthermore, they also differ from traditional
interactive systems like operating systems which endlessly interact
with their environment at their own speed (in contrast to the speed
determined by the environment). A reactive system can be perceived as
a black box that perpetually receives some input events as external
stimuli and reacts to them by producing some output events as their
behavior. This output may successively affect the production of later
stimuli by the environment.
HipHop, which is the language used in this study is based on Esterel's
semantics. It is a synchronous reactive DSL for JavaScript, built on
top of Hop.js for applying the Esterel programming model to the world
of web and IoT where programs are sent over networks. HipHop can be used
to develop complex web application interfaces and IoT controllers,
which are dynamic in nature. HipHop blends Esterel's synchrony with
JavaScript's asynchrony, simplifying the cooperation between
synchronous and asynchronous activities that are typical in these
application domains. HipHop differs from Esterel in having its own syntax
and programming model adapted to the web. For example, HipHop supports
partial reconfiguration of programs between two synchronous reactions,
while maintaining consistency of the control state.
The expressiveness and the flexibility of Esterel dialects come with a
downside: the debugging, and more precisely the error reporting is
difficult because errors detected by the runtime system are loosely
connected with locations in the program source code. This is a major
difficulty, especially for programmers not deeply accustomed with the
programming model. Improving the error messages the compiler and the
runtime system report is then a major issue and is the subject of
ongoing researches in the team. This year we have developed and
implemented a technique that isolates the fragments of the program
that are responsible for an error when it occurs. The technique we
propose applies to the compilation technique HipHop uses to transform a
source program into an equivalent electric circuit using techniques
developed for the Esterel programming language. The improved error
messages are built by isolating parts in the generated circuit -
minimizing the size of causality error cycles using an iterative
process.
The method of causality error analysis and debugging proceeds by
building on classical graph algorithms, which are applied to the graph
of nets composing the circuit generated by the HipHop compiler. This
enables programmers to narrow down to smaller error positions in
source code. We have shown the results and advantages of application
of our debugging approach in a real life project developed using
HipHop. This work has been presented at the Principles and Practice
of Declarative Programming
conference 5.
Dynamic languages are particularly difficult to implement efficiently
because most of their expressions have all sorts of different meanings
that involve all sorts of different executions that are not
distinguished by any syntactic or type annotation. For instance
considering JavaScript, “obj.prop” might i) fetch property
prop from obj, ii) scan the linked list of
obj's prototype chain and fetch prop from another
object, iii) call a user defined function if prop is
an accessor, iv) allocate a fresh object if obj is a
primitive value, or v) evaluate yet another user function if
obj is a proxy object. Checking all the possible
interpretations and executing the appropriate one literally, that is
treating the language specification as an algorithm, delivers
unacceptably slow performance. All fast implementations use
alternative strategies. Amongst all the possible interpretations, they
favor the one that corresponds to the most frequent situation, for
which they elaborate a faster execution plan, and, as importantly, for
which they elaborate a fast guard that ensures the preservation of the
language semantics. Typically, that is what inline caches and
hidden classes achieve. Using a single test, the comparison of
the object's hidden class with the inline cache, we know if the
property is to be read directly from the object and, if so, at which
offset. The common intuition is that only dynamic compilers,
a.k.a., JIT compilers, can handle dynamic languages
efficiently because this heuristic-based strategy requires having the
program and the data on hand in order to generate efficient
code. We view this position as too extreme, as it is oblivious to
other characteristics of static compilers (AoT) that might make them
competitive.
As few are committed to developing optimizing AoT JavaScript compilers, we rely too heavily on our intuition to answer the question whether AoT compilers can deliver performance comparable to JIT compilers. To provide the elements of a proper scientific comparison, we have built Hopc, an AoT compiler for JavaScript. We have compared its performance with those of production JIT compilers and we have shown that on many new tests, its performance is close to those of JIT compilers 1. We read this as a strong indication that an AoT compiler that optimizes the whole core language and the whole set of libraries could compete with the fastest JIT compilers. We intend to pursue this exploration in the coming years.
Session types describe communication protocols involving two or more
participants by specifying the sequence of exchanged messages and their
functionality (sender, receiver and type of carried data). They may
be viewed as the analogue, for concurrency and distribution, of data
types for sequential computation. Originally conceived as a static
analysis technique for an enhanced version of the
The aim of session types is to ensure safety properties for
sessions, such as the absence of communication errors (no
type mismatch in exchanged data) and deadlock-freedom (no
standstill until all participants are terminated). Multiparty
session types often target also the liveness property
of progress or lock-freedom (no participant waits
forever), which is stronger than deadlock-freedom.
While binary sessions can be described by a single session type,
multiparty sessions require two kinds of types: a global type
that describes the whole session protocol, and local types
that describe the individual contributions of the participants to
the protocol. The key requirement to achieve safety properties such
as deadlock-freedom is that the local types of the processes
implementing the participants be obtained as projections from the
same global type. To ensure progress, global types must satisfy
additional well-formedness requirements.
What makes session types particularly attractive is that they offer several advantages at once: 1) static safety guarantees, 2) automatic check of protocol implementation correctness, based on local types, and 3) a strong connection with linear logics and with concurrency models such as communicating automata, graphical choreographies and message-sequence charts.
During the past year we have further investigated the relationship between multiparty session types and concurrency models, focussing on Event Structures 17, a canonical model for concurrent computation. As most of our previous work on this subject, this research has been pursued in collaboration with colleagues from the Universities of Eastern Piedmont and Turin.
In the two papers 10 and 11, we explored the relationship between multiparty session calculi and Event Structures (ESs), a well-known concurrency model introduced in the early 80's 18, 15.
In the first paper 10, we considered a
core multiparty session calculus with synchronous
communication, where sessions are described as networks of
sequential processes (each process implementing a participant),
equipped with standard global types. We proposed an interpretation of
networks as Flow Event Structures (FESs) 13, a
subclass of Winskel's Stable Event Structures 18, as
well as an interpretation of global types as Prime Event
Structures (PESs) 15, the simplest class of
ESs. Since global types are sequential specifications, which are not
able to explicitly represent the concurrency among communications, the
events of the associated PES need to be defined as equivalence classes
of communication sequences up to permutation equivalence. We
showed that when a network is typable with a global type, the FES
semantics of the former is equivalent, in a precise technical sense,
to the PES semantics of its type.
In the second paper 11, we undertook a
similar endeavour for asynchronous communication. This led us
to devise a new notion of global type for asynchronous sessions. The
type system for asynchronous sessions is expected to be more
permissive than the one for synchronous sessions. For instance,
consider a session with two participants each of which wishes to first
send a message and then receive a message. This session is stuck if
communication is synchronous but not if communication is
asynchronous. Hence it should be typable in the latter case but not in
the former.
We started by considering a core session calculus as in the
synchronous case, where networks are now endowed with a queue, and
they act on this queue by performing outputs or inputs: an output
stores a message in the queue, while an input fetches a message from
the queue. Then, the idea for our asynchronous global types is
quite simple: to split communications in the type into outputs and
inputs, and to equip the type with a queue, thus mimicking very
closely the behaviour of asynchronous networks. The well-formedness
conditions for global types must now take into account also the
queue. Essentially, this amounts to requiring that each input
appearing in the type be justified by a preceding output in the type
or by a message in the queue, and vice versa.
The contribution of 11 is twofold: 1) We propose an original type system for asynchronous multiparty sessions, which accounts for asynchronous communication more directly than existing approaches, while remaining decidable; 2) We present an Event Structure semantics for asynchronous sessions and asynchronous global types, and we show that these two semantics agree.
Both these papers have been submitted for journal publication.
Previously, in 2020, we had developed an analyzer for constant-time called Binsec/Rel.
Binsec/Rel analyses timing-leaks attacks. These attacks can be captured via a security property called constant-time, which states that the execution time
of an application does not depend on the dynamic path of the execution. Our analyzer works at binary-level and is based on symbolic execution with
dedicated optimizations for constant-time analysis. In particular, we complement relational symbolic execution with a new on-the-fly simplification to maximize sharing in the memory and formally prove that our analysis is correct for bug-finding and bounded-verification.
In 2021,
we extended this analyzer to handle microarchitectural attacks. The new analyzer is called Binsec/Haunted 3.
We first modeled the semantics of hardware with microarchitectural features to model timing-side channels and attacks such as Spectre that can be used e.g., in the cloud, to learn all kind of
confidential information from the cloud's customers. Our obtained hardware semantics supports out-of-order and speculative execution by modeling reorder buffers and transient instructions, respectively. It assumes that attackers have complete control
over microarchitectural features (e.g., the branch target predictor), and uses adversarial execution directives to model the adversarys control over predictors. The Binsec/Haunted analyzer was based on this semantics and scales to detect Spectre-PHT vulnerabilities in binaries of cryptographic libraries. It also works for Spectre-STL attacks and helped to uncover inconsistencies between different Spectre defenses.
The analyzers
have helped to disclose that popular compilers cannot be trusted to preserve constant-time and that popular counter-measures for Spectre vulnerabilities
may also introduce other variants of the vulnerability.
Binsec/Rel and Binsec/Haunted constitute the contributions in the PhD thesis of Lesly-Ann Daniel, who defended on November 12th, 2021.
We have also worked in adding phases to the Jazmin compiler in order to obtain code certified
as free of Spectre PHT and STL attacks.
Our works on microarchitectural attacks have been presented in
major security conferences 3, 2.
Modern client-side web applications often include external third-party code, namely gadgets such as advertisement banners.
Previously to 2021, we developed a compiler, called Mashic, that takes a client-side web application as input to transform it in such a way that gadgets are included with iframes. The guarantees of the compiler are: if the gadget is not malicious, then the functionality of the compiled code is the same as the original one. If the mashup is malicious, all the attacks are neutralized and left with no effect on trusted code. During the current period, with the intention of obtaining a compiler applicable to IoT applications, we have generalized the Mashic compiler to be independent from the browsers and, in particular, of any browser built-in isolation mechanisms. We called the generalization SecureJS 4, and it can used for JavaScript running in IoT devices. SecureJS passes the test suites of the ECMAScript5 specification in diverse JavaScript engines such as V8, Hop, and JerryScript. As a continuation of this work, we have studied more flexible JavaScript security policies, in particular a security policy that will allow us to express sharing of JavaScript objects via declassification. This kind of policies can be implemented for example by membrane patterns from the object capability model and could be used to prove formally the security property that the secure subset of EcmaScript provides. We also studied JavaScript and PHP security by means of analyzers to detect XSS vulnerabilities using neural networks 6. We compare two different code representations based on Natural Language Processing (NLP) and Programming Language Processing (PLP) and experiment with models based on different neural network architectures for static analysis detection of vulnerabilities in PHP and Node.js. We train and evaluate the models using synthetic databases. Using the generated PHP and Node.js databases, we compare our results with a well-known static analyzer for PHP code, ProgPilot, and a known scanner for Node.js, AppScan static mode. Our analyzers using neural networks overcome the results of existing tools in all cases but are limited to only a reduced number of lines of code.
For many different reasons, this makes programming much more difficult than it was when only a single computer was involved:
The Indes, Northwestern, and College de France teams are studying programming languages and have each created complementary solutions that address the aforementioned problems. Combined together, they could lead to a robust and secure execution environment for the web and IoT programming. Indes will bring its expertise in secure web programming, College de France its expertise in synchronous reactive programming, Northwestern its expertise secure execution environments and run-time validation of security properties of program executions. Finally Northwestern will contribute with its expertise in medical descriptions, which will be the main application domain of the secure execution environment the participants aim to develop.
The main objective of the collaboration is the development of a robust and secure integrated programming environment for reactive applications suitable for web and IoT applications. The programming of medical prescriptions will be our favored application domain. We will base our work on three pillars: Hop.js, the contract system designed for the Racket language, and HipHop.js, a domain specific language for reactive programming within Hop.js.
SPARTA establishes a strategic research and innovation roadmap to stimulate the development and deployment of key technologies in cybersecurity and to retain digital sovereignty and autonomy of the European industries.
SPARTA Roadmap serves as common ground for the alignment of research, education and certification priorities of the European Cybersecurity Competence Network.
The CISC project (Certified IoT Secure Compilation) is funded by the ANR for 42 months, starting in April 2018. The goal of the CISC project is to provide strong security and privacy guarantees for IoT applications by means of a language to orchestrate IoT applicatoins from the microcontroller to the cloud. Tamara Rezk coordinates this project, and Manuel Serrano, Ilaria Castellani and Nataliia Bielova participate in the project. The partners of this project are Inria teams Celtique, Indes and Privatics, and Collège de France.
Tamara Rezk gave a talk for undergraduates at University of Córdoba in May 2021 to promote scientific careers.
Tamara Rezk co-organized PLMW at PLDI'21.
Ilaria Castellani participated in the following PhD jury:
Tamara Rezk participated in the following PhD juries: