The goal of the Indes team is to study models for diffuse computing and develop languages for secure diffuse applications. Diffuse applications, of which Web 2.0 applications are a notable example, are the new applications emerging from the convergence of broad network accessibility, rich personal digital environment, and vast sources of information. Strong security guarantees are required for these applications, which intrinsically rely on sharing private information over networks of mutually distrustful nodes connected by unreliable media.
Diffuse computing requires an original combination of nearly all previous computing paradigms, ranging from classical sequential computing to parallel and concurrent computing in both their synchronous / reactive and asynchronous variants. It also benefits from the recent advances in mobile computing, since devices involved in diffuse applications are often mobile or portable.
The Indes team contributes to the whole chain of research
on models and languages for diffuse computing, going from the study of
foundational models and formal semantics to the design and
implementation of new languages to be put to work on concrete
applications. Emphasis is placed on correct-by-construction
mechanisms to guarantee correct, efficient and secure implementation
of high-level programs. The research is partly inspired by and
built around Hop, the web programming model proposed by the
former Mimosa team, which takes the web as its execution platform and
targets interactive and multimedia applications.
Concurrency management is at the heart of diffuse programming. Since
the execution platforms are highly heterogeneous, many different
concurrency principles and models may be involved. Asynchronous
concurrency is the basis of shared-memory process handling within
multiprocessor or multicore computers, of direct or fifo-based message
passing in distributed networks, and of fifo- or interrupt-based event
handling in web-based human-machine interaction or sensor
handling. Synchronous or quasi-synchronous concurrency is the basis of
signal processing, of real-time control, and of safety-critical
information acquisition and display. Interfacing existing devices
based on these different concurrency principles within Hop or other
diffuse programming languages will require better understanding of the
underlying concurrency models and of the way they can nicely
cooperate, a currently ill-resolved problem.
We are studying new paradigms for programming Web applications that
rely on multi-tier functional programming. We have created a Web
programming environment named Hop. It relies on a single formalism
for programming the server-side and the client-side of the
applications as well as for configuring the execution engine.
Hop is a functional language based on the Scheme programming
language. That is, it is a strict functional language, fully
polymorphic, supporting side effects, and dynamically
type-checked. Hop is implemented as an extension of the Bigloo
Scheme compiler that we develop. In the past, we have extensively studied
static analyses (type systems and inference, abstract interpretations,
as well as classical compiler optimizations) to improve the efficiency
of compilation in both space and time.
As a Hop DSL, we have created HipHop, a synchronous
orchestration language for web and IoT applications. HipHop facilitates
the design and programming of complex web/IoT applications by smoothly
integrating three computation models and programming styles that have
been historically developed in different communities and for different
purposes: i) Transformational programs that simply compute
output values from input values, with comparatively simple interaction
with their environment; ii) asynchronous concurrent programs
that perform interactions between their components or with their
environment with uncontrollable timing, using typically network-based
communication; and iii) synchronous reactive programs that
react to external events in a conceptually instantaneous and
deterministic way.
The main goal of our security research is to provide scalable and rigorous language-based techniques that can be integrated into multi-tier compilers to enforce the security of diffuse programs. Research on language-based security has been carried on before in former Inria teams. In particular previous research has focused on controlling information flow to ensure confidentiality.
Typical language-based solutions to these problems are founded on
static analysis, logics, provable cryptography, and compilers that
generate correct code by construction. Relying on the multi-tier
programming language Hop that tames the complexity of writing and
analysing secure diffuse applications, we are studying language-based
solutions to prominent web security problems such as code injection
and cross-site scripting, to name a few.
The Web is the natural application domain of the team. We are designing and implementing multitier languages for helping the development of Web applications. We are creating static and dynamic analyses for Web security. We are conducting empirical studies about privacy preservation on the Web.
More recently, we have started focusing on Internet
of Things (IoT) applications. They share many similarities
with Web applications so most of the methodologies and
expertises we have developed for the Web apply to
IoT but the restricted hardware resources made available
by many IoT devices demand new developments and new
research explorations.
This section should rather be called “Lowlights”, given its negative assessments.
We point out several issues and institutional dysfunctions which impaired and slowed down our team activity, and also badly affected the general atmosphere in the research centre and in the institute at large, causing a great deal of anxiety and strain in the scientific, technical and administrative staff.
The Hop programming environment consists in a web broker that intuitively combines in a single architecture a web server and a web proxy. The broker embeds a Hop interpreter for executing server-side code and a Hop client-side compiler for generating the code that will get executed by the client.
An important effort is devoted to providing Hop with a realistic and efficient implementation. The Hop implementation is validated against web applications that are used on a daily-basis. In particular, we have developed Hop applications for authoring and projecting slides, editing calendars, reading RSS streams, or managing blogs.
We present a new web application architecture that allows web developers to gain control over certain types of third party content. In the traditional web application architecture, a web application developer has no control over third party content. This allows the exchange of tracking information between the browser and the third party content provider.
To prevent this, our solution is based on the automatic rewriting of the web application in such a way that the third party requests are redirected to a trusted third party server, called the Middle Party Server. It may be either controlled by a trusted party, or by a main site owner and automatically eliminates third-party tracking cookies and other technologies that may be exchanged by the browser and third party server
We have pursued the development of Hop and our study on efficient
JavaScript implementations as well as our development of analyses for
distributed language sessions and security.
We proposed the JavaScript Sealed Classes, which differ from regular
classes in a few ways that allow ahead-of-time (AoT) compilers to
implement them more efficiently. Sealed classes are compatible with
the rest of the language so that they can be combined with all other
structures, including regular classes, and can be gradually integrated
into existing code bases.
Sealed classes trade a little bit of the dynamicity of JavaScript classes for faster and more predictable execution. All the benchmarks we tested benefit from sealed classes. Some benefit from a code size reduction and others benefit from speedup. Some benefit from both.
Sealed classes are compatible with the rest of the JavaScript runtime system. They can be passed to functions, returned by them, stored in data structures, and they can be used as the super classes of sealed and ordinary classes. Thus, in existing programs, sealed classes can gradually replace those classes that naturally respect the restrictions they impose. Infringements to the rules of sealed classes are detected, so that sealing classes does not present the risk of silently corrupting operational programs.
The dynamic semantics of sealed classes that do not raise errors is
identical to that of regular classes. They can therefore already be
used by unmodified JavaScript engines, although in this case there is no
runtime acceleration. To benefit from this acceleration, we have
modified the AoT Hopc compiler. We have shown that the average
speedup due to sealed classes is of 19% on a variety of
programs using classes. We have detailed this implementation in a
conference paper 19. It is simple and required less
than 1,000
new lines of code for the compiler and a few
hundred lines of code for the runtime system. Sealed classes deliver
better performance than regular classes and they are easy to
implement.
The DefinitelyTyped repository hosts type declarations for thousands
of JavaScript libraries. Given the lack of formal connection between
the types and the corresponding code, a natural question is are
the types right? An equally important question, as DefinitelyTyped
and the libraries it supports change over time, is how can we
keep the types from becoming wrong?
To tackle this problem, we have created Scotty, a tool that detects mismatches between the types and code in the DefinitelyTyped repository. More specifically, Scotty checks each package by converting its types into contracts and installing the contracts on the boundary between the library and its test suite. Running the test suite in this environment can reveal mismatches between the types and the JavaScript code. As automation and generality are both essential if such a tool is going to remain useful in the long term, we focus on techniques that sacrifice completeness, instead preferring to avoid false positives. Scotty currently handles about 26% of the 8806 packages on DefinitelyTyped (61% of the packages whose code is available and whose test suite passes).
Perhaps unsurprisingly, running the tests with these contracts in place revealed many errors in DefinitelyTyped. More surprisingly, despite the inherent limitations of the techniques we use, this exercise led to one hundred accepted pull requests that fix errors in DefinitelyTyped, demonstrating the value of this approach for the long-term maintenance of DefinitelyTyped. It also revealed a number of lessons about working in the JavaScript ecosystem and how details beyond the semantics of the language can be surprisingly important. Best of all, it also revealed a few places where programmers preferred incorrect types, suggesting some avenues of research to improve TypeScript.
Scotty, its design, its architecture, and also its limits, have been described in a publication 14.
Session types describe communication protocols involving two or more
participants by specifying the sequence of exchanged messages and
their functionality (sender, receiver and type of carried data). They
may be viewed as the analogue, for concurrency and distribution, of
data types for sequential computation. Originally conceived as a
static analysis technique for a variant of the
The aim of session types is to ensure safety properties for
sessions, such as the absence of communication errors (no
type mismatch in exchanged data) and deadlock-freedom (no
standstill until all participants are terminated). When describing multiparty
protocols, session types often target also the liveness property
of progress or lock-freedom (no participant waits
forever).
While binary sessions can be described by a single session type,
multiparty sessions require two kinds of types: a global type
that describes the whole session protocol, and local types
that describe the individual contributions of the participants to
the protocol. The key requirement to achieve safety properties such
as deadlock-freedom is that the local types of the processes
implementing the participants be obtained as projections from the
same global type. To ensure progress, global types must satisfy
additional well-formedness requirements.
What makes session types particularly attractive is that they offer several advantages at once: 1) static safety guarantees, 2) automatic check of protocol implementation correctness, based on local types, and 3) a strong connection with linear logics and with concurrency models such as communicating automata, graphical choreographies and message-sequence charts.
During the past year we have further investigated the relationship between multiparty session types and concurrency models, focussing on Event Structures 27, a canonical model for concurrent computation with explicit notions of causality and concurrency. We have also addressed the issue of input races in multiparty sessions, and proposed a new type system that accepts some kinds of “innocuous” input races, thus enlarging the class of protocols that can be specified by session types.
Like most of our previous work on this subject, this research has been pursued in collaboration with colleagues from the Universities of Eastern Piedmont and Turin.
We proposed a denotational semantics for multiparty session calculi by means of Event Structures (ESs), a well-known concurrency model introduced in the early 80's 28, 26.
We considered a core multiparty session calculus with
synchronous communication, where sessions are described as
networks of sequential processes (each process implementing a
participant), equipped with standard global types. We proposed an
interpretation of networks as Flow Event Structures (FESs)
25, a subclass of Winskel's Stable Event Structures
28, as well as an interpretation of global types as
Prime Event Structures (PESs) 26, the simplest
class of ESs. Concurrency between network communications may be
directly reflected in the events of the associated FES. On the other
hand, since global types are sequential specifications, which are not
able to explicitly represent concurrency between communications, the
events of the associated PES need to be defined as equivalence classes
of communication sequences up to permutation equivalence. We
showed that when a network is typable with a global type, the FES
semantics of the former is equivalent to the PES semantics of its
type.
This work has been published in the journal JLAMP 12.
The original papers on multiparty session types
imposed strong restrictions on the syntax of global types, requiring
all initial communications in the branches of a choice to have the
same sender and the same receiver, and every other participant to be
independent from the choice, i.e., to have the same behaviour in all
branches. Although these were useful simplifying assumptions in order
to achieve multiparty session correctness, they limited the
expressiveness of global types, ruling out relevant protocols. For
this reason, more permissive choice constructors were investigated in
subsequent work.
However input races, namely the possibility for a receiver
to choose between inputs from different senders, continued
to be viewed as problematic and to be forbidden by typing. As a
consequence, common protocols such as a server shared by different
clients could not be specified by global types.
In the paper 16 we propose a more flexible
type system for asynchronous multiparty sessions, which allows two
kinds of innocuous input races, which we call respectively
confluent races and fake races, while still rejecting
dangerous races that could lead to deadlock or starvation.
Cross-site Scripting (XSS) is one of the most dangerous software weaknesses due to its constant popularity through the years. Several dynamic and static approaches for detection and prevention have been explored in the past. In this work, we explore static approaches to detect XSS vulnerabilities using neural networks. We compare two different code representations based on Natural Language Processing (NLP) and Programming Language Processing (PLP) and experiment with models based on different neural network architectures for static analysis detection in PHP and Node.js. We train and evaluate the models using synthetic databases. Using the generated PHP and Node.js databases, we compare our results with a well-known static analyzer for PHP code, ProgPilot, and a known scanner for Node.js, AppScan static mode. Our analyzers using neural networks improve on the results of existing tools in all cases.
This work was part of the PhD thesis of Héloise Maurel, defended in November 2022. The work is described in her PhD thesis 23 and in two publications 18 and a journal article to appear.
We tackle the problem of designing efficient binary-level verification for a subset of information flow properties encompassing constant-time and secret-erasure. These properties are crucial for cryptographic implementations, but are generally not preserved by compilers. Our proposal builds on relational symbolic execution enhanced with new optimizations dedicated to information flow and binary-level analysis, yielding a dramatic improvement over prior work based on symbolic execution. We implement a prototype, Binsec/Rel, for bug-finding and bounded-verification of constant-time and secret-erasure, and perform extensive experiments on a set of 338 cryptographic implementations, demonstrating the benefits of our approach. Using Binsec/Rel, we also automate two prior manual studies on preservation of constant-time and secret-erasure by compilers for a total of 4148 and 1156 binaries respectively. Interestingly, our analysis highlights incorrect usages of volatile data pointers for secret erasure and shows that scrubbing mechanisms based on volatile function pointers can introduce additional register spilling which might break secret-erasure. We also discovered that gcc -O0 and backend passes of clang introduce violations of constant-time in implementations that were previously deemed secure by a state-of-the-art constant-time verification tool operating at LLVM level, showing the importance of reasoning at binary-level. We have published this work in an important journal for computer security, TOPS
13.
Nowadays most applications are distributed, that is, they run on several computers: a mobile device for the graphical user interface a gateway for storing data in a local area; a remote server of a large cloud platform for resource demanding computing; an object connected to Internet in the IoT (Internet of Things); etc. For many different reasons, this makes programming much more difficult than it was when only a single computer was involved:
The Indes, Northwestern, and Collège de France teams are studying programming languages and have each created complementary solutions that address the aforementioned problems. Combined together, they could lead to a robust and secure execution environment for the web and IoT programming. Indes will bring its expertise in secure web programming, Collège de France its expertise in synchronous reactive programming, Northwestern its expertise in secure execution environments and run-time validation of security properties of program executions. Finally Northwestern will contribute with its expertise in medical descriptions, which will be the main application domain of the secure execution environment the participants aim to develop.
The main objective of the collaboration is the development of a robust and secure integrated programming environment for reactive applications suitable for web and IoT applications. The programming of medical prescriptions will be our favored application domain. We will base our work on three pillars: Hop.js, the contract system designed for the Racket language, and HipHop.js, a domain specific language for reactive programming within Hop.js.
SPARTA project on cordis.europa.eu
The CISC project (Certified IoT Secure Compilation) is funded by the ANR for 42 months, ending in September 2023. The goal of the CISC project is to provide strong security and privacy guarantees for IoT applications by means of a language to orchestrate IoT applicatoins from the microcontroller to the cloud. Tamara Rezk coordinates this project, and Manuel Serrano, Ilaria Castellani and Nataliia Bielova participate in the project. The partners of this project are Inria teams Celtique, Indes and Privatics, and Collège de France.
Tamara Rezk organized and chaired PLMW at PLDI'22.
Tamara Rezk taught 56 hours ETD of courses in Université Côte d'Azur, master level.
Tamara Rezk participated at the W@PLDI panel at PLDI'22.