Section: New Results

Foundations of information hiding

Information hiding refers to the problem of protecting private information while performing certain tasks or interactions, and trying to avoid that an adversary can infer such information. This is one of the main areas of research in Comète; we are exploring several topics, described below.

Information Leakage Games

In [19] we studied a game-theoretic setting to model the interplay between attacker and defender in the context of information flow, and to reason about their optimal strategies. In contrast with standard game theory, in our games the utility of a mixed strategy is a convex function of the distribution on the defender's pure actions, rather than the expected value of their utilities. Nevertheless, the important properties of game theory, notably the existence of a Nash equilibrium, still hold for our (zero-sum) leakage games, and we provided algorithms to compute the corresponding optimal strategies. As typical in (simultaneous) game theory, the optimal strategy is usually mixed, i.e., probabilistic, for both the attacker and the defender. From the point of view of information flow, this was to be expected in the case of the defender, since it is well known that randomization at the level of the system design may help to reduce information leaks. Regarding the attacker, however, this seems the first work (w.r.t. the literature in information flow) proving formally that in certain cases the optimal attack strategy is necessarily probabilistic.

Efficient Utility Improvement for Location Privacy

The continuously increasing use of location-based services poses an important threat to the privacy of users. A natural defense is to employ an obfuscation mechanism, such as those providing geo-indistinguishability [24], a framework for obtaining formal privacy guarantees that has become popular in recent years. Ideally, one would like to employ an optimal obfuscation mechanism, providing the best utility among those satisfying the required privacy level. In theory optimal mechanisms can be constructed via linear programming. In practice, however, this is only feasible for a radically small number of locations. As a consequence, all known applications of geo-indistinguishability simply use noise drawn from a planar Laplace distribution.

In [12], we studied methods for substantially improving the utility of location obfuscation, while maintaining practical applicability as a main goal. We provided such solutions for both infinite (continuous or discrete) as well as large but finite domains of locations, using a Bayesian remapping procedure as a key ingredient. We evaluated our techniques in two real world complete datasets, without any restriction on the evaluation area, and showed important utility improvements with respect to the standard planar Laplace approach.

Trading Optimality for Performance in Location Privacy

Location-Based Services (LBSs) provide invaluable aid in the everyday activities of many individuals, however they also pose serious threats to the user' privacy. There is, therefore, a growing interest in the development of mechanisms to protect location privacy during the use of LBSs. Nowadays, the most popular methods are probabilistic, and the so-called optimal method achieves an optimal trade-off between privacy and utility by using linear optimization techniques.

Unfortunately, due to the complexity of linear programming, the method is unfeasible for a large number N of locations, because the constraints are O(N3). In [20], we have proposed a technique to reduce the number of constraints to O(N2), at the price of renouncing to perfect optimality. We have showed however that on practical situations the utility loss is quite acceptable, while the gain in performance is significant.

Methods for Location Privacy: A comparative overview

The growing popularity of location-based services, allowing to collect huge amounts of information regarding users' location, has started raising serious privacy concerns. In [13] we analyzed the various kinds of privacy breaches that may arise in connection with the use of location-based services, and we surveyd and compared the metrics and the mechanisms that have been proposed in the literature.

Quantifying Leakage in the Presence of Unreliable Sources of Information

Belief and min-entropy leakage are two well-known approaches to quantify information flow in security systems. Both concepts stand as alternatives to the traditional approaches founded on Shannon entropy and mutual information, which were shown to provide inadequate security guarantees. In [16] we unified the two concepts in one model so as to cope with the frequent (potentially inaccurate, misleading or outdated) attackers' side information about individuals on social networks, online forums, blogs and other forms of online communication and information sharing. To this end we proposed a new metric based on min-entropy that takes into account the adversary's beliefs.

Differential Inference Testing: A Practical Approach to Evaluate Anonymized Data

In order to protect individuals' privacy, governments and institutions impose some obligations on data sharing and publishing. Mainly, they require the data to be “anonymized”. In this paper, we have shortly discussed the criteria introduced by European General Data Protection Regulation to assess anonymized data. We have argued that the evaluation of anonymized data should be based on whether the data allows individual based inferences, instead of being centered around the concept of re-identification as the regulation has proposed.

Then, we have proposed an inference-based framework that can be used to evaluate the robustness of a given anonymized dataset against a specific inference model, e.g. a machine learning model.

Our approach evaluates the anonymized data itself, and deals with the related anonymization technique as a black-box. Thus, it can be used to assess datasets that are anonymized by organizations which may prefer not to provide access to their techniques. Finally, we have used our framework to evaluate two datasets after being anonymized using k-anonymity and l-diversity.

Formal Analysis and Offline Monitoring of Electronic Exams

More and more universities are moving toward electronic exams (in short e-exams). This migration exposes exams to additional threats, which may come from the use of the information and communication technology. In [17], we have identified and defined several security properties for e-exam systems. Then, we have showed how to use these properties in two complementary approaches: model-checking and monitoring.

We have illustrated the validity of our definitions by analyzing a real e-exam used at the pharmacy faculty of University Grenoble Alpes (UGA) to assess students. On the one hand, we have instantiated our properties as queries for ProVerif, a process calculus based automatic verifier for cryptographic protocols,

and we have used it to check our modeling of UGA exam specifications. ProVerif found some attacks. On the other hand, we have expressed our properties as Quantified Event Automata (QEAs), and we have synthesized them into monitors using MarQ, a Java tool designed to implement QEAs. Then, we have used these monitors to verify real exam executions conducted by UGA. Our monitors found fraudulent students and discrepancies between the specifications of UGA exam and its implementation.

On the Compositionality of Quantitative Information Flow

In the min-entropy approach to quantitative information flow, the leakage is defined in terms of a minimization problem, which, in the case of large systems, can be computationally rather heavy. The same happens for the recently proposed generalization called g-vulnerability. In [18] we studied the case in which the channel associated to the system can be decomposed into simpler channels, which typically happens when the observables consist of several components. Our main contribution is the derivation of bounds on the g-leakage of the whole system in terms of the g-leakages of its components. We also considered the particular cases of min-entropy leakage and of parallel channels, generalizing and systematizing results from the literature. We demonstrated the effectiveness of our method and evaluate the precision of our bounds using examples.