The scientific objectives of ASPI are the design, analysis and implementation of interacting Monte Carlo methods, also known as particle methods, with focus on

statistical inference in hidden Markov models and particle filtering,

risk evaluation and simulation of rare events,

global optimization.

The whole problematic is multidisciplinary, not only because of the many scientific and engineering areas in which particle methods are used, but also because of the diversity of the scientific communities which have already contributed to establish the foundations of the field

target tracking, interacting particle systems, empirical processes, genetic algorithms (GA), hidden Markov models and nonlinear filtering, Bayesian statistics, Markov chain Monte Carlo (MCMC) methods, etc.

Intuitively speaking, interacting Monte Carlo methods are sequential simulation methods, in which particles

*explore* the state space by mimicking the evolution
of an underlying random process,

*learn* their environment by evaluating a fitness function,

and *interact* so that only the most successful particles
(in view of the fitness function) are allowed to survive
and to get offsprings at the next generation.

The effect of this mutation / selection mechanism is to automatically concentrate particles (i.e. the available computing power) in regions of interest of the state space. In the special case of particle filtering, which has numerous applications under the generic heading of positioning, navigation and tracking, in

target tracking, computer vision, mobile robotics, wireless communications, ubiquitous computing and ambient intelligence, sensor networks, etc.,

each particle represents a possible hidden state, and is replicated or terminated at the next generation on the basis of its consistency with the current observation, as quantified by the likelihood function. With these genetic–type algorithms, it becomes easy to efficiently combine a prior model of displacement with or without constraints, sensor–based measurements, and a base of reference measurements, for example in the form of a digital map (digital elevation map, attenuation map, etc.). In the most general case, particle methods provide approximations of Feynman–Kac distributions, a pathwise generalization of Gibbs–Boltzmann distributions, by means of the weighted empirical probability distribution associated with an interacting particle system, with applications that go far beyond filtering, in

simulation of rare events, global optimization, molecular simulation, etc.

The main applications currently considered are geolocalisation and tracking of mobile terminals, terrain–aided navigation, data fusion for indoor localisation, optimization of sensors location and activation, risk assessment in air traffic management, protection of digital documents.

Monte Carlo methods are numerical methods that are widely used
in situations where
(i) a stochastic (usually Markovian) model is given for some underlying
process, and (ii) some quantity of interest should be evaluated, that
can be expressed in terms of the expected value of a functional of the
process trajectory, which includes as an important special case the
probability that a given event has occurred.
Numerous examples can be found, e.g. in financial engineering (pricing of options and derivative
securities) ,
in performance evaluation of communication networks (probability of buffer
overflow), in statistics of hidden Markov models (state estimation,
evaluation of contrast and score functions), etc.
Very often in practice, no analytical expression is available for
the quantity of interest, but it is possible to simulate trajectories
of the underlying process. The idea behind Monte Carlo methods is
to generate independent trajectories of this process
or of an alternate instrumental process,
and to build an approximation (estimator) of the quantity of interest
in terms of the weighted empirical probability distribution
associated with the resulting independent sample.
By the law of large numbers, the above estimator converges
as the size *blindly*,
and only afterwards are the corresponding weights evaluated.
Some of the weights can happen to be negligible, in which case the
corresponding trajectories are not going to contribute to the estimator,
i.e. computing power has been wasted.

A recent and major breakthrough,
has been the introduction of interacting Monte Carlo methods,
also known as sequential Monte Carlo (SMC) methods,
in which a whole (possibly weighted) sample,
called *system of particles*, is propagated in time, where
the particles

*explore* the state space under the effect of
a *mutation* mechanism which mimics the evolution of the
underlying process,

and are *replicated* or *terminated*, under
the effect of a *selection* mechanism which automatically
concentrates the particles, i.e. the available computing power,
into regions of interest of the state space.

In full generality, the underlying process is a discrete–time Markov chain, whose state space can be

finite, continuous, hybrid (continuous / discrete), graphical, constrained, time varying, pathwise, etc.,

the only condition being that it can easily be *simulated*.

In the special case of particle filtering,
originally developed within the tracking community,
the algorithms yield a numerical approximation of the optimal Bayesian
filter, i.e. of the conditional probability distribution
of the hidden state given the past observations, as a (possibly
weighted) empirical probability distribution of the system of particles.
In its simplest version, introduced in several different scientific
communities under the name of
*bootstrap filter* ,
*Monte Carlo filter*
or *condensation* (conditional density propagation)
algorithm ,
and which historically has been the first algorithm to include
a redistribution step,
the selection mechanism is governed by the likelihood function:
at each time step, a particle is more likely to survive
and to replicate at the next generation if it is consistent with
the current observation.
The algorithms also provide as a by–product a numerical approximation
of the likelihood function, and of many other contrast functions for
parameter estimation in hidden Markov models, such as the prediction
error or the conditional least–squares criterion.

Particle methods are currently being used in many scientific and engineering areas

positioning, navigation, and tracking , , visual tracking , mobile robotics , , ubiquitous computing and ambient intelligence, sensor networks, risk evaluation and simulation of rare events , genetics, molecular simulation , etc.

Other examples of the many applications of particle filtering can be
found in the contributed volume and in the special
issue of *IEEE Transactions on Signal Processing* devoted
to *Monte Carlo Methods for Statistical Signal Processing*
in February 2002,
where the tutorial paper can be found,
and in the textbook devoted
to applications in target tracking.
Applications of sequential Monte Carlo methods to other areas,
beyond signal and image processing, e.g. to genetics,
can be found in .
A recent overview can also be found in .

Particle methods are very easy to implement, since it is sufficient in principle to simulate independent trajectories of the underlying process. The whole problematic is multidisciplinary, not only because of the already mentioned diversity of the scientific and engineering areas in which particle methods are used, but also because of the diversity of the scientific communities which have contributed to establish the foundations of the field

target tracking, interacting particle systems, empirical processes, genetic algorithms (GA), hidden Markov models and nonlinear filtering, Bayesian statistics, Markov chain Monte Carlo (MCMC) methods.

These algorithms can be interpreted as numerical approximation schemes
for Feynman–Kac distributions, a pathwise generalization of Gibbs–Boltzmann
distributions,
in terms of the weighted empirical probability distribution
associated with a system of particles.
This abstract point of view , ,
has proved to be extremely fruitful in providing a very general
framework to the design and analysis of numerical approximation schemes,
based on systems of branching and / or interacting particles,
for nonlinear dynamical systems with values in the space of probability
distributions, associated with Feynman–Kac distributions.
Many asymptotic results have been proved as the number

convergence in

The objective here is to systematically study the impact of the many algorithmic variants on the convergence results.

Hidden Markov models (HMM) form a special case of partially observed stochastic dynamical systems, in which the state of a Markov process (in discrete or continuous time, with finite or continuous state space) should be estimated from noisy observations. The conditional probability distribution of the hidden state given past observations is a well–known example of a normalized (nonlinear) Feynman–Kac distribution, see . These models are very flexible, because of the introduction of latent variables (non observed) which allows to model complex time dependent structures, to take constraints into account, etc. In addition, the underlying Markovian structure makes it possible to use numerical algorithms (particle filtering, Markov chain Monte Carlo methods (MCMC), etc.) which are computationally intensive but whose complexity is rather small. Hidden Markov models are widely used in various applied areas, such as speech recognition, alignment of biological sequences, tracking in complex environment, modeling and control of networks, digital communications, etc.

Beyond the recursive estimation of a hidden state from noisy observations, the problem arises of statistical inference of HMM with general state space , , including estimation of model parameters, early monitoring and diagnosis of small changes in model parameters, etc.

**Large time asymptotics** A fruitful approach is the asymptotic study, when the observation
time increases to infinity, of an extended Markov chain, whose
state includes (i) the hidden state, (ii) the observation,
(iii) the prediction filter (i.e. the conditional probability
distribution of the hidden state given observations at all previous
time instants), and possibly (iv) the derivative of the prediction
filter with respect to the parameter.
Indeed, it is easy to express the log–likelihood function,
the conditional least–squares criterion, and many other clasical
contrast processes, as well as their derivatives with respect to
the parameter, as additive functionals of the extended Markov chain.

The following general approach has been proposed

first, prove an exponential stability property (i.e. an exponential forgetting property of the initial condition) of the prediction filter and its derivative, for a misspecified model,

from this, deduce a geometric ergodicity property and the existence of a unique invariant probability distribution for the extended Markov chain, hence a law of large numbers and a central limit theorem for a large class of contrast processes and their derivatives, and a local asymptotic normality property,

finally, obtain the consistency (i.e. the convergence to the set of minima of the associated contrast function), and the asymptotic normality of a large class of minimum contrast estimators.

This programme has been completed in the case of a finite state space , and has been generalized under an uniform minoration assumption for the Markov transition kernel, which typically does only hold when the state space is compact. Clearly, the whole approach relies on the existence of an exponential stability property of the prediction filter, and the main challenge currently is to get rid of this uniform minoration assumption for the Markov transition kernel , , so as to be able to consider more interesting situations, where the state space is noncompact.

**Small noise asymptotics** Another asymptotic approach can also be used, where it is rather easy
to obtain interesting explicit results, in terms close to the language
of nonlinear deterministic control theory .
Taking the simple example where the hidden state is the solution to
an ordinary differential equation, or a nonlinear state model, and
where the observations are subject to additive Gaussian white noise,
this approach consists in assuming that covariances matrices
of the state noise and of the observation noise go simultaneously
to zero. If it is reasonable in many applications to consider that
noise covariances are small, this asymptotic approach is less natural
than the large time asymptotics, where it is enough (provided a
suitable ergodicity assumption holds) to accumulate observations
and to see the expected limit laws (law of large numbers, central
limit theorem, etc.). In opposition, the expressions obtained in the
limit (Kullback–Leibler divergence, Fisher information matrix, asymptotic
covariance matrix, etc.) take here a much more explicit form than in the
large time asymptotics.

The following results have been obtained using this approach

the consistency of the maximum likelihood estimator (i.e. the convergence to the set

if the abovementioned set

it has been shown
that (i) the parameter dependent
probability distributions of the observations are locally asymptotically
normal (LAN) , from which the asymptotic
normality of the maximum likelihood estimator follows, with an explicit
expression for the asymptotic covariance matrix, i.e. for the Fisher
information matrix

The estimation of the small probability of a rare but critical event, is a crucial issue in industrial areas such as

nuclear power plants, food industry, telecommunication networks, finance and insurance industry, air traffic management, etc.

In such complex systems, analytical methods cannot be used, and naive Monte Carlo methods are clearly unefficient to estimate accurately very small probabilities. Besides importance sampling, an alternate widespread technique consists in multilevel splitting , where trajectories going towards the critical set are given offsprings, thus increasing the number of trajectories that eventually reach the critical set. As shown in , the Feynman–Kac formalism of is well suited for the design and analysis of splitting algorithms for rare event simulation.

**Propagation of uncertainty** Multilevel splitting can be used in static situations. Here, the
objective is to learn the probability distribution of an output random
variable

The key issue is to learn as fast as possible regions of the input space which contribute most to the computation of the target quantity. The proposed splitting methods consists in (i) introducing a sequence of intermediate regions in the input space, implicitly defined by exceeding an increasing sequence of thresholds or levels, (ii) counting the fraction of samples that reach a level given that the previous level has been reached already, and (iii) improving the diversity of the selected samples, usually using an artificial Markovian dynamics. In this way, the algorithm learns

the transition probability between successive levels, hence the probability of reaching each intermediate level,

and the probability distribution of the input random variable, conditionned on the output variable reaching each intermediate level.

A further remark, is that this conditional probability distribution is precisely the optimal (zero variance) importance distribution needed to compute the probability of reaching the considered intermediate level.

**Rare event simulation** To be specific, consider a complex dynamical system modelled as a Markov
process, whose state can possibly contain continuous components and
finite components (mode, regime, etc.), and the objective is to
compute the probability, hopefully very small, that a critical region
of the state space is reached by the Markov process before a final
time

The proposed splitting method consists in (i) introducing a decreasing
sequence of intermediate, more and more critical, regions in the state
space, (ii) counting the fraction of trajectories that reach an
intermediate region before time

the branching rate (number of offsprings allocated to a successful trajectory) is fixed, which allows for depth–first exploration of the branching tree, but raises the issue of controlling the population size,

the population size is fixed, which requires a breadth–first exploration of the branching tree, with random (multinomial) or deterministic allocation of offsprings, etc.

Just as in the static case, the algorithm learns

the transition probability between successive levels, hence the probability of reaching each intermediate level,

and the entrance probability distribution of the Markov process in each intermediate region.

Contributions have been given to

minimizing the asymptotic variance, obtained through a central limit theorem, with respect to the shape of the intermediate regions (selection of the importance function), to the thresholds (levels), to the population size, etc.

controlling the probability of extinction (when not even one trajectory reaches the next intermediate level),

designing and studying variants suited for hybrid state space (resampling per mode, marginalization, mode aggregation),

and in the static case, to

minimizing the asymptotic variance, obtained through a central limit theorem, with respect to intermediate levels, to the Metropolis kernel introduced in the mutation step, etc.

A related issue is global optimization. Indeed, the difficult problem
of finding the set

In pattern recognition and statistical learning, also known as machine
learning, nearest neighbor (NN) algorithms are amongst the simplest but
also very powerful algorithms available.
Basically, given a training set of data, i.e. an

In general, there is no way to guess exactly the value of the feature
associated with the new object, and the minimal error that can be done
is that of the Bayes estimator, which cannot be computed by lack of knowledge
of the distribution of the object–feature pair, but the Bayes estimator
can be useful to characterize the strength of the method.
So the best that can be expected is that the NN estimator converges, say
when the sample size

The asymptotic behavior when the sample size grows is well understood in finite dimension, but the situation is radically different in general infinite dimensional spaces, when the objects to be classified are functions, images, etc.

**Nearest neighbor classification in infinite dimension** In finite dimension, the

**Rates of convergence of the functional $k$–nearest neighbor
estimator** Motivated by a broad range of potential applications, such as regression
on curves, rates of convergence of the

This topic has produced several theoretical advances , in collaboration with Gérard Biau (université Pierre et Marie Curie, ENS Paris and EPI CLASSIC, Inria Paris—Rocquencourt). A few possible target application domains have been identified in

the statistical analysis of recommendation systems,

the design of reduced–order models and analog samplers,

that would be a source of interesting problems.

Among the many application domains of particle methods, or interacting Monte Carlo methods, ASPI has decided to focus on applications in localisation (or positioning), navigation and tracking , , which already covers a very broad spectrum of application domains. The objective here is to estimate the position (and also velocity, attitude, etc.) of a mobile object, from the combination of different sources of information, including

a prior dynamical model of typical evolutions of the mobile, such as inertial estimates and prior model for inertial errors,

measurements provided by sensors,

and possibly a digital map providing some useful feature (terrain altitude, power attenuation, etc.) at each possible position.

In some applications, another useful source of information is provided by

a map of constrained admissible displacements, for instance in the form of an indoor building map,

which particle methods can easily handle (map-matching). This Bayesian dynamical estimation problem is also called filtering, and its numerical implementation using particle methods, known as particle filtering, has been introduced by the target tracking community , , which has already contributed to many of the most interesting algorithmic improvements and is still very active, and has found applications in

target tracking, integrated navigation, points and / or objects tracking in video sequences, mobile robotics, wireless communications, ubiquitous computing and ambient intelligence, sensor networks, etc.

ASPI is contributing (or has contributed recently) to several applications of particle filtering in positioning, navigation and tracking, such as geolocalisation and tracking in a wireless network, terrain–aided navigation, and data fusion for indoor localisation.

Another application domain of particle methods, or interacting Monte Carlo methods, that ASPI has decided to focus on is the estimation of the small probability of a rare but critical event, in complex dynamical systems. This is a crucial issue in industrial areas such as

nuclear power plants, food industry, telecommunication networks, finance and insurance industry, air traffic management, etc.

In such complex systems, analytical methods cannot be used, and naive Monte Carlo methods are clearly unefficient to estimate accurately very small probabilities. Besides importance sampling, an alternate widespread technique consists in multilevel splitting , where trajectories going towards the critical set are given offsprings, thus increasing the number of trajectories that eventually reach the critical set. This approach not only makes it possible to estimate the probability of the rare event, but also provides realizations of the random trajectory, given that it reaches the critical set, i.e. provides realizations of typical critical trajectories, an important feature that methods based on importance sampling usually miss.

ASPI is contributing (or has contributed recently) to several applications of multilevel splitting for rare event simulation, such as risk assessment in air traffic management, detection in sensor networks, and protection of digital documents.

We have show last year that an adaptive version of multilevel splitting
for rare events is strongly consistent and that the estimates satisfy
a CLT (central limit theorem), with the same asymptotic variance as the
non–adaptive algorithm with the optimal choice of the parameters.
This year we have generalized these results to include Markov kernels used
to move the particles (or *shakers*) are of Metropolis–Hastings type.
This is a non–trivial generalization to a very important case.

This is a collaboration with Bernard Delyon (université de Rennes 1) and Mathias Rousset (EPI MATHERIALS, Inria Paris Rocquencourt).

By considering the adaptive multilevel splitting algorithm as a Fleming–Viot particle system for a stochastic wave, in the sense of , we have shown the mean square convergence using a general result about the convergence of Fleming–Viot (Villemonais, 2013). We are currently working on the proof of a central limit theorem, but the proof is not yet complete. We have nevertheless identified the expression of the asymptotic variance.

This is a collaboration with Damien Jacquemart (ONERA, Palaiseau) and Jérôme Morio (ONERA, Toulouse).

In , we highlight a bias induced by the discretization of the sampled Markov paths in the splitting algorithm, and we propose to correct this bias using a deformation of the intermediate regions, as proposed in . Moreover, we propose two numerical methods to design intermediate regions in the splitting algorithm that minimise the variance. One is connected with a partial differential equation approach, the other one is based on the discretization of the state space of the process.

This is a collaboration with Christian Musso (ONERA, Palaiseau) and with Sébastien Paris (LSIS, université du Sud Toulon Var).

The problem considered here can be described as follows: a limited number of sensors should be deployed by a carrier in a given area, and should be activated at a limited number of time instants within a given time period, so as to maximize the probability of detecting a target (present in the given area during the given time period). There is an information dissymmetry in the problem: if the target is sufficiently close to a sensor position when it is activated, then the target can learn about the presence and exact position of the sensor, and can temporarily modify its trajectory so as to escape away before it is detected. This is referred to as the target intelligence. Two different simulation–based algorithms have been designed in to solve separately or jointly this optimization problem, with different and complementary features. One is fast, and sequential: it proceeds by running a population of targets and by dropping and activating a new sensor (or re–activating a sensor already available) where and when this action seems appropriate. The other is slow, iterative, and non–sequential: it proceeds by updating a population of deployment plans with guaranteed and increasing criterion value at each iteration, and for each given deployment plan, there is a population of targets running to evaluate the criterion. Finally, the two algorithms can cooperate in many different ways, to try and get the best of both approaches. A simple and efficient way is to use the deployment plans provided by the sequential algorithm as the initial population for the iterative algorithm.

This is a collaboration with Paul Bui Quang (CEA, Bruyères–le–Châtel) and Christian Musso (ONERA, Palaiseau).

This is a collaboration with Pierre Ailliot (université de Bretagne Occidentale), Ronan Fablet and Pierre Tandéo (Télécom Bretagne), Anne Cuzol (université de Bretagne Sud) and Bernard Chapron (IFREMER, Brest).

Nowadays, ocean and atmosphere sciences face a deluge of data from spatial observations, in situ monitoring as well as numerical simulations. The availability of these different data sources offer new opportunities, still largely underexploited, to improve the understanding, modeling and reconstruction of geophysical dynamics. The classical way to reconstruct the space–time variations of a geophysical system from observations relies on data assimilation methods using multiple runs of the known dynamical model. This classical framework may have severe limitations including its computational cost, the lack of adequacy of the model with observed data, modeling uncertainties. In , we explore an alternative approach and develop a fully data–driven framework, which combines machine learning and statistical sampling to simulate the dynamics of complex system. As a proof concept, we address the assimilation of the chaotic Lorenz–63 model. We demonstrate that a nonparametric sampler from a catalog of historical datasets, namely a nearest neighbor or analog sampler, combined with a classical stochastic data assimilation scheme, the ensemble Kalman filter and smoother, reach state–of–the–art performances, without online evaluations of the physical model.

This is a collaboration with Pierre Ailliot (université de Bretagne Occidentale), Julie Bessac (Argonne National Laboratory, Chicago) and Julien Cattiaux (Météo–France, Toulouse).

Multivariate time series are of interest in many fields including economics and environment. The most popular tools for studying multivariate time series are the vector autoregressive (VAR) models because of their simple specification and the existence of efficient methods to fit these models. However, the VAR models do not allow to describe time series mixing different dynamics. For instance, when meteorological variables are observed, the resulting time series exhibit an alternance of different temporal dynamics corresponding to weather regimes. The regime is often not observed directly and is thus introduced as a latent process in time series models in the spirit of hidden Markov models. Markov switching vector autoregressive (MSVAR) models have been introduced as a generalization of autoregressive models and hidden Markov models. They lead to flexible and interpretable models. In this mutivariate context, several questions occur.

The discrete hidden variable also called regime has to be correctly defined. Indeed the regime can be local (e.g. link to a subset of the variables) or global (e.g. the same for all the variables). It can also be observed and inferred a priori or hidden. In the second case, it has to be estimated at the same time as the model parameters.

The question of the definition of the regime is investigated in for the specific problem of multi site wind modeling.

Markov Switching VAR models (MSVAR) suffer of the same dimensionality problem as VAR models. For large (and even moderate) dimensions, the number of autoregressive coefficients in each regime can be prohibitively large which results in noisy estimates. When the variables are correlated, which is the standard situation in multivariate time series, over–learning is frequent. The estimated parameters contains spurious non–zero coefficients and are then difficult to interpret. The predictions associated to the model are usually unstable. Collinearity causes also ill–conditioning of the innovation covariance. In , we propose a likelihood penalization method with hard thresholding for MSVAR models leading to sparse MSVAR. Both autoregressive matrices and precision matrices are penalized using smoothly clipped absolute deviation (SCAD) penalties.

This is a collaboration with Pierre Ailliot (université de Bretagne Occidentale), Bernard Delyon (université de Rennes 1) and Marc Prevosto (IFREMER, Brest).

Many records in environmental sciences exhibit asymmetric trajectories and there is a need for simple and tractable models which can reproduce such feature. In we explore an approach based on applying both a time change and a marginal transformation on Gaussian processes. The main originality of the proposed model is that the time change depends on the observed trajectory. We first show that the proposed model is stationary and ergodic and provide an explicit characterization of the stationary distribution. This result is then used to build both parametric and non–parametric estimate of the time change function whereas the estimation of the marginal transformation is based on up–crossings. Simulation results are provided to assess the quality of the estimates. The model is applied to wave data and it is shown that the fitted model is able to reproduce important statistics of the data such as its spectrum and marginal distribution which are important quantities for practical applications. An important benefit of the proposed model is its ability to reproduce the observed asymmetries between the crest and the troughs and between the front and the back of the waves by accelerating the chronometer in the crests and in the front of the waves.

This is a collaboration with Angélique Drémeau (ENSTA Bretagne, Brest) and Cédric Herzet (EPI FLUMINANCE, Inria Rennes–Bretagne Atlantique)

This is a collaboration with Cédric Herzet (EPI FLUMINANCE, Inria Rennes–Bretagne Atlantique).

Inria contract ALLOC 7326 — April 2013 to December 2016.

This is a collaboration with Christian Musso (ONERA, Palaiseau) and with Sébastien Paris (LSIS, université du Sud Toulon Var).

The objective of this project is to optimize the position and activation times of a few sensors deployed by one or several platforms over a search zone, so as to maximize the probability of detecting a moving target. The difficulty here is that the target can detect an activated sensor before it is detected itself, and it can then modify its own trajectory to escape from the sensor. This makes the optimization problem a spatio–temporal problem. Our contribution has been to study different ways to merge two different solutions to the optimization problem : a fast, though suboptimal, solution developped by ONERA in which sensors are deployed where and when the probability of presence of a target is high enough, and the optimal population–based solution developped by LSIS and Inria in a previous contract (Inria contract ALLOC 4233) with DGA / Techniques navales.

This is a collaboration with Christophe Villien (CEA LETI, Grenoble).

The issue here is user localization, and more generally localization–based services (LBS). This problem is addressed by GPS for outdoor applications, but no such general solution has been provided so far for indoor applications. The desired solution should rely on sensors that are already available on smartphones and other tablet computers. Inertial solutions that use MEMS (microelectromechanical system, such as accelerometer, magnetometer, gyroscope and barometer) are already studied at CEA. An increase in performance should be possible, provided these data are combined with other available data: map of the building, WiFi signal, modeling of perturbations of the magnetic field, etc. To be successful, advanced data fusion techniques should be used, such as particle filtering and the like, to take into account displacement constraints due to walls in the building, to manage several possible trajectories, and to deal with rather heterogeneous information (map, radio signals, sensor signals).

The main objective of this thesis is to design and tune localization algorithms that will be tested on platforms already available at CEA. Special attention is paid to particle smoothing and particle MCMC algorithms, to exploit some very precise information available at special time instants, e.g. when the user is clearly localized near a landmark point.

January 2015 to December 2017.

This is a joint research initiative supported by the three labex active in Brittany, CominLabs (Communication and Information Sciences Laboratory), Lebesgue (Centre de Mathématiques Henri Lebesgue) and LabexMER (Frontiers in Marine Research).

This project aims at exploring novel statistical and stochastic methods to address the emulation, reconstruction and forecast of fine–scale upper ocean dynamics. The key objective is to investigate new tools and methods for the calibration and implementation of novel sound and efficient oceanic dynamical models, combining

recent advances in the theoretical understanding, modeling and simulation of upper ocean dynamics,

and mass of data routinely available to observe the ocean evolution.

In this respect, the emphasis will be given to stochastic frameworks to encompass multi–scale/multi–source approaches and benefit from the available observation and simulation massive data. The addressed scientific questions constitute basic research issues at the frontiers of several disciplines. It crosses in particular advanced data analysis approaches, physical oceanography and stochastic representations. To develop such an interdisciplinary initiative, the project gathers a set of research groups associated with these different scientific domains, which have already proven for several years their capacities to interact and collaborate on topics related to oceanic data and models. This project will place Brittany with an innovative and leading expertise at the frontiers of computer science, statistics and oceanography. This transdisciplinary research initiative is expected to resort to significant advances challenging the current thinking in computational oceanography.

Inria contract ALLOC 9452 — January 2015 to December 2017.

The COSMOS project aims at developing numerical techniques dedicated to the sampling of high–dimensional probability measures describing a system of interest. There are two application fields of interest: computational statistical physics (a field also known as molecular simulation), and computational statistics. These two fields share some common history, but it seems that, in view of the quite recent specialization of the scientists and the techniques used in these respective fields, the communication between molecular simulation and computational statistics is not as intense as it should be.

We believe that there are therefore many opportunities in considering both fields at the same time: in particular, the adaption of a successful simulation technique from one field to the other requires first some abstraction process where the features specific to the original field of application are discarded and only the heart of the method is kept. Such a cross–fertilization is however only possible if the techniques developed in a specific field are sufficiently mature: this is why some fundamental studies specific to one of the application fields are still required. Our belief is that the embedding in a more general framework of specific developments in a given field will accelerate and facilitate the diffusion to the other field.

Inria contract ALLOC 8102 — March 2014 to February 2018.

The GERONIMO project aims at devising new efficient and effective techniques for the design of geophysical reduced–order models (ROMs) from image data. The project both arises from the crucial need of accurate low–order descriptions of highly–complex geophysical phenomena and the recent numerical revolution which has supplied the geophysical scientists with an unprecedented volume of image data. Our research activities are concerned by the exploitation of the huge amount of information contained in image data in order to reduce the uncertainty on the unknown parameters of the models and improve the reduced–model accuracy. In other words, the objective of our researches to process the large amount of incomplete and noisy image data daily captured by satellites sensors to devise new advanced model reduction techniques. The construction of ROMs is placed into a probabilistic Bayesian inference context, allowing for the handling of uncertainties associated to image measurements and the characterization of parameters of the reduced dynamical system.

François Le Gland has been invited by Joaquín Míguez to visit the department of signal theory and communications of Universidad Carlos III de Madrid, in February 2015.

Valérie Monbet has co–organized the workshop
on *Stochastic Model-Data Coupled Representations
for the Upper Ocean
Dynamics*,
the kick–off meeting of the SEACS project,
held in Landeda in May 2015.

Valérie Monbet has been the guest editor of a special issue
(volume 156, number 1) on stochastic weather generators,
in *Journal de la Société Française de Statistique*.

Valérie Monbet has given an invited talk on
Markov–switching vector autoregressive models for multivariate time series
of air temperature,
at *47èmes Journées de Statistique*,
held in Lille in June 2015.

Patrick Héas gives a course on Monte Carlo simulation methods in image analysis at université de Rennes 1, within the SISEA (signal, image, systèmes embarqués, automatique, école doctorale MATISSE) track of the master in electronical engineering and telecommunications.

François Le Gland gives

a course on Kalman filtering and hidden Markov models, at université de Rennes 1, within the SISEA (signal, image, systèmes embarqués, automatique, école doctorale MATISSE) track of the master in electronical engineering and telecommunications,

a 3rd year course on Bayesian filtering and particle approximation, at ENSTA (école nationale supérieure de techniques avancées), Paris, within the systems and control module,

a 3rd year course on linear and nonlinear filtering, at ENSAI (école nationale de la statistique et de l'analyse de l'information), Ker Lann, within the statistical engineering track,

and a 3rd year course on hidden Markov models, at Télécom Bretagne, Brest.

Valérie Monbet gives several courses on data analysis, on time series, and on mathematical statistics, all at université de Rennes 1 within the master on statistics and econometrics.

François Le Gland and Valérie Monbet are jointly supervising one PhD student

Chau Thi Tuyet Trang,
provisional title: *Non parametric filtering for Metocean multi–source
data fusion*,
université de Rennes 1,
started in October 2015,
expected defense in October 2018,
co–direction: Pierre Ailliot (université de Bretagne Occidentale).

François Le Gland is supervising two others PhD students

Alexandre Lepoutre,
provisional title: *Detection issues in track–before–detect*,
université de Rennes 1,
started in October 2010,
expected defense in 2016,
funding: ONERA grant,
co–direction: Olivier Rabaste (ONERA, Palaiseau),

Kersane Zoubert–Ousseni,
provisional title: *Particle filters for hybrid indoor navigation
with smartphones*,
université de Rennes 1,
started in December 2014,
expected defense in 2017,
funding: CEA grant,
co–direction: Christophe Villien (CEA LETI, Grenoble).

Valérie Monbet is supervising one other PhD student

Audrey Poterie,
provisional title: *Régression d'une variable ordinale par des données
longitudinales de grande dimension : application à la modélisation des
effets secondaires suite à un traitement par radiothérapie*,
université de Rennes 1,
started in October 2015,
expected defense in October 2018,
co–direction : Jean–François Dupuy (INSA de Rennes),
Laurent Rouvière (université de Haute Bretagne).

François Le Gland has been a reviewer for the PhD theses of Jana Kalawoun (université Paris Sud, Orsay, advisors: Gilles Celeux and Patrick Pamphile) and Antoine Campi (université Paul Sabatier, Toulouse, advisors: Christophe Baehr, Alain Dabas and Pierre Del Moral). He has also been a member of the committee for the PhD thesis of Eugenia Koblents (Universidad Carlos III de Madrid, advisor: Joaquín Míguez).

Valérie Monbet has been a member of the committee for the PhD theses of Xavier Kergadallan (École des Pont ParisTech, advisor: Michel Benoit) and Khalil El Waled (université de Haute Bretagne, advisor: Dominique Dehay).

In addition to presentations with a publication in the proceedings, which are listed at the end of the document in the bibliography, members of ASPI have also given the following presentations.

Frédéric Cérou has presented the results about the convergence of ABC at the probability and stochastic processes seminar of université de Rennes 1, and at the applied mathematics seminar of université de Nantes, both in November 2015.

Patrick Héas has given a talk on 3D wind field reconstruction by infrared sounding, at EUMETSAT (European Organisation for the Exploitation of Meteorological Satellites) in Darmstadt, Germany, in June 2015, and a talk on reduced–order modeling of hidden dynamics, at the international workshop on reduced basis, POD and PGD model reduction techniques, held in Cachan in November 2015.

François Le Gland has given a talk on simulation–based algorithms for the optimization of sensor deployment at the department of signal theory and communications of Universidad Carlos III de Madrid, in February 2015, and a talk on marginalization in rare event simulation for switching diffusions at the ONERA workshop on particle algorithms, held in Toulouse in May 2015.

Valérie Monbet has given a talk on switching autoregressive models for stochastic weather generators, and application to temperature series, at the kick–off meeting of the SEACS project, held in Landeda in May 2015.

Kersane Zoubert–Ousseni has given a poster presentation at the summer school on Foundations and Advances in Stochastic Filtering, held in Barcelona in June 2015.