The scientific objectives of ASPI are the design, analysis and implementation of interacting Monte Carlo methods, also known as particle methods, with focus on
statistical inference in hidden Markov models, e.g. state or parameter estimation, including particle filtering,
risk evaluation, including simulation of rare events.
The whole problematic is multidisciplinary, not only because of the many scientific and engineering areas in which particle methods are used, but also because of the diversity of the scientific communities which have already contributed to establish the foundations of the field : target tracking, interacting particle systems, empirical processes, genetic algorithms (GA), hidden Markov models and nonlinear filtering, Bayesian statistics, Markov chain Monte Carlo (MCMC) methods, etc. Intuitively speaking, interacting Monte Carlo methods are sequential simulation methods, in which particles
explore the state space by mimicking the evolution of an underlying random process,
learn the environment by evaluating a fitness function,
and interact so that only the most successful particles (in view of the value of the fitness function) are allowed to survive and to get offsprings at the next generation.
The effect of this mutation / selection mechanism is to automatically concentrate particles (i.e. the available computing power) in regions of interest of the state space. In the special case of particle filtering, which has numerous applications under the generic heading of positioning, navigation and tracking, in target tracking, computer vision, mobile robotics, ubiquitous computing and ambient intelligence, sensor networks, etc. each particle represents a possible hidden state, and is multiplied or terminated at the next generation on the basis of its consistency with the current observation, as quantified by the likelihood function. In the most general case, particle methods provide approximations of probability distributions associated with a Feynman-Kac flow, by means of the weighted empirical probability distribution associated with an interacting particle system.
ASPI will essentially carry methodological research activities, rather than activities oriented towards a single application area, with the objective to obtain generic results with high potential for applications, and to bring these results (and other results found in the literature) until implementation on a few appropriate examples, through collaboration with industrial partners.
The main applications currently considered are geolocalisation and tracking of mobile terminals, calibration of models for electricity price, and risk assessment for complex hybrid systems such as those used in air traffic management.
The objective here is to explain how interacting Monte Carlo methods differ from classical Monte Carlo methods, and to introduce the general and extremely fruitful framework of Feynman–Kac flows.
Monte Carlo methods are numerical methods that are widely used in situations where (i) a stochastic (usually Markovian) model is given for some underlying process, and (ii) some quantity of interest should be evaluated, that can be expressed in terms of the expected value of a functional of the process trajectory, or the probability that a given event has occurred. Numerous examples can be found, e.g. in financial engineering (pricing of options and derivative securities) , in performance evaluation in communication networks (probability of buffer overflow), in statistics of hidden Markov models (state estimation, evaluation of contrast and score functions), etc. Very often in practice, no analytical expression is available for the quantity of interest, but it is possible to simulate trajectories of the underlying process. The idea behind Monte Carlo methods is to generate independent trajectories of the underlying process, and to use as an approximation (estimator) of the quantity of interest the average of the functional over the resulting independent sample. For instance, if
where Xn denotes a Markov chain with (possibly) time dependent
transition kernels Qn and initial probability distribution 0,
then
where is an N–sample whose common
probability distribution is precisely n,
which can be easily achieved as follows : independently for
any
for any k1.
By the law of large numbers,
the above estimator converges
to as the size N of the sample goes
to infinity, with rate and the asymptotic variance can be
estimated.
To reduce the asymptotic variance of the estimator, many variance
reduction techniques are routinely used, among which importance
sampling can be defined as follows : for any given importance
decomposition
for any k1, it holds
with , hence the alternative Monte Carlo estimator
where independently for any
and
for any k1,
i.e. independent trajectories are generated under an alternate
wrong model, and are weighted according to their likelihood for
the true model. For a given test function f, there are some adequate
choices of the importance decomposition for which the asymptotic variance
of the alternative Monte Carlo estimator is smaller than
the asymptotic variance of the original Monte Carlo estimator.
However, running independent Monte Carlo simulations can lead to
very poor results,
because trajectories are
generated blindly, and only afterwards is the corresponding
weight w0:ni evaluated, which can happen to be negligible
in which case the corresponding trajectory is not going to
contribute to the estimator, i.e. computing power has been wasted.
A recent and major breakthrough, a brief mathematical presentation of which is given in , has been the introduction of interacting Monte Carlo methods, also known as sequential Monte Carlo (SMC) methods, in which a whole (possibly weighted) sample, called system of particles, is propagated in time, where the particles
explore the state space under the effect of a mutation mechanism which mimics the evolution of the underlying process,
and are replicated or terminated, under the effect of a selection mechanism which automatically concentrates the particles, i.e. the available computing power, into regions of interest of the state space.
In full generality, the underlying process is a Markov chain, whose state space can be finite, continuous (Euclidean), hybrid (continuous / discrete), constrained, time varying, pathwise, etc., the only condition being that it can easily be simulated. The very important case of a sampled continuous–time Markov process, e.g. the solution of a stochastic differential equation driven by a Wiener process or a more general Lévy process, is also covered.
In the special case of particle filtering, originally developed within the tracking community, the algorithms yield a numerical approximation of the optimal filter, i.e. of the conditional probability distribution of the hidden state given the past observations, as a (possibly weighted) empirical probability distribution of the system of particles. In its simplest version, introduced in several different scientific communities under the name of interacting particle filter, bootstrap filter, Monte Carlo filter or condensation (conditional density propagation) algorithm , and which historically has been the first algorithm to include a redistribution step, the selection mechanism is governed by the likelihood function : at each time step, a particle is more likely to survive and to replicate at the next generation if it is consistent with the current observation. The algorithms also provide as a by–product a numerical approximation of the likelihood function, and of many other contrast functions for parameter estimation in hidden Markov models, such as the prediction error or the conditional least–squares criterion.
Particle methods are currently being used in many scientific and engineering areas : positioning, navigation, and tracking , visual tracking , mobile robotics , ubiquitous computing and ambient intelligence , sensor networks , risk evaluation and simulation of rare events , genetics, molecular dynamics, etc. Other examples of the many applications of particle filtering can be found in the contributed volume and in the special issue of IEEE Transactions on Signal Processing devoted to Monte Carlo Methods for Statistical Signal Processing in February 2002, which contains in particular the tutorial paper , and in the textbook devoted to applications in target tracking. Applications of sequential Monte Carlo methods to other areas, beyond signal and image processing, e.g. to genetics, and molecular dynamics, can be found in .
Particle methods are very easy to implement, since it is sufficient in principle to simulate independent trajectories of the underlying process. The whole problematic is multidisciplinary, not only because of the already mentioned diversity of the scientific and engineering areas in which particle methods are used, but also because of the diversity of the scientific communities which have contributed to establish the foundations of the field : target tracking, interacting particle systems, empirical processes, genetic algorithms (GA), hidden Markov models and nonlinear filtering, Bayesian statistics, Markov chain Monte Carlo (MCMC) methods.
The following abstract point of view, developed and extensively studied by Pierre Del Moral , has proved to be extremely fruitful in providing a very general framework to the design and analysis of numerical approximation schemes, based on systems of branching and / or interacting particles, for nonlinear dynamical systems with values in the space of probability distributions, associated with Feynman–Kac flows of the form
where Xn denotes a Markov chain
with (possibly) time dependent state spaces En
and with transition kernels Qn,
and where the nonnegative potential functions gn
play the role of selection functions.
Feynman–Kac flows (FK) naturally arise whenever importance sampling
is used, as seen from (IS) above : this applies for instance
to simulation of rare events, to filtering, i.e. to state estimation
in hidden Markov models (HMM), etc.
Clearly, the unnormalized linear flow satisfies the dynamical system
with the nonnegative kernel , and the associated normalized nonlinear flow of probability distributions satisfies the dynamical system
which can be decomposed in the following two steps
i.e.
Conversely, the normaling constant ,
hence the unnormalized (linear) flow as well,
can be expressed in terms of the normalized (nonlinear) flow : indeed
.
To solve these equations numerically, and in view of the
basic assumption that it is easy to simulate r.v.'s
according to the probability distributions ,
i.e. to mimic the evolution of the Markov chain,
the original idea behind particle methods consists of looking for
an approximation of the probability distribution n in the form of
a (possibly weighted) empirical probability distribution associated with
a system of particles :
The approximation is completely characterized by
the set of particle
positions and weights, and the algorithm is completely described by
the mechanism which builds k from k-1.
In practice, in the simplest version of the algorithm,
known as the bootstrap algorithm, particles
are selected according to their respective weights (selection step),
move according to the Markov kernel Qk (mutation step),
are weighted by evaluating the fitness function gk (weighting
step).
The algorithm yields a numerical approximation of the probability
distribution n as the weighted empirical probability
distribution nN associated with a system of particles, and
many asymptotic results have been proved as the number N of
particles (sample size) goes to infinity,
using techniques coming from applied probability (interacting particle
systems, empirical processes ),
see e.g. the survey article
or the recent textbook , and references therein :
convergence in ,
convergence as empirical processes indexed by classes of functions,
uniform convergence in time (see also , ),
central limit theorem (see also ),
propagation of chaos,
large deviations principle,
moderate deviations principle (see ), etc.
Beyond the simplest bootstrap version of the algorithm,
many algorithmic variations have been proposed ,
and are commonly used in practice :
in the redistribution step, sampling with replacement could be replaced with other redistribution schemes so as to reduce the variance (this issue has also been addressed in genetic algorithms),
to reduce the variance and to save computational effort, it is often a good idea not to redistribute the particles at each time step, but only when the weights are too much uneven.
Most of the results proved in the literature assume that particles
are redistributed (i) at each time step, and (ii) using sampling with
replacement. Studying systematically the impact of these algorithmic
variations on the convergence results is still to be done.
Even with interacting Monte Carlo methods, it could happen
that some particle ki generated in one time step has a negligible
weight gk(ki) : if this happens for too many particles in
the sample , then computer power has been
wasted, and it has been suggested to use importance sampling again
in the mutation step, i.e. to let particles explore the state space
under the action of an alternate wrong mutation kernel, and to weight
the particles according to their likelihood for the true model,
so as to compensate for the wrong modeling.
More specifically, using an arbitrary importance decomposition
results in the following general algorithm, known as the sampling with importance resampling (SIR) algorithm, in which particles
are selected according to their respective weights (selection step),
move according to the importance Markov kernel Pk (mutation step),
are weighted by evaluating the importance weight function Wk
on the resulting transition (weighting step).
Hidden Markov models (HMM) form a special case of partially observed stochastic dynamical systems, in which the state of a Markov process (in discrete or continuous time, with finite or continuous state space) should be estimated from noisy observations. The conditional probability distribution of the hidden state given past observations is a well–known example of a normalized (nonlinear) Feynman–Kac flow, see . These models are very flexible, because of the introduction of latent variables (non observed) which allows to model complex time dependent structures, to take constraints into account, etc. In addition, the underlying Markovian structure makes it possible to use numerical algorithms (particle filtering, Markov chain Monte Carlo methods (MCMC), etc.) which are computationally intensive but whose complexity is rather small. Hidden Markov models are widely used in various applied areas, such as speech recognition, alignment of biological sequences, tracking in complex environment, modeling and control of networks, digital communications, etc.
Beyond the recursive estimation of an hidden state from noisy observations, the problem arises of statistical inference of HMM with general state space, including estimation of model parameters, early monitoring and diagnosis of small changes in model parameters, etc.
Large time asymptotics A fruitful approach is the asymptotic study, when the observation time increases to infinity, of an extended Markov chain, whose state includes (i) the hidden state, (ii) the observation, (iii) the prediction filter (i.e. the conditional probability distribution of the hidden state given observations at all previous time instants), and possibly (iv) the derivative of the prediction filter with respect to the parameter. Indeed, it is easy to express the log–likelihood function, the conditional least–squares criterion, and many other clasical contrast processes, as well as their derivatives with respect to the parameter, as additive functionals of the extended Markov chain.
The following general approach has been proposed :
first, prove an exponential stability property (i.e. an exponential forgetting property of the initial condition) of the prediction filter and its derivative, for a misspecified model,
from this, deduce a geometric ergodicity property and the existence of a unique invariant probability distribution for the extended Markov chain, hence a law of large numbers and a central limit theorem for a large class of contrast processes and their derivatives, and a local asymptotic normality property,
finally, obtain the consistency (i.e. the convergence to the set of minima of the associated contrast function), and the asymptotic normality of a large class of minimum contrast estimators.
This programme has been completed in the case of a finite state space , and has been generalized in under a uniform minoration assumption for the Markov transition kernel, which typically does only hold when the state space is compact. Clearly, the whole approach relies on the existence of exponential stability property of the prediction filter, and the main challenge currently is to get rid of this uniform minoration assumption for the Markov transition kernel , , so as to be able to consider more interesting situations, where the state space is noncompact.
Small noise asymptotics Another asymptotic approach can also be used, where it is rather easy to obtain interesting explicit results, in terms close to the language of nonlinear deterministic control theory . Taking the simple example where the hidden state is the solution of an ordinary differential equation, or a nonlinear state model, and where the observations are subject to additive Gaussian white noise, this approach consists in assuming that covariances matrices of the state noise and of the observation noise go simultaneously to zero. If it is reasonable in many applications to consider that noise covariances are small, this asymptotic approach is less natural than the large time asymptotics, where it is enough (provided a suitable ergodicity assumption holds) to accumulate observations and to see the expected limit laws (law of large numbers, central limit theorem, etc.). In opposition, the expressions obtained in the limit (Kullback–Leibler divergence, Fisher information matrix, asymptotic covariance matrix, etc.) take here much more explicit form than in the large time asymptotics.
The following results have been obtained using this approach :
the consistency of the maximum likelihood estimator (i.e. the convergence to the set M of global minima of the Kullback–Leibler
divergence), has been obtained using large deviations techniques,
with an analytical approach ,
if the abovementioned set M does not reduce to the true
parameter value, i.e. if the model is not identifiable, it is still
possible to describe precisely the asymptotic behavior of the
estimators : in the simple case where the state
equation is a noise–free ordinary differential equation and using
a Bayesian framework,
it has been shown that (i) if the rank r of the Fisher
information matrix is constant in a neighborhood of the
set M, then this set is a differentiable submanifold of
codimension r, (ii) the posterior probability distribution of the
parameter converges to a random probability distribution in the limit,
supported by the manifold M, absolutely continuous w.r.t. the Lebesgue measure on M, with an explicit expression for the density,
and (iii) the posterior probability distribution of the suitably
normalized difference between the parameter and its projection on
the manifold M, converges to a mixture of Gaussian probability
distributions on the normal spaces to the manifold M, which
generalized the usual asymptotic normality property,
it has been shown in that (i) the parameter dependent probability distributions of the observations are locally asymptotically normal (LAN) , from which the asymptotic normality of the maximum likelihood estimator follows, with an explicit expression for the asymptotic covariance matrix, i.e. for the Fisher information matrix , in terms of the Kalman filter associated with the linear tangent linear Gaussian model, and (ii) the score function (i.e. the derivative of the log–likelihood function w.r.t. the parameter), evaluated at the true value of the parameter and suitably normalized, converges to a Gaussian r.v. with zero mean and covariance matrix .
Among the many application domains of particle methods, or interacting Monte Carlo methods, ASPI has decided to focus on applications in localisation (or positioning), navigation and tracking , which already covers a very broad spectrum of application domains. The objective here is to estimate position (and also velocity, attitude, etc.) of a mobile object, from the combination of different sources of information, including
a prior dynamical model of typical evolutions of the mobile,
measurements provided by sensors,
and possibly a digital map providing some useful feature (altitude, gravity, power attenuation, etc.) at each possible position,
see . This Bayesian dynamical estimation problem is also called filtering, and its numerical implementation using particle methods, known as particle filtering, has found applications in target tracking, integrated navigation, points and / or objects tracking in video sequences, mobile robotics, wireless communications, ubiquitous computing and ambient intelligence, sensor networks, etc. Particle filtering was definitely invented by the target tracking community , which has already contributed to many of the most interesting algorithmic improvements and is still very active. Beyond target tracking, ASPI is willing to consider all possible applications of particle filtering in positioning, navigation and tracking. To be more specific, the objective of ASPI is to implement and assess the performance of particle filtering in localisation and tracking of mobile terminals in a wireless network, using network measurements (received power level and possibly TDOA (time difference of arrival)) and a database of reference measurements of the power level, available in a few points or in the form of a digital map (power attenuation map). Generic algorithms will be proposed and specialized to the indoor context (wireless local area network, e.g. WiFi) and to the outdoor context (cellular network, e.g. GSM) when necessary. Constraints and obstacles such as building walls in an indoor environment, street, road or railway networks in an outdoor environment, will be represented in a simplified manner, using a prior model on a graph, e.g. a Voronoï graph as in similar experiments in mobile robotics . To assess the performance of the proposed localisation and tracking algorithms, posterior Cramèr–Rao bounds for a Markov process on a graph will be derived. Another objective, somehow reminiscent of the SLAM (simultaneous localisation and mapping) problem in mobile robotics, is to update and enrich the initial database of reference measurements, using network measurements collected on–the–fly.
To illustrate that particle filtering algorithms are efficient, easy to implement, and extremely visual and intuitive by nature, several demos have been programmed by Fabien Campillo, with the corresponding MATLAB scripts available on the site http://www.irisa.fr/aspi/campillo/site-pf/. This material has proved very useful in training sessions and seminars that have been organized in response to demand from industrial partners (SAGEM, CNES and EDF), and this effort will be continued. At the moment, the following four demos are available :
Navigation of an aircraft using altimeter measurements and elevation map of the terrain : a noisy measurement of the terrain height below the aircraft is obtained as the difference between (i) the aircraft altitude above the sea level (provided by a pression sensor) and (ii) the aircraft altitude above the terrain (provided by an altimetric radar), and is compared with the terrain height in any possible point (read on the elevation map). In this demo, a cloud (swarm) of particles explores multiple possible trajectories according to some raw model, and are replicated or discarded depending on whether the terrain height below the particle (i.e. at the same horizontal position) matches or not the available noisy measurement of the terrain height below the aircraft.
Tracking a dim point target in a sequence of noisy images. In this track–before–detect demo, a point, which cannot be detected in a single image of the sequence, can be automatically tracked in a sequence of noisy images.
Positioning and tracking in the presence of obstacles. In this interactive demo, presented by Simon Maskell (QinetiQ and CUED, Cambridge University Engineering Department) at a GDR ISIS event co–organized by François Le Gland and Jean–Pierre Le Cadre in December 2002, several stations (the number and locations of which are chosen interactively) try to position and track a mobile from noisy angle measurements, in the presence of obstacles (walls, tunnels, etc., the number, locations and orientations of which are also chosen interactively), which make the mobile temporarily invisible from one or several stations. This nonlinear filtering problem in a complex environment, with many constraints, would be practically impossible to implement using Kalman filters.
Positioning and tracking of a mobile in a urban area. In this interactive demo, power attenuation maps associated with several base stations (the number and locations of which are chosen interactively) are combined with power measurements of the signal received from the base stations, and with a random walk prior model for the motion of the mobile user, in order to position and track a user in a urban Manhattan–like environment. The user is allowed to enter buildings, where no signal at all is received, and the particle filter is able in principle to lock quickly to the user position whenever he / she leaves the building.
This is a collaboration with Nadia Oudjane, from the OSIRIS (Optimisation, simulation, risque et statistiques) department of Électricité de France R&D.
We consider the special case of a Feynman–Kac flow, see , where the selection functions can possibly take the zero value, which may occur in many important practical situations
simulation of rare events using an importance splitting approach, see ,
simulation of a Markov chain conditionned or constrained to visit a given sequence of subspaces of the state space,
simulation of a r.v. in the tail of a given probability distribution,
nonlinear filtering with bounded observation noise,
implementation of a robustification approach in nonlinear filtering, using a truncation of the likelihood function , ,
algorithms of approximate nonlinear filtering, where hidden state and observation are simulated jointly, and where the simulated observation is validated against the actual observation , e.g. if there is no explicit expression available for the likelihood function, or if there does not even exist a likelihood function (nonadditive observation noise, noise–free observations, etc.).
If the selection function gk can possibly take the zero value, and
even if , it can happen that
the evaluation of the function gk returns the zero value for
all the particles generated at the end of the mutation step, i.e. the particle systems dies out and the algorithm cannot go on.
A reinitialization procedure has been proposed and studied
in , in which the particle system is generated if
necessary from an arbitrary restarting probability distribution .
Alternatively, one could be interested by the behavior of the algorithm
until the extinction time N of the particle system.
Under the assumption ,
the probability that the algorithm can not go on
until the time instant n goes to zero with exponential
rate .
Using a global approach and a central limit theorem for triangular
arrays of martingale increments, the following central limit theorem
has been proved in
for the nonsequential particle algorithm with a constant number N
of particles
We have studied a sequential particle algorithm,
already proposed in ,
which automatically keeps the particle system alive,
i.e. which ensures its non–extinction.
In some sense, the sequential algorithm
is a fixed performance policy,
as opposed to the usual nonsequential algorithm
which is a fixed effort policy.
For any level H>0, and for any , the random number
of particles is defined by
where the r.v.'s are
i.i.d. with common probability distribution 0 (for k = 0), and
common probability distribution k-1HQk (for ).
The particle approximation is now parameterized by the level H>0,
and under the additional
assumption ,
the random number NkH of particles is a.s. finite.
By construction, the particle system never dies out and the
algorithm can always go on, and in addition
in probability, with rate .
For the sequential particle algorithm, with a random number of particles
defined by the level H>0, we have obtained the following central limit
theorem
The proof follows the approach of using an induction argument, and relies on a central limit theorem for the sum of a random number of random variables , which is known in sequential analysis since the 1950's, see also or . To get a fair comparison of the nonsequential and sequential particle approximations, we can use as a normalizing factor the time–average
of the number of particles, which is an indication of how much computing power has been used, and we obtain
This is a collaboration with Pierre Del Moral, from université de Nice–Sophia Antipolis, and with Pascal Lezaud, from CENA (Centre d'Études de la Navigation Aérienne) in Toulouse.
The numerical evaluation of extremely small probabilities, such as the probability of occurrence of a rare event — typically the probability that a set is reached by a continuous–time strong Markov process before a fixed or a random time, is a challenging numerical problem whose applications are numerous : analysis and performance evaluation of a telecommunication network, evaluation of conflict or collision risk in air traffic management, see , etc. To deal with this class of problems, there are on one hand probabilistic methods, which provide asymptotic results and are based on large deviations theory, and on the other hand simulation methods, the most widely used of which is importance sampling, where independent trajectories (i) are generated under a proposal probability distribution for which the considered event is not so rare, and (ii) are weighted by the Radon–Nikodym derivative of the proposal probability distribution w.r.t. the true probability distribution.
An alternative method is importance splitting, in which a sequence of increasingly rare events is defined, and a selection mechanism is introduced where trajectories for which an intermediate event holds true split / branch into several offsprings, while trajectories for which none of the intermediate events hold true are terminated . This selection mechanism allows to generate many trajectories for which the rare event holds true, and to evaluate statistics of such trajectories.
To be more specific, the objective is to compute the probability of the rare (but critical) event, and the probability distribution of the critical trajectories, i.e.
where
TBtXtB
is the first hitting time of some critical region B in the
state–space and T is some deterministic final time,
or an a.s.–finite stopping time.
If the probability is extremely small, say 10-9 or even smaller,
and if independent trajectories are simulated, there is a chance that
none of these trajectories will manage to reach B, and in any case
there will be too few of such trajectories to get accurate estimates.
Introducing a decreasing subsets
sequence and the
associated increasing sequence
of first hitting times
TktXtBk
the problem can be formulated in terms of Feynman–Kac flows, see , as
and
where
for a discrete–time Markov chain
with values in the set of trajectories, and for a selection function gk
with value 1 if the trajectory has managed to reach the set Bk
before time T, i.e. if its terminal value belongs to the set Bk,
and 0 otherwise. Genealogical models can also be considered, and allow
to address the approximation of the probability distribution of the
critical trajectories.
Within this general framework, it is straightforward to implement
interacting Monte Carlo methods.
Specializing the simple bootstrap algorithm to this context,
trajectories of the continuous–time Markov process are generated
independently until either they reach the set Bk or the final time T
is reached. Let IkN denote the set of successful trajectories :
these are allowed to survive at the next generation,
where their offsprings will try to reach the set Bk + 1, while
unsuccessful trajectories are terminated.
Under such fixed effort policy,
transition probabilities are estimated as the ratio
of the number of successful trajectories to the total number of simulated trajectories, and
It is also possible to implement a more general SIR algorithm
to this context, which combines importance splitting
and importance sampling : trajectories are generated under a proposal
probability distribution for which reaching the set Bk is not so rare
an event, and successful trajectories are given a number of offsprings
at the next generation related to their weight, i.e. to
the Radon–Nikodym derivative of the proposal probability distribution
w.r.t. the true probability distribution. In other words, among all
the successful trajectories, those which are closest to a typical
trajectory from the true probability distribution are given more
offsprings than others.
Because the selection functions are indicator functions,
special attention should be paid to the problem of extinction of the
particle system, which arises when IkN is empty, i.e. when none of the trajectories is able to reach the set Bk,
and a possible solution is to implement a sequential version
of the algorithms, see .
For the simple bootstrap algorithm, given an integer H,
trajectories are generated independently until H exactly among them
reach the set Bk, and let NkH denote the total number of simulated
trajectories.
Under such fixed performance policy,
transition probabilities are estimated again as the ratio
of the number of successful trajectories to the total number of simulated trajectories, and
For each of these algorithms, many convergence results have been obtained within the general framework of Feynman–Kac flows, such as CLT providing expressions for the asymptotic variance of the approximation error.
On the occasion of the summer project of Nordine El Baraka (EGIM,
École Généraliste d'Ingénieurs de Marseille),
under the direction of Frédéric Cérou, we have started to study
a variant of the above algorithms, which is closer to the
original importance splitting algorithm.
Here, any trajectory which has managed to reach the set Bk before
time T is given a fixed number Rk of offsprings at the next
generation, each offspring receiving the fraction 1/Rk of the weight
of its ancestor. Other variants where trajectories which go in the
wrong direction for too long are eliminated, such as the RESTART
algorithm , have
also been considered.
We have started a general comparison of all these different methods.
Empirical preliminary results suggested that methods which are more
elaborate than simple importance splitting do not perform
significantly better.
It may also be noted that imposing a finite time horizon changes
the behavior of some of the proposed algorithms, which were initially
designed assuming that the final time is some renewal time, associated
with some recurrent event.
Another issue is how to choose the intermediate
subsets B0B1...Bn. This choice is rather
critical, and we are working towards a method for building these level
sets in an adaptive manner, at least for one dimensional models.
This is a collaboration with Nadia Oudjane, from the OSIRIS (Optimisation, simulation, risque et statistiques) department of Électricité de France R&D, see .
Given nonnegative kernels Rn and a nonnegative measure 0,
we consider the unnormalized (linear) Feynman–Kac flow
A well–known example is provided by the unnormalized conditional
probability distribution of the hidden state given past observations,
when the hidden state and the observation form jointly a Markov chain :
this includes HMM and switching AR models as special cases,
with the decomposition
where Qk is the Markov transition kernel
and where the selection function gk is the likelihood function.
If the nonnegative kernels depends smoothly (continuously or differentiably) on a parameter, in such a way that the Feynman–Kac flow depends smoothly on the parameter, we would like to design a particle approximation to would depend smoothly on the parameter as well. The need for such a regularity property arises for instance
in sensitivity analysis, e.g. in the computation of Greeks, in option pricing,
in statistics of HMM, see , e.g. in the evaluation of the derivative w.r.t. the parameter of any contrast function that can be expressed in terms of the conditional probability distribution of the hidden state given past observations.
Running a particle algorithm for each different value of the parameter would result in using different particle systems, and it is very unlikely that the approximation will be smooth in any reasonable sense w.r.t. the parameter.
To be specific, we consider only the HMM case,
and we assume that the Markov transition kernel Qk satisfies
where it is easy to simulate jointly the pair (Xk, k)
given Xk-1 = x under the probability measure corresponding
to a pivot fixed value of the parameter (loosely speaking,
the r.v. k is related to the Radon–Nikodym derivative
of the probability measure w.r.t. the pivot probability
measure ).
Introducing
which may have an explicit expression or not, yields
which shows that the Markov transition kernel Qk is absolutely
continuous w.r.t. the pivot Markov transition kernel Qk0, and
the following importance decomposition holds
It is therefore possible to design a particle approximation of the form
The approximation is completely characterized by
a single set
of particle positions and weights, which depend only on the pivot
value of the parameter, and for each different value of the parameter
by a set of secondary weights,
and the algorithm is completely described by the mechanism
which builds (k0, Sk) from (k-10, Sk-1).
Using an arbitrary importance decomposition
results in a smooth (SIR) algorithm, in which particles
are selected according to their respective primary weights (selection step),
move according to the importance Markov kernel Pk0 (mutation
step),
are weighted by evaluating the importance weight
function Wk0 on the resulting transitions (weighting step),
and in addition, for each different value of the parameter
are further weighted by evaluating the importance weight
function rk on the resulting transitions (secondary weighting step).
In other words, a single particle system is propagated, which
depends only on the pivot value of the parameter, and for each different
value of the parameter this single particle system is further weighted
with a different set of secondary weights .
Notice that this last step does not bring any additional source
of randomness in the algorithm :
if the importance weight function rk depends smoothly (continuously
or differentiably) on the parameter, then the approximation depends
smoothly on the parameter as well.
This can be thought of as an interacting particle implementation of
the MCML (Monte Carlo maximum likelihood)
algorithm .
Alternatively, one could differentiate the Feynman–Kac flow w.r.t. the parameter, so as to obtain a linear tangent Feynman–Kac flow,
and one could design a joint particle approximation for
the Feynman–Kac flow and the linear tangent Feynman–Kac flow,
using a single system of particles and signed weights.
To be specific, this would require that the Markov transition
kernel Qk is differentiable w.r.t. the parameter, and that
the linear tangent kernel satisfies
where it is easy to simulate jointly the pair (Xk, k)
given Xk-1 = x under the probability measure (loosely speaking,
the r.v. k is the logarithmic derivative w.r.t. the parameter
of the r.v. k considered above).
It is therefore possible to design a joint particle approximation
of the form
and
The approximation is completely characterized by a single set of particle positions and weights, and by a set of signed weights, and the algorithm is completely described by the mechanism which builds from .
This alternate approach was taken in , and it is remarkable that the particle approximation obtained there for the linear tangent Feynman–Kac flow coincides exactly, for the pivot value of the parameter, with the particle approximation that would be obtained if one would differentiate w.r.t. the parameter the smooth SIR algorithm described above. This remarkable property provides some solid ground justification for all these different algorithms .
Contract INRIA 1 02 C 0037 — January 2002/December 2004
In view of the undergoing evolution in management and control of large complex real–time systems towards an increasing distribution of sensors, decisions, etc., and an increasing concern for safety criticality, the IST project HYBRIDGE addresses methodological issues in stochastic analysis and distributed control of hybrid systems, with conflict management in air trafic as its target application area. It is coordinated by National Aerospace Laboratory (NLR, Netherlands) and its partners are Cambridge University (United Kingdom), Universita di Brescia and Universita dell'Aquila (Italy), Twente University (Netherlands), National Technical University of Athens (NTUA, Greece), Centre d'Études de la Navigation Aérienne (CENA), Eurocontrol Experimental Center (EEC), AEA Technology and BAe Systems (United Kingdom), and INRIA.
Our contribution to this project concerns the work package on modeling accident risks with hybrid stochastic systems, and the workpackage on risk decomposition and risk assessment methods, and their implementation using conditional Monte Carlo methods. This problem has motivated our work on the importance splitting approach to the simulation of rare events, see .
Contract INRIA 1 04 C 0862 — October 2004/September 2005
This is a collaboration with Nadia Oudjane, from the OSIRIS (Optimisation, simulation, risque et statistiques) department of Électricité de France R&D.
The objective is to estimate parameters in various multi–factor models for electricity spot price, from the observation of futures contracts prices that are traded in the market. This problems fits within the general framework of parameter estimation in hidden Markov models, and we propose to rely on joint particle approximation schemes for the optimal filter and the linear tangent filter, so as to maximize the likelihood function, or other suitable contrast functions, w.r.t. the parameters. In the simple case where the futures contracts are written for electricity delivery over a single period of time, and if multi–factor models are based on Ornstein–Uhlenbeck processes driven by a Brownian motion, then the problem is linear Gaussian and explicit expressions provided by the Kalman filter can be used to assess the performance of the proposed approach. In practice however futures contracts are usually written for electricity delivery over a long period of time, and realistic multi–factor models should be used, based on Ornstein–Uhlenbeck processes driven by a Lévy process, which make the problem non–linear with non Gaussian noise structure. The performance of the proposed approach will be assessed on real data provided by the industrial partner.
Affiliation to the French partner of the network — September 2000/August 2004.
Members of ASPI participate in the european network DYNSTOCH «Statistical Methods for Dynamical Stochastic Models», which gathers nine european research groups : Københavns Universitet (coordinator, Denmark), Universiteit van Amsterdam (Netherlands), Humboldt Universität zu Berlin and Albert Ludwigs Universität Freiburg (Germany), Universidad Politécnica de Cartagena (Spain), Helsingin Yliopisto (Finland), University College London (United Kingdom), LADSEB/CNR (Italy), université de Paris 6 (France), within the IHP program. The annual workshop has been held in Copenhagen in June 2004. Our contribution within the French team of the network (PMA, laboratoire de Probabilités et Modèles Aléatoires, université de Paris 6/7), is focused on asymptotic statistics of HMM with finite or continuous state space, and their particle implementation. The proposal of a follow–up Marie Curie research training network DYNSTOCH, coordinated by Peter Spreij (UvA, Amsterdam), has been submitted to the November 2003 call of the FP6, with INRIA Rennes as a research group on its own, and with additional research groups from SZTAKI (Hungary), Universiteit Gent (Belgium), Ruprecht Karls Universität Heidelberg (Germany), and Linköpings Universitet (Sweden).
Since September 2002, F. Le Gland is coordinating with Olivier Cappé (ENST Paris) a project (action spécifique) «Méthodes particulaires» supported by the STIC department of CNRS, and promoted by the RTP 24 «Mathématiques de l'Information et des Systèmes». This project follows another project «Chaînes de Markov cachées et filtrage particulaire», which started in December 2001 within the inter–departmental CNRS programme Math–STIC, and was coordinated by F. Le Gland and Éric Moulines (ENST Paris). A two–day workshop on «Particle and Monte Carlo Methods» has been organized in July 2004 with support from AS 67. The workshop was held in Barcelona, as a satellite event to the 6th world congress of the Bernoulli Society and to the 67th annual meeting of the Institute of Mathematical Statistics (IMS), and attracted about 50 participants. At the closing meeting of the RTP 24 in November 2004 at ENST Paris, F. Le Gland has given an overview presentation of the activities and results of AS 67.
Since September 2002, F. Le Gland is coordinating a project (action spécifique, AS67) «Méthodes particulaires» supported by the STIC department of CNRS, and promoted by the RTP 24 «Mathématiques de l'Information et des Systèmes», see . He has coorganized with Pierre Del Moral (LSP Toulouse, now at université de Nice Sophia Antipolis) and with Éric Moulines (ENST Paris) a two–day workshop in July 2004 on «Particle and Monte Carlo Methods» with support from AS 67 and from the project «Chaînes de Markov cachées et filtrage particulaire», awarded within the inter–departmental CNRS programme Math–STIC. The workshop was held in Barcelona, as a satellite event to the 6th world congress of the Bernoulli Society and to the 67th annual meeting of the Institute of Mathematical Statistics (IMS), and attracted about 50 participants.
F. Campillo has organized a special session on «Méthodes particulaires et applications» at the Journées SMAI / MAS (Modélisation Aléatoire et Statistique), held in September 2004 at IECN (Institut Élie Cartan, Nancy).
F. Le Gland has reported on the PhD theses of Jean–Jacques Szkolnik (ENSIETA and université de Bretagne Occidentale, advisor : André Quinquis), Vivien Rossi (ENSAM and université de Montpellier 2, advisor : Jean–Pierre Vila), and Karim Dahia (ONERA and université Joseph Fourier, advisors : Christian Musso and Dinh–Tuan Pham).
F. Le Gland gives a course on Kalman filtering, particle filtering and hidden Markov models, within the Master STI (école doctorale MATISSE, université de Rennes 1).
Within the continuing education programme «École Chercheurs» organized by IRISA on the theme of signal processing, and held in Cesson Sévigné in October 2004, F. Campillo has given two introductory lectures on particle filtering and its application to mobile tracking in a cellular network.
In addition to presentations with a publication in the proceedings, and which are listed at the end of the document, members of ASPI have also given the following presentations.
F. Campillo has been invited at LMA (Laboratoire de Mécanique et d'Acoustique, CNRS), Marseilles in June 2004 for a week and has given there a seminar on particle filtering.
N. Caylus has given a talk on statistical inference of HMM using Monte Carlo methods with interaction in the IRMAR seminar «Processus Stochastiques et Statistiques», Rennes in June 2004.
A. Guyader has given a talk on the k–nearest neighbours algorithm
at the «Journées Données Fonctionnelles» held in September 2004
at UHB (université de Haute–Bretagne, Rennes).
F. Le Gland has given a talk on statistical inference of HMM using Monte Carlo methods with interaction in the joint université de Montpellier 2 / ENSAM / INRA seminar «Probabilités et Statistiques» in February 2004, and in the seminar «Méthodes Particulaires pour l'Estimation et la Commande Optimale Stochastique» organized by Nadia Oudjane at EDF, Clamart in March 2004. He has also given a talk on tracking mobiles using Monte Carlo methods with interaction, at a seminar on stochastic approaches held at LORIA, Nancy in April 2004, within the regional project (Plan Etat Région Lorraine) TOAI «Télé–Opération et Assistants Intelligents».
At the joint 6th world congress of the Bernoulli Society and 67th annual meeting of the Institute of Mathematical Statistics (IMS), held in Barcelona in July 2004, F. Campillo has given a talk on local asymptotic normality for partially observed small noise diffusions, and F. Le Gland has given a talk on smooth interacting particle approximation of Feynman–Kac flows depending on a parameter, in the invited session organized by Arnaud Doucet on «Applications of Particle Methods in Statistics». At the satellite workshop on «Particle and Monte Carlo Methods», F. Le Gland has given a talk on the simulation of rare events using particle methods.
At the «Journées SMAI / MAS» (Modélisation Aléatoire et Statistique), held in September 2004 at IECN (Institut Élie Cartan, Nancy), F. Le Gland has given a talk on the simulation of rare events using particle methods, in the special session «Méthodes particulaires et applications» organized by F. Campillo.
Diego Salmeron Martinez, a PhD student of Mathieu Kessler at Universidad de Murcia, has visited us during one month from mid–June to mid–July 2004, in the framework and with the support of the DYNSTOCH european research network, see .
Stéphane Sénécal, post–doc at the Institute of Statistical Mathematics in Tokyo, has visited us for one week in May 2004, and has given a talk on his joint work with Arnaud Doucet on sampling strategies for sequential Monte Carlo methods.
Alexander Yu. Veretennikov, professor at the University of Leeds, has visited us for two days in December 2004, and has given a talk on the invariant probability distribution of an ergodic diffusion process, and its regularity w.r.t. some parameter.
Rivo Rakotozafy, assistant professor at the University of Fianaranstoa, has been awarded by the French embassy in Antananarivo, Madagascar a grant to support three visits (one per year, each stay of three months duration) to prepare a Madagascar habilitation thesis (HDR) under the supervision of Fabien Campillo. A related objective is to set up a collaboration between the university of Fianaranstoa and INRIA. This collaboration is mainly focused on Bayesian inference applied to engineering for renewable resources, see the 2003 activity report of the former SIGMA2 project. These results were presented at the 36èmes «Journées de Statistique» (SFdS'04), held in May 2004 in Montpellier, and at the 7ème CARI (Colloque Africain sur la Recherche en Informatique), held in November 2004 in Hammamet. Rivo Rakotozafy was also involved in the topic of mobile tracking in an urban cellular network.