## Section: Scientific Foundations

### Multilevel splitting for rare event simulation

See 4.2 , and 6.1 , 6.6 , 6.10 and 6.11 .

The estimation of the small probability of a rare but critical event, is a crucial issue in industrial areas such as

nuclear power plants, food industry, telecommunication networks, finance and insurance industry, air traffic management, etc.

In such complex systems, analytical methods cannot be used, and naive Monte Carlo methods are clearly unefficient to estimate accurately very small probabilities. Besides importance sampling, an alternate widespread technique consists in multilevel splitting [60] , where trajectories going towards the critical set are given offsprings, thus increasing the number of trajectories that eventually reach the critical set. As shown in [5] , the Feynman–Kac formalism of 3.1 is well suited for the design and analysis of splitting algorithms for rare event simulation.

**Propagation of uncertainty** Multilevel splitting can be used in static situations. Here, the
objective is to learn the probability distribution of an output random
variable $Y=F\left(X\right)$, where the function $F$ is only defined pointwise
for instance by a computer programme, and where the probability distribution
of the input random variable $X$ is known and easy to simulate from.
More specifically, the objective
could be to compute the probability of the output random variable
exceeding a threshold, or more generally to evaluate the
cumulative distribution function of the output random variable for
different output values.
This problem is characterized by
the lack of an analytical expression for the function, the
computational cost of a single pointwise evaluation of the function,
which means that the number of calls to the function should be limited as
much as possible, and finally the complexity and / or unavailability of the
source code of the computer programme, which makes any modification
very difficult or even impossible, for instance to change the model as in
importance sampling methods.

The key issue is to learn as fast as possible regions of the input space which contribute most to the computation of the target quantity. The proposed splitting methods consists in (i) introducing a sequence of intermediate regions in the input space, implicitly defined by exceeding an increasing sequence of thresholds or levels, (ii) counting the fraction of samples that reach a level given that the previous level has been reached already, and (iii) improving the diversity of the selected samples, usually using an artificial Markovian dynamics. In this way, the algorithm learns

the transition probability between successive levels, hence the probability of reaching each intermediate level,

and the probability distribution of the input random variable, conditionned on the output variable reaching each intermediate level.

A further remark, is that this conditional probability distribution is precisely the optimal (zero variance) importance distribution needed to compute the probability of reaching the considered intermediate level.

**Rare event simulation** To be specific, consider a complex dynamical system modelled as a Markov
process, whose state can possibly contain continuous components and
finite components (mode, regime, etc.), and the objective is to
compute the probability, hopefully very small, that a critical region
of the state space is reached by the Markov process before a final
time $T$, which can be deterministic and fixed, or random (for instance
the time of return to a recurrent set, corresponding to a nominal
behaviour).

The proposed splitting method consists in (i) introducing a decreasing sequence of intermediate, more and more critical, regions in the state space, (ii) counting the fraction of trajectories that reach an intermediate region before time $T$, given that the previous intermediate region has been reached before time $T$, and (iii) regenerating the population at each stage, through redistribution. In addition to the non–intrusive behaviour of the method, the splitting methods make it possible to learn the probability distribution of typical critical trajectories, which reach the critical region before final time $T$, an important feature that methods based on importance sampling usually miss. Many variants have been proposed, whether

the branching rate (number of offsprings allocated to a successful trajectory) is fixed, which allows for depth–first exploration of the branching tree, but raises the issue of controlling the population size,

the population size is fixed, which requires a breadth–first exploration of the branching tree, with random (multinomial) or deterministic allocation of offsprings, etc.

Just as in the static case, the algorithm learns

the transition probability between successive levels, hence the probability of reaching each intermediate level,

and the entrance probability distribution of the Markov process in each intermediate region.

Contributions have been given to

minimizing the asymptotic variance, obtained through a central limit theorem, with respect to the shape of the intermediate regions (selection of the importance function), to the thresholds (levels), to the population size, etc.

controlling the probability of extinction (when not even one trajectory reaches the next intermediate level),

designing and studying variants suited for hybrid state space (resampling per mode, marginalization, mode aggregation),

and in the static case, to

minimizing the asymptotic variance, obtained through a central limit theorem, with respect to intermediate levels, to the Metropolis kernel introduced in the mutation step, etc.

A related issue is global optimization. Indeed, the difficult problem of finding the set $M$ of global minima of a real–valued function $V$ can be replaced by the apparently simpler problem of sampling a population from a probability distribution depending on a small parameter, and asymptotically supported by the set $M$ as the small parameter goes to zero. The usual approach here is to use the cross–entropy method [68] , [39] , which relies on learning the optimal importance distribution within a prescribed parametric family. On the other hand, multilevel splitting methods could provide an alternate nonparametric approach to this problem.