In recent years, a new generation of numerical algorithms have begun to spread through the scientific community. Surprisingly enough, up to a few exceptions, many of these modern ideas do not really come from physics, but from biology and ethology. In a growing number of scientific disciplines, the researchers are now interpreting real world processes and engineering type systems less like purely deterministic and crude clockwork mechanisms, but much more like random and sophisticated biology inspired processes. This new generation of engineering models is based on stochastic ideas and natural principles like : chance, randomness, interactions, reinforcement strategies, exploration rules, biology-inspired adaptation and selection transitions, learning, reproduction, birth and death, ancestral lines and historical processes, genealogical tree evolutions, as well as self-organization principles, and many others.

These biology-inspired stochastic algorithms are often presented as natural heuristic simulation schemes without any mathematical foundations, nor a single performance analysis ensuring their convergence, nor even a single theoretical discussion that clarifies the applicability of these models. An important aspect of our project is to create a concrete bridge between pure and applied probability, statistics, biology, stochastic engineering and computer sciences. This fundamental bridging effort is probably one of the most important key to turn real nature's processes into engineering devices and stochastic algorithms, by learning what can be abstracted, copied or adapted. In the reverse angle, we can mention that these abstracted models adapting nature mechanisms and biological capabilities also provides a better understanding of the real processes.

By essence,
*the team-project is not a single application-driven research project*. The reasons are three-fold. Firstly, the same stochastic algorithm is very often used in a variety of application
areas. On the other hand every application domain area offers a series of different perspectives that can be used to improve the design and the performances of the algorithms. Last but not
least, concrete industrial applications, as well as most of concrete problems arising in biology, physics and chemistry, require a specific attention. In general, we do not use a single class
of stochastic algorithm but a broader set of stochastic search algorithms that incorporates facets of nature inspired strategies.

Our research project is centered on two central problems in advanced stochastic engineering:
*Bayesian inference and rare event simulation*and more particularly
*unsupervised learning, multi-target tracking, data assimilation and forecasting, as well as infection spreads inference.*These important and natural research directions have emerged as
logical parts of the team project combined with interdisciplinary approaches well-represented at Bordeaux university campus.

The fundamental and the theoretical aspects of our research project are essentially concerned with the stochastic analysis of the following three classes of biology inspired stochastic
algorithms:
*branching and interacting particle systems, reinforced random walks and self-interacting processes, random tree based models.*One of our prospective research project is to apply the
Bayesian learning methodology and the recent particle filter technology to the design of a new generation of interactive
`evolutionary computation`and
`stochastic art composition`models.

The team co-organized the Machine Learning Summer Schoolin Bordeaux, from September 4 to September 17 2011. This event gathered 100 participants, mostly PhD student, from 18 different countries and covered various topics (support vector machines, Monte Carlo methods, Bayesian inference, boosting, etc.) presented by world-class experts in their field. MLSS is a recurrent and important event of the machine learning community.

This idea of analyzing nature systems and transferring the underlying principles into stochastic algorithms and technical implementations is one of the central component of the ALEA team project. Adapting nature mechanisms and biological capabilities clearly provides a better understanding of the real processes, and it also improves the performance and the power of engineers devices. Our project is centered on both the understanding of biological processes in terms of mathematical, physical and chemical models, and on the other hand, on the use of these biology inspired stochastic algorithms to solve complex engineering problems.

There is a huge series of virtual interfaces, robotic devices, numerical schemes and stochastic algorithms which were invented mimicking biological processes or simulating natural
mechanisms. The terminology
*"mimicking or simulating"*doesn't really mean to find an exact copy of natural processes, but
*to elaborate the mathematical principles so that they can be abstracted from the original biological or physical model.*In our context, the whole series of evolutionary type principles
discussed in previous sections can be abstracted into only three different and natural classes of stochastic algorithms, depending on the nature of the biology-inspired interaction mechanism
used in the stochastic evolution model. These three stochastic search models are listed below :

1)
*Branching and interacting particle systems ( birth and death chains, spatial branching processes, mean-field interaction between generations):*

The first generation of adaptive branching-selection algorithms is very often built on the same genetic type paradigm: When exploring a state space with many particles, we duplicate better fitted individuals at the expense of light particles with poor fitness die. From a computational point of view, we generate a large number of random problem solvers. Each one is then rated according to a fitness or performance function defined by the developer. Mimicking natural selection, an evolutionary algorithm selects the best solvers in each generation and breeds them.

2)
*Reinforced random walks and self-interacting chains (reinforced learning strategies, interaction processes with respect to the occupation measure of the past visited sites):*

This type of reinforcement is observed frequently in nature and society, where "beneficial" interactions with the past history tend to be repeated. A new class of historical mean field type interpretation models of reinforced processes were developed by the team project leader in a pair of articles , . Self interaction gives the opportunity to build new stochastic search algorithms with the ability to, in a sense, re-initialized their exploration from the past, re-starting from some better fitted initial value already met in the past , .

3)
*Random tree based stochastic exploration models (coalescent and genealogical tree search explorations techniques on path space):*

The last generation of stochastic random tree models is concerned with biology-inspired algorithms on paths and excursions spaces. These genealogical adaptive search algorithms coincide with genetic type particle models in excursion spaces. They have been applied with success in generating the excursion distributions of Markov processes evolving in critical and rare event regimes, as well as in path estimation and related smoothing problems arising in advanced signal processing (cf. and references therein). We underline the fact that the complete mathematical analysis of these random tree models, including their long time behavior, their propagations of chaos properties, as well as their combinatorial structures are far from being completed. This class of genealogical tree based models has been introduced in for solving smoothing problems and more generally Feynman-Kac semigroups on path spaces, see also , , and references therein.

This short section is only concerned with the list of concrete application domains developed by our team project on Bayesian inference and unsupervised learning, nonlinear filtering and rare event analysis. Most of these application areas result from fruitful collaborations with other national institutes through a series of four recently selected ANR research projects and one INRIA-INRA joint research project.

Three application domains are directly related to evolutionary computing, particle filtering and Bayesian inference. They are currently investigated by our team project:

**Multi-target tracking**. Multi-target tracking deals with the task of estimating the states of a set of moving targets from a set of measurements obtained sequentially. These
measurements may either arise from one of the targets or from clutter and the measurement-to-target association is generally unknown. This problem can then be recast as a dynamic clustering
one where the clusters are the clutter and the different targets. The targets actually move in time, some targets may appear/disappear over time and the number of targets is generally
unknown and time-varying. We are running this research project with the DCNS-SIS division in Toulon.

**Forecasting and Data assimilation :**This new application domain concerns the application of the particle filter technology and more general sequential Monte Carlo methods to data
assimilation problems arising in forecasting. The ALEA team project is involved in the
`ANR 2008 selected project PREVASSEMBLE`with Météo France Toulouse, the INRIA Rennes and the LMD in Paris.

**Virtual prairie:**This application domain of evolutionary computing is concerned with the design of ecological systems, mixed-species models and prairial ecosystems. For more details,
we refer the reader to the web site of the
`Virtual Prairie project`The ALEA project is a partner of the 2008 ANR SYSCOM project named MODECOL.

Three other application domains are directly related to rare event analysis using particle stochastic simulations techniques. These projects are currently investigated by our team project, two of them are 2008's ANR research projects :

**Watermarking of digital contents:**

The terminology watermarking refers to a set of techniques for imbedding/hiding information in a digital audio or video file, such that the change is not noticed, and very hard to remove. In order to be used in an application, a watermarking technique must be reliable. Protection false alarms and failures of traceability codes are practically not achievable without using rare event analysis. This application domain area of particle rare event technology is the subject of joint ANR 2007 research project with the IRISA-INRIA in Rennes and the LIS INPG in Grenoble.

**Epidemic propagations analysis:**

This project aims at developing stochastic mathematical models for the spread of transmissible infectious diseases, together with dedicated statistical methodologies, with the intent to
deliver efficient diagnostic/prediction tools for epidemiologists. This application domain area of particle rare event technology is the subject of joint ANR 2008 research project with
Telecom Paristech, the Laboratoire Paul Painlevé in Lille 1 and the University of Paris 5 (cf.
`Programme Systèmes Complexes et Modélisation Mathématique, list of 2008 selected projects`).

**Statistical eco-microbiology predictions:**

This project aims at developing stochastic models and algorithms for the analysis of bacteriology ecosystems, especially in food safety. The objective is to predict and control critical risk of proliferations. This is the subject of joint research project with the INRA of Paris and Montpellier (Appel d'Offre INRIA-INRA 2008 : Systèmes Complexes).

A discrete time stochastic model for a multiagent system given in terms of a large collection of interacting Markov chains is studied. The evolution of the interacting particles is described
through a time inhomogeneous transition probability kernel that depends on the 'gradient' of the potential field. The particles, in turn, dynamically modify the potential field through their
cumulative input. Interacting Markov processes of the above form have been suggested as models for active biological transport in response to external stimulus such as a chemical gradient. One
of the basic mathematical challenges is to develop a general theory of stability for such interacting Markovian systems and for the corresponding nonlinear Markov processes that arise in the
large agent limit. Such a theory would be key to a mathematical understanding of the interactive structure formation that results from the complex feedback between the agents and the potential
field. It will also be a crucial ingredient in developing simulation schemes that are faithful to the underlying model over long periods of time. The goal of the work developed in
is to study qualitative properties of the above stochastic system as
the number of particles (N ) and the time parameter (n) approach infinity. In this regard asymptotic properties of a deterministic nonlinear dynamical system, that arises in the propagation of
chaos limit of the stochastic model, play a key role. We show that under suitable conditions this dynamical system has a unique fixed point. This result allows us to study stability properties
of the underlying stochastic model. We show that as N

While statisticians are well-accustomed to performing exploratory analysis in the modeling stage of an analysis, the notion of conducting preliminary general-purpose exploratory analysis in the Monte Carlo stage (or more generally, the model-fitting stage) of an analysis is an area which we feel deserves much further attention. Towards this aim, the paper proposes a general-purpose algorithm for automatic density exploration. The proposed exploration algorithm combines and expands upon components from various adaptive Markov chain Monte Carlo methods, with the Wang-Landau algorithm at its heart. Additionally, the algorithm is run on interacting parallel chains – a feature which both decreases computational cost as well as stabilizes the algorithm, improving its ability to explore the density. Performance is studied in several applications. Through a Bayesian variable selection example, the authors demonstrate the convergence gains obtained with interacting chains. The ability of the algorithm's adaptive proposal to induce mode-jumping is illustrated through a trimodal density and a Bayesian mixture modeling application. Lastly, through a 2D Ising model, the authors demonstrate the ability of the algorithm to overcome the high correlations encountered in spatial models.

For the Ornstein-Uhlenbeck process, the asymptotic behavior of the maximum likelihood estimator of the drift parameter is totally different in the stable, unstable, and explosive cases. Notwithstanding of this trichotomy, we investigate sharp large deviation principles for this estimator in the three situations. In the explosive case, we exhibit in a very unusual rate function with a shaped flat valley and an abrupt discontinuity point at its minimum.

Recently, it has been stated that the complexity of a solution is a good indicator of the amount of overfitting it incurs. However, measuring the complexity of a program, in Genetic Programming, is not a trivial task. In , we study the functional complexity and how it relates with overfitting on symbolic regression problems.We consider two measures of complexity, Slope-based Functional Complexity, inspired by the concept of curvature, and Regularity-based Functional Complexity based on the concept of Holderian regularity. In general, both complexity measures appear to be poor indicators of program overfitting. However, results suggest that Regularity-based Functional Complexity could provide a good indication of overfitting in extreme cases.

During the development of applied systems, an important problem that must be addressed is that of choosing the correct tools for a given domain or scenario. This general task has been addressed by the genetic programming (GP) community by attempting to determine the intrinsic difficulty that a problem poses for a GP search. In , we present an approach to predict the performance of GP applied to data classification, one of the most common problems in computer science. The novelty of the proposal is to extract statistical descriptors and complexity descriptors of the problem data, and from these estimate the expected performance of a GP classifier. We derive two types of predictive models: linear regression models and symbolic regression models evolved with GP. The experimental results show that both approaches provide good estimates of classifier performance, using synthetic and real-world problems for validation. In conclusion, this paper shows that it is possible to accurately predict the expected performance of a GP classifier using a set of descriptors that characterize the problem data.

The analysis of image regularity using Holder exponents can be used to characterize singular structures contained within an image, and provide a compact description of local shape and appearance. However, estimating the Holder exponent is not a trivial task and current methods tend to be slow and complex. Therefore, the goal in is to automatically synthesize image operators that can be used to estimate the Holder regularity of an image. We pose this task as an optimization problem and use Genetic Programming (GP) to search for operators that can approximate a traditional estimator, the oscillations method. In our experiments, GP was able to evolve estimators that achieve a low error and a high correlation with the ground truth estimation. Furthermore, most of the GP estimators are faster than the traditional approaches, in some cases their runtime is orders of magnitude smaller. This result allowed us to implement a real-time estimation of the Holder exponent on a live video signal, the first such implementation in current literature. Moreover, the evolved estimators are used to generate local descriptors of salient image regions, a task for which we obtain a stable and robust matching that is comparable with state-of-the-art methods. In conclusion, the evolved estimators produced by GP could help expand the application domain of Holderian regularity within the fields of image analysis and signal processing.

In ISAR processing, post-processing of the range Doppler image is useful to help the practitioner for ship recognition. Among the image post-processing tools, interpolation methods can be of interest especially when zooming. In , we study the relevance of the Holderian regularity-based interpolation. In that case, interpolating consists in adding a new scale in the wavelet transform and the new wavelet coefficients can be estimated from others. In the original method, initially proposed by two of the authors, the image is first interpolated along the rows and then along the columns. Concerning the diagonal pixels, they are estimated as the mean of the adjacent original and interpolated pixels. Here, we propose a variant where the diagonal pixels are estimated by taking into account the local orientation of the image. It has the advantage of conserving local regularity on all interpolated pixels of the image. A comparative study on synthetic data and real range-Doppler images is then carried out with alternative interpolation techniques such as the linear interpolation, the bicubic one, the nearest neighbour interpolation, etc. The simulation results confirm the effectiveness of the approach.

The objective of this contract (2009-2011) between the teams ALEA and CQFD and EDF, is to develop algorithms for the recursive prediction of the electricity consumption. The team will organize a workshop on this subject in Institut Henri Poincaré.

The objective of this contract is to develop particle algorithms for the pricing of American-style options .

The objective of this contract (2010-2011) is to propose algorithms for the estimation of uncertainties in laser experiments , .

The project PSI (Psychology and sounds interactions), headed by P. Legrand received a grant by the region Aquitaine for a PhD thesis on “Dimension reduction in supervised learning. Application to the study of brain activity".

To combat dramatic event such as happened in Bombay last year (coming from the sea, a terrorist commando killed more than 200 peoples in Bombay city), authorities are decided to deploy efficient sea surveillance system to protect coastal zone including sensitive infrastructures often in vicinity of important cities.

Regulation on frequencies allocation and on coastal constructions is strong constraint to be taken into account to install technical capabilities to permanently survey vulnerable littoral zones. For example, new active sensor shall be frequencies compatible within numerous existing ones in inhabited region. In this context to perform coastal surveillance, attractive solution is to deploy passive sensors networks because:

Not necessarily compatible within existing active sensors network.

Provide large possibilities to install the passive sensors, because, it is not needed to be on the shoreline, but can be deployed inside the territory. Such as facility offers more potential sites and then, to optimise the deployment for optimal coverage of the sensitive zone.

Is totally undetectable by external technical means in hand of structured criminal organisations.

For these objectives, the PROPAGATION project will study, develop and experiment a demonstrator to carry out maritime traffic picture from a set of passive sensors: passive radar, AIS and optronic cameras deployed over a coastal site. This is a joint ANR project with DCNS, Thalès, Ecomer and Exavision, accepted in 2009.

This is an interdisciplinary exploratory research project, between Institut de Mathématiques de Bordeaux and Laboratory Ecologie & Evolution, UMR 7625 CNRS-UMPC-ENS (responsible: B. Cazelles). The objective of this project on the dynamics of epidemic diseases characterized by multiple strains of pathogens, is to use the competencies of the ALEA team to get efficient Bayesian optimization techniques. An opening workshopon stochastic models and bayesian inference in epidemiology has been organized in Bordeaux in November 2011.

Partner 1: Oxford University, Department of Statistics (UK)

Interacting Particle Systems

Bayesian nonparametrics

Partner 2: Imperial College, Department of Statistics (UK)

Interacting Particle Systems

The following researchers visited the Team ALEA during 2011: M. Ludkovski (Univ. UCSB), A. Doucet (Univ. Oxford), C. Holmes (Oxford), C. Archambeau (Xerox), N. Whiteley (Univ. Bristol), S. Singh (Cambridge), L. Bornn (UBC), Leonardo Trujillo (Cicese).

P. Del Moral is currently associate editor/editor for the following journals

Chief editor :
`ESAIM: Proceedings`since 2006.

Associate editor :
`Applied Mathematics and Optimization`since 2009.

Associate editor Revista de Matemàtica: Teoria y Aplicaciones, since 2009.

Associate editor :
`Stochastic Analysis and Applications`since 2001.

P. Del Moral partipated to the following committees

Scientific expert for the PES research grant selection committee of the French Ministry of Research in 2010, section CNU 25 and 26.

Responsible with Xavier Warin (EDF R&D Clamart) of the theme
*modélisation stochastique et incertitude*, of the strategic action EDF-INRIA since 2010.

P. Del Moral co-organized an interdisciplinary workshop on stochastic models and Bayesian inference in epidemiologyin Bordeaux.

B. Bercu is responsible of the thematic group MAS(Modélisation Aléatoire et Statistique) at SMAI.

B. Bercu is an assistant director of the Institut de Mathématiques de Bordeaux (IMB). He is also a member of the IMB council and the UFR council of the University of Bordeaux. He is a member of the CNU section 26.

B. Bercu is co-responsible of the specialty "Modélisation Statistique et Stochastique" of the Master MIMSE.

F. Caron, P. Legrand and P. Del Moral co-organized the Machine Learning Summer School 2011, organized near Bordeaux in September 4-17, 2011.

F. Caron gave a practical session on parametric and nonparametric Bayesian clustering at the Machine Learning Summer School 2011in Bordeaux.

F. Caron was in the senior program committee of the Fourteenth International Conference on Artificial Intelligence and Statistics ( AISTATS 2011).

F. Caron was in the program committee of the NIPS workshop on Choice Models and Preference Learning.

F. Caron was a reviewer for the following international machine learning conferences and journals this year: International Conference on Machine Learning ( ICML 2011), Neural Information Processing Systems ( NIPS 2011), Journal of the Royal Statistical Society B, Bayesian Analysis, IEEE Transactions on Signal Processing, Statistics and Computing, Journal of Machine Learning Research, Automatica.

P. Legrand was a reviewer for the following international conferences and journals this year: Signal Processing, Evolve 2011, EA 2011.

P. Legrand and P. Del Moral were in the organizing committee of the international conference Evolve 2011.

P. Legrand was in the organizing committee of the French summer school EA 2011and the international conference EA 2011.

B. Bercu is teaching the following courses (142 hours)

Licence: Mathématiques générales, Analyse et Algèbre SVE, 36h, L1, University of Bordeaux, France

Master: Séries Chronologiques, 48h, M2, University of Bordeaux, France

Master: Processus aléatoires à temps discret, Martingales, 30h, M1, University of Bordeaux, France

Master: Probabilités, 30h, L3, University of Bordeaux, France

F. Caron is teaching the following courses (50 hours)

Master : Unsupervised Learning, 25 hours, M2, University of Bordeaux, France

Master : Bayesian Methods, 13 hours, M2, University of Bordeaux, France

Master: Projet Informatique, 12 hours, M2, University of Bordeaux, France

P. Del Moral gives the following courses

**Since september 2011**:
Professeur
chargé de cours Polytechnique,
`CMAP`(58h).

Travaux dirigés/Petites classes :

1) les méthodes stochastiques et les méthodes de Monte Carlo

2) Les modèles aléatoires en écologie et évolution.

P. Legrand is teaching the following courses (244 hours)

Licence: Analyse, 32h, L1, University of Bordeaux, France

Licence: Mathématiques générales, 72h, L1, University of Bordeaux, France

Licence: Informatique pour les mathématiques, 72h, L1, University of Bordeaux, France

Licence: Complément d'algèbre, 72h, L2, University of Bordeaux, France

A.Richou is teaching the following courses (128 hours)

Master: Probabilité, 32h, M1, University of Bordeaux 1, France

Licence: Probabilités et Statistiques, 32h, L3, University of Bordeaux 1, France

Licence: Probabilité et Statistiques, L3, 32h, University of Bordeaux 1, France

Licence: Probabilité et Statistiques, L1, 32h, University of Bordeaux 1, France

PhD & HdR:

PhD : Michele Pace, Stochastic models and methods for multi-object tracking , University of Bordeaux, July 13, 2011, supervised by P. Del Moral and F. Caron