The core component of our scientific agenda focuses on the development of statistical and probabilistic methods for the modeling and the optimization of complex systems. These systems require dynamic and stochastic mathematical representations with discrete and/or continuous variables. Their complexity poses genuine scientific challenges that can be addressed through complementary approaches and methodologies:

*Modeling:* design and analysis of realistic and tractable models for such complex real-life systems taking into account various probabilistic phenomena;

*Estimation:* developing theoretical and computational methods in order to estimate the parameters of the model and to evaluate the performance of the system;

*Control:* developing theoretical and numerical control tools to optimize the performance.

These three approaches are strongly connected and the most important feature of the team is to consider these topics as a whole. This enables the team to deal with real industrial problems in several contexts such as biology, production planning, trajectory generation and tracking, performance and reliability.

The scientific objectives of the team are to provide mathematical tools for modeling and optimization of complex systems. These systems require mathematical representations which are in essence dynamic, multi-model and stochastic. This increasing complexity poses genuine scientific challenges in the domain of modeling and optimization. More precisely, our research activities are focused on stochastic optimization and (parametric, semi-parametric, multidimensional) statistics which are complementary and interlinked topics. It is essential to develop simultaneously statistical methods for the estimation and control methods for the optimization of the models.

**Stochastic modeling**: Markov chain, Piecewise Deterministic Markov Processes (PDMP), Markov Decision Processes (MDP).

The mathematical representation of complex systems is a preliminary step to our final goal corresponding to the optimization of its performance. The team CQFD focuses on two complementary types of approaches. The first approach is based on mathematical representations built upon physical models where the dynamic of the real system is described by *stochastic processes*. The second one consists in studying the modeling issue in an abstract framework where the real system is considered as black-box. In this context, the outputs of the system are related to its inputs through a *statistical model*.
Regarding stochastic processes, the team studies Piecewise Deterministic Markov Processes (PDMPs) and Markov Decision Processes (MDPs). These two classes of Markov processes form general families of controlled stochastic models suitable for the design of sequential decision-making problems. They appear in many fields such as biology, engineering, computer science, economics, operations research and provide powerful classes of processes for the modeling of complex systems. Our contribution to this topic consists in expressing real-life industrial problems into these mathematical frameworks.
Regarding statistical methods, the team works on dimension reduction models. They provide a way to understand and visualize the structure of complex data sets. Furthermore, they are important tools in several different areas such as data analysis and machine learning, and appear in many applications such as biology, genetics, environment and recommendation systems. Our contribution to this topic consists in studying semiparametric modeling which combines the advantages of parametric and nonparametric models.

**Estimation methods**: estimation for PDMP; estimation in non- and semi- parametric regression modeling.

To the best of our knowledge, there does not exist any general theory for the problems of estimating parameters of PDMPs although there already exist a large number of tools for sub-classes of PDMPs such as point processes and marked point processes. To fill the gap between these specific models and the general class of PDMPs, new theoretical and mathematical developments will be on the agenda of the whole team. In the framework of non-parametric regression or quantile regression, we focus on kernel estimators or kernel local linear estimators for complete data or censored data. New strategies for estimating semi-parametric models via recursive estimation procedures have also received an increasing interest recently. The advantage of the recursive estimation approach is to take into account the successive arrivals of the information and to refine, step after step, the implemented estimation algorithms. These recursive methods do require restarting calculation of parameter estimation from scratch when new data are added to the base. The idea is to use only the previous estimations and the new data to refresh the estimation. The gain in time could be very interesting and there are many applications of such approaches.

**Dimension reduction**: dimension-reduction via SIR and related methods, dimension-reduction via multidimensional and classification methods.

Most of the dimension reduction approaches seek for lower dimensional subspaces minimizing the loss of some statistical information. This can be achieved in modeling framework or in exploratory data analysis context.

In modeling framework we focus our attention on semi-parametric models in order to conjugate the advantages of parametric and nonparametric modeling. On the one hand, the parametric part of the model allows a suitable interpretation for the user. On the other hand, the functional part of the model offers a lot of flexibility.
In this project, we are especially interested in the semi-parametric regression model

Methods of dimension reduction are also important tools in the field of data analysis, data mining and machine learning.They provide a way to understand and visualize the structure of complex data sets.Traditional methods among others are principal component analysis for quantitative variables or multiple component analysis for qualitative variables. New techniques have also been proposed to address these challenging tasks involving many irrelevant and redundant variables and often comparably few observation units. In this context, we focus on the problem of synthetic variables construction, whose goals include increasing the predictor performance and building more compact variables subsets. Clustering of variables is used for feature construction. The idea is to replace a group of ”similar” variables by a cluster centroid, which becomes a feature. The most popular algorithms include K-means and hierarchical clustering. For a review, see, e.g., the textbook of Duda .

**Stochastic control**: optimal stopping, impulse control, continuous control, linear programming.

The main objective is to develop *approximation techniques* to provide quasi-optimal feasible solutions and to derive *optimality results* for control problems related to MDPs and PDMPs:

*Approximation techniques*.
The analysis and the resolution of such decision models mainly rely on the maximum principle and/or the dynamic/linear programming techniques together with their various extensions such as the value iteration (VIA) and the policy iteration (PIA) algorithm. However, it is well known that these approaches are hardly applicable in practice and suffer from the so-called *curse of dimensionality*. Hence, solving numerically a PDMP or an MDP is a difficult and important challenge.
Our goal is to obtain results which are both consistent from a theoretical point of view and computationally tractable and accurate from an application standpoint.
It is important to emphasize that these research objectives were not planned in our initial 2009 program.

Our objective is to propose approximation techniques to efficiently compute the optimal value function and to get quasi-optimal controls for different classes of constrained and unconstrained MDPs with general state/action spaces, and possibly unbounded cost function. Our approach is based on combining the linear programming formulation of an MDP with probabilistic approximation techniques related to quantization techniques and the theory of empirical processes. An other aim is to apply our methods to specific industrial applications in collaboration with industrial partners such as Airbus Defence & Space, Naval Group and Thales.

Asymptotic approximations are also developed in the context of queueing networks, a class of models where the decision policy of the underlying MDP is in some sense fixed a priori, and our main goal is to study the transient or stationary behavior of the induced Markov process. Even though the decision policy is fixed, these models usually remain intractable to solve. Given this complexity, the team has developed analyses in some limiting regime of practical interest, i.e., queueing models in the large-network, heavy-traffic, fluid or mean-field limit. This approach is helpful to obtain a simpler mathematical description of the system under investigation, which is often given in terms of ordinary differential equations or convex optimization problems.

*Optimality results*.
Our aim is to investigate new important classes of optimal stochastic control problems including constraints and combining continuous and impulse actions for MDPs and PDMPs. In this framework, our objective is to obtain different types of optimality results. For example, we intend to provide conditions to guarantee the existence and uniqueness of the optimality equation for the problem under consideration and to ensure existence of an optimal (and

Our abilities in probability and statistics apply naturally to industry, in particular in studies of dependability and safety. An illustrative example is the collaboration that started in September 2014 with with THALES Optronique. The goal of this project is the optimization of the maintenance of an onboard system equipped with a HUMS (Health Unit Monitoring Systems). The physical system under consideration is modeled by a piecewise deterministic Markov process. In the context of impulse control, we propose a dynamic maintenance policy, adapted to the state of the system and taking into account both random failures and those related to the degradation phenomenon.

The spectrum of applications of the topics that the team can address is large and can concern many other fields. Indeed non parametric and semi-parametric regression methods can be used in biometry, econometrics or engineering for instance. Gene selection from microarray data and text categorization are two typical application domains of dimension reduction among others. We had for instance the opportunity via the scientific program PRIMEQUAL to work on air quality data and to use dimension reduction techniques as principal component analysis (PCA) or positive matrix factorization (PMF) for pollution sources identification and quantization.

*Bayesian Inference with Interacting Particle Systems*

Functional Description: Biips is a software platform for automatic Bayesian inference with interacting particle systems. Biips allows users to define their statistical model in the probabilistic programming BUGS language, as well as to add custom functions or samplers within this language. Then it runs sequential Monte Carlo based algorithms (particle filters, particle independent Metropolis-Hastings, particle marginal Metropolis-Hastings) in a black-box manner so that to approximate the posterior distribution of interest as well as the marginal likelihood. The software is developed in C++ with interfaces with the softwares R, Matlab and Octave.

Participants: Adrien Todeschini and François Caron

Contact: Adrien Todeschini

Keyword: Statistic analysis

Functional Description: Mixed data type arise when observations are described by a mixture of numerical and categorical variables. The R package PCAmixdata extends standard multivariate analysis methods to incorporate this type of data. The key techniques included in the package are PCAmix (PCA of a mixture of numerical and categorical variables), PCArot (rotation in PCAmix) and MFAmix (multiple factor analysis with mixed data within a dataset). The MFAmix procedure handles a mixture of numerical and categorical variables within a group - something which was not possible in the standard MFA procedure. We also included techniques to project new observations onto the principal components of the three methods in the new version of the package.

Contact: Marie Chavent

URL: https://

Keyword: Regression

Functional Description: QuantifQuantile is an R package that allows to perform quantization-based quantile regression. The different functions of the package allow the user to construct an optimal grid of N quantizers and to estimate conditional quantiles. This estimation requires a data driven selection of the size N of the grid that is implemented in the functions. Illustration of the selection of N is available, and graphical output of the resulting estimated curves or surfaces (depending on the dimension of the covariate) is directly provided via the plot function.

Contact: Jérôme Saracco

URL: https://

Piecewise-deterministic Markov processes form a general class of non-diffusion stochastic models that involve both deterministic trajectories and random jumps at random times. In this paper, we state a new characterization of the jump rate of such a process with discrete transitions. We deduce from this result a nonparametric technique for estimating this feature of interest. We state the uniform convergence in probability of the estimator. The methodology is illustrated on a numerical example.

Authors: Alexandre Genadot (Inria CQFD) and Romain Azaïs.

Assume that you observe trajectories of a non-diffusive non-stationary process and that you are interested in the average number of times where the process crosses some threshold (in dimension d = 1) or hypersurface (in dimension d ≥ 2). Of course, you can actually estimate this quantity by its empirical version counting the number of observed crossings. But is there a better way? In this paper, for a wide class of piecewise smooth processes, we propose estimators of the average number of continuous crossings of an hypersurface based on Kac-Rice formulae. We revisit these formulae in the uni-and multivariate framework in order to be able to handle non-stationary processes. Our statistical method is tested on both simulated and real data.

Authors: Alexandre Genadot (Inria CQFD) and Romain Azaïs.

In this paper, we propose a Ward-like hierarchical clustering algorithm including spatial/geographical constraints. Two dissimilarity matrices

Authors: Marie Chavent (Inria CQFD), Vanessa Kuentz-Simonet, Amaury Labenne and Jérôme Saracco (Inria CQFD).

We consider a change-point detection problem for a simple class of Piecewise Deterministic Markov Processes (PDMPs). A continuous-time PDMP is observed in discrete time and through noise, and the aim is to propose a numerical method to accurately detect both the date of the change of dynamics and the new regime after the change. To do so, we state the problem as an optimal stopping problem for a partially observed discrete-time Markov decision process taking values in a continuous state space and provide a discretization of the state space based on quantization to approximate the value function and build a tractable stopping policy. We provide error bounds for the approximation of the value function and numerical simulations to assess the performance of our candidate policy.

Authors: Alice Cleynen and Benoîte de Saporta (Inria CQFD).

This article provides a new theory for the analysis of forward and backward particle approximations of Feynman–Kac models. Such formulae are found in a wide variety of applications and their numerical (particle) approximation is required due to their intractability. Under mild assumptions, we provide sharp and non-asymptotic first order expansions of these particle methods, potentially on path space and for possibly unbounded functions. These expansions allow one to consider upper and lower bound bias type estimates for a given time horizon

Authors: Pierre Del Moral (Inria CQFD) and Ajay Jasrab.

This article provides a new theory for the analysis of the particle Gibbs (PG) sampler (Andrieu et al., 2010). Following the work of Del Moral and Jasra (2017) we provide some analysis of the particle Gibbs sampler, giving first order expansions of the kernel and minorization estimates. In addition, first order propagation of chaos estimates are derived for empirical measures of the dual particle model with a frozen path, also known as the conditional sequential Monte Carlo (SMC) update of the PG sampler. Backward and forward PG samplers are discussed, including a first comparison of the contraction estimates obtained by first order estimates. We illustrate our results with an example of fixed parameter estimation arising in hidden Markov models.

Authors: Pierre Del Moral (Inria CQFD) and Ajay Jasrab.

We consider an elliptic and time-inhomogeneous diffusion process with time-periodic coefficients evolving in a bounded domain of RdRd with a smooth boundary. The process is killed when it hits the boundary of the domain (hard killing) or after an exponential time (soft killing) associated with some bounded rate function. The branching particle interpretation of the non absorbed diffusion again behaves as a set of interacting particles evolving in an absorbing medium. Between absorption times, the particles evolve independently one from each other according to the diffusion semigroup; when a particle is absorbed, another selected particle splits into two offsprings. This article is concerned with the stability properties of these non absorbed processes. Under some classical ellipticity properties on the diffusion process and some mild regularity properties of the hard obstacle boundaries, we prove an uniform exponential strong mixing property of the process conditioned to not be killed. We also provide uniform estimates w.r.t. the time horizon for the interacting particle interpretation of these non-absorbed processes, yielding what seems to be the first result of this type for this class of diffusion processes evolving in soft and hard obstacles, both in homogeneous and non-homogeneous time settings.

Authors: Pierre Del Moral (Inria CQFD) and Denis Villemonais.

The data we analyze derives from the observation of numerous cells of the bacterium Escherichia coli (E. coli) growing and dividing. Single cells grow and divide to give birth to two daughter cells, that in turn grow and divide. Thus, a colony of cells from a single ancestor is structured as a binary genealogical tree. At each node the measured data is the growth rate of the bacterium. In this paper, we study two different data sets. One set corresponds to small complete trees, whereas the other one corresponds to long specific sub-trees. Our aim is to compare both sets. This paper is accessible to post graduate students and readers with advanced knowledge in statistics.

Authors: Bernard Delyon, Benoîte De Saporta (Inria CQFD), Nathalie Krell, Lydia Robert.

Restoring hazy images is challenging since it must account for several physical factors that are related to the image formation process. Existing analytical methods can only provide partial solutions because they rely on assumptions that may not be valid in practice. This research presents an effective method for restoring hazy images based on genetic programming. Using basic mathematical operators several computer programs that estimate the medium transmission function of hazy scenes are automatically evolved. Afterwards, image restoration is performed using the estimated transmission function in a physics-based restoration model. The proposed estimators are optimized with respect to the mean-absolute-error. Thus, the effects of haze are effectively removed while minimizing overprocessing artifacts. The performance of the evolved GP estimators given in terms of objective metrics and a subjective visual criterion, is evaluated on synthetic and real-life hazy images. Comparisons are carried out with state-of-the-art methods, showing that the evolved estimators can outperform these methods without incurring a loss in efficiency, and in most scenarios achieving improved performance that is statistically significant.

Authors: Jose Enrique Hernandez-Beltran, Victor H.Diaz-Ramirez, Leonardo Trujillo and Pierrick Legrand (Inria CQFD).

Immune interventions consisting in repeated injection are broadly used as they are thought to improve the quantity and the quality of the immune response. However, they also raised several questions that remains unanswered, in particular the number of injections to make or the delay to respect between different injections to acheive this goal. Practical and financial considerations add constraints to these questions, especially in the framework of human studies. We specifically focus here on the use of interleukine-7 (IL-7) injections in HIV-infected patients under antiretroviral treatment, but still unable to restore normal levels of CD4+ T lymphocytes. Clinical trials have already shown that repeated cycles of injections of IL-7 could help maintaining CD4+ T lymphocytes levels over the limit of 500 cells per microL, by affecting proliferation and survival of CD4+ T cells. We then aim at answering the question : how to maintain a patient's level of CD4+ T lymphocytes by using a minimum number of injections (ie optimizing the strategy of injections) ? Based on mechanistic models that were previously developed for the dynamics of CD4+ T lymphocytes in this context, we model the process by a piecewise deterministic Markov model. We then address the question by using some recently established theory on impulse control problem in order to develop a numerical tool determining the optimal strategy. Results are obtained on a reduced model, as a proof of concept : the method allows to defined an optimal strategy for a given patient. This method could applied to optimize injections schedules in clinical trials.

Authors: Chloé Pasin, François Dufour (Inria CQFD), Laura Villain, Huilong Zhang (Inria CQFD), Rodolphe Thiébaut.

The authors present in this study a numerical method which computes the optimal trajectory of a underwater vehicle subject to some mission objectives. The method is applied to a submarine whose goal is to best detect one or several targets, or/and to minimise its own detection range perceived by the other targets. The signal considered is acoustic propagation attenuation. This approach is based on dynamic programming of a finite horizon Markov decision process. A quantisation method is applied to fully discretise the problem and allows a numerically tractable solution. Different scenarios are considered. The authors suppose at first that the position and the velocity of the targets are known and in the second they suppose that they are unknown and estimated by a Kalman type filter in a context of passive tracking.

Authors: Huilong Zhang (Inria CQFD), Benoite de Saporta (Inria CQFD), Francois Dufour (Inria CQFD), Dann Laneuville and Adrien Nègre.

In this paper we study the numerical approximation of the optimal long-run average cost of a continuous-time Markov decision process, with Borel state and action spaces, and with bounded transition and reward rates. Our approach uses a suitable discretization of the state and action spaces to approximate the original control model. The approximation error for the optimal average reward is then bounded by a linear combination of coefficients related to the discretization of the state and action spaces, namely, the Wasserstein distance between an underlying probability measure

Authors: Jonatha Anselmi (Inria CQFD), François Dufour (Inria CQFD) and Tomás Prieto-Rumeau.

This papers deals with the zero-sum game with a discounted reward criterion for piecewise deterministic Markov process (PDMPs) in general Borel spaces. The two players can act on the jump rate and transition measure of the process, with the decisions being taken just after a jump of the process. The goal of this paper is to derive conditions for the existence of min?max strategies for the infinite horizon total expected discounted reward function, which is composed of running and boundary parts. The basic idea is, by using the special features of the PDMPs, to re-write the problem via an embedded discrete-time Markov chain associated to the PDMP and re-formulate the problem as a discrete-stage zero sum game problem.

Authors: Oswaldo Costa and François Dufour (Inria CQFD) and Tomás Prieto-Rumeau.

This paper is concerned with a minimax control problem (also known as a robust Markov decision process (MDP) or a game against nature) with general state and action spaces under the discounted cost optimality criterion. We are interested in approximating numerically the value function and an optimal strategy of this general discounted minimax control problem. To this end, we derive structural Lipschitz continuity properties of the solution of this robust MDP by imposing suitable conditions on the model, including Lipschitz continuity of the elements of the model and absolute continuity of the Markov transition kernel with respect to some probability measure. Then, we are able to provide an approximating minimax control model with finite state and action spaces, and hence computationally tractable, by combining these structural properties with a suitable discretization procedure of the state space (related to a probabilistic criterion) and the action spaces (associated to a geometric criterion). Finally, it is shown that the corresponding approximation errors for the value function and the optimal strategy can be controlled in terms of the discretization parameters. These results are also extended to a two-player zero-sum Markov game.

Authors: François Dufour (Inria CQFD) and Tomás Prieto-Rumeau.

We consider a discrete-time Markov decision process with Borel state and action spaces. The performance criterion is to maximize a total expected utility determined by unbounded return function. It is shown the existence of optimal strategies under general conditions allowing the reward function to be unbounded both from above and below and the action sets available at each step to the decision maker to be not necessarily compact. To deal with unbounded reward functions, a new characterization for the weak convergence of probability measures is derived. Our results are illustrated by examples.

Authors: François Dufour (Inria CQFD) and Alexandre Genadot (Inria CQFD).

A young subfield of Evolutionary Computing that has gained the attention of many researchers in recent years is Genetic Improvement. It uses an automated search method that directly modifies the source code or binaries of a software system tofind improved versions based on some given criteria. Genetic Improvement has achieved notable results and the acceptance of several research communities, namely software engineering and evolutionary computation. Over the past 10 years there has been core publications on the subject, however, we have identified, to the best of our knowledge, that there is no work on applying Genetic Improvement to a meta-heuristic system. In this work we apply the GI framework called GISMO to the Beagle Puppy library version 0.1 in C++, a Genetic Programming system configured to perform symbolic regression on several benchmark and real-world problems. The objective is to improve the processing time while maintaining a similar or better test-fitness of the best individual produced by the unmodified Genetic Programming search. Results show that GISMO can generate individuals that present an improvement on those two key aspects over some problems, while also reducing the effects of bloat, one of the main issues in Genetic Programming.

Authors: Victor R. López-López, Leonardo Trujillo, Pierrick Legrand (Inria CQFD).

The increasing complexity of warfare submarine missions has led Naval Group to study new tactical help functions for underwater combat management systems. In this context, the objective is to find optimal trajectories according to the current mission type by taking into account sensors, environment and surrounding targets. This problem has been modeled as a discrete-time Markov decision process with finite horizon. A quantization technique has been applied to discretize the problem in order to get a finite MDP for which standard methods such as the dynamic and/or the linear programming approaches can be applied. Different kind of scenarios have been considered and studied.

Maintenance, optimization, fleet of industrial equipements The topic of this collaboration with Université de Montpellier and Thales Optronique is the application of Markov decision processes to the maintenance optimization of a fleet of industrial equipments.

Stochastic modelling, Optimization. This project has just started in November 2017. The topic of this collaboration with Lyre, l'Agence de l'eau Adour-Garonne and ENSEGID is the modeling of the uncertainties in the Water demand adequacy in a context of global climate change. A PhD thesis (2018-2021) is part of this project.

The involved research groups are Inria Rennes/IRISA Team SUMO; Inria Rocquencourt Team Lifeware; LIAFA University Paris 7; Bordeaux University.

The aim of this research project is to develop scalable model checking techniques that can handle large stochastic systems. Large stochastic systems arise naturally in many different contexts, from network systems to system biology. A key stochastic model we will consider is from the biological pathway of apoptosis, the programmed cell death.

Statistical methods have become more and more popular in signal and image processing over the past decades. These methods have been able to tackle various applications such as speech recognition, object tracking, image segmentation or restoration, classification, clustering, etc. We propose here to investigate the use of Bayesian nonparametric methods in statistical signal and image processing. Similarly to Bayesian parametric methods, this set of methods is concerned with the elicitation of prior and computation of posterior distributions, but now on infinite-dimensional parameter spaces. Although these methods have become very popular in statistics and machine learning over the last 15 years, their potential is largely underexploited in signal and image processing. The aim of the overall project, which gathers researchers in applied probabilities, statistics, machine learning and signal and image processing, is to develop a new framework for the statistical signal and image processing communities. Based on results from statistics and machine learning we aim at defining new models, methods and algorithms for statistical signal and image processing. Applications to hyperspectral image analysis, image segmentation, GPS localization, image restoration or space-time tomographic reconstruction will allow various concrete illustrations of the theoretical advances and validation on real data coming from realistic contexts.

The involved research groups are Inria Bordeaux Sud-Ouest Team CQFD and Thales Optronique. This new collaboration with Thales Optronique that started in October 2017 is funded by the Fondation Mathématique Jacques Hadamard. This is the continuation of the PhD Thesis of A. Geeraert. The objective of this project is to optimize the maintenance of a multi-component equipment that can break down randomly. The underlying problem is to choose the best dates to repair or replace components in order to minimize a cost criterion that takes into account costs of maintenance but also the cost associated to the unavailability of the system for the customer. In the PhD thesis of A. Geeraert, the model under consideration was rather simple and only a numerical approximation of the value function was provided. Here, our objective is more ambitious. A more realistic model will be considered and our aim is to provide a tractable quasi-optimal control strategy that can be applied in practice to optimize the maintenance of such equipments.

Program: Direcion General de Investigacion Cientifica y Tecnica, Gobierno de Espana

Project acronym: GAMECONAPX

Project title: Numerical approximations for Markov decision processes and Markov games

Duration: 01/2017 - 12/2019

Coordinator: Tomas Prieto-Rumeau, Department of Statistics and Operations Research, UNED (Spain)

Abstract:

This project is funded by the Gobierno de Espana, Direcion General de Investigacion Cientifica y Tecnica (reference number: MTM2016-75497-P) for three years to support the scientific collaboration between Tomas Prieto-Rumeau, Jonatha Anselmi and Francois Dufour. This research project is concerned with numerical approximations for Markov decision processes and Markov games. Our goal is to propose techniques allowing to approximate numerically the optimal value function and the optimal strategies of such problems. Although such decision models have been widely studied theoretically and, in general, it is well known how to characterize their optimal value function and their optimal strategies, the explicit calculation of these optimal solutions is not possible except for a few particular cases. This shows the need for numerical procedures to estimate or to approximate the optimal solutions of Markov decision processes and Markov games, so that the decision maker can really have at hand some approximation of his optimal strategies and his optimal value function. This project will explore areas of research that have been, so far, very little investigated. In this sense, we expect our techniques to be a breakthrough in the field of numerical methods for continuous-time Markov decision processes, but particularly in the area of numerical methods for Markov game models. Our techniques herein will cover a wide range of models, including discrete- and continuous-time models, problems with unbounded cost and transition rates, even allowing for discontinuities of these rate functions. Our research results will combine, on one hand, mathematical rigor (with the application of advanced tools from probability and measure theory) and, on the other hand, computational efficiency (providing accurate and ?applicable? numerical methods). In this sense, particular attention will be paid to models of practical interest, including population dynamics, queueing systems, or birth-and-death processes, among others. So, we expect to develop a generic and robust methodology in which, by suitably specifying the data of the decision problem, an algorithm will provide the approximations of the value function and the optimal strategies. Therefore, the results that we intend to obtain in this research project will be of interest for researchers in the fields of Markov decision processes and Markov games, both for the theoretical and the applied or practitioners communities

**Tree-Lab, ITT**. TREE-LAB is part of the Cybernetics research line within the Engineering Science graduate program offered by the Department of Electric and Electronic Engineering at Tijuana's Institute of Technology (ITT), in Tijuana Mexico. TREE-LAB is mainly focused on scientific and engineering research within the intersection of broad scientific fields, particularly Computer Science, Heuristic Optimization and Pattern Analysis. In particular, specific domains studied at TREE-LAB include Genetic Programming, Classification, Feature Based Recognition, Bio-Medical signal analysis and Behavior-Based Robotics. Currently, TREE-LAB incorporates the collaboration of several top researchers, as well as the participation of graduate (doctoral and masters) and undergraduate students, from ITT. Moreover, TREE-LAB is actively collaborating with top researchers from around the world, including Mexico, France, Spain, Portugal and USA.

Oswaldo Costa (Escola Politécnica da Universidade de São Paulo, Brazil) collaborate with the team on the theoretical aspects of continuous control of piecewise-deterministic Markov processes. He visited the team during two weeks in december 2018.

Tomas Prieto-Rumeau (Department of Statistics and Operations Research, UNED, Madrid, Spain) visited the team during 2 weeks in 2018. The main subject of the collaboration is the approximation of Markov Decision Processes.

F. Dufour is the chair of the Program Committee of the SIAM Conference on Control and Its Applications (CT19) in Pittsburgh, USA, 2019.

J. Anselmi has been a member of the technical program committee of the following international conference VALUETOOLS 2018.

P. Del Moral is an associate editor for the journal Stochastic Analysis and Applications since 2001.

P. Del Moral is an associate editor for the journal Revista de Matematica: Teoria y aplicaciones since 2009.

P. Del Moral is an associate editor for the journal Applied Mathematics and Optimization since 2009.

F. Dufour is corresponding editor of the SIAM Journal of Control and Optimization since 2018. F. Dufour is associate editor of the journal Applied Mathematics & Optimization (AMO) since 2018. F. Dufour is associate editor of the journal Stochastics: An International Journal of Probability and Stochastic Processes since 2018.

F. Dufour is the representative of the SIAM activity group in control and system theory for the journal SIAM News since 2014.

J. Saracco is an associate editor of the journal Case Studies in Business, Industry and Government Statistics (CSBIGS) since 2006.

All the members of CQFD are regular reviewers for several international journals and conferences in applied probability, statistics and operations research.

In March 2018, Jonatha Anselmi was invited to give a talk on load balancing for parallel systems at the Inria team Polaris (Grenoble).

François Dufour was invited to give a talk during the IMA Conference on Stochastic Control, Computational Methods, and Applications at University of Minnesota, May 2018.

François Dufour was invited to give a talk during the Symposium on Optimal Stopping, Rice University, Houston, Texas, June 2018.

J. Saracco is elected member of the council of the *Société Française de Statistique* (SFdS, French Statistical Society).

J. Saracco is deputy director of IMB (Institut de Mathématiques de Bordeaux, UMR CNRS 5251) since 2015.

M. Chavent is member of the national evaluation committee of Inria.

M. Chavent and Pierrick Legrand are members of the council of the Institut de Mathématique de Bordeaux.

F. Dufour has been the coordinator for the Inria evaluation of the theme "Stochastic Approaches"

Licence : J. Anselmi, Probabilités et statistiques, 20 heures, L3, Institut Polytechnique de Bordeaux, école ENSEIRB-MATMECA, filière Télécommunications, France.

Licence : J. Anselmi, Probabilités et statistiques, 16 heures, L3, Institut Polytechnique de Bordeaux, école ENSEIRB-MATMECA, filière Electronique, France.

Licence : J. Anselmi, Probabilités et statistiques, 48 heures, niveau L3, Institut Polytechnique de Bordeaux, école ENSEIRB-MATMECA, filière Mathématique et Mécanique, France.

Licence : F. Dufour, Probabilités et statistiques, 70h, first year of école ENSEIRB-MATMECA, Institut Polytechnique de Bordeaux, France.

Master : F. Dufour, Méthodes numériques pour la fiabilité, 36h, third year of école ENSEIRB-MATMECA, Institut Polytechnique de Bordeaux, France.

PhD completed : Alizé Geeraert, Contrôle optimal des processus Markoviens déterministes par morceaux et application à la maintenance, University of Bordeaux, supervised by B. de Saporta and F. Dufour (defense in October 2018).

PhD in progress : Tiffany Cherchi, “Automated optimal fleet management policy for airborne equipment”, Montpellier University, since 2017, supervised by B. De Saporta and F. Dufour.