EN FR
EN FR


Section: New Results

Estimation and control for stochastic processes

Inference for dynamical systems driven by Gaussian noises

Participant: S. Tindel

External collaborators: K. Chouk, A. Deya, Y. Hu, L. Khoa, D. Nualart, E. Nualart, F. Xu. (US)

The problem of estimating the coefficients of a general differential equation driven by a Gaussian process is still largely unsolved. To be more specific, the most general (-valued) equation handled up to now as far as parameter estimation is concerned is of the form:

X t θ = a + θ 0 t b ( X u ) d u + B t ,

where θ is the unknown parameter, b is a smooth enough coefficient and B is a one-dimensional fractional Brownian motion. In contrast with this simple situation, our applications of interest (motivated by some anomalous diffusion phenomenon in proteins fluctuations) require the analysis of the following n-valued equation:

X t θ = a + 0 t b ( θ ; X u ) d u + 0 t σ ( θ ; X u ) d B t , (2)

where θ enters non linearly in the coefficient, where σ is a non-trivial diffusion term and B is a d-dimensional fractional Brownian motion. We have thus decided to tackle this important scientific challenge first.

To this aim, here are the steps we have focused on in 2015:

  • Some limit theorems for general functionals of Gaussian sequences [6] , or for functionals of a Brownian motion [3] , which give some insight on the asymptotic behavior of systems like (2 ).

  • Extension of pathwise stochastic integration to processes indexed by the plane in [1] , which helps to the definition of noisy systems such as partial differential equations.

  • Definition of new systems driven by a (spatial) fractional Brownian motion, such as the stochastic PDE considered in [37] .

  • The local asymptotic normality obtained for the system (2 ), which implies a lower bound on general estimators of the coefficient θ. This is the contents of the preprint [41] .

Optimal estimation of the jump rate of a piecewise-deterministic Markov process

Participants: R. Azaïs, A. Muller-Gueudin

A piecewise-deterministic Markov process is a stochastic process whose behavior is governed by an ordinary differential equation punctuated by random jumps occurring at random times. In a recent preprint [33] , we focus on the nonparametric estimation problem of the jump rate for such a stochastic model observed within a long time interval under an ergodicity condition. More precisely, we introduce an uncountable class (indexed by the deterministic flow) of recursive kernel estimates of the jump rate and we establish their strong pointwise consistency as well as their asymptotic normality. In addition, we propose to choose among this class the estimator with the minimal variance, which is unfortunately unknown and thus remains to be estimated. We also discuss the choice of the bandwidth parameters by cross-validation methods. This paper has also been presented in two national workshops.

Estimation and optimal control for the TCP process

Participant : R. Azaïs

External collaborators: N. Krell (Rennes), B. de Saporta (Montpellier)

In [33] , we assume that the transition kernel is continuous with respect to the Lebesgue measure. This condition may be not satisfied in some applications, as for instance for the well-known TCP process that appears in the modeling of the famous Transmission Control Protocol used for data transmission over the Internet. As a consequence, we propose to investigate estimation followed by optimal control for this ergodic process. The particular framework defined by this process allows us to define an optimal policy for the estimation of its jump rate. We obtain at present an efficient method for estimating the moments of the conditional distribution of the inter-congestion times in an optimal way. This work is currently in progress.

Estimation of integrals from a Markov design

Participant : R. Azaïs

External participants: B. Delyon, F. Portier

Monte-Carlo methods for estimating an integral assume that the distribution of the random design is known. Unfortunately, some applications generate a design whose density function f is unknown. In this case, a solution is to perform the classical Monte-Carlo estimate of the integral by replacing f by a leave-one-out kernel estimator, and one may expect the convergence

1 n i = 1 n ϕ ( X i ) f ^ ( - i ) ( X i ) ϕ d λ ,

when the number n of independent data Xi goes to infinity. This difficult question has been investigated by François Portier and Bernard Delyon in a recent paper. We propose to extend this work to the more general case of a Markov design. This new model includes a large variety of applications, in particular in biology and climatology. Indeed, the data (Xi,ϕ(Xi)) are often obtained from a measuring instrument that is launched in its environment and thus follows a random walk in it. A paper on this work will be submitted soon.

Method of control for radiotherapy treatment using Decision Markov Processes

Participants : R. Azaïs, B. Scherrer, S. Tindel, S. Wantz-Mézières

In recent years, Bastogne, Keinj and Vallois designed a Markov model of the evolution of cells under a radiotherapy treatment. We are currently investigating the problem of optimizing the radiotherapy intensity sequence in order to kill as many cancerous cells as possible while preserving as many healthy cells, a problem that fits into the stochastic optimal control problem. Our preliminary efforts suggest that, since we are dealing with large populations of cells, the problem can be well approximated by a limit deterministic optimal control problem. We can solve this problem numerically with a Pontryagine approach, and symbolically (in the simplest cases) by identifying the critical points of some multivariate polynomials. The latter approach allows us to validate the fact that the former actually finds globally optimal solutions. This is a work in progress.

Numerical approximate schemes for large optimal control problems and zero-sum two player games

Participant: B. Scherrer

External collaborators: V. Gabillon, M. Ghavamzadeh, M. Geist, B. Lesner, J. Perolat, O. Pietquin, M. Tagorti

We have provided in [23] (ICML 2015) the first finite-sample analysis of the LSTD(λ) algorithm aimed at approximating the value of some fixed policy in a large MDP, through the approximation of the projected fixed point of the linear Bellman equation from samples. This analysis highlights the influence of the main parameter λ of the algorithm.

The long version of our previous work on the analysis of an approximate modified policy iteration for optimal control and its application to the Tetris domain is now published in JMLR [13] . The extension of this algorithm family for computing approximately-optimal non-stationary policies allows to improve the dependency with respect to the discount factor: we provide such improved bounds in [19] , as well as examples that show that our analysis is tight (and cannot be further improved).

An original analysis of the variation of the approximate modified policy iteration for computing approximate Nash equilibria in the more general setting of two-player zero-sum games was published in ICML 2015 [22] .