EN FR
EN FR


Bibliography

Publications of the year

Doctoral Dissertations and Habilitation Theses

  • 1M. Loth.

    Active Set Algorithms for the LASSO, Université Lille 1 Sciences et Technologies, 2011.
  • 2O.-A. Maillard.

    Apprentissage séquentiel: bandits, statistique et renforcement, Université Lille 1, Lille, France, Octobre 2011.
  • 3D. Ryabko.

    Learnability in Problems of Sequential Inference, Université Lille 1 Sciences et Technologies, 2011.
  • 4N. Viandier.

    Modélisation et utilisation des erreurs de pseudodistances GNSS en environnement transport pour l'amélioration des performances de localisation., Ecole Centrale de Lille, Juin 2011.

Articles in International Peer-Reviewed Journal

  • 5S. Bubeck, R. Munos, G. Stoltz.

    Pure Exploration in Finitely-Armed and Continuous-Armed Bandits, in: Theoretical Computer Science, 2011, vol. 412, p. 1832-1852.
  • 6S. Bubeck, R. Munos, G. Stoltz, C. Szepesvári.

    X-Armed Bandits, in: Journal of Machine Learning Research, 2011, vol. 12, p. 1655-1695.
  • 7P. Chainais, E. Kœnig, V. Delouille, Jean-François. Hochedez.

    Virtual Super Resolution of Scale Invariant Textured Images Using Multifractal Stochastic Processes, in: Journal of Mathematical Imaging and Vision, 2011, vol. 39, no 1, p. 28-44.
  • 8A. M. Farahmand, M. Ghavamzadeh, Cs. Szepesvári, S. Mannor.

    L2-Regularized Policy Iteration, in: Journal of Machine Learning Research, 2011, submitted.
  • 9S. Girgin, J. Mary, P. Preux, O. Nicol.

    Managing Advertising Campaigns – an Approximate Planning approach, in: Frontiers in Computer Science, October 2011.
  • 10A. Lazaric, M. Ghavamzadeh, R. Munos.

    Finite-Sample Analysis of Least-Squares Policy Iteration, in: Journal of Machine learning Research, 2011, To appear.
  • 11A. Lazaric, R. Munos.

    Learning with Stochastic Inputs and Adversarial Outputs, in: Journal of Computer and System Sciences, 2011, To appear.
  • 12B. Lebental, P. Chainais, P. Chenevier, N. Chevalier, E. Delevoye, J.-M. Fabbri, S. Nicoletti, P. Renaux, A. Ghis.

    Aligned carbon nanotube based ultrasonic microtransducers for durability monitoring in civil engineering, in: Nanotechnology, 2011, vol. 22, no 39.
  • 13D. Ryabko.

    On the relation between realizable and non-realizable cases of the sequence prediction problem., in: Journal of Machine Learning Research, 2011, vol. 12, p. 2161-2180.
  • 14D. Ryabko.

    Testing composite hypotheses about discrete ergodic processes, in: Test, 2011, (to appear).
  • 15B. Ryabko, D. Ryabko.

    Constructing Perfect Steganographic Systems, in: Information and Computation, 2011, vol. 209, no 9, p. 1223-1230.

International Conferences with Proceedings

  • 16M. G. Azar, R. Munos, M. Ghavamzadeh, H. Kappen.

    Speedy Q-Learning, in: Proceedings of Advances in Neural Information Processing Systems 24, MIT Press, 2011.
  • 17L. Busoniu, R. Munos, B. De Schutter, R. Babuska.

    Optimistic Planning for Sparsely Stochastic Systems, in: IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2011.
  • 18A. Carpentier, A. Lazaric, M. Ghavamzadeh, R. Munos, P. Auer.

    Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits, in: Proceedings of the Twenty-Second International Conference on Algorithmic Learning Theory, 2011, p. 189-203.
  • 19A. Carpentier, O.-A. Maillard, R. Munos.

    Sparse Recovery with Brownian Sensing, in: Advances in Neural Information Processing Systems, 2011.
  • 20A. Carpentier, R. Munos.

    Finite Time Analysis of Stratified Sampling for Monte Carlo, in: Advances in Neural Information Processing Systems, 2011.
  • 21P. Chainais, V. Delouille, Jean-François. Hochedez.

    Scale invariant images in astronomy through the lens of multifractal modeling, in: 2011 IEEE International Conference on Image Processing (IEEE ICIP2011), Brussels, Belgium, 9 2011.
  • 22E. Delande, E. Duflos, P. Vanheeghe, D. Heurguier.

    Multi-Sensor PHD by Space Partionning: Computation of a True Reference Density Within The PHD Framework, in: Statistical Signal Processing Workshop (SSP), 2011, Nice, France, IEEE - Signal Processing Society (editor), IEEE - Signal Processing Society, June 2011, p. 333 - 336. [ DOI : 10.1109/SSP.2011.5967695 ]

    http://hal.inria.fr/hal-00639710/en/
  • 23E. Delande, E. Duflos, P. Vanheeghe, D. Heurguier.

    Multi-Sensor PHD: Construction and Implementation by Space Partitioning, in: International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, Prague, Tchèque, République, IEEE - Signal Processing Society (editor), IEEE - Signal Processing Society, May 2011, p. 3632 - 3635. [ DOI : 10.1109/ICASSP.2011.5947137 ]

    http://hal.inria.fr/hal-00639724/en/
  • 24V. Gabillon, M. Ghavamzadeh, A. Lazaric, S. Bubeck.

    Multi-Bandit Best Arm Identification, in: Proceedings of Advances in Neural Information Processing Systems 24, MIT Press, 2011.
  • 25V. Gabillon, A. Lazaric, M. Ghavamzadeh, B. Scherrer.

    Classification-based Policy Iteration with a Critic, in: Proceedings of the Twenty-Eighth International Conference on Machine Learning, 2011, p. 1049-1056.
  • 26N. Gatti, A. Lazaric, F. Trovó.

    A Truthful Learning Mechanism for Contextual Multi-Slot Sponsored Search Auctions with Externalities, in: AAMAS'12, 2011, submitted.
  • 27M. Ghavamzadeh, A. Lazaric, R. Munos, M. Hoffman.

    Finite-Sample Analysis of Lasso-TD, in: Proceedings of the Twenty-Eighth International Conference on Machine Learning, 2011, p. 1177-1184.
  • 28M. Ghavamzadeh, A. Lazaric, R. Munos, O.-A. Maillard.

    LSTD with Random Projections, in: Proceedings of the Twenty-Fourth Annual Conference on Advances in Neural Information Processing Systems, 2011.
  • 29N. Jaoua, E. Duflos, P. Vanheeghe, L. Clavier, F. Septier.

    Impulsive interference mitigation in ad hoc networks based on alpha-stable modeling and partile filtering, in: International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, Prague, Tchèque, République, IEEE - Signal Processing Society (editor), IEEE - Signal Processing Society, May 2011, p. 3548 - 3551. [ DOI : 10.1109/ICASSP.2011.5946244 ]

    http://hal.inria.fr/hal-00640682/en/
  • 30H. Kadri, E. Duflos, P. Preux, S. Canu.

    Multiple functional regression with both discrete and continuous covariates, in: Proc. of the 2nd International Workshop on Functional and Operatorial Statistics, F. Ferraty (editor), Contributions to Statistics, Physica-Verlag HD, June 2011, p. 189-195, Recent Advances in Functional Data Analysis and Related Topics, Chapter 29.
  • 31H. Kadri, E. Duflos, P. Preux.

    Learning Vocal Tract Variables With Multi-Task Kernels, in: Proc. 36th IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, May 2011.
  • 32A. Lazaric, M. Restelli.

    Transfer from Multiple MDPs, in: Advances in Neural Information Processing Systems, August 2011.
  • 33O.-A. Maillard, R. Munos.

    Adaptive bandits: Towards the best history-dependent strategy, in: International conference on Artificial Intelligence and Statistics, 2011.
  • 34O.-A. Maillard, R. Munos, D. Ryabko.

    Selecting the State-Representation in Reinforcement Learning, in: Advances in Neural Information Processing Systems, 2011.
  • 35O.-A. Maillard, R. Munos, G. Stoltz.

    Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences, in: Conference On Learning Theory, 2011.
  • 36R. Munos.

    Optimistic Optimization of Deterministic Functions without the Knokledge of its Smoothness, in: Advances in Neural Information Processing Systems, 2011.
  • 37A. Rabaoui, E. Duflos, N. Viandier, J. Marais.

    Selecting the Hyperparameters of the DPM Models fir the density estimation of observation errors, in: International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, Prague, Tchèque, République, IEEE - Signal Processing Society (editor), IEEE - Signal Processing Society, May 2011, p. 4092 - 4095.
  • 38A. Rabaoui, H. Kadri, P. Preux, E. Duflos, A. Rakotomamonjy.

    Functional Regularized Least Squares Classification with Operator-Valued Kernels, in: Proc. 28th International Conference on Machine Learning (ICML), New York, NY, USA, L. Getoor, T. Scheffer (editors), ACM, June 2011, p. 993–1000.
  • 39B. Ryabko, D. Ryabko.

    Confidence Sets in Time–Series Filtering, in: Proc. 2011 IEEE International Symposium on Information Theory (ISIT), St. Petersburg, Russia, 2011, p. 2436-2438.
  • 40C. Salperwyck, V. Lemaire.

    Incremental discretization for supervised learning, in: CLADAG: CLAssification and Data Analysis Group — 8th International Meeting of the Italian Statistical Society, September 2011.
  • 41C. Salperwyck, V. Lemaire.

    Learning with few examples: an empirical study on leading classifiers, in: International Joint Conference on Neural Networks (IJCNN), IEEE, August 2011.

National Conferences with Proceeding

  • 42P. Chainais, M. Chevaldonné, J.-M. Favreau.

    Synthèse de textures multifractales directement sur des surfaces 3D, in: Proc. of GRETSI, 2011.
  • 43P. Chainais, B. Lebental.

    Caractérisation statistique d'une assemblée de nanotubes en imagerie microscopique, in: Proc. of GRETSI, 2011.

Scientific Books (or Scientific Book chapters)

  • 44G. Arnold-Dulac, L. Denoyer, P. Preux, P. Gallinari.

    Datum-wise classification. A sequential Approach to sparsity, in: Machine Learning and Knowledge Discovery in Databases, D. Gunopulos, T. Hofmann, D. Malerba, M. Vazirgiannis (editors), Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2011, vol. 6911, p. 375-390, Proc. European Conference on Machine Learning (ECML).

    http://dx.doi.org/10.1007/978-3-642-23780-5_34
  • 45L. Busoniu, A. Lazaric, M. Ghavamzadeh, R. Munos, R. Babuska, B. De Schutter.

    Least-squares methods for policy iteration, in: Reinforcement Learning: State of the Art, M. Wiering, M. van Otterlo (editors), Springer, 2011.
  • 46L. Busoniu, R. Munos, R. Babuska.

    Optimistic Planning in Markov decision processes, in: Reinforcement Learning and Adaptive Dynamic Programming for feedback control, F. Lewis, D. Liu (editors), Wiley, 2011, To appear.
  • 47S. Clémençon, R. Gaudel, J. Jakubowicz.

    Clustering Rankings in the Fourier Domain, in: Machine Learning and Knowledge Discovery in Databases: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD'11), D. Gunopulos, T. Hofmann, D. Malerba, M. Vazirgiannis (editors), Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2011, p. 343–358.

    http://dx.doi.org/10.1007/978-3-642-23780-5_32

Internal Reports

  • 48M. G. Azar, R. Munos, M. Ghavamzadeh, H. Kappen.

    Reinforcement Learning with a Near Optimal rate of Convergence, INRIA, 2011, no inria-00636615.

    http://hal.inria.fr/inria-00636615/en/
  • 49A. Carpentier, A. Lazaric, M. Ghavamzadeh, R. Munos, P. Auer.

    Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits, INRIA, 2011, no inria-00594131.
  • 50V. Gabillon, M. Ghavamzadeh, A. Lazaric, S. Bubeck.

    Multi-Bandit Best Arm Identification, INRIA, 2011, no inria-00632523.

    http://hal.inria.fr/hal-00632523_v3/
  • 51V. Gabillon, M. Ghavamzadeh, A. Lazaric, B. Scherrer.

    Classification-based Policy Iteration with a Critic, INRIA, 2011, no inria-00590972.

    http://hal.inria.fr/hal-00590972_v1/
  • 52M. Ghavamzadeh, A. Lazaric, O.-A. Maillard, R. Munos.

    LSPI with Random Projections, INRIA, 2011, no inria-00530762.

    http://hal.inria.fr/inria-00530762_v1/
  • 53H. Kadri, P. Preux, E. Duflos, S. Canu.

    Operator-valued Kernels for Nonparametric Operator Estimation, INRIA, 2011, no 7607.

    http://hal.inria.fr/inria-00587649/en

Other Publications

  • 54G. Arnold-Dulac, L. Denoyer, P. Preux, P. Gallinari.

    Sequential Approaches for Learning Datum-Wise Sparse Representations, October 2011, (submitted).
  • 55J. Daquin.

    Factorisation non-négative de matrices, Université de Lille, 2011, master in applied mathematics, Ph. Preux, B. Beckermann adviors.
  • 56E. Duflos, N. Viandier, J. Marais, A. Rabaoui, P. Vanheeghe.

    GNSS Urban Localization Enhencement using Dirichlet Process Mixture Modelling, December 2011, Workshop Non Parametric Bayes at NIPS 2011 (Genada, Spain).
  • 57A. M. Farahmand, M. Ghavamzadeh, Cs. Szepesvári, S. Mannor.

    L2-Regularized Fitted-Q Iteration Algorithm, 2011, in preparation.
  • 58M. Ghavamzadeh, S. Mannor, P. Poupart.

    Bayesian Reinforcement Learning: A Survey, 2011, in preparation.
  • 59S. Girgin.

    VVU Project report, July 2011, Deliverable, VVU project, PICOM, France.
  • 60S. Girgin, P. Preux.

    Identification of prospective clients, October 2011, Report for the contract with Addressing Business (confidential).
  • 61M. Hoffman, A. Lazaric, M. Ghavamzadeh, R. Munos.

    Regularized Least Squares Temporal Difference Learning with Nested 2 and 1 Penalization, in: Ninth European Workshop on Reinforcement Learning, 2011.
  • 62N. Jaoua, E. Duflos, P. Vanheeghe.

    Nonparametric Bayesian state estimation in nonlinear dynamic systems with alpha-stable measurement noise, December 2011, Workshop Non Parametric Bayes at NIPS 2011 (Genada, Spain).
  • 63O. Nicol.

    On-line Trading of Exploration and Exploitation, June 2011, invited speech Exploration/Exploitation challenge ICML workshop.
  • 64C. Salperwyck, V. Lemaire.

    Impact de la taille de l'ensemble d'apprentissage : une étude empirique, January 2011, Atelier CIDN : Clustering incrémental et méthodes de détection de nouveauté de la conférence Extraction et Gestion des Connaissances (EGC).
  • 65C. Salperwyck, T. Urvoy.

    On-line Trading of Exploration and Exploitation, June 2011, invited speech Exploration/Exploitation challenge ICML workshop.
References in notes
  • 66P. Auer, N. Cesa-Bianchi, P. Fischer.

    Finite-time analysis of the multi-armed bandit problem, in: Machine Learning, 2002, vol. 47, no 2/3, p. 235–256.
  • 67R. Bellman.

    Dynamic Programming, Princeton University Press, 1957.
  • 68D. Bertsekas, S. Shreve.

    Stochastic Optimal Control (The Discrete Time Case), Academic Press, New York, 1978.
  • 69D. Bertsekas, J. Tsitsiklis.

    Neuro-Dynamic Programming, Athena Scientific, 1996.
  • 70E. Delande, E. Duflos, D. Heurguier, P. Vanheeghe.

    Multi-target PHD filtering: proposition of extensions to the multi-sensor case, INRIA, 2010, no 7337.
  • 71E. Duflos, S. Razavi, C. Haas, P. Vanheeghe.

    Belief Function Based Algorithm for Material Detection and Tracking in Construction, in: Proceedings of Workshop on the theory of belief functions, April 2010, CDROM - 6 pages.
  • 72T. Ferguson.

    A Bayesian Analysis of Some Nonparametric Problems, in: The Annals of Statistics, 1973, vol. 1, no 2, p. 209–230.
  • 73T. Hastie, R. Tibshirani, J. Friedman.

    The elements of statistical learning — Data Mining, Inference, and Prediction, Springer, 2001.
  • 74H. Kadri, E. Duflos, P. Preux, S. Canu, M. Davy.

    Nonlinear functional regression: a functional RKHS approach, in: Proc. of the 13th Artificial Intelligence and Statistics (AI & Stats), JMLR: W&CP 9, May 13-15 2010, p. 374–380.
  • 75J. Marais, E. Duflos, N. Viandier, D. Nahimana, A. Rabaoui.

    Advanced signal processing techniques for multipath mitigation in land transportation environment, in: Proceedings of ITSC 2010, September 2010, Proceedings on CD ROM (6 pages).
  • 76J. Marais, N. Viandier, A. Rabaoui, E. Duflos.

    GNSS multipath bias models for accurate positioning in urban environments, in: Proceedings of ITST 2010, November 2010, Proceedings on CD ROM (6 pages).
  • 77W. Powell.

    Approximate Dynamic Programming, Wiley, 2007.
  • 78M. Puterman.

    Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley and Sons, 1994.
  • 79H. Robbins.

    Some aspects of the sequential design of experiments, in: Bull. Amer. Math. Soc., 1952, vol. 55, p. 527–535.
  • 80J. Rust.

    How Social Security and Medicare Affect Retirement Behavior in a World of Incomplete Market, in: Econometrica, July 1997, vol. 65, no 4, p. 781–831.

    http://gemini.econ.umd.edu/jrust/research/rustphelan.pdf
  • 81J. Rust.

    On the Optimal Lifetime of Nuclear Power Plants, in: Journal of Business & Economic Statistics, 1997, vol. 15, no 2, p. 195–208.

    http://129.3.20.41/eprints/io/papers/9512/9512002.abs
  • 82R. Sutton, A. Barto.

    Reinforcement learning: an introduction, MIT Press, 1998.
  • 83G. Tesauro.

    Temporal Difference Learning and TD-Gammon, in: Communications of the ACM, March 1995, vol. 38, no 3.

    http://www.research.ibm.com/massive/tdl.html
  • 84N. Viandier, A. Rabaoui, J. Marais, E. Duflos.

    GNSS pseudorange error density tracking using Dirichlet Process Mixture, in: Proceedings of FUSION 2010, July 2010, Proceedings on CD ROM (7 pages).
  • 85N. Viandier, A. Rabaoui, J. Marais, E. Duflos.

    Studies on DPM for the density estimation of pseudorange noises and evaluations on real data, in: Proceedings of IEEE Plans, May 2010, Proceedings on CD ROM (8 pages).
  • 86P. Werbos.

    ADP: Goals, Opportunities and Principles, IEEE Press, 2004, p. 3–44, Handbook of learning and approximate dynamic programming.