EN FR
EN FR


Bibliography

Publications of the year

Doctoral Dissertations and Habilitation Theses

Articles in International Peer-Reviewed Journals

  • 3T. Collet, O. Pietquin.

    Optimism in Active Learning, in: Computational Intelligence and Neuroscience, August 2015.

    https://hal.inria.fr/hal-01225798
  • 4L. Devroye, G. Lugosi, G. Neu.

    Random-Walk Perturbations for Online Combinatorial Optimization, in: IEEE Transactions on Information Theory, June 2015, vol. 61, no 7, pp. 4099 - 4106. [ DOI : 10.1109/TIT.2015.2428253 ]

    https://hal.inria.fr/hal-01214987
  • 5N. Gatti, A. Lazaric, M. Rocco, F. Trovò.

    Truthful Learning Mechanisms for Multi–Slot Sponsored Search Auctions with Externalities, in: Artificial Intelligence, October 2015, vol. 227, pp. 93-139.

    https://hal.inria.fr/hal-01237670
  • 6H. Kadri, E. Duflos, P. Preux, S. Canu, A. Rakotomamonjy, J. Audiffren.

    Operator-valued Kernels for Learning from Functional Response Data, in: Journal of Machine Learning Research (JMLR), 2015.

    https://hal.archives-ouvertes.fr/hal-01221329
  • 7A. Khaleghi, D. Ryabko.

    Nonparametric multiple change point estimation in highly dependent time series, in: Theoretical Computer Science, November 2015. [ DOI : 10.1016/j.tcs.2015.10.041 ]

    https://hal.inria.fr/hal-01235330
  • 8B. Scherrer, M. Ghavamzadeh, V. Gabillon, B. Lesner, M. Geist.

    Approximate Modified Policy Iteration and its Application to the Game of Tetris, in: Journal of Machine Learning Research, 2015, vol. 16, 1629−1676 p, A paraître.

    https://hal.inria.fr/hal-01091341

International Conferences with Proceedings

  • 9J. Audiffren, M. Valko, A. Lazaric, M. Ghavamzadeh.

    Maximum Entropy Semi-Supervised Inverse Reinforcement Learning, in: International Joint Conference on Artificial Intelligence, Bueons Aires, Argentina, July 2015.

    https://hal.inria.fr/hal-01146187
  • 10M. Barlier, J. Perolat, R. Laroche, O. Pietquin.

    Human-Machine Dialogue as a Stochastic Game, in: 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL 2015), Prague, Czech Republic, September 2015.

    https://hal.inria.fr/hal-01225848
  • 11A. Carpentier, M. Valko.

    Simple regret for infinitely many armed bandits, in: International Conference on Machine Learning, Lille, France, July 2015.

    https://hal.inria.fr/hal-01153538
  • 12J. Chemali, A. Lazaric.

    Direct Policy Iteration with Demonstrations, in: IJCAI - 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, July 2015.

    https://hal.inria.fr/hal-01237659
  • 13T. Collet, O. Pietquin.

    Bayesian Credible Intervals for Online and Active Learning of Classification Trees, in: ADPRL 2015 - Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Cape Town, South Africa, Proceedings of the Symposium Series on Computational Intelligence, IEEE, December 2015.

    https://hal.inria.fr/hal-01225850
  • 14T. Collet, O. Pietquin.

    Optimism in Active Learning with Gaussian Processes, in: 22nd International Conference on Neural Information Processing (ICONIP2015), Istanbul, Turkey, November 2015.

    https://hal.inria.fr/hal-01225826
  • 15B. Derbel, P. Preux.

    Simultaneous Optimistic Optimization on the Noiseless BBOB Testbed, in: The 17th IEEE Congress on Evolutionary Computation (CEC), Sendai, Japan, May 2015.

    https://hal.inria.fr/hal-01246420
  • 16C. Dhanjal, R. Gaudel, S. Clémençon.

    Collaborative Filtering with Localised Ranking, in: Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI'15), Austin, United States, January 2015, 7 p.

    https://hal.inria.fr/hal-01255890
  • 17H. Glaude, C. Enderli, J.-F. Grandin, O. Pietquin.

    Learning of scanning strategies for electronic support using predictive state representations, in: International Workshop on Machine Learning for Signal Processing (MLSP 2015), Boston, United States, September 2015.

    https://hal.inria.fr/hal-01225807
  • 18H. Glaude, C. Enderli, O. Pietquin.

    Non-negative Spectral Learning for Linear Sequential Systems, in: 22nd International Conference on Neural Information Processing (ICONIP2015), Istanbul, Turkey, November 2015.

    https://hal.inria.fr/hal-01225838
  • 19H. Glaude, C. Enderli, O. Pietquin.

    Spectral learning with proper probabilities for finite state automation, in: ASRU 2015 - Automatic Speech Recognition and Understanding Workshop, Scottsdale, United States, Proceedings of the Automatic Speech Recognition and Understanding Workshop, IEEE, December 2015.

    https://hal.inria.fr/hal-01225810
  • 20J.-B. Grill, M. Valko, R. Munos.

    Black-box optimization of noisy functions with unknown smoothness, in: Neural Information Processing Systems, Montréal, Canada, December 2015.

    https://hal.inria.fr/hal-01222915
  • 21M. K. H. Hanawal, V. Saligrama, M. Valko, R. Munos.

    Cheap Bandits, in: International Conference on Machine Learning, Lille, France, 2015.

    https://hal.inria.fr/hal-01153540
  • 22K. Lakshmanan, R. Ortner, D. Ryabko.

    Improved Regret Bounds for Undiscounted Continuous Reinforcement Learning, in: International Conference on Machine Learning (ICML), Lille, France, July 2015.

    https://hal.inria.fr/hal-01165966
  • 23J. Mary, R. Gaudel, P. Preux.

    Bandits and Recommender Systems, in: First International Workshop on Machine Learning, Optimization, and Big Data (MOD'15), Taormina, Italy, Lecture Notes in Computer Science, Springer International Publishing, July 2015, vol. 9432, pp. 325-336. [ DOI : 10.1007/978-3-319-27926-8_29 ]

    https://hal.inria.fr/hal-01256033
  • 24T. Munzer, B. Piot, M. Geist, O. Pietquin, M. Lopes.

    Inverse Reinforcement Learning in Relational Domains, in: International Joint Conferences on Artificial Intelligence, Buenos Aires, Argentina, July 2015.

    https://hal.archives-ouvertes.fr/hal-01154650
  • 25V. Musco, M. Monperrus, P. Preux.

    An Experimental Protocol for Analyzing the Accuracy of Software Error Impact Analysis, in: Tenth IEEE/ACM International Workshop on Automation of Software Test, Florence, Italy, May 2015.

    https://hal.inria.fr/hal-01120913
  • 26G. Neu.

    Explore no more: Improved high-probability regret bounds for non-stochastic bandits, in: Advances on Neural Information Processing Systems 28 (NIPS 2015), Montreal, Canada, December 2015, pp. 3150-3158.

    https://hal.inria.fr/hal-01223501
  • 27G. Neu.

    First-order regret bounds for combinatorial semi-bandits, in: Proceedings of the 28th Annual Conference on Learning Theory (COLT), Paris, France, JMLR Workshop and Conference Proceedings, July 2015, vol. 40, pp. 1360-1375.

    https://hal.inria.fr/hal-01215001
  • 28J. Perolat, B. Scherrer, B. Piot, O. Pietquin.

    Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games, in: International Conference on Machine Learning (ICML 2015), Lille, France, July 2015.

    https://hal.inria.fr/hal-01153270
  • 29B. Piot, M. Geist, O. Pietquin.

    Imitation Learning Applied to Embodied Conversational Agents, in: 4th Workshop on Machine Learning for Interactive Systems (MLIS 2015), Lille, France, J. Workshop, C. Proceedings (editors), July 2015, vol. 43.

    https://hal.inria.fr/hal-01225816
  • 30D. Ryabko, B. Ryabko.

    Predicting the outcomes of every process for which an asymptotically accurate stationary predictor exists is impossible, in: International Symposium on Information Theory, Hong Kong, Hong Kong SAR China, IEEE, June 2015, pp. 1204-1206.

    https://hal.inria.fr/hal-01165876
  • 31A. Sani, A. Lazaric, D. Ryabko.

    The Replacement Bootstrap for Dependent Data, in: Proceedings of the IEEE International Symposium on Information Theory, Hong Kong, Hong Kong SAR China, June 2015.

    https://hal.inria.fr/hal-01144547
  • 32B. Szorenyi, R. Busa-Fekete, P. Weng, E. Hüllermeier.

    Qualitative Multi-Armed Bandits: A Quantile-Based Approach, in: Proceedings of The 32nd International Conference on Machine Learning, pp. 1660–1668, 2015, Lille, France, July 2015.

    https://hal.inria.fr/hal-01204708
  • 33A. C. Y. Tossou, C. Dimitrakakis.

    Algorithms for Differentially Private Multi-Armed Bandits, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.

    https://hal.inria.fr/hal-01234427
  • 34Z. Zhang, B. Rubinstein, C. Dimitrakakis.

    On the Differential Privacy of Bayesian Inference, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.

    https://hal.inria.fr/hal-01234215

Conferences without Proceedings

  • 35F. Guillou, R. Gaudel, P. Preux.

    Collaborative Filtering as a Multi-Armed Bandit, in: NIPS'15 Workshop: Machine Learning for eCommerce, Montréal, Canada, December 2015.

    https://hal.inria.fr/hal-01256254
  • 36F. Strub, J. Mary.

    Collaborative Filtering with Stacked Denoising AutoEncoders and Sparse Inputs, in: NIPS Workshop on Machine Learning for eCommerce, Montreal, Canada, December 2015.

    https://hal.inria.fr/hal-01256422

Scientific Books (or Scientific Book chapters)

Scientific Popularization

  • 38P. Philippe, M. Tommasi, T. Vieville, C. De La Higuera.

    L’apprentissage automatique : le diable n’est pas dans l’algorithme, June 2015, Article sur http://binaire.blog.lemonde.fr.

    https://hal.inria.fr/hal-01246178

Other Publications

References in notes
  • 42P. Auer, N. Cesa-Bianchi, P. Fischer.

    Finite-time analysis of the multi-armed bandit problem, in: Machine Learning, 2002, vol. 47, no 2/3, pp. 235–256.
  • 43R. Bellman.

    Dynamic Programming, Princeton University Press, 1957.
  • 44D. Bertsekas, S. Shreve.

    Stochastic Optimal Control (The Discrete Time Case), Academic Press, New York, 1978.
  • 45D. Bertsekas, J. Tsitsiklis.

    Neuro-Dynamic Programming, Athena Scientific, 1996.
  • 46M. Puterman.

    Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley and Sons, 1994.
  • 47H. Robbins.

    Some aspects of the sequential design of experiments, in: Bull. Amer. Math. Soc., 1952, vol. 55, pp. 527–535.
  • 48R. Sutton, A. Barto.

    Reinforcement learning: an introduction, MIT Press, 1998.
  • 49P. Werbos.

    ADP: Goals, Opportunities and Principles, IEEE Press, 2004, pp. 3–44, Handbook of learning and approximate dynamic programming.