Inria | Raweb 2016 | Presentation of the Project-Team SEQUEL | SEQUEL Web Site


	PDF	e-Pub

Previous |

Home

Bibliography

Major publications by the team in recent years

1O. Cappé, A. Garivier, O.-A. Maillard, R. Munos, G. Stoltz.
Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation, in: Annals of Statistics, 2013, vol. 41, n^o 3, pp. 1516-1541, Accepted, to appear in Annals of Statistics.
https://hal.archives-ouvertes.fr/hal-00738209
2A. Carpentier, M. Valko.
Revealing graph bandits for maximizing local influence, in: International Conference on Artificial Intelligence and Statistics, Seville, Spain, May 2016.
https://hal.inria.fr/hal-01304020
3N. Gatti, A. Lazaric, M. Rocco, F. Trovò.
Truthful Learning Mechanisms for Multi–Slot Sponsored Search Auctions with Externalities, in: Artificial Intelligence, October 2015, vol. 227, pp. 93-139.
https://hal.inria.fr/hal-01237670
4M. Ghavamzadeh, Y. Engel, M. Valko.
Bayesian Policy Gradient and Actor-Critic Algorithms, in: Journal of Machine Learning Research, January 2016, vol. 17, n^o 66, pp. 1-53.
https://hal.inria.fr/hal-00776608
5H. Kadri, E. Duflos, P. Preux, S. Canu, A. Rakotomamonjy, J. Audiffren.
Operator-valued Kernels for Learning from Functional Response Data, in: Journal of Machine Learning Research (JMLR), 2016.
https://hal.archives-ouvertes.fr/hal-01221329
6E. Kaufmann, O. Cappé, A. Garivier.
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models, in: Journal of Machine Learning Research, January 2016, vol. 17, pp. 1-42.
https://hal.archives-ouvertes.fr/hal-01024894
7A. Lazaric, M. Ghavamzadeh, R. Munos.
Analysis of Classification-based Policy Iteration Algorithms, in: Journal of Machine Learning Research, 2016, vol. 17, pp. 1 - 30.
https://hal.inria.fr/hal-01401513
8R. Munos.
From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning, 2014, 130 pages.
https://hal.archives-ouvertes.fr/hal-00747575
9R. Ortner, D. Ryabko, P. Auer, R. Munos.
Regret bounds for restless Markov bandits, in: Journal of Theoretical Computer Science (TCS), 2014, vol. 558, pp. 62-76. [ DOI : 10.1016/j.tcs.2014.09.026 ]
https://hal.inria.fr/hal-01074077
10D. Ryabko, J. Mary.
A Binary-Classification-Based Metric between Time-Series Distributions and Its Use in Statistical and Learning Problems, in: Journal of Machine Learning Research, 2013, vol. 14, pp. 2837-2856.
https://hal.inria.fr/hal-00913240

Publications of the year

Doctoral Dissertations and Habilitation Theses

11H. Glaude.
Learning rational linear sequential systems using the method of moments, Université de Lille 1 - Sciences et Technologies, July 2016.
https://tel.archives-ouvertes.fr/tel-01374080
12F. Guillou.
On Recommendation Systems in a Sequential Context, Université Lille 3, December 2016.
https://tel.archives-ouvertes.fr/tel-01407336
13V. Musco.
Propagation Analysis based on Software Graphs and Synthetic Data, Université Lille 3, November 2016.
https://tel.archives-ouvertes.fr/tel-01398903
14M. Valko.
Bandits on graphs and structures, École normale supérieure de Cachan - ENS Cachan, June 2016, Habilitation à diriger des recherches.
https://hal.inria.fr/tel-01359757

Articles in International Peer-Reviewed Journals

15M. Ghavamzadeh, Y. Engel, M. Valko.
Bayesian Policy Gradient and Actor-Critic Algorithms, in: Journal of Machine Learning Research, January 2016, vol. 17, n^o 66, pp. 1-53.
https://hal.inria.fr/hal-00776608
16H. Kadri, E. Duflos, P. Preux, S. Canu, A. Rakotomamonjy, J. Audiffren.
Operator-valued Kernels for Learning from Functional Response Data, in: Journal of Machine Learning Research (JMLR), 2016.
https://hal.archives-ouvertes.fr/hal-01221329
17E. Kaufmann, O. Cappé, A. Garivier.
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models, in: Journal of Machine Learning Research, January 2016, vol. 17, pp. 1-42.
https://hal.archives-ouvertes.fr/hal-01024894
18A. Khaleghi, D. Ryabko.
Nonparametric multiple change point estimation in highly dependent time series, in: Theoretical Computer Science, 2016, vol. 620, pp. 119-133. [ DOI : 10.1016/j.tcs.2015.10.041 ]
https://hal.inria.fr/hal-01235330
19A. Khaleghi, D. Ryabko, J. Mary, P. Preux.
Consistent Algorithms for Clustering Time Series, in: Journal of Machine Learning Research, 2016, vol. 17, n^o 3, pp. 1 - 32.
https://hal.inria.fr/hal-01399613
20A. Lazaric, M. Ghavamzadeh, R. Munos.
Analysis of Classification-based Policy Iteration Algorithms, in: Journal of Machine Learning Research, 2016, vol. 17, pp. 1 - 30.
https://hal.inria.fr/hal-01401513
21V. Musco, M. Monperrus, P. Preux.
A Large-scale Study of Call Graph-based Impact Prediction using Mutation Testing, in: Software Quality Journal, 2016. [ DOI : 10.1007/s11219-016-9332-8 ]
https://hal.inria.fr/hal-01346046
22G. Neu, B. Gábor.
Importance Weighting Without Importance Weights: An Efficient Algorithm for Combinatorial Semi-Bandits, in: Journal of Machine Learning Research, August 2016, vol. 17, n^o 154, pp. 1 - 21.
https://hal.archives-ouvertes.fr/hal-01380278

International Conferences with Proceedings

23K. Azizzadenesheli, A. Lazaric, A. Anandkumar.
Reinforcement Learning of POMDPs using Spectral Methods, in: Proceedings of the 29th Annual Conference on Learning Theory (COLT2016), New York City, United States, June 2016.
https://hal.inria.fr/hal-01322207
24M. Barlier, R. Laroche, O. Pietquin.
A Stochastic Model for Computer-Aided Human-Human Dialogue, in: Interspeech 2016, San Francisco, United States, September 2016, vol. 2016, pp. 2051 - 2055.
https://hal.inria.fr/hal-01406894
25M. Barlier, R. Laroche, O. Pietquin.
Learning Dialogue Dynamics with the Method of Moments, in: Workshop on Spoken Language Technologie (SLT 2016), San Diego, United States, December 2016.
https://hal.inria.fr/hal-01406904
26D. Calandriello, A. Lazaric, M. Valko.
Analysis of Nyström method with sequential ridge leverage score sampling, in: Uncertainty in Artificial Intelligence, New York City, United States, June 2016.
https://hal.inria.fr/hal-01343674
27A. Carpentier, M. Valko.
Revealing graph bandits for maximizing local influence, in: International Conference on Artificial Intelligence and Statistics, Seville, Spain, May 2016.
https://hal.inria.fr/hal-01304020
28L. El Asri, R. Laroche, O. Pietquin.
Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory, in: 7th International Workshop on Spoken Dialogue Systems (IWSDS 2016), Saariselka, Finland, January 2016.
https://hal.inria.fr/hal-01406873
29L. El Asri, B. Piot, M. Geist, R. Laroche, O. Pietquin.
Score-based Inverse Reinforcement Learning, in: International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), Singapore, Singapore, May 2016.
https://hal.inria.fr/hal-01406886
30A. Erraqabi, M. Valko, A. Carpentier, O.-A. Maillard.
Pliable rejection sampling, in: International Conference on Machine Learning, New York City, United States, June 2016.
https://hal.inria.fr/hal-01322168
31C. Z. Felício, K. V. R. Paixão, C. A. Z. Barcelos, P. Preux.
Preference-like Score to Cope with Cold-Start User in Recommender Systems, in: 28th International Conference on Tools with Artificial Intelligence (ICTAI), San Jose, United States, Proceedings of the IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), November 2016.
https://hal.inria.fr/hal-01390762
32V. Gabillon, A. Lazaric, M. Ghavamzadeh, R. Ortner, P. Bartlett.
Improved Learning Complexity in Combinatorial Pure Exploration Bandits, in: Proceedings of the 19th International Conference on Artificial Intelligence (AISTATS), Cadiz, Spain, May 2016.
https://hal.inria.fr/hal-01322198
33A. Garivier, E. Kaufmann.
Optimal Best Arm Identification with Fixed Confidence, in: 29th Annual Conference on Learning Theory (COLT), New York, United States, JMLR Workshop and Conference Proceedings, June 2016, vol. 49.
https://hal.archives-ouvertes.fr/hal-01273838
34A. Garivier, E. Kaufmann, W. M. Koolen.
Maximin Action Identification: A New Bandit Framework for Games, in: 29th Annual Conference on Learning Theory (COLT), New-York, United States, JMLR Workshop and Conference Proceedings, June 2016, vol. 49.
https://hal.archives-ouvertes.fr/hal-01273842
35A. Garivier, E. Kaufmann, T. Lattimore.
On Explore-Then-Commit Strategies, in: NIPS, Barcelona, Spain, Advances in Neural Information Processing Systems (NIPS), December 2016, vol. 29.
https://hal.archives-ouvertes.fr/hal-01322906
36H. Glaude, O. Pietquin.
PAC learning of Probabilistic Automaton based on the Method of Moments, in: International Conference on Machine Learning (ICML 2016), New York, United States, June 2016.
https://hal.inria.fr/hal-01406889
37J.-B. Grill, M. Valko, R. Munos.
Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning, in: NIPS 2016 - Thirtieth Annual Conference on Neural Information Processing Systems, Barcelona, Spain, December 2016.
https://hal.inria.fr/hal-01389107
38F. Guillou, R. Gaudel, P. Preux.
Large-scale Bandit Recommender System, in: the 2nd International Workshop on Machine Learning, Optimization and Big Data (MOD'16), Volterra, Italy, August 2016.
https://hal.inria.fr/hal-01406389
39F. Guillou, R. Gaudel, P. Preux.
Scalable explore-exploit Collaborative Filtering, in: Pacific Asia Conference on Information Systems (PACIS'16), Chiayi, Taiwan, 2016.
https://hal.inria.fr/hal-01406418
40F. Guillou, R. Gaudel, P. Preux.
Sequential Collaborative Ranking Using (No-)Click Implicit Feedback, in: The 23rd International Conference on Neural Information Processing (ICONIP'16), Kyoto, Japan, Lecture Notes in Computer Science, October 2016, vol. 9948, pp. 288 - 296. [ DOI : 10.1007/978-3-319-46672-9_33 ]
https://hal.inria.fr/hal-01406338
41E. Kaufmann, T. Bonald, M. Lelarge.
A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks, in: ALT 2016 - Algorithmic Learning Theory, Bari, Italy, R. Ortner, H. U. Simon, S. Zilles (editors), Lecture Notes in Computer Science, Springer, October 2016, vol. 9925, pp. 355-370. [ DOI : 10.1007/978-3-319-46379-7_24 ]
https://hal.archives-ouvertes.fr/hal-01163147
42T. Kocák, G. Neu, M. Valko.
Online learning with Erdős-Rényi side-observation graphs, in: Uncertainty in Artificial Intelligence, New York City, United States, June 2016.
https://hal.inria.fr/hal-01320588
43T. Kocák, G. Neu, M. Valko.
Online learning with noisy side observations, in: International Conference on Artificial Intelligence and Statistics, Seville, Spain, May 2016.
https://hal.inria.fr/hal-01303377
44V. Musco, A. Carette, M. Monperrus, P. Preux.
A Learning Algorithm for Change Impact Prediction, in: 5th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering, Austin, United States, May 2016.
https://hal.inria.fr/hal-01279620
45V. Musco, M. Monperrus, P. Preux.
Mutation-Based Graph Inference for Fault Localization, in: International Working Conference on Source Code Analysis and Manipulation, Raleigh, United States, October 2016.
https://hal.inria.fr/hal-01350515
46J. Pérolat, B. Piot, M. Geist, B. Scherrer, O. Pietquin.
Softened Approximate Policy Iteration for Markov Games, in: ICML 2016 - 33rd International Conference on Machine Learning, New York City, United States, June 2016.
https://hal.inria.fr/hal-01393328
47J. Pérolat, B. Piot, B. Scherrer, O. Pietquin.
On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games, in: 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), Cadiz, Spain, Proceedings of the International Conference on Artificial Intelligences and Statistics, May 2016.
https://hal.inria.fr/hal-01291495
48D. Ryabko.
Things Bayes can't do, in: Proceedings of the 27th International Conference on Algorithmic Learning Theory (ALT'16), Bari, Italy, October 2016, vol. LNCS, n^o 9925, pp. 253-260. [ DOI : 10.1007/978-3-319-46379-7_17 ]
https://hal.inria.fr/hal-01380063
49F. Strub, R. Gaudel, J. Mary.
Hybrid Recommender System based on Autoencoders, in: the 1st Workshop on Deep Learning for Recommender Systems, Boston, United States, September 2016, pp. 11 - 16. [ DOI : 10.1145/2988450.2988456 ]
https://hal.inria.fr/hal-01336912
50A. C. Y. Tossou, C. Dimitrakakis.
Algorithms for Differentially Private Multi-Armed Bandits, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.
https://hal.inria.fr/hal-01234427
51Z. Zhang, B. Rubinstein, C. Dimitrakakis.
On the Differential Privacy of Bayesian Inference, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.
https://hal.inria.fr/hal-01234215

Conferences without Proceedings

52A. Bérard, C. Servan, O. Pietquin, L. Besacier.
MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP, in: The 10th edition of the Language Resources and Evaluation Conference (LREC), Portoroz, Slovenia, May 2016.
https://hal.archives-ouvertes.fr/hal-01335930
53F. Guillou, R. Gaudel, P. Preux.
Compromis exploration-exploitation pour système de recommandation à grande échelle, in: Conférence francophone sur l'Apprentissage Automatique (CAp'16), Marseille, France, July 2016.
https://hal.inria.fr/hal-01406439
54F. Strub, J. Mary, R. Gaudel.
Filtrage Collaboratif Hybride avec des Auto-encodeurs, in: Conférence francophone sur l'Apprentissage Automatique (CAp'16), Marseille, France, July 2016.
https://hal.inria.fr/hal-01406432

Internal Reports

55B. Danglot, P. Preux, B. Baudry, M. Monperrus.
Correctness Attraction: A Study of Stability of Software Behavior Under Runtime Perturbation, HAL, 2016, n^o hal-01378523.
https://hal.archives-ouvertes.fr/hal-01378523

Other Publications

56S. Bubeck, R. Eldan, J. Lehec.
Sampling from a log-concave distribution with Projected Langevin Monte Carlo, January 2017, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01428950
57C. Dimitrakakis, F. Jarboui, D. Parkes, L. Seeman.
Multi-view Sequential Games: The Helper-Agent Problem, December 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01408294
58E. Kaufmann.
On Bayesian index policies for sequential resource allocation, September 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01251606
59A. R. Luedtke, E. Kaufmann, A. Chambaz.
Asymptotically Optimal Algorithms for Multiple Play Bandits with Partial Feedback, June 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01338733
60F. Strub, J. Mary, R. Gaudel.
Hybrid Collaborative Filtering with Autoencoders, July 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01281794

References in notes

61P. Auer, N. Cesa-Bianchi, P. Fischer.
Finite-time analysis of the multi-armed bandit problem, in: Machine Learning, 2002, vol. 47, n^o 2/3, pp. 235–256.
62R. Bellman.
Dynamic Programming, Princeton University Press, 1957.
63D. Bertsekas, S. Shreve.
Stochastic Optimal Control (The Discrete Time Case), Academic Press, New York, 1978.
64D. Bertsekas, J. Tsitsiklis.
Neuro-Dynamic Programming, Athena Scientific, 1996.
65M. Puterman.
Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley and Sons, 1994.
66H. Robbins.
Some aspects of the sequential design of experiments, in: Bull. Amer. Math. Soc., 1952, vol. 55, pp. 527–535.
67R. Sutton, A. Barto.
Reinforcement learning: an introduction, MIT Press, 1998.
68P. Werbos.
ADP: Goals, Opportunities and Principles, IEEE Press, 2004, pp. 3–44, Handbook of learning and approximate dynamic programming.

Previous |

Home