SEQUEL - 2018 - Annual activity report

SEQUEL

SEQUEL - 2018

Project-Team Sequel

Team, Visitors, External Collaborators

Overall Objectives

Presentation

Research Program

Application Domains

Sequential decision making under uncertainty and prediction

Highlights of the Year

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Bilateral Contracts with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: Partnerships and Cooperations

International Initiatives

Inria Associate Teams Not Involved in an Inria International Labs

Allocate

Participants : Pierre Perrault, Julien Seznec, Michal Valko, Émilie Kaufmann, Odalric Maillard.

Title: Adaptive allocation of resources for recommender systems
Inria contact: Michal Valko
International Partner (Institution - Laboratory - Researcher):
- Otto-von-Guericke-Universität Magdeburg A. Carpentier
Start year: 2017
We plan to improve a practical scenario of resource allocation in market surveys, such as product appraisals and music recommendation. In practice, the market is typically divided into segments: geographic regions, age groups, ...These groups are then queried for preference with some fixed rule of a number of queries per group. This testing is costly and non-adaptive. The reason is some groups are easier to estimate than others, but this is impossible to know a priori. Our challenge is adaptively allocate the optimal number of samples to each group and improve the efficient of market studies, by providing sample-efficient solutions. In 2018 we made big advances that resulted in two new research results, currently under review.

Inria International Partners

Declared Inria International Partners

SequeL
Title: The multi-armed bandit problem
International Partner (Institution - Laboratory - Researcher):
- University of Leoben (Austria) Peter Auer
Duration: 2014 - 2018
Start year: 2014
In a nutshell, the collaboration is focusing on nonparametric algorithms for active learning problems, mainly involving theoretical analysis of reinforcement learning and bandits problems beyond the traditional settings of finite-state MDPs (for RL) or i.i.d. rewards (for bandits). Peter Auer from University of Leoben is a worldwide leader in the field, having introduced the UCB approach around 2000, along with its finite-time analysis. Today, SequeL is likely to be the largest research group working in this field in the world, enjoying worldwide recognition. SequeL and P. Auer's group have been collaborating for a couple of years now; they have co-authored papers, visited each other (sabbatical stay, post-doc), coorganized workshops; the STREP Complacs partially funds this very active collaboration.

CWI

We also collaborate with P. Grunwald, and W. Koolen through the associate team headed by Benjamin Guedj from Modal.

Participation in Other International Programs

In 2017, we mentioned many collaborations with: Adobe, MIT, Stanford, Leoben, ...

Massachusetts Institute of Technology

Victor-Emmanuel Brunel Collaborator
M. Valko collaborated with V.-E. Brunel on the estimation of low rank determinantal point processes useful for diverse recommender systems.

Otto-von-Guericke-Universität Magdeburg

Alexandra Carpentier Collaborator
M. Valko collaborated with A. Carpentier on adaptive estimation of the block-diagonal matrices with application to market segmentations. This collaboration formalized in September 2017 by creating a north-european associate team. which results in two finished results.

Adobe Research

Y. Abbasi-Yadkori Collaborator
M. Valko collaborated on learning in unpredictable but potentially easy environment. This led to a publication in COLT 2018.

University of California, Berkeley

Peter Bartlett Collaborator
Victor Gabillon Collaborator
Alain Malek Collaborator
M. Valko collaborated with P.Barlett, V. Gabillon, and A. Malek on the sample complexities in unknown type of environments.

DeepMind London

Rémi Munos Collaborator
M. Valko collaborated with R. Munos on Brownian motion maximization, important for stock value predictions. This led to a publication in NIPS 2018.

Mila, Université de Montréal

A. Courville Collaborator
A. Touati Collaborator
F. Strub and O. Pietquin collaborate on deep reinforcement learning for language acquisition. This led to several papers at IJCAI, CVPR, and NIPS, as well as the Guesswhat?! dataset and protocol, and the HOME dataset.
M. Valko collaborates on faster learning in submodular learning with limited feedback. This setting has application in marketing when we want to select the inventory while maximizing the profit.

McGill University, Montreal

A. Durand, J. Pineau Collaborator
A. Durand and OA. Maillard collaborate on a project of structured bandits, with application in physics (calibration).

Northeastearn University, Boston

M. Aziz, J. Anderton, J. Aslam Collaborator
E. Kaufmann collaborate with M. Aziz, J. Anderton and J. Aslam on a project on infinite bandits, which led to an ALT 2018 publication. E. Kaufmann also collaborates with M. Aziz on bandits for phase I clinical trials. This led to the submission of a paper to the Biometrics journal.

Previous |

Home | Next next