Section: New Results
Batched Bandit Problems
Participant : Vianney Perchet [correspondent] .
Collaboration with Philippe Rigollet, Sylvain Chassang and Erik Snowberg.
Motivated by practical applications, chiefly clinical trials, we study in  the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.