SEQUEL - 2019 - Annual activity report

SEQUEL

SEQUEL - 2019

Project-Team Sequel

Team, Visitors, External Collaborators

Overall Objectives

Presentation

Research Program

Application Domains

Sequential decision making under uncertainty and prediction

Highlights of the Year

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Bilateral Contracts with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: Partnerships and Cooperations

European Initiatives

Collaborations in European Programs, Except FP7 & H2020

DELTA

Participants : Michal Valko, Émilie Kaufmann, Omar Darwiche Domingues, Pierre Ménard.

Program: CHIST-ERA
Project acronym: DELTA
Project title: Dynamically Evolving Long-Term Autonomy
Duration: October 2017 - December 2021
Coordinator: Anders Jonsson (PI)
Inria Coordinator: Michal Valko
Other partners: UPF Spain, MUL Austria, ULG Belgium
Abstract: Many complex autonomous systems (e.g., electrical distribution networks) repeatedly select actions with the aim of achieving a given objective. Reinforcement learning (RL) offers a powerful framework for acquiring adaptive behaviour in this setting, associating a scalar reward with each action and learning from experience which action to select to maximise long-term reward. Although RL has produced impressive results recently (e.g., achieving human-level play in Atari games and beating the human world champion in the board game Go), most existing solutions only work under strong assumptions: the environment model is stationary, the objective is fixed, and trials end once the objective is met. The aim of this project is to advance the state of the art of fundamental research in lifelong RL by developing several novel RL algorithms that relax the above assumptions. The new algorithms should be robust to environmental changes, both in terms of the observations that the system can make and the actions that the system can perform. Moreover, the algorithms should be able to operate over long periods of time while achieving different objectives. The proposed algorithms will address three key problems related to lifelong RL: planning, exploration, and task decomposition. Planning is the problem of computing an action selection strategy given a (possibly partial) model of the task at hand. Exploration is the problem of selecting actions with the aim of mapping out the environment rather than achieving a particular objective. Task decomposition is the problem of defining different objectives and assigning a separate action selection strategy to each. The algorithms will be evaluated in two realistic scenarios: active network management for electrical distribution networks, and microgrid management. A test protocol will be developed to evaluate each individual algorithm, as well as their combinations.

Previous |

Home | Next next