LARSEN - 2017 - Annual activity report

LARSEN

LARSEN - 2017

Project-Team Larsen

Personnel

Overall Objectives

Research Program

Application Domains

Highlights of the Year

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Cifre Diatelic-Pharmagest

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: New Results

Lifelong Autonomy

Sensorized environment

Localisation of Robots on a Load-sensing Floor

Participants : François Charpillet, Francis Colas, Vincent Thomas.

The use of floor-sensors in ambient intelligence contexts began in the late 1990’s. We designed such a sensing floor in Nancy in collaboration with the Hikob company (http://www.hikob.com) and Inria SED. This is a load-sensing floor which is composed of square tiles, each equipped with two ARM processors (Cortex M3 and A8), 4 load cells, and a wired connection to the four neighboring cells. Ninety tiles cover the floor of our experimental platform (HIS).

This year, with Aurelien Andre (master student from Univ. Lorraine), we have focused on tracking robots on several scenarios based on data originated from the sensing tiles and collected the previous years. We have proposed a new approach to build relevant clusters of tiles (based on connexity). For single robot scenarios, we have focused on basic algorithms (for instance, Kalman filter) and on Probability Data Association Filter to consider the possibility of false positive in the bayesian filter. Then, for multi-target tracking, we have investigated elaborate strategies to associate atomic measures to the tracked targets like JPDAF (Joint Probability Data Association Filter algorithm [58]) and JPDAMF (Joint Probability Data Association Merged Filter [45]) in order to consider measures resulting from several targets.

High Integrity Personal Tracking Using Fault Tolerant Multi-Sensor Data Fusion

Participants : François Charpillet, Maan Badaoui El-Najjar.

Maan Badaoui El Najjar is professor at university of Lille and he is the head of the DiCOT Team “Diagnostic, Control and Observation for fault Tolerant Systems” of the CRIStAL Laboratory.

The objective of this PhD work is to study the possibilities offered by the above mentioned load-sensing floor. The idea is to combine the information from each sensor (load sensors and accelerometers) to identify daily living activities (walking, standing, lying down, sitting, falling) and to create a positioning system for the person in the apartment. The approach is based on information theory to address the detection of outliers during the fusion process. This is based on informational filters and fault detection to identify and eliminate faulty measurements. This work was carried through the PhD Thesis of Mohamad Daher under the supervision of François Charpillet and Maan Badaoui El Najjar. This thesis was defended at university of Lille on the 13th December 2017.

Publication: [14]

Active Sensing and Multi-Camera Tracking

Participants : François Charpillet, Vincent Thomas.

The problem of active sensing is of paramount interest for building self awareness in robotic systems. It consists of a system to make decisions in order to gather information (measured through the entropy of the probability distribution over unknown variables) in an optimal way.

This problem we are focusing on consists of following the trajectories of persons with the help of several controllable cameras in the smart environment. The approach we are working on is based on probabilistic decision processes in partial observability (POMDP - Partially Observable Markov Decision Processes) and particle filters. In the past, we have proposed an original formalism $r h o$ -POMDP and new algorithms for representing and solving active sensing problems [43] by tracking several persons with fixed camera based on particle filters and Simultaneous Tracking and Activity Recognition approach [49].

This year, approaches based on Monte-Carlo Tree Search algorithms (MCTS) like POMCP [60] have been used to build policies for following a single person with several controllable cameras in a simulated environment.

Partially Observable Markovian Decision Processes (POMDP)

Solving $ρ$ -POMDP using Lipschitz Properties

Participant : Vincent Thomas.

We are currently investigating how to solve continuous MDP and $ρ$ -POMDP by using Lipschitz property (rather than classical Piecewise Linear and Convex property used to solve POMDP). We have proven that if the transition and reward functions are lipschtiz-continuous, the value function has the same property.

With Mathieu Fehr (Ulm ENS student), we have studied new algorithm based on HSVI (Heuristic Search Value Iteration [62]) to take advantage of the lipschitz continuity property. The properties of these algorithms are currently investigated.

Distributed Exploration of an Unknown Environment by a Swarm of Robots

Participants : Nassim Kalde, François Charpillet, Olivier Simonin.

Olivier Simonin is Professeur at INSA Lyon and is the scientific leader of Chroma Team.

In this PhD, we have explored the issue for a team of cooperating mobile robots to intelligently explore an unknown environment. This question has been addressed both in the framework of sequential decision making and frontier based exploration. Considered environments includes static or populated environments.

This work was carried through the PhD Thesis of Nassim Fates under the supervision of François Charpillet and Olivier Simonin. This thesis was defended on the 12th December 2017.

Robot Learning

Black-box Data-efficient RObot Policy Search (Black-DROPS)

Participants : Konstantinos Chatzilygeroudis, Dorian Goepp, Rituraj Kaushik, Jean-Baptiste Mouret.

The most data-efficient algorithms for reinforcement learning (RL) in robotics are based on uncertain dynamical models: after each episode, they first learn a dynamical model of the robot, then they use an optimization algorithm to find a policy that maximizes the expected return given the model and its uncertainties. It is often believed that this optimization can be tractable only if analytical, gradient-based algorithms are used; however, these algorithms require using specific families of reward functions and policies, which greatly limits the flexibility of the overall approach. We introduced a novel model-based RL algorithm [23], called Black-DROPS (Black-box Data-efficient RObot Policy Search), that: (1) does not impose any constraint on the reward function or the policy (they are treated as black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for data-efficient RL in robotics, and (3) is as fast (or faster) than analytical approaches when several cores are available. The key idea is to replace the gradient-based optimization algorithm with a parallel, black-box algorithm that takes into account the model uncertainties. We demonstrate the performance of our new algorithm on two standard control benchmark problems (in simulation) and a low-cost robotic manipulator (with a real robot).

Publications: [23]

Reset-free Data-efficient Trial-and-error for Robot Damage Recovery

Participants : Konstantinos Chatzilygeroudis, Jean-Baptiste Mouret, Vassilis Vassiliades.

The state-of-the-art RL algorithms for robotics require the robot and the environment to be reset to an initial state after each episode, that is, the robot is not learning autonomously. In addition, most of the RL methods for robotics do not scale well with complex robots (e.g., walking robots) and either cannot be used at all or take too long to converge to a solution (e.g., hours of learning). We introduced a novel learning algorithm called “Reset-free Trial-and-Error” (RTE) that (1) breaks the complexity by pre-generating hundreds of possible behaviors with a dynamics simulator of the intact robot, and (2) allows complex robots to quickly recover from damage while completing their tasks and taking the environment into account [13]. We evaluated our algorithm on a simulated wheeled robot, a simulated six-legged robot, and a real six-legged walking robot that are damaged in several ways (e.g., a missing leg, a shortened leg, faulty motor, etc.) and whose objective is to reach a sequence of targets in an arena. Our experiments show that the robots can recover most of their locomotion abilities in an environment with obstacles, and without any human intervention.

Publications: [13]

Illumination & Quality Diversity Algorithms

Using Centroidal Voronoi Tessellations to Scale up the MAP-Elites Algorithm

Participants : Konstantinos Chatzilygeroudis, Jean-Baptiste Mouret, Vassilis Vassiliades.

The MAP-Elites algorithm [55] is a key step of our “Intelligent Trial and Error” approach [46] for data-efficient damage recovery. It works by discretizing a continuous feature space into unique regions according to the desired discretization per dimension. While simple, this algorithm has a main drawback: it cannot scale to high-dimensional feature spaces since the number of regions increase exponentially with the number of dimensions. We addressed this limitation by introducing a simple extension of MAP-Elites that has a constant, pre-defined number of regions irrespective of the dimensionality of the feature space [21]. Our main insight is that methods from computational geometry could partition a high-dimensional space into well-spread geometric regions. In particular, our algorithm uses a centroidal Voronoi tessellation (CVT) to divide the feature space into a desired number of regions; it then places every generated individual in its closest region, replacing a less fit one if the region is already occupied. We demonstrated the effectiveness of the new “CVT-MAP-Elites” algorithm in high-dimensional feature spaces through comparisons against MAP-Elites in maze navigation and hexapod locomotion tasks.

Publications: [21], [37], [38]

Aerodynamic Design Exploration through Surrogate-Assisted Illumination

Participants : Adam Gaier, Jean-Baptiste Mouret.

Design optimization techniques are often used at the beginning of the design process to explore the space of possible designs. In these domains, illumination algorithms, such as MAP-Elites, are promising alternatives to classic optimization algorithms because they produce diverse, high quality solutions in a single run, instead of a single, near-optimal solution. Unfortunately, these algorithms currently require a large number of function evaluations, limiting their applicability. In our recent work [27], [26], we introduced a new illumination algorithm, called Surrogate-Assisted Illumination (SAIL), that creates a map of the design space according to user-defined features by leveraging surrogate modeling and intelligent sampling to minimize the number of evaluations. On a 2-dimensional airfoil optimization problem SAIL produces hundreds of diverse but high performing designs with several orders of magnitude fewer evaluations than MAP-Elites [55] or CMA-ES [52]. As shown in this article, SAIL can also produce maps of high-performing designs in a more realistic 3-dimensional aerodynamic task with an accurate flow simulation. Overall, SAIL can help designers understand what is possible, beyond what is optimal, by considering more than pure objective-based optimization.

Publications: [27], [26]

Applications – civil robotics

Minimally Invasive Exploration of Heritage Buildings

Participants : Jean-Baptiste Mouret, Lucien Renaud, Kapil Sawant.

In 2017, the team officially joined the ScanPyramids mission, which aims at better understanding how the pyramids of the Old Kingdom were built, but also to encourage innovations in various fields (muography, virtual reality, simulation, ...) that could be useful for the pyramids as well as for other monuments. The ScanPyramids team has discovered several previously unknown voids in the pyramid of Cheops, one of them with a size similar to the one of the Grand Gallery, called « ScanPyramids' Big Void ».

We participated to the article about the ScanPyramids' Big Void [17] and we designed several prototypes for minimally invasive exploration. We envision exploration to take place in two stages. At first, a tubular robot fitted with an omnidirectional camera would be inserted to take high-resolution pictures of the inaccessible place. In a second stage, the team would use the same hole to send an exploration robot operated remotely to travel through corridors and help mapping the interior. For this second step, we are currently designing a miniature blimp that would be folded during the insertion, then remotely inflated once in the inaccessible place. When the exploration is over, the blimp would come back to its base, be deflated, then extracted from the insertion hole.

Publications: [17]

Humanoid Robotics

Trial-and-error Learning of Repulsors for Humanoid QP-based Whole-Body Control

Participants : Karim Bouyarmane, Serena Ivaldi, Jean-Baptiste Mouret, Jonathan Spitz, Vassilis Vassiliades.

Whole body controllers based on quadratic programming allow humanoid robots to achieve complex motions. However, they rely on the assumption that the model perfectly captures the dynamics of the robot and its environment, whereas even the most accurate models are never perfect. We introduced a trial-and-error learning algorithm that allows whole-body controllers to operate in spite of inaccurate models, without needing to update these models [35]. The main idea is to encourage the controller to perform the task differently after each trial by introducing repulsors in the quadratic program cost function. We demonstrated our algorithm on (1) a simple 2D case and (2) a simulated iCub robot for which the model used by the controller and the one used in simulation do not match.

Publications: [35]

Safe Trajectory Optimization for Whole-body Motion of Humanoids

Participants : Serena Ivaldi, Valerio Modugno.

Multi-task prioritized controllers generate complex behaviors for humanoids that concurrently satisfy several tasks and constraints. In our previous work we automatically learned the task priorities that maximized the robot performance in whole-body reaching tasks, ensuring that the optimized priorities were leading to safe behaviors. Here, we take the opposite approach: we optimize the task trajectories for whole-body balancing tasks with switching contacts, ensuring that the optimized movements are safe and never violate any of the robot and problem constraints. We use (1+1)-CMA-ES with Constrained Covariance Adaptation as a constrained black box stochastic optimization algorithm, with an instance of (1+1)- CMA-ES for bootstrapping the search. We apply our learning framework to the prioritized whole-body torque controller of iCub, to optimize the robot’s movement for standing up from a chair.

Publications: [29]

Humanoid Robot Fall Control

Participant : Karim Bouyarmane.

Falling is a major skill to be mastered by an autonomous humanoid robot, since no matter what balance controller we use, a humanoid robot will end up falling in certain circumstances. We proposed new approaches to control humanoid robots in general fall configurations and in general cluttered environment. From fall detection instant, a pre-imapct phase is triggered where a real-time configuration adaptation routine makes the robot quickly analyze the surrounding environment, choose best impact points on the environment, and adapts its configuration accordingly to meet the desired impact points (all calculations performed in the short duration of 0.7s to 1s that the fall lasts). Then right after impact a real-time motor PD gain adaptation controller allows to set the right values for the gains in real-time to comply actively with the impact while minimizing peak torque at impact. Finally, a model-predictive approach combined with a novel formulation of admissible force polytopes accounting for both torque limits and Coulomb friction limitation ensures that the robot safely comes to a steady-state resting state at the end of the fall.

Publications: [41], [34], [33]

Stability Proof of Weighted Multi-Task Humanoid QP Controller

Participant : Karim Bouyarmane.

We proved that weighted multi-task controllers are locally exponentially stable under appropriate conditions of the task gain matrices. We also derived a number of stability properties of the underlying QP optimization problem.

Publications: [12]

Theoretical Study of Commonalities between Locomotion and Manipulation in Humanoid-like Locomotion-and-manipulation Integration System

Participant : Karim Bouyarmane.

We published our theoretical study on common ground formulations of locomotion and manipulation, and thereby their extension to integrated locomotion-and-manipulation systems, by analytically deriving their planning and control solutions in low-dimensional proof-of-concept examples based on nonlinear control and differential geometry tools.

Publications: [11]

Embodied Evolutionary Robotics

Online Distributed Learning for a Swarm of Robots

Participants : Iñaki Fernández Pérez, Amine Boumaza, François Charpillet.

We study how a swarm of robots adapts over time to solve a collaborative task using a distributed Embodied Evolutionary approach, where each robot runs an evolutionary algorithm and locally exchange genomes and fitness values. Particularly, we study a collaborative foraging task, where the robots are rewarded for collecting food items that are too heavy to be collected individually and need at least two robots to be collected. Furthermore, to promote collaboration, agents must agree on a signal in order to collect the items. Our experiments show that the distributed algorithm is able to evolve swarm behavior to collect items cooperatively. The experiments also reveal that effective cooperation is evolved due mostly to the ability of robots to jointly reach food items, while learning to display the right color that matches the item is done suboptimally. However, a closer analysis shows that, without a mechanism to avoid neglecting any kind of item, robots collect all of them, which means that there is some degree of learning to choose the right value for the color effector depending on the situation.

This work was carried through the PhD Thesis of Iñaki Fernández Pérez under the supervision of François Charpillet and Amine Boumaza. This thesis was defended on the 19th December 2017.

Publications: [25]

Phylogeny of Embodied Evolutionary Robotics

Participant : Amine Boumaza.

We explore the idea of analyzing Embodied Evolutionary Robotics from the perspective of genes and their dynamics using phylogenetic trees. We illustrate a general approach on a simple question regarding the dynamics of the fittest and most copied genes as an illustration using tools from spectral graph theory or computational phylogenetics, and argue that such an approach may give interesting insights on the behavior of these algorithms. This idea seems promising and further investigations are underway, especially on the links with coalescence theory.

Publications: [22]

Previous |

Home | Next next

LARSEN - 2017

LARSEN - 2017

Section: New Results

Lifelong Autonomy

Sensorized environment

Localisation of Robots on a Load-sensing Floor

High Integrity Personal Tracking Using Fault Tolerant Multi-Sensor Data Fusion

Active Sensing and Multi-Camera Tracking

Partially Observable Markovian Decision Processes (POMDP)

Solving ρ-POMDP using Lipschitz Properties

Distributed Exploration of an Unknown Environment by a Swarm of Robots

Robot Learning

Black-box Data-efficient RObot Policy Search (Black-DROPS)

Reset-free Data-efficient Trial-and-error for Robot Damage Recovery

Illumination & Quality Diversity Algorithms

Using Centroidal Voronoi Tessellations to Scale up the MAP-Elites Algorithm

Aerodynamic Design Exploration through Surrogate-Assisted Illumination

Applications – civil robotics

Minimally Invasive Exploration of Heritage Buildings

Humanoid Robotics

Trial-and-error Learning of Repulsors for Humanoid QP-based Whole-Body Control

Safe Trajectory Optimization for Whole-body Motion of Humanoids

Humanoid Robot Fall Control

Stability Proof of Weighted Multi-Task Humanoid QP Controller

Theoretical Study of Commonalities between Locomotion and Manipulation in Humanoid-like Locomotion-and-manipulation Integration System

Embodied Evolutionary Robotics

Online Distributed Learning for a Swarm of Robots

Phylogeny of Embodied Evolutionary Robotics

Solving $ρ$ -POMDP using Lipschitz Properties