EN FR
EN FR


Section: New Software and Platforms

Explauto: Autonomous Exploration and Learning Benchmarking

An autonomous exploration library

 

Scientific Description

An important challenge in Developmental Robotics is how robots can efficiently learn sensorimotor mappings by experience, i.e. the mappings between the motor actions they make and the sensory effects they produce. This can be a robot learning how arm movements make physical objects move, or how movements of a virtual vocal tract modulates vocalization sounds. The way the robot will collects its own sensorimotor experience have a strong impact on learning efficiency because for most robotic systems the involved spaces are high dimensional, the mapping between them is non-linear and redundant, and there is limited time allowed for learning. If robots explore the world in an unorganized manner, e.g. randomly, learning algorithms will be often ineffective because very sparse data points will be collected. Data are precious due to the high dimensionality and the limited time, whereas data are not equally useful due to non-linearity and redundancy. This is why learning has to be guided using efficient exploration strategies, allowing the robot to actively drive its own interaction with the environment in order to gather maximally informative data to feed the sensorimotor model.

In the recent year, work in developmental learning has explored various families of algorithmic principles which allow the efficient guiding of learning and exploration.

Explauto is a framework developed to study, model and simulate curiosity-driven learning and exploration in virtual and robotic agents. Explauto's scientific roots trace back from Intelligent Adaptive Curiosity algorithmic architecture [120] , which has been extended to a more general family of autonomous exploration architecture by [73] and recently expressed as a compact and unified formalism [114] . The library is detailed in [115] .

In Explauto, the strategies to explore sensorimotor models are called interest models. They implements the active exploration process, where sensorimotor experiments are chosen to improve the forward or inverse prediction of the sensorimotor model. The simplest strategy is to randomly draw goals tin the motor or sensory space. More efficient strategies are based on the active choice of learning experiments that maximize learning progress, for e.g. improvement of predictions or of competences to reach goals [120] . This automatically drives the system to explore and learn first easy skills, and then explore skills of progressively increasing complexity. Both random and learning progress models can act either on the motor or on the sensory space, resulting in motor babbling or goal babbling strategies.

  • Motor babbling consists in sampling commands in the motor space according to a given strategy (random or learning progress), predicting the expected sensory consequence, executing the command through the environment and observing the actual sensory effect. Both sensorimotor and interest models are finally updated according to this experience.

  • Goal babbling consists in sampling goals in the sensory effect space and to use the current state of the sensorimotor model to infer a motor action supposed to reach the goals (inverse prediction). The robot/agent then executes the command through the environment and observes the actual sensory effect. Both sensorimotor and interest models are finally updated according to this experience.

It has been shown that this second strategy allows a progressive covering of the reachable sensory space much more uniformly than in a motor babbling strategy, where the agent samples directly in the motor space [73] .  

Figure 6. Complex sensorimotor mappings involve a high dimensional sensorimotor spaces. For the sake of visualization, the motor M and sensory S spaces are only 2D each in this example. The relationship between M and S is non-linear, dividing the sensorimotor space into regions of unequal stability: small regions of S can be reached very precisely by large regions of M, or large regions in S can be very sensitive to variations in M.: s as well as a non-linear and redundant relationship. This non-linearity can imply redundancy, where the same sensory effect can be attained using distinct regions in M.
IMG/explStratIllustr2D.png

Functional Description

This library provides high-level API for an easy definition of:

  • Virtual and robotics setups (Environment level),

  • Sensorimotor learning iterative models (Sensorimotor level),

  • Active choice of sensorimotor experiments (Interest level).

The library comes with several built-in environments. Two of them corresponds to simulated environments: a multi-DoF arm acting on a 2D plan, and an under-actuated torque-controlled pendulum. The third one allows to control real robots based on Dynamixel actuators using the Pypot library.

Learning sensorimotor mappings involves machine learning algorithms, which are typically regression algorithms to learn forward models, from motor controllers to sensory effects, and optimization algorithms to learn inverse models, from sensory effects, or goals, to the motor programs allowing to reach them. We call these sensorimotor learning algorithms sensorimotor models. The library comes with several built-in sensorimotor models: simple nearest-neighbor look-up, non-parametric models combining classical regressions and optimization algorithms, online mixtures of Gaussians, and discrete Lidstone distributions. Explauto sensorimotor models are online learning algorithms, i.e. they are trained iteratively during the interaction of the robot in the environment in which it evolves.

Explauto provides also a unified interface to define exploration strategies using the InterestModel class. The library comes with two built-in interest models: random sampling as well as sampling maximizing the learning progress in forward or inverse predictions.

This library has been used in many experiments including:

  • the control of a 2D simulated arm,

  • the exploration of the inverse kinematics of a poppy humanoid (both on the real robot and on the simulated version),

  • accoustic model of a vocal tract.

Explauto is crossed-platform and has been tested on Linux, Windows and Mac OS. It has been released under the GPLv3 license.