Section: New Results

Robotic and Computational Models of Human Development

Computational models of information-seeking, curiosity and attention

Participants : Pierre-Yves Oudeyer, Manuel Lopes.

An associated team, called Neurocuriosity, was created between Flowers and the Cognitive Neuroscience lab of Jacqueline Gottlieb at Univ. Columbia, NY. The goal of this associated team is to investigate mechanisms of spontaneous exploration and learning in humans by setting up experiments allowing to confirm or falsify predictions made by computational models previously developped by the team. This constitutes a crucial collaboration between developmental robotics and cognitive neuroscience. This joint work already led to a major publication on curiosity and information seeking, in the prestigious Trends in Cognitive Science journal (impact factor: 16.5). [27]

Abstract: Intelligent animals devote much time and energy to exploring and obtaining information, but the underlying mechanisms are poorly understood. We review recent developments on this topic that have emerged from the traditionally separate fields of machine learning, eye movements in natural behavior, and studies of curiosity in psychology and neuroscience. These studies show that exploration may be guided by a family of mechanisms that range from automatic biases toward novelty or surprise to systematic searches for learning progress and information gain in curiosity-driven behavior. In addition, eye movements reflect visual information searching in multiple conditions and are amenable for cellular-level investigations. This suggests that the oculomotor system is an excellent model system for understanding information-sampling mechanisms.

Formalizing Imitation Learning

Participants : Thomas Cederborg, Pierre-Yves Oudeyer.

An original formalization of imitation learning was elaborated. Previous attempts to systematize imitation learning has been limited to categorizing different types of demonstrator goals (for example defining success in terms of the sequential joint positions of a dance, or in terms of environmental end states), and/or been limited to a smaller subset of imitation (such as learning from tele-operated demonstrations). The formalism proposed attempts to describe a large number of different types of learning algorithms using the same notation. Any type of algorithm that modifies a policy based on observations of a human, is treated as an interpretation hypothesis of this behavior. One example would be an update algorithm that updates a policy, partially based on the hypothesis that the demonstrator succeeds at demonstrations with probability 0.8, or an update algorithm that assumes that a scalar value is an accurate evaluation of an action compared to the latest seven actions. The formalism aims to give a principled way of updating these hypotheses, either rejecting some of a set of hypotheses regarding the same type of behavior, or set of parameters of an hypothesis. Any learning algorithm that modifies policy based on observations of a human that wants an agent to do something or act in some way, is describable as an interpretation hypothesis. If the learning algorithm is static, this simply corresponds to an hypothesis that is not updated based on observations. A journal article [26] .

Self-Organization of Early Vocal Development in Infants and Machines: The Role of Intrinsic Motivation

Participants : Clément Moulin-Frier, Sao Mai Nguyen, Pierre-Yves Oudeyer.

We bridge the gap between two issues in infant development: vocal development and intrinsic motivation. We propose and experimentally test the hypothesis that general mechanisms of intrinsically motivated spontaneous exploration, also called curiosity-driven learning, can self-organize developmental stages during early vocal learning and explain several aspects observed in infants (Figure 20 ). We introduce a computational model of intrinsically motivated vocal exploration, which allows the learner to autonomously structure its own vocal experiments, and thus its own learning schedule, through a drive to maximize competence progress. This model relies on a physical model of the vocal tract, the auditory system and the agent's motor control, as well as vocalizations of social peers. We present computational experiments that show how such a mechanism can explain the adaptive transition from vocal self-exploration with little influence from the speech environment, to a later stage where vocal exploration becomes influenced by vocalizations of peers (Figure 21 ). Within the initial self-exploration phase, we show that a sequence of vocal production stages self-organizes, and shares properties with data from infant developmental psychology: the vocal learner first discovers how to control phonation, then focuses on vocal variations of unarticulated sounds, and finally automatically discovers and focuses on babbling with articulated proto-syllables (Figure 22 ). As the vocal learner becomes more proficient at producing complex sounds, imitating vocalizations of peers starts to provide high learning progress explaining an automatic shift from self-exploration to vocal imitation.

This work has been recently accepted in the journal Frontiers in Psychology, Cognitive Science [30] .

Figure 20. Rapid view of the first year of infant vocal development.
Figure 21. Our model displays an adaptive transition from vocal self-exploration with little influence from the speech environment, to a later stage where vocal exploration becomes influenced by vocalizations of peers.
Figure 22. Within the self-exploration phase, our model first discovers how to control phonation, then focuses on vocal variations of unarticulated sounds, and finally automatically discovers and focuses on babbling with articulated proto-syllables.
Emergent Proximo-Distal Motor Development through Adaptive Exploration, applied to Reaching and Vocal Learning

Participants : Freek Stulp, Pierre-Yves Oudeyer, Jules Brochard, Clément Moulin-Frier.

Life-long robot learning in the high-dimensional real world requires guided and structured exploration mechanisms. In this developmental context, we have investigated the use of the PI2-CMAES episodic reinforcement learning algorithm, which is able to learn high-dimensional motor tasks through adaptive control of exploration. By studying PI2-CMAES in a reaching task on a simulated arm, we observe two developmental properties. First, we show how PI2-CMAES autonomously and continuously tunes the global exploration/exploitation trade-off, allowing it to re-adapt to changing tasks. Second, we show how PI2-CMAES spontaneously self-organizes a maturational structure whilst exploring the degrees-of-freedom (DOFs) of the motor space. In particular, it automatically demonstrates the so-called proximo-distal maturation observed in humans: after first freezing distal DOFs while exploring predominantly the most proximal DOF, it progressively frees exploration in DOFs along the proximo-distal body axis. These emergent properties suggest the use of PI2-CMAES as a general tool for studying reinforcement learning of skills in life-long developmental learning contexts. This work was published in the Paladyn Journal of Behavioral Robotics [36] .

This model of emergent developmental freezing and unfreezing of degrees of freedom was then applied to infant vocal development. For this aim, we used an articulatory synthesizer which is a computer model of the human vocal tract and the ear. While testing different possibilities, the algorithm eventually creates learning structures, which are more efficient that random motor babbling. Using the algorithm with a vocal synthesizer, we show that it can reproduce a babbling infant’s characteristic: the predominance of the jaw over the other articulators, namely the canonical babbling.

This is the first study to our knowledge of emergent maturation in speech. Without presupposing any biological or social constraint, we give a new explanation of the jaw predominance in babbling, based on freezing and freeing the degrees of freedom in an adaptive maturation scheme to improve learning. This provides an original hypothesis regarding the emergence of canonical babbling in infant vocal development.

This last work was performed during the internship of Jules Brochard in 2013 and a journal article is currently being written.

COSMO (“Communicating about Objects using Sensory-Motor Operations”): a Bayesian modeling framework for studying speech communication and the emergence of phonological systems

Participants : Clément Moulin-Frier, Jean-Luc Schwartz, Julien Diard, Pierre Bessiã¨re.

This work began with the PhD thesis of Clement Moulin-Frier at GIPSA-Lab, Grenoble, France, supervised by Jean-Luc Schwartz (GIPSA-Lab, CNRS), Julien Diard (LPNC, CNRS) and Pierre Bessière (College de France, CNRS). A few papers were finalized during his post-doc at FLOWERS in 2012. Firstly, an international journal paper based on the PhD thesis work of Raphael Laurent (GIPSA-Lab), extending Moulin-Frier's model, was published [108] , as well as a commentary in Behavioral and Brain Sciences [97] . Both these papers provide computational arguments based on a sensory-motor cognitive model to feed the age-old debate of motor vs. auditory theories of speech perception. Secondly, in another journal paper under the submission process, we attempt to derive some properties of phonological systems (the sound systems of human languages) from the mere properties of speech communication. We introduce a model of the cognitive architecture of a communicating agent, called COSMO (for “Communicating about Objects using Sensory-Motor Operations”) that allows expressing in a probabilistic way the main theoretical trends found in the speech production and perception literature. This allows a computational comparison of these theoretical trends, helping to identify the conditions that favor the emergence of linguistic codes. We present realistic simulations of phonological system emergence showing that COSMO is able to predict the main regularities in vowel, stop consonant and syllable systems in human languages.

This work is currently under consideration as a target article for a special issue in an international journal. Pierre-Yves Oudeyer joined this process as a member of the editing committee.

Recognizing speech in a novel accent: the Motor Theory of Speech Perception reframed

Participants : Clément Moulin-Frier, Michael Arbib.

Clément Moulin-Frier engaged this work with Michael Arbib during his 6-month visit in 2009 at the USC Brain Project, University of Southern California, Los Angeles, USA, during his PhD thesis at Gipsa-Lab, Grenoble. He continues to write a journal article during his post-doc in the Flowers team in 2012-2013. This paper has been published recently in Biological Cybernetics [29] , in which we offer a novel computational model of foreign-accented speech adaptation, together with a thorough analysis of its implications with respect to the motor theory of speech perception.