Section: New Results

HRI and Robot Language Teaching

Intuitive and Robust Physical Human-Robot Interaction with Acroban

Participants : Olivier Ly, Pierre-Yves Oudeyer, Pierre Rouanet, Matthieu Lapeyre, Jérome Béchu, Paul Fudal, Haylee Fogg.

We have experimented and shown how the humanoid robot Acroban allows whole-body robust, natural and intuitive physical interaction with both adults and children. These physical human-robot interaction are made possible through the combination of several properties of Acroban: 1) it is whole-body compliant thanks to variable impedance control and also thanks to the use of elastics and springs; 2) it has a bio-inspired vertebral column allowing more flexibility in postural and equilibrium control; 3) it is light- weight; 4) it has simple low-level controllers that leverage the first three properties. Moreover, the capabilities for physical human-robot interaction that we show are not using a model of the human, and in this sense are “model free”: 1) the capability of the robot to keep its equilibrium while being manipulated or pushed by humans is a result of the intrinsic capability of the whole body to absorb unpredicted external perturbations; 2) the capability of leading Acroban by the hand is an emergent human-robot interface made possible by the self-organizing properties of the body and its low-level controllers and was observed a posteriori only after the robot was conceived and without any initial plan to make this possible. Finally, an originality of Acroban is that is is made with relatively low-cost components which lack of precision is counterbalanced with the robustness due to global geometry and compliance. These results were presented in [28] . A dedicated web page with videos is available at: http://flowers.inria.fr/acroban.php .

A Real World User Study of Different Interfaces for Teaching New Visually Grounded Words to a Robot

Participants : Pierre Rouanet, Pierre-Yves Oudeyer, Fabien Danieau, David Filliat.

We have continued to elaborate and experiment an integrated system based on a combination of advanced Human-Robot Interaction, visual perception and machine learning methods that allows non-expert users to intuitively and robustly teach new visually grounded words to robots. This system is based on the state-of-the-art bags of words technique but focuses on different mediator based interfaces that we can propose to the users. Indeed, we argue that by focusing on interaction we could help users to collect good learning examples and thus improve the performance of the overall learning system. We compared four different interfaces and their impact on the overall system through a real world study where we asked participants to show and teach a robot names for five different objects. Three interfaces were based on mediator objects such as an iPhone, a Wiimote and a laser pointer and provided the users with different kinds of feedback of what the robot is perceiving. The fourth interface was gesture based with a Wizard-of-Oz recognition system included in order to compare our mediator interfaces with a more natural interaction. We showed that the interface may indeed strongly impact the quality of the learning examples collected by users, especially for small objects. More precisely, we showed that interfaces such as the iPhone interface do not only give feedback about what the robot is perceiving but also drive users to pay attention to the learning examples they are collecting. Thus, this interface allows non-expert users to intuitively and easily collect almost as good learning examples as expert users trained for this task and aware of the different visual perception and machine learning issues. Finally, we showed that the mediator based interfaces were judged as easier to use than the a priori more natural gestures based interface. This work was presented in [29] .

Language Acquisition as a Particular Case of Context-Dependant Motor Skills Acquisition

Participants : Thomas Cederborg, Pierre-Yves Oudeyer.

Imitation learning, or robot programing by demonstration, have made important advances in recent years. We have proposed to extend the usual contexts investigated to also include linguistic expressions. We have proposed a modification to existing algorithms within the imitation learning framework so that they can handle learning from the demonstration of several unlabelled tasks (or motor primitives) without having to inform the imitator of what task is being demonstrated or what the number of tasks is, which then allows directly for relatively complex language learning. A mechanism for detecting wether or not linguistic/speech input is relevant to the task has also been proposed. With these additions it becomes possible to build an imitator that bridges the gap between imitation learning and language learning by being able to learn linguistic expressions using methods from the imitation learning community. In this sense the imitator learns a word by knowing that a certain speech pattern present in the context means that a specific task is to be executed. The imitator is however not assumed to know that speech is relevant and has to figure this out on its own by looking at the demonstrations. To demonstrate this ability to find the relevance of speech non linguistic tasks are learnt along with linguistic tasks and the imitator has to figure out when speech is relevant (in some tasks speech should be completely ignored and in other tasks the entire policy is determined by speech). A simulated experiment demonstrates that an imitator can indeed find the number of tasks it has been demonstrated, discover what demonstrations are of what task, for which of the tasks speech is relevant and successfully reproduce those tasks. This work is presented in a publication under review.

Robot Learning by Imitation of Internal Cognitive Operations in the Context of Language Acquisition

Participants : Thomas Cederborg, Pierre-Yves Oudeyer.

We have examined the problem of learning socio-linguistic skills through imitation when those skills involve both observable motor patterns and internal unobservable cognitive operations. This approach is framed in a research program trying to investigate novel links between context-dependent motor learning by imitation and language acquisition. More precisely, the paper presents an algorithm for learning how to respond to communicative/linguistic actions of one human, called an interactant, by observing how another human, called a demonstrator, responds. The response of the demonstrator, which depends on the context, including the signs of the interactant, is assumed to be appropriate and the robotic imitator uses these observations to build a general policy of how to respond to interactant actions. In this paper the communicative actions of the interactant is hand signs, and the learnt behavior consists of how to respond to the hand signs of a small and simple sign language, both in terms of adequately focusing attention on the right part of the scene, and in terms of responding physically. As a response to two continuous signs of the interactant, the demonstrator focuses on one out of three objects, and then performs a movement in relation to the object focused on. An algorithm is proposed based on a similarity metric between demonstrations, and a simulated experiment is presented where the unseen “focus on object” operation and the hand movements are successfully imitated, including in situations where there are no demonstrations. This work has been pubished in [21]

Learning Simultaneously New Tasks and Feedback Models in Socially Guided Robot Learning

Participants : Manuel Lopes, Thomas Cederborg, Pierre-Yves Oudeyer.

We have developped a system that allows a robot to learn simultaneously new tasks and feedback models from ambiguous feedback in the context of robot learning by imitation. We have considered an inverse reinforcement learner that receives feedback from a user with an unknown and noisy protocol. The system needs to estimate simultaneously what the task is, and how the user is providing the feedback. We have further explored the problem of ambiguous protocols by considering that the words used by the teacher have an unknown relation with the action and meaning expected by the robot. This allows the system to start with a set of known symbols and learn the meaning of new ones. We have presented computational results that show that it is possible to learn the task under a noisy and ambiguous feedback. Using an active learning approach, the system is able to reduce the length of the training period. [24] , [26] .