EN FR
EN FR


Section: Partnerships and Cooperations

National Initiatives

ANR DYCI2

  • Project acronym: DYCI2 (http://repmus.ircam.fr/dyci2/)

  • Project title: Creative Dynamics of Improvised Interaction

  • Duration: March 2015 - February 2018

  • Coordinator: Ircam (Paris)

  • Other partners: Inria (Nancy), University of La Rochelle

  • Participants: Ken Déguernel, Nathan Libermann, Emmanuel Vincent

  • Abstract: The goal of this project was to design a music improvisation system able to listen to the other musicians, to improvise in their style, and to modify its improvisation according to their feedback in real time.

    MULTISPEECH was responsible for designing a system able to improvise on multiple musical dimensions (melody, harmony) across multiple time scales.

ANR ArtSpeech

  • Project acronym: ArtSpeech

  • Project title: Synthèse articulatoire phonétique

  • Duration: October 2015 - March 2019

  • Coordinator: Yves Laprie

  • Other partners: Gipsa-Lab (Grenoble), IADI (Nancy), LPP (Paris)

  • Participants: Ioannis Douros, Yves Laprie, Anastasiia Tsukanova

  • Abstract: The objective is to synthesize speech from text via the numerical simulation of the human speech production processes, i.e. the articulatory, aerodynamic and acoustic aspects. Corpus based approaches have taken a hegemonic place in text to speech synthesis. They exploit very good acoustic quality speech databases while covering a high number of expressions and of phonetic contexts. This is sufficient to produce intelligible speech. However, these approaches face almost insurmountable obstacles as soon as parameters intimately related to the physical process of speech production have to be modified. On the contrary, an approach which rests on the simulation of the physical speech production process makes explicitly use of source parameters, anatomy and geometry of the vocal tract, and of a temporal supervision strategy. It thus offers direct control on the nature of the synthetic speech.

    Static MRI acquisition of vowels (images plus acoustic signal) have been carried out this year and their exploitation started to explore the impact of the articulatory modeling and the plane wave assumption. Manual delineations of approximately 1000 images have been done and used to generate speech signals with articulatory copy synthesis.

ANR JCJC KAMoulox

  • Project acronym: KAMoulox

  • Project title: Kernel additive modelling for the unmixing of large audio archives

  • Duration: January 2016 - September 2019

  • Coordinator: Antoine Liutkus (Inria Zenith)

  • Participants: Mathieu Fontaine, Antoine Liutkus

  • Abstract: The objective is to develop the theoretical and applied tools required to embed audio denoising and separation tools in web-based audio archives. The applicative scenario is to deal with large audio archives, and more precisely with the notorious “Archives du CNRS — Musée de l'homme”, gathering about 50,000 recordings dating back to the early 1900s.

PIA2 ISITE LUE

  • Project acronym: ISITE LUE

  • Project title: Lorraine Université d’Excellence

  • Duration: starting in 2016

  • Coordinator: Univ. Lorraine

  • Participants: Ioannis Douros, Yves Laprie

  • Abstract: The initiative aims at developing and densifying the initial perimeter of excellence, within the scope of the social and economic challenges, so as to build an original model for a leading global engineering university, with a strong emphasis on technological research and education through research. For this, we have designed LUE as an “engine” for the development of excellence, by stimulating an original dialogue between knowledge fields.

    MULTISPEECH is mainly concerned with challenge number 6: “Knowledge engineering”, i.e., engineering applied to the field of knowledge and language, which represent our immaterial wealth while being a critical factor for the consistency of future choices. This project funds the PhD thesis of Ioannis Douros.

E-FRAN METAL

  • Project acronym: E-FRAN METAL

  • Project title: Modèles Et Traces au service de l’Apprentissage des Langues

  • Duration: October 2016 - September 2020

  • Coordinator: Anne Boyer (LORIA)

  • Other partners: Interpsy, LISEC, ESPE de Lorraine, D@NTE (Univ. Versailles Saint Quentin), Sailendra SAS, ITOP Education, Rectorat.

  • Participants: Theo Biasutto-Lervat, Anne Bonneau, Vincent Colotte, Dominique Fohr, Denis Jouvet, Odile Mella, Slim Ouni, Anne-Laure Piat-Marchand, Elodie Gauthier, Thomas Girod

  • Abstract: METAL aims at improving the learning of languages (both written and oral components) through the development of new tools and the analysis of numeric traces associated with students' learning, in order to adapt to the needs and rythm of each learner.

    MULTISPEECH is concerned by oral language learning aspects.

ANR VOCADOM

  • Project acronym: VOCADOM (http://vocadom.imag.fr/)

  • Project title: Robust voice command adapted to the user and to the context for ambient assisted living

  • Duration: January 2017 - December 2020

  • Coordinator: CNRS - LIG (Grenoble)

  • Other partners: Inria (Nancy), Univ. Lyon 2 - GREPS, THEORIS (Paris)

  • Participants: Dominique Fohr, Md Sahidullah, Sunit Sivasankaran, Emmanuel Vincent

  • Abstract: The goal of this project is to design a robust voice control system for smart home applications.

    MULTISPEECH is responsible for wake-up word detection, overlapping speech separation, and speaker recognition.

ANR JCJC DiSCogs

  • Project acronym: DiSCogs

  • Project title: Distant speech communication with heterogeneous unconstrained microphone arrays

  • Duration: September 2018 – March 2022

  • Coordinator: Romain Serizel

  • Participants: Nicolas Furnon, Irina Illina, Romain Serizel, Emmanuel Vincent

  • Collaborators: Télécom ParisTech, 7sensing

  • Abstract: The objective is to solve fundamental sound processing issues in order to exploit the many devices equipped with microphones that populate our everyday life. The solution proposed is to apply machine learning methods based on deep learning to recast the problem of synchronizing devices at the signal level as a multi-view learning problem aiming at extracting complementary information from the devices at hand.