<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN" "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8"/>
    <title>Project-Team:MULTISPEECH</title>
    <link rel="stylesheet" href="../static/css/raweb.css" type="text/css"/>
    <meta name="description" content="Overall Objectives - Overall Objectives"/>
    <meta name="dc.title" content="Overall Objectives - Overall Objectives"/>
    <meta name="dc.subject" content=""/>
    <meta name="dc.publisher" content="INRIA"/>
    <meta name="dc.date" content="(SCHEME=ISO8601) 2019-01"/>
    <meta name="dc.type" content="Report"/>
    <meta name="dc.language" content="(SCHEME=ISO639-1) en"/>
    <meta name="projet" content="MULTISPEECH"/>
    <script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
      <!-- MathJax -->
    </script>
    <script type="text/javascript" src="../static/js/piwik.js">
      <!-- Piwik JS -->
    </script>
    <noscript>
      <p>
        <img src="https://piwik.inria.fr/matomo.php?idsite=49&amp;rec=1" style="border:0;" alt=""/>
      </p>
      <!-- Piwik Img -->
    </noscript>
  </head>
  <body>
    <div class="tdmdiv">
      <div class="logo">
        <a href="http://www.inria.fr">
          <img style="align:bottom; border:none" src="../static/img/icons/logo_INRIA-coul.jpg" alt="Inria"/>
        </a>
      </div>
      <div class="TdmEntry">
        <div class="tdmentete">
          <a href="uid0.html">Project-Team Multispeech</a>
        </div>
        <span>
          <a href="uid1.html">Team, Visitors, External Collaborators</a>
        </span>
      </div>
      <div class="tdmActPage">
        <a href="./uid3.html">Overall Objectives</a>
      </div>
      <div class="TdmEntry">Research Program<ul><li><a href="uid11.html&#10;&#9;&#9;  ">Beyond black-box supervised learning</a></li><li><a href="uid15.html&#10;&#9;&#9;  ">Speech production and perception</a></li><li><a href="uid20.html&#10;&#9;&#9;  ">Speech in its environment</a></li></ul></div>
      <div class="TdmEntry">Application Domains<ul><li><a href="uid25.html&#10;&#9;&#9;  ">Introduction</a></li><li><a href="uid26.html&#10;&#9;&#9;  ">Multimodal Computer Interactions</a></li><li><a href="uid27.html&#10;&#9;&#9;  ">Annotation and Processing of Spoken Documents and Audio Archives</a></li><li><a href="uid28.html&#10;&#9;&#9;  ">Aided Communication and Monitoring</a></li><li><a href="uid29.html&#10;&#9;&#9;  ">Computer Assisted Learning</a></li></ul></div>
      <div class="TdmEntry">
        <a href="./uid31.html">Highlights of the Year</a>
      </div>
      <div class="TdmEntry">New Software and Platforms<ul><li><a href="uid34.html&#10;&#9;&#9;  ">dnnsep</a></li><li><a href="uid37.html&#10;&#9;&#9;  ">KATS</a></li><li><a href="uid40.html&#10;&#9;&#9;  ">SOJA</a></li><li><a href="uid43.html&#10;&#9;&#9;  ">LORIA-PHON</a></li><li><a href="uid46.html&#10;&#9;&#9;  ">Dynalips-Player</a></li><li><a href="uid50.html&#10;&#9;&#9;  ">VisArtico</a></li><li><a href="uid56.html&#10;&#9;&#9;  ">Xarticulators</a></li><li><a href="uid59.html&#10;&#9;&#9;  ">DCASE 2019 baseline</a></li></ul></div>
      <div class="TdmEntry">New Results<ul><li><a href="uid65.html&#10;&#9;&#9;  ">Beyond black-box supervised learning</a></li><li><a href="uid69.html&#10;&#9;&#9;  ">Speech production and perception</a></li><li><a href="uid81.html&#10;&#9;&#9;  ">Speech in its environment</a></li></ul></div>
      <div class="TdmEntry">Bilateral Contracts and Grants with Industry<ul><li><a href="uid92.html&#10;&#9;&#9;  ">Bilateral Contracts with Industry</a></li><li><a href="uid109.html&#10;&#9;&#9;  ">Bilateral Grants with Industry</a></li></ul></div>
      <div class="TdmEntry">Partnerships and Cooperations<ul><li><a href="uid131.html&#10;&#9;&#9;  ">Regional Initiatives</a></li><li><a href="uid151.html&#10;&#9;&#9;  ">National Initiatives</a></li><li><a href="uid273.html&#10;&#9;&#9;  ">European Initiatives</a></li><li><a href="uid321.html&#10;&#9;&#9;  ">International Initiatives</a></li><li><a href="uid329.html&#10;&#9;&#9;  ">International Research Visitors</a></li></ul></div>
      <div class="TdmEntry">Dissemination<ul><li><a href="uid334.html&#10;&#9;&#9;  ">Promoting Scientific Activities</a></li><li><a href="uid430.html&#10;&#9;&#9;  ">Teaching - Supervision - Juries</a></li><li><a href="uid508.html&#10;&#9;&#9;  ">Popularization</a></li></ul></div>
      <div class="TdmEntry">
        <div>Bibliography</div>
      </div>
      <div class="TdmEntry">
        <ul>
          <li>
            <a id="tdmbibentyear" href="bibliography.html">Publications of the year</a>
          </li>
        </ul>
      </div>
    </div>
    <div id="main">
      <div class="mainentete">
        <div id="head_agauche">
          <small><a href="http://www.inria.fr">
	    
	    Inria
	  </a> | <a href="../index.html">
	    
	    Raweb 
	    2019</a> | <a href="http://www.inria.fr/en/teams/multispeech">Presentation of the Project-Team MULTISPEECH</a> | <a href="https://team.inria.fr/multispeech/">MULTISPEECH Web Site
	  </a></small>
        </div>
        <div id="head_adroite">
          <table class="qrcode">
            <tr>
              <td>
                <a href="multispeech.xml">
                  <img style="align:bottom; border:none" alt="XML" src="../static/img/icons/xml_motif.png"/>
                </a>
              </td>
              <td>
                <a href="multispeech.pdf">
                  <img style="align:bottom; border:none" alt="PDF" src="IMG/qrcode-multispeech-pdf.png"/>
                </a>
              </td>
              <td>
                <a href="../multispeech/multispeech.epub">
                  <img style="align:bottom; border:none" alt="e-pub" src="IMG/qrcode-multispeech-epub.png"/>
                </a>
              </td>
            </tr>
            <tr>
              <td/>
              <td>PDF
</td>
              <td>e-Pub
</td>
            </tr>
          </table>
        </div>
      </div>
      <!--FIN du corps du module-->
      <br/>
      <div class="bottomNavigation">
        <div class="tail_aucentre">
          <a href="./uid1.html" accesskey="P"><img style="align:bottom; border:none" alt="previous" src="../static/img/icons/previous_motif.jpg"/> Previous | </a>
          <a href="./uid0.html" accesskey="U"><img style="align:bottom; border:none" alt="up" src="../static/img/icons/up_motif.jpg"/>  Home</a>
          <a href="./uid11.html" accesskey="N"> | Next <img style="align:bottom; border:none" alt="next" src="../static/img/icons/next_motif.jpg"/></a>
        </div>
        <br/>
      </div>
      <div id="textepage">
        <!--DEBUT2 du corps du module-->
        <h2>Section: 
      Overall Objectives</h2>
        <h3 class="titre3">Overall Objectives</h3>
        <p>The goal of the project is the modeling of speech for facilitating oral-based communication.
The name MULTISPEECH comes from the following aspects that are particularly considered.</p>
        <ul>
          <li>
            <p class="notaparagraph"><a name="uid4"> </a><b>Multisource aspects</b> - which means dealing with speech signals originating from several sources, such as speaker plus noise, or overlapping speech signals resulting from multiple speakers; sounds captured from several microphones are also considered.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid5"> </a><b>Multilingual aspects</b> - which means dealing with speech in a multilingual context, as for example for computer assisted language learning, where the pronunciations of words in a foreign language (i.e., non-native speech) is strongly influenced by the mother tongue.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid6"> </a><b>Multimodal aspects</b> - which means considering simultaneously the various modalities of speech signals, acoustic and visual, in particular for the expressive synthesis of audio-visual speech.</p>
          </li>
        </ul>
        <p/>
        <p>Our objectives are structured in three research axes, which have evolved compared to the project proposal finalized in 2014. Indeed, due to the ubiquitous use of deep learning, the distinction between `explicit modeling' and `statistical modeling' is not relevant anymore and the fundamental issues raised by deep learning have grown into a new research axis `beyond black-box supervised learning'.
The three research axes are now the following.</p>
        <ul>
          <li>
            <p class="notaparagraph"><a name="uid7"> </a><b>Beyond black-box supervised learning</b>
This research axis focuses on fundamental, domain-agnostic challenges relating to deep learning, such as the integration of domain knowledge, data efficiency, or privacy preservation. The results of this axis naturally apply in the various domains studied in the two other research axes.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid8"> </a><b>Speech production and perception</b>
This research axis covers the topics of the research axis on `Explicit modeling of speech production and perception' of the project proposal, but now includes a wide use of deep learning approaches. It also includes topics around prosody that were previously in the research axis on `Uncertainty estimation and exploitation in speech processing' in the project proposal.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid9"> </a><b>Speech in its environment</b>
The themes covered by this research axis mainly correspond to those of the axis on `Statistical modeling of speech' in the project proposal, plus the acoustic modeling topic that was previously in the research axis on `Uncertainty estimation and exploitation in speech processing' in the project proposal.</p>
          </li>
        </ul>
        <p>A large part of the research is conducted on French and English speech data; German and Arabic languages are also considered either in speech recognition experiments or in language learning. Adaptation to other languages of the machine learning based approaches is possible, depending on the availability of speech corpora.</p>
      </div>
      <!--FIN du corps du module-->
      <br/>
      <div class="bottomNavigation">
        <div class="tail_aucentre">
          <a href="./uid1.html" accesskey="P"><img style="align:bottom; border:none" alt="previous" src="../static/img/icons/previous_motif.jpg"/> Previous | </a>
          <a href="./uid0.html" accesskey="U"><img style="align:bottom; border:none" alt="up" src="../static/img/icons/up_motif.jpg"/>  Home</a>
          <a href="./uid11.html" accesskey="N"> | Next <img style="align:bottom; border:none" alt="next" src="../static/img/icons/next_motif.jpg"/></a>
        </div>
        <br/>
      </div>
    </div>
  </body>
</html>
