<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN" "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8"/>
    <title>Team:MULTISPEECH</title>
    <link rel="stylesheet" href="../static/css/raweb.css" type="text/css"/>
    <meta name="description" content="Overall Objectives - Overall Objectives"/>
    <meta name="dc.title" content="Overall Objectives - Overall Objectives"/>
    <meta name="dc.subject" content=""/>
    <meta name="dc.publisher" content="INRIA"/>
    <meta name="dc.date" content="(SCHEME=ISO8601) 2014-01"/>
    <meta name="dc.type" content="Report"/>
    <meta name="dc.language" content="(SCHEME=ISO639-1) en"/>
    <meta name="projet" content="MULTISPEECH"/>
    <!-- Piwik -->
    <script type="text/javascript" src="/rapportsactivite/piwik.js"></script>
    <noscript><p><img src="//piwik.inria.fr/piwik.php?idsite=49" style="border:0;" alt="" /></p></noscript>
    <!-- End Piwik Code -->
  </head>
  <body>
    <div class="tdmdiv">
      <div class="logo">
        <a href="http://www.inria.fr">
          <img style="align:bottom; border:none" src="../static/img/icons/logo_INRIA-coul.jpg" alt="Inria"/>
        </a>
      </div>
      <div class="TdmEntry">
        <div class="tdmentete">
          <a href="uid0.html">Team Multispeech</a>
        </div>
        <span>
          <a href="uid1.html">Members</a>
        </span>
      </div>
      <div class="tdmActPage">
        <a href="./uid3.html">Overall Objectives</a>
      </div>
      <div class="TdmEntry">Research Program<ul><li><a href="uid11.html&#10;&#9;&#9;  ">Introduction</a></li><li><a href="uid12.html&#10;&#9;&#9;  ">Explicit modeling of speech production and perception</a></li><li><a href="uid16.html&#10;&#9;&#9;  ">Statistical modeling of speech</a></li><li><a href="uid20.html&#10;&#9;&#9;  ">Uncertainty estimation and exploitation in speech processing</a></li></ul></div>
      <div class="TdmEntry">Application Domains<ul><li><a href="uid25.html&#10;&#9;&#9;  ">Introduction</a></li><li><a href="uid26.html&#10;&#9;&#9;  ">Computer assisted learning</a></li><li><a href="uid27.html&#10;&#9;&#9;  ">Aided communication and monitoring</a></li><li><a href="uid28.html&#10;&#9;&#9;  ">Annotation and processing of spoken documents</a></li><li><a href="uid29.html&#10;&#9;&#9;  ">Multimodal computer interactions</a></li></ul></div>
      <div class="TdmEntry">New Software and Platforms<ul><li><a href="uid31.html&#10;&#9;&#9;  ">Introduction</a></li><li><a href="uid32.html&#10;&#9;&#9;  ">Speech processing tools</a></li><li><a href="uid42.html&#10;&#9;&#9;  ">Speech visualization tools</a></li><li><a href="uid48.html&#10;&#9;&#9;  ">Data acquisition</a></li></ul></div>
      <div class="TdmEntry">New Results<ul><li><a href="uid53.html&#10;&#9;&#9;  ">Highlights of the Year</a></li><li><a href="uid54.html&#10;&#9;&#9;  ">Explicit modeling of speech production and perception</a></li><li><a href="uid63.html&#10;&#9;&#9;  ">Complex statistical modeling of speech</a></li><li><a href="uid81.html&#10;&#9;&#9;  ">Uncertainty estimation and exploitation in speech processing</a></li></ul></div>
      <div class="TdmEntry">Bilateral Contracts and Grants with Industry<ul><li><a href="uid89.html&#10;&#9;&#9;  ">Bilateral Contracts with Industry</a></li></ul></div>
      <div class="TdmEntry">Partnerships and Cooperations<ul><li><a href="uid102.html&#10;&#9;&#9;  ">National Initiatives</a></li><li><a href="uid149.html&#10;&#9;&#9;  ">European Initiatives</a></li><li><a href="uid158.html&#10;&#9;&#9;  ">International Initiatives</a></li><li><a href="uid162.html&#10;&#9;&#9;  ">International Research Visitors</a></li></ul></div>
      <div class="TdmEntry">Dissemination<ul><li><a href="uid180.html&#10;&#9;&#9;  ">Promoting Scientific Activities</a></li><li><a href="uid228.html&#10;&#9;&#9;  ">Teaching - Supervision - Juries</a></li><li><a href="uid300.html&#10;&#9;&#9;  ">Popularization</a></li></ul></div>
      <div class="TdmEntry">
        <div>Bibliography</div>
      </div>
      <div class="TdmEntry">
        <ul>
          <li>
            <a id="tdmbibentmajor" href="bibliography.html">Major publications</a>
          </li>
          <li>
            <a id="tdmbibentyear" href="bibliography.html#year">Publications of the year</a>
          </li>
          <li>
            <a id="tdmbibentfoot" href="bibliography.html#References">References in notes</a>
          </li>
        </ul>
      </div>
    </div>
    <div id="main">
      <div class="mainentete">
        <div id="head_agauche">
          <small><a href="http://www.inria.fr">
	    
	    Inria
	  </a> | <a href="../index.html">
	    
	    Raweb 
	    2014</a> | <a href="http://www.inria.fr/en/teams/multispeech">Presentation of the Team MULTISPEECH</a></small>
        </div>
        <div id="head_adroite">
          <table class="qrcode">
            <tr>
              <td>
                <a href="multispeech.xml">
                  <img style="align:bottom; border:none" alt="XML" src="../static/img/icons/xml_motif.png"/>
                </a>
              </td>
              <td>
                <a href="multispeech.pdf">
                  <img style="align:bottom; border:none" alt="PDF" src="IMG/qrcode-multispeech-pdf.png"/>
                </a>
              </td>
              <td>
                <a href="../multispeech/multispeech.epub">
                  <img style="align:bottom; border:none" alt="e-pub" src="IMG/qrcode-multispeech-epub.png"/>
                </a>
              </td>
            </tr>
            <tr>
              <td/>
              <td>PDF
</td>
              <td>e-Pub
</td>
            </tr>
          </table>
        </div>
      </div>
      <!--FIN du corps du module-->
      <br/>
      <div class="bottomNavigation">
        <div class="tail_aucentre">
          <a href="./uid1.html" accesskey="P"><img style="align:bottom; border:none" alt="previous" src="../static/img/icons/previous_motif.jpg"/> Previous | </a>
          <a href="./uid0.html" accesskey="U"><img style="align:bottom; border:none" alt="up" src="../static/img/icons/up_motif.jpg"/>  Home</a>
          <a href="./uid11.html" accesskey="N"> | Next <img style="align:bottom; border:none" alt="next" src="../static/img/icons/next_motif.jpg"/></a>
        </div>
        <br/>
      </div>
      <div id="textepage">
        <!--DEBUT2 du corps du module-->
        <h2>Section: 
      Overall Objectives</h2>
        <h3 class="titre3">Overall Objectives</h3>
        <p>MULTISPEECH is a joint project between Inria, CNRS and University of Lorraine, hosted in the LORIA laboratory (UMR 7503).
The goal of the project is the modeling of speech for facilitating oral-based communication.
The name MULTISPEECH comes from the three following aspects that are particularly considered, namely:</p>
        <ul>
          <li>
            <p class="notaparagraph"><a name="uid4"> </a><i/><i><b>Multisource aspects</b></i><i/> - which means dealing with speech signals originating from several sources, such as speaker plus noise, or overlapping speech signals resulting from multiple speakers; sounds captured from several microphones will also be considered.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid5"> </a><i/><i><b>Multilingual aspects</b></i><i/> - which means dealing with speech in a multilingual context, as for example for computer assisted language learning, where the pronunciations of words in a foreign language (i.e., non-native speech) is strongly influenced by the mother tongue.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid6"> </a><i/><i><b>Multimodal aspects</b></i><i/> - which means considering simultaneously the various modalities of speech signals, acoustic and visual, in particular for the expressive synthesis of audio-visual speech.</p>
          </li>
        </ul>
        <p/>
        <p class="notaparagraph">The project is organized along the three following scientific challenges:</p>
        <ul>
          <li>
            <p class="notaparagraph"><a name="uid7"> </a><i/><i><b>The explicit modeling of speech.</b></i><i/> - Speech signals result from the movements of articulators. A good knowledge of their position with respect to sounds is essential to improve, on the one hand, articulatory speech synthesis, and on the other hand, the relevance of the diagnosis and of the associated feedback in computer assisted language learning. Production and perception processes are interrelated, so a better understanding of how humans perceive speech will lead to more relevant diagnoses in language learning as well as pointing out critical parameters for expressive speech synthesis. Also, as the expressivity translates into both visual and acoustic effects that must be considered simultaneously, the multimodal components of expressivity, which are both on the voice and on the face, will be addressed to produce expressive multimodal speech.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid8"> </a><i/><i><b>The statistical modeling of speech.</b></i><i/> - Statistical approaches are common for processing speech and they achieve performance that makes possible their use in actual applications. However, speech recognition systems still have limited capabilities (for example, even if large, the vocabulary is limited) and their performance drops significantly when dealing with degraded speech, such as noisy signals and spontaneous speech. Source separation based approaches will be investigated as a way of making speech recognition systems more robust to noise. Dealing with spontaneous speech and handling new proper names are two critical aspects that will be tackled, along with the use of statistical models for speech-text automatic alignment and for speech production.</p>
          </li>
          <li>
            <p class="notaparagraph"><a name="uid9"> </a><i/><i><b>The estimation and the exploitation of uncertainty in speech processing.</b></i><i/> - Speech signals are highly variable and often disturbed with noise or other spurious signals (such as music or undesired extra speech). In addition, the output of speech enhancement and of source separation techniques is not exactly the accurate "clean" original signal, and estimation errors have to be taken into account in further processing. This is the goal of computing and handling the uncertainty of the reconstructed signal provided by source separation approaches. Confidence measures associated with word recognition results aim at providing information on the reliability of the hypothesized words. Finally, with respect to phonetic segment boundaries and prosodic parameters no such information is yet available.</p>
          </li>
        </ul>
        <p>Although being interdependent, each of these three scientific challenges constitutes a founding research direction for the MULTISPEECH project. Consequently, the research program is organized along three research directions, each one matching a scientific challenge.
A large part of the research is conducted on French speech data; English and German languages are also considered in speech recognition experiments and language learning.
Adaptation to other languages of the machine learning based approaches is possible providing the availability of corresponding speech corpora.</p>
      </div>
      <!--FIN du corps du module-->
      <br/>
      <div class="bottomNavigation">
        <div class="tail_aucentre">
          <a href="./uid1.html" accesskey="P"><img style="align:bottom; border:none" alt="previous" src="../static/img/icons/previous_motif.jpg"/> Previous | </a>
          <a href="./uid0.html" accesskey="U"><img style="align:bottom; border:none" alt="up" src="../static/img/icons/up_motif.jpg"/>  Home</a>
          <a href="./uid11.html" accesskey="N"> | Next <img style="align:bottom; border:none" alt="next" src="../static/img/icons/next_motif.jpg"/></a>
        </div>
        <br/>
      </div>
    </div>
  </body>
</html>
