EN FR
EN FR


Bibliography

Publications of the year

Doctoral Dissertations and Habilitation Theses

Articles in International Peer-Reviewed Journals

  • 4N. Bertin, E. Camberlein, R. Lebarbenchon, E. Vincent, S. Sivasankaran, I. Illina, F. Bimbot.

    VoiceHome-2, an extended corpus for multichannel speech processing in real homes, in: Speech Communication, January 2019, vol. 106, pp. 68-78. [ DOI : 10.1016/j.specom.2018.11.002 ]

    https://hal.inria.fr/hal-01923108
  • 5A. Deleforge, D. Di Carlo, M. Strauss, R. Serizel, L. Marcenaro.

    Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition, in: IEEE Signal Processing Magazine, September 2019, vol. 36, no 5, pp. 138-144, https://arxiv.org/abs/1907.04655. [ DOI : 10.1109/MSP.2019.2924687 ]

    https://hal.archives-ouvertes.fr/hal-02161897
  • 6K. Déguernel, E. Vincent, J. Nika, G. Assayag, K. Smaïli.

    Learning of Hierarchical Temporal Structures for Guided Improvisation, in: Computer Music Journal, 2019, vol. 43, no 2.

    https://hal.inria.fr/hal-02378273
  • 7A. Mesaros, A. Diment, B. Elizalde, T. Heittola, E. Vincent, B. Raj, T. Virtanen.

    Sound event detection in the DCASE 2017 Challenge, in: IEEE/ACM Transactions on Audio, Speech and Language Processing, June 2019, vol. 27, no 6, pp. 992-1006. [ DOI : 10.1109/TASLP.2019.2907016 ]

    https://hal.inria.fr/hal-02067935
  • 8Q. V. Nguyen, F. Colas, E. Vincent, F. Charpillet.

    Motion planning for robot audition, in: Autonomous Robots, December 2019, vol. 43, no 8, pp. 2293-2317. [ DOI : 10.1007/s10514-019-09880-1 ]

    https://hal.inria.fr/hal-02188342
  • 9L. Perotin, R. Serizel, E. Vincent, A. Guérin.

    CRNN-based multiple DoA estimation using acoustic intensity features for Ambisonics recordings, in: IEEE Journal of Selected Topics in Signal Processing, February 2019, vol. 13, no 1, pp. 22-33. [ DOI : 10.1109/jstsp.2019.2900164 ]

    https://hal.inria.fr/hal-01839883
  • 10A. Poddar, M. Sahidullah, G. Saha.

    Quality Measures for Speaker Verification with Short Utterances, in: Digital Signal Processing, January 2019, vol. 88, pp. 66-79. [ DOI : 10.1016/j.dsp.2019.01.023 ]

    https://hal.inria.fr/hal-01998376
  • 11K. Smaïli, D. Fohr, C.-E. González-Gallardo, M. L. Grega, L. Janowski, D. Jouvet, A. Koźbiał, D. Langlois, M. Leszczuk, O. Mella, M.-A. Menacer, A. Mendez, E. L. L. Pontes, E. Sanjuan, J.-M. Torres-Moreno, B. Garcia-Zapirain.

    Summarizing videos into a target language: Methodology, architectures and evaluation, in: Journal of Intelligent and Fuzzy Systems, July 2019, vol. 1, pp. 1-12. [ DOI : 10.3233/JIFS-179350 ]

    https://hal.archives-ouvertes.fr/hal-02271287
  • 12V. Vestman, T. Kinnunen, R. G. Hautamäki, M. Sahidullah.

    Voice Mimicry Attacks Assisted by Automatic Speaker Verification, in: Computer Speech and Language, June 2019, vol. 59, pp. 36-54. [ DOI : 10.1016/j.csl.2019.05.005 ]

    https://hal.archives-ouvertes.fr/hal-02161773

Invited Conferences

  • 13C. Dodane, D. Boutet, F. Hirsch, S. Ouni, A. Morgenstern.

    MODALISA une plateforme intégrative pour capturer l’orchestration des gestes et de la parole, in: Défi Instrumentation aux Limites, Colloque de restitution, Paris, France, CNRS, September 2019.

    https://hal.archives-ouvertes.fr/hal-02375011
  • 14F. Forbes, A. Deleforge, R. Horaud, E. Perthame.

    Robust non-linear regression approach for generalized inverse problems in a high dimensional setting, in: AIP 2019 - Applied Inverse Problem conference, Grenoble, France, July 2019.

    https://hal.archives-ouvertes.fr/hal-02415115
  • 15D. Jouvet.

    Speech Processing and Prosody, in: TSD 2019 - 22nd International Conference of Text, Speech and Dialogue, Ljubljana, Slovenia, September 2019.

    https://hal.inria.fr/hal-02177210
  • 16R. Serizel, N. Turpault.

    Sound Event Detection from Partially Annotated Data: Trends and Challenges, in: IcETRAN conference, Srebrno Jezero, Serbia, June 2019.

    https://hal.inria.fr/hal-02114652
  • 17E. Vincent.

    COMPRISE, in: META-FORUM, Bruxelles, Belgium, October 2019.

    https://hal.inria.fr/hal-02377051
  • 18E. Vincent.

    Grands défis scientifiques et technologiques en traitement de la parole: quelles initiatives chez Inria et au niveau européen?, in: Voice Tech Paris 2019, Paris, France, November 2019.

    https://hal.inria.fr/hal-02377036
  • 19E. Vincent.

    Parole & deep learning : succès et grands défis, in: Journée IA, Langage et Citoyens, Nancy, France, March 2019.

    https://hal.inria.fr/hal-02090623

International Conferences with Proceedings

  • 20K. Abidi, D. Fohr, D. Jouvet, D. Langlois, O. Mella, K. Smaïli.

    A Fine-grained Multilingual Analysis Based on the Appraisal Theory: Application to Arabic and English Videos, in: ICALP: International Conference on Arabic Language Processing, Nancy, France, Springer, August 2019, vol. Communications in Computer and Information Science book series (CCIS, volume 1108), pp. 49-61. [ DOI : 10.1007/978-3-030-32959-4_4 ]

    https://hal.archives-ouvertes.fr/hal-02314244
  • 21T. Biasutto–Lervat, S. Dahmani, S. Ouni.

    Modeling Labial Coarticulation with Bidirectional Gated Recurrent Networks and Transfer Learning, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.

    https://hal.inria.fr/hal-02175780
  • 22A. Bonneau.

    German obstruent sequences by French L2 learners, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.

    https://hal.inria.fr/hal-02143360
  • 23S. Dahmani, V. Colotte, V. Girard, S. Ouni.

    Conditional Variational Auto-Encoder for Text-Driven Expressive AudioVisual Speech Synthesis, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.

    https://hal.inria.fr/hal-02175776
  • 24D. Di Carlo, A. Deleforge, N. Bertin.

    Mirage: 2D Source Localization Using Microphone Pair Augmentation with Echoes, in: ICASSP 2019 - IEEE International Conference on Acoustic, Speech Signal Processing, Brighton, United Kingdom, IEEE, May 2019, pp. 775-779, https://arxiv.org/abs/1906.08968. [ DOI : 10.1109/ICASSP.2019.8683534 ]

    https://hal.archives-ouvertes.fr/hal-02160940
  • 25C. Dodane, D. Boutet, I. Didirkova, F. Hirsch, S. Ouni, A. Morgenstern.

    An integrative platform to capture the orchestration of gesture and speech, in: GeSpIn 2019 - Gesture and Speech in Interaction, Paderborn, Germany, September 2019.

    https://hal.inria.fr/hal-02278345
  • 26I. K. Douros, J. Felblinger, J. Frahm, K. Isaieva, A. Joseph, Y. Laprie, F. Odille, A. Tsukanova, D. Voit, P.-A. Vuissoz.

    A Multimodal Real-Time MRI Articulatory Corpus of French for Speech Research, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.

    https://hal.inria.fr/hal-02167756
  • 27I. K. Douros, Y. Laprie, P.-A. Vuissoz, B. Elie.

    Acoustic Evaluation of Simplifying Hypotheses Used in Articulatory Synthesis, in: ICA 2019 - 23rd International Congress on Acoustics, Aachen, Germany, September 2019.

    https://hal.inria.fr/hal-02180617
  • 28I. K. Douros, A. Tsukanova, K. Isaieva, P.-A. Vuissoz, Y. Laprie.

    Towards a method of dynamic vocal tract shapes generation by combining static 3D and dynamic 2D MRI speech data, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.

    https://hal.inria.fr/hal-02181333
  • 29I. K. Douros, P.-A. Vuissoz, Y. Laprie.

    Acoustic impacts of geometric approximation at the level of velum and epiglottis on French vowels, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.

    https://hal.inria.fr/hal-02180566
  • 30I. K. Douros, P.-A. Vuissoz, Y. Laprie.

    Comparison between 2D and 3D models for speech production: a study of French vowels, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.

    https://hal.inria.fr/hal-02180606
  • 31I. K. Douros, P.-A. Vuissoz, Y. Laprie.

    Effect of head posture on phonation of French vowels, in: ICPhS 2019 - Proceedings of International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.

    https://hal.inria.fr/hal-02180486
  • 32A. Dufraux, E. Vincent, A. Hannun, A. Brun, M. Douze.

    Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition, in: ASRU 2019 - IEEE Automatic Speech Recognition and Understanding Workshop, Singapour, Singapore, December 2019.

    https://hal.inria.fr/hal-02316572
  • 33B. Elie, A. Amelot, Y. Laprie, S. Maeda.

    Glottal Opening Measurements in VCV and VCCV Sequences, in: ICA 2019 - 23rd International Congress on Acoustics, Aachen, Germany, September 2019.

    https://hal.inria.fr/hal-02180626
  • 34M. Fontaine, A. A. Nugraha, R. Badeau, K. Yoshii, A. Liutkus.

    Cauchy Multichannel Speech Enhancement with a Deep Speech Prior, in: EUSIPCO 2019 - 27th European Signal Processing Conference, Coruña, Spain, September 2019.

    https://hal.telecom-paristech.fr/hal-02288063
  • 35T. Kinnunen, R. G. Hautamäki, V. Vestman, M. Sahidullah.

    Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection, in: ICASSP 2019 – 44th International Conference on Acoustics, Speech, and Signal Processing, Brighton, United Kingdom, May 2019.

    https://hal.inria.fr/hal-02051701
  • 36A. Kulkarni, V. Colotte, D. Jouvet.

    Layer adaptation for transfer of expressivity in speech synthesis, in: LTC'19 - 9th Language & Technology Conference, Poznan, Poland, May 2019.

    https://hal.inria.fr/hal-02177945
  • 37L. Lee, K. Bartkova, D. Jouvet, M. Dargnat, Y. Keromnes.

    Can prosody meet pragmatics? Case of discourse particles in French, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.

    https://hal.inria.fr/hal-02177202
  • 38K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P.-M. B. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, J. Patino, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. Ngo Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. Kuan Teh, H. Dat Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J.-F. Bonastre, C. Xu, Z. H. Lim, S. Chng, S. Ranjan, J. H. L. Hansen, M. Todisco, N. Evans.

    I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.

    https://hal.archives-ouvertes.fr/hal-02280151
  • 39T. Léonova, G. Coffe, A. Tarasconi, A. Piquard-Kipffer, D. Sardin, A. Gosse, J. Boré.

    L'impact du trouble du spectre de l'autisme sur le bien-être psychologique des parents, in: XVIIIème Congrès de l'Association Internationale de Formation et de Recherche en Éducation Familiale, Schoelcher, Martinique, France, May 2019.

    https://hal.inria.fr/hal-02179616
  • 40M. A. Menacer, C. E. González-Gallardo, K. Abidi, D. Fohr, D. Jouvet, D. Langlois, O. Mella, F. Sadat, J. M. Torres-Moreno, K. Smaïli.

    Extractive Text-Based Summarization of Arabic videos: Issues, Approaches and Evaluations, in: ICALP: International Conference on Arabic Language Processing, Nancy, France, Springer, August 2019, vol. Communications in Computer and Information Science book series (CCIS, volume 1108), pp. 65-78. [ DOI : 10.1007/978-3-030-32959-4_5 ]

    https://hal.archives-ouvertes.fr/hal-02314238
  • 41M. Menacer, D. Langlois, D. Jouvet, D. Fohr, O. Mella, K. Smaïli.

    Machine Translation on a parallel Code-Switched Corpus, in: Canadian AI 2019 - 32nd Conference on Canadian Artificial Intelligence, Ontario, Canada, Lecture Notes in Artificial Intelligence, May 2019.

    https://hal.archives-ouvertes.fr/hal-02106010
  • 42M. Pariente, A. Deleforge, E. Vincent.

    A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders, in: INTERSPEECH, Graz, Austria, September 2019, https://arxiv.org/abs/1905.01209.

    https://hal.inria.fr/hal-02116165
  • 44D. Ribas, E. Vincent.

    An improved uncertainty propagation method for robust i-vector based speaker recognition, in: ICASSP 2019 - 44th International Conference on Acoustics, Speech, and Signal Processing, Brighton, United Kingdom, May 2019, https://arxiv.org/abs/1902.05761.

    https://hal.inria.fr/hal-02010199
  • 45B. M. L. Srivastava, A. Bellet, M. Tommasi, E. Vincent.

    Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.

    https://hal.inria.fr/hal-02166434
  • 46M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, K. A. Lee.

    ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.

    https://hal.archives-ouvertes.fr/hal-02172099
  • 47A. Tsukanova, I. K. Douros, A. Shimorina, Y. Laprie.

    Can static vocal tract positions represent articulatory targets in continuous speech? Matching static MRI captures against real-time MRI for the French language, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.

    https://hal.inria.fr/hal-02181314
  • 48N. Turpault, R. Serizel, A. Parag Shah, J. Salamon.

    Sound event detection in domestic environments with weakly labeled data and soundscape synthesis, in: Workshop on Detection and Classification of Acoustic Scenes and Events, New York City, United States, October 2019.

    https://hal.inria.fr/hal-02160855
  • 49N. Turpault, R. Serizel, E. Vincent.

    Semi-supervised triplet loss based learning of ambient audio embeddings, in: ICASSP, Brighton, United Kingdom, 2019.

    https://hal.archives-ouvertes.fr/hal-02025824
  • 50I. Zangar, Z. Mnasri, V. Colotte, D. Jouvet.

    F0 modeling using DNN for Arabic parametric speech synthesis, in: INNSBDDL 2019 - INNS Big Data and Deep Learning, Sestri Levante, Italy, April 2019.

    https://hal.inria.fr/hal-02177496

Scientific Books (or Scientific Book chapters)

  • 51M. Sahidullah, H. Delgado, M. Todisco, T. Kinnunen, N. Evans, J. Yamagishi, K. A. Lee.

    Introduction to Voice Presentation Attack Detection and Recent Advances, in: Handbook of Biometric Anti-Spoofing: Presentation Attack Detection, S. Marcel, M. S. Nixon, J. Fierrez, N. Evans (editors), Advances in Computer Vision and Pattern Recognition, Springer, 2019, pp. 321-361. [ DOI : 10.1007/978-3-319-92627-8_15 ]

    https://hal.inria.fr/hal-01974528

Internal Reports

  • 52B. Caramiaux, F. Lotte, J. Geurts, G. Amato, M. Behrmann, F. Bimbot, F. Falchi, A. Garcia, J. Gibert, G. Gravier, H. Holken, H. Koenitz, S. Lefebvre, A. Liutkus, A. Perkis, R. Redondo, E. Turrin, T. Viéville, E. Vincent.

    AI in the media and creative industries, New European Media (NEM), April 2019, pp. 1-35, https://arxiv.org/abs/1905.04175.

    https://hal.inria.fr/hal-02125504
  • 53G. Carbajal, R. Serizel, E. Vincent, E. Humbert.

    Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise: Supporting Document, Inria Nancy, équipe Multispeech ; Invoxia SAS, November 2019, no RR-9303.

    https://hal.inria.fr/hal-02372431
  • 54K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P.-M. B. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, J. Patino, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. Ngo Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. Kuan Teh, H. Dat Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J.-F. Bonastre, C. Xu, Z. H. Lim, S. Chng, S. Ranjan, J. H. L. Hansen, M. Todisco, N. Evans.

    I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences, I4U Consortium, April 2019.

    https://hal.archives-ouvertes.fr/hal-02174317
  • 55M. Pariente, A. Deleforge, E. Vincent.

    A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders : Supporting Document, Inria, April 2019, no RR-9268, pp. 1-8.

    https://hal.inria.fr/hal-02089062

Software

Other Publications

  • 57G. Carbajal, R. Serizel, E. Vincent, E. Humbert.

    Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise, December 2019, working paper or preprint.

    https://hal.inria.fr/hal-02372579
  • 58N. Furnon, R. Serizel, I. Illina, S. Essid.

    DNN-Based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays, October 2019, Submitted to ICASSP2020.

    https://hal.archives-ouvertes.fr/hal-02389159
  • 59M. Pariente, S. Cornell, A. Deleforge, E. Vincent.

    Filterbank design for end-to-end speech separation, October 2019, Submitted to ICASSP2020.

    https://hal.archives-ouvertes.fr/hal-02355623
  • 60M. Sahidullah, J. Patino, S. Cornell, R. Yin, S. Sivasankaran, H. Bredin, P. Korshunov, A. Brutti, R. Serizel, E. Vincent, N. Evans, S. Marcel, S. Squartini, C. Barras.

    The Speed Submission to DIHARD II: Contributions & Lessons Learned, November 2019, working paper or preprint.

    https://hal.inria.fr/hal-02352840
  • 61R. Serizel, N. Turpault, A. Shah, J. Salamon.

    Sound event detection in synthetic domestic environments, November 2019, working paper or preprint.

    https://hal.inria.fr/hal-02355573
  • 62S. Sivasankaran, E. Vincent, D. Fohr.

    Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition, November 2019, Submitted to ICASSP 2020.

    https://hal.inria.fr/hal-02355669
  • 63S. Sivasankaran, E. Vincent, D. Fohr.

    SLOGD: Speaker Location Guided Deflation Approach to Speech Separation, November 2019, Submitted to ICASSP 2020.

    https://hal.inria.fr/hal-02355613
  • 64B. M. L. Srivastava, N. Vauquier, M. Sahidullah, A. Bellet, M. Tommasi, E. Vincent.

    Evaluating Voice Conversion-based Privacy Protection against Informed Attackers, November 2019, working paper or preprint.

    https://hal.inria.fr/hal-02355115