Inria | Raweb 2019 | Presentation of the Project-Team MULTISPEECH | MULTISPEECH Web Site


	PDF	e-Pub

Previous |

Home

Bibliography

Publications of the year

Doctoral Dissertations and Habilitation Theses

1M. Fontaine.
Alpha-stable processes for signal processing, Université de Lorraine, June 2019.
https://tel.archives-ouvertes.fr/tel-02188304
2L. Perotin.
Localization and enhancement of speech from the Ambisonics format : analyse de scènes sonores pour faciliter la commande vocale, Université de Lorraine, October 2019.
https://hal.univ-lorraine.fr/tel-02393258
3A. Tsukanova.
Articulatory speech synthesis, Univeristé de lorraine, December 2019.
https://hal.archives-ouvertes.fr/tel-02433528

Articles in International Peer-Reviewed Journals

4N. Bertin, E. Camberlein, R. Lebarbenchon, E. Vincent, S. Sivasankaran, I. Illina, F. Bimbot.
VoiceHome-2, an extended corpus for multichannel speech processing in real homes, in: Speech Communication, January 2019, vol. 106, pp. 68-78. [ DOI : 10.1016/j.specom.2018.11.002 ]
https://hal.inria.fr/hal-01923108
5A. Deleforge, D. Di Carlo, M. Strauss, R. Serizel, L. Marcenaro.
Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition, in: IEEE Signal Processing Magazine, September 2019, vol. 36, n^o 5, pp. 138-144, https://arxiv.org/abs/1907.04655. [ DOI : 10.1109/MSP.2019.2924687 ]
https://hal.archives-ouvertes.fr/hal-02161897
6K. Déguernel, E. Vincent, J. Nika, G. Assayag, K. Smaïli.
Learning of Hierarchical Temporal Structures for Guided Improvisation, in: Computer Music Journal, 2019, vol. 43, n^o 2.
https://hal.inria.fr/hal-02378273
7A. Mesaros, A. Diment, B. Elizalde, T. Heittola, E. Vincent, B. Raj, T. Virtanen.
Sound event detection in the DCASE 2017 Challenge, in: IEEE/ACM Transactions on Audio, Speech and Language Processing, June 2019, vol. 27, n^o 6, pp. 992-1006. [ DOI : 10.1109/TASLP.2019.2907016 ]
https://hal.inria.fr/hal-02067935
8Q. V. Nguyen, F. Colas, E. Vincent, F. Charpillet.
Motion planning for robot audition, in: Autonomous Robots, December 2019, vol. 43, n^o 8, pp. 2293-2317. [ DOI : 10.1007/s10514-019-09880-1 ]
https://hal.inria.fr/hal-02188342
9L. Perotin, R. Serizel, E. Vincent, A. Guérin.
CRNN-based multiple DoA estimation using acoustic intensity features for Ambisonics recordings, in: IEEE Journal of Selected Topics in Signal Processing, February 2019, vol. 13, n^o 1, pp. 22-33. [ DOI : 10.1109/jstsp.2019.2900164 ]
https://hal.inria.fr/hal-01839883
10A. Poddar, M. Sahidullah, G. Saha.
Quality Measures for Speaker Verification with Short Utterances, in: Digital Signal Processing, January 2019, vol. 88, pp. 66-79. [ DOI : 10.1016/j.dsp.2019.01.023 ]
https://hal.inria.fr/hal-01998376
11K. Smaïli, D. Fohr, C.-E. González-Gallardo, M. L. Grega, L. Janowski, D. Jouvet, A. Koźbiał, D. Langlois, M. Leszczuk, O. Mella, M.-A. Menacer, A. Mendez, E. L. L. Pontes, E. Sanjuan, J.-M. Torres-Moreno, B. Garcia-Zapirain.
Summarizing videos into a target language: Methodology, architectures and evaluation, in: Journal of Intelligent and Fuzzy Systems, July 2019, vol. 1, pp. 1-12. [ DOI : 10.3233/JIFS-179350 ]
https://hal.archives-ouvertes.fr/hal-02271287
12V. Vestman, T. Kinnunen, R. G. Hautamäki, M. Sahidullah.
Voice Mimicry Attacks Assisted by Automatic Speaker Verification, in: Computer Speech and Language, June 2019, vol. 59, pp. 36-54. [ DOI : 10.1016/j.csl.2019.05.005 ]
https://hal.archives-ouvertes.fr/hal-02161773

Invited Conferences

13C. Dodane, D. Boutet, F. Hirsch, S. Ouni, A. Morgenstern.
MODALISA une plateforme intégrative pour capturer l’orchestration des gestes et de la parole, in: Défi Instrumentation aux Limites, Colloque de restitution, Paris, France, CNRS, September 2019.
https://hal.archives-ouvertes.fr/hal-02375011
14F. Forbes, A. Deleforge, R. Horaud, E. Perthame.
Robust non-linear regression approach for generalized inverse problems in a high dimensional setting, in: AIP 2019 - Applied Inverse Problem conference, Grenoble, France, July 2019.
https://hal.archives-ouvertes.fr/hal-02415115
15D. Jouvet.
Speech Processing and Prosody, in: TSD 2019 - 22nd International Conference of Text, Speech and Dialogue, Ljubljana, Slovenia, September 2019.
https://hal.inria.fr/hal-02177210
16R. Serizel, N. Turpault.
Sound Event Detection from Partially Annotated Data: Trends and Challenges, in: IcETRAN conference, Srebrno Jezero, Serbia, June 2019.
https://hal.inria.fr/hal-02114652
17E. Vincent.
COMPRISE, in: META-FORUM, Bruxelles, Belgium, October 2019.
https://hal.inria.fr/hal-02377051
18E. Vincent.
Grands défis scientifiques et technologiques en traitement de la parole: quelles initiatives chez Inria et au niveau européen?, in: Voice Tech Paris 2019, Paris, France, November 2019.
https://hal.inria.fr/hal-02377036
19E. Vincent.
Parole & deep learning : succès et grands défis, in: Journée IA, Langage et Citoyens, Nancy, France, March 2019.
https://hal.inria.fr/hal-02090623

International Conferences with Proceedings

20K. Abidi, D. Fohr, D. Jouvet, D. Langlois, O. Mella, K. Smaïli.
A Fine-grained Multilingual Analysis Based on the Appraisal Theory: Application to Arabic and English Videos, in: ICALP: International Conference on Arabic Language Processing, Nancy, France, Springer, August 2019, vol. Communications in Computer and Information Science book series (CCIS, volume 1108), pp. 49-61. [ DOI : 10.1007/978-3-030-32959-4_4 ]
https://hal.archives-ouvertes.fr/hal-02314244
21T. Biasutto–Lervat, S. Dahmani, S. Ouni.
Modeling Labial Coarticulation with Bidirectional Gated Recurrent Networks and Transfer Learning, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02175780
22A. Bonneau.
German obstruent sequences by French L2 learners, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02143360
23S. Dahmani, V. Colotte, V. Girard, S. Ouni.
Conditional Variational Auto-Encoder for Text-Driven Expressive AudioVisual Speech Synthesis, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02175776
24D. Di Carlo, A. Deleforge, N. Bertin.
Mirage: 2D Source Localization Using Microphone Pair Augmentation with Echoes, in: ICASSP 2019 - IEEE International Conference on Acoustic, Speech Signal Processing, Brighton, United Kingdom, IEEE, May 2019, pp. 775-779, https://arxiv.org/abs/1906.08968. [ DOI : 10.1109/ICASSP.2019.8683534 ]
https://hal.archives-ouvertes.fr/hal-02160940
25C. Dodane, D. Boutet, I. Didirkova, F. Hirsch, S. Ouni, A. Morgenstern.
An integrative platform to capture the orchestration of gesture and speech, in: GeSpIn 2019 - Gesture and Speech in Interaction, Paderborn, Germany, September 2019.
https://hal.inria.fr/hal-02278345
26I. K. Douros, J. Felblinger, J. Frahm, K. Isaieva, A. Joseph, Y. Laprie, F. Odille, A. Tsukanova, D. Voit, P.-A. Vuissoz.
A Multimodal Real-Time MRI Articulatory Corpus of French for Speech Research, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02167756
27I. K. Douros, Y. Laprie, P.-A. Vuissoz, B. Elie.
Acoustic Evaluation of Simplifying Hypotheses Used in Articulatory Synthesis, in: ICA 2019 - 23rd International Congress on Acoustics, Aachen, Germany, September 2019.
https://hal.inria.fr/hal-02180617
28I. K. Douros, A. Tsukanova, K. Isaieva, P.-A. Vuissoz, Y. Laprie.
Towards a method of dynamic vocal tract shapes generation by combining static 3D and dynamic 2D MRI speech data, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02181333
29I. K. Douros, P.-A. Vuissoz, Y. Laprie.
Acoustic impacts of geometric approximation at the level of velum and epiglottis on French vowels, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02180566
30I. K. Douros, P.-A. Vuissoz, Y. Laprie.
Comparison between 2D and 3D models for speech production: a study of French vowels, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02180606
31I. K. Douros, P.-A. Vuissoz, Y. Laprie.
Effect of head posture on phonation of French vowels, in: ICPhS 2019 - Proceedings of International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02180486
32A. Dufraux, E. Vincent, A. Hannun, A. Brun, M. Douze.
Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition, in: ASRU 2019 - IEEE Automatic Speech Recognition and Understanding Workshop, Singapour, Singapore, December 2019.
https://hal.inria.fr/hal-02316572
33B. Elie, A. Amelot, Y. Laprie, S. Maeda.
Glottal Opening Measurements in VCV and VCCV Sequences, in: ICA 2019 - 23rd International Congress on Acoustics, Aachen, Germany, September 2019.
https://hal.inria.fr/hal-02180626
34M. Fontaine, A. A. Nugraha, R. Badeau, K. Yoshii, A. Liutkus.
Cauchy Multichannel Speech Enhancement with a Deep Speech Prior, in: EUSIPCO 2019 - 27th European Signal Processing Conference, Coruña, Spain, September 2019.
https://hal.telecom-paristech.fr/hal-02288063
35T. Kinnunen, R. G. Hautamäki, V. Vestman, M. Sahidullah.
Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection, in: ICASSP 2019 – 44th International Conference on Acoustics, Speech, and Signal Processing, Brighton, United Kingdom, May 2019.
https://hal.inria.fr/hal-02051701
36A. Kulkarni, V. Colotte, D. Jouvet.
Layer adaptation for transfer of expressivity in speech synthesis, in: LTC'19 - 9th Language & Technology Conference, Poznan, Poland, May 2019.
https://hal.inria.fr/hal-02177945
37L. Lee, K. Bartkova, D. Jouvet, M. Dargnat, Y. Keromnes.
Can prosody meet pragmatics? Case of discourse particles in French, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02177202
38K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P.-M. B. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, J. Patino, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. Ngo Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. Kuan Teh, H. Dat Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J.-F. Bonastre, C. Xu, Z. H. Lim, S. Chng, S. Ranjan, J. H. L. Hansen, M. Todisco, N. Evans.
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.archives-ouvertes.fr/hal-02280151
39T. Léonova, G. Coffe, A. Tarasconi, A. Piquard-Kipffer, D. Sardin, A. Gosse, J. Boré.
L'impact du trouble du spectre de l'autisme sur le bien-être psychologique des parents, in: XVIIIème Congrès de l'Association Internationale de Formation et de Recherche en Éducation Familiale, Schoelcher, Martinique, France, May 2019.
https://hal.inria.fr/hal-02179616
40M. A. Menacer, C. E. González-Gallardo, K. Abidi, D. Fohr, D. Jouvet, D. Langlois, O. Mella, F. Sadat, J. M. Torres-Moreno, K. Smaïli.
Extractive Text-Based Summarization of Arabic videos: Issues, Approaches and Evaluations, in: ICALP: International Conference on Arabic Language Processing, Nancy, France, Springer, August 2019, vol. Communications in Computer and Information Science book series (CCIS, volume 1108), pp. 65-78. [ DOI : 10.1007/978-3-030-32959-4_5 ]
https://hal.archives-ouvertes.fr/hal-02314238
41M. Menacer, D. Langlois, D. Jouvet, D. Fohr, O. Mella, K. Smaïli.
Machine Translation on a parallel Code-Switched Corpus, in: Canadian AI 2019 - 32nd Conference on Canadian Artificial Intelligence, Ontario, Canada, Lecture Notes in Artificial Intelligence, May 2019.
https://hal.archives-ouvertes.fr/hal-02106010
42M. Pariente, A. Deleforge, E. Vincent.
A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders, in: INTERSPEECH, Graz, Austria, September 2019, https://arxiv.org/abs/1905.01209.
https://hal.inria.fr/hal-02116165
44D. Ribas, E. Vincent.
An improved uncertainty propagation method for robust i-vector based speaker recognition, in: ICASSP 2019 - 44th International Conference on Acoustics, Speech, and Signal Processing, Brighton, United Kingdom, May 2019, https://arxiv.org/abs/1902.05761.
https://hal.inria.fr/hal-02010199
45B. M. L. Srivastava, A. Bellet, M. Tommasi, E. Vincent.
Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02166434
46M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, K. A. Lee.
ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.archives-ouvertes.fr/hal-02172099
47A. Tsukanova, I. K. Douros, A. Shimorina, Y. Laprie.
Can static vocal tract positions represent articulatory targets in continuous speech? Matching static MRI captures against real-time MRI for the French language, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02181314
48N. Turpault, R. Serizel, A. Parag Shah, J. Salamon.
Sound event detection in domestic environments with weakly labeled data and soundscape synthesis, in: Workshop on Detection and Classification of Acoustic Scenes and Events, New York City, United States, October 2019.
https://hal.inria.fr/hal-02160855
49N. Turpault, R. Serizel, E. Vincent.
Semi-supervised triplet loss based learning of ambient audio embeddings, in: ICASSP, Brighton, United Kingdom, 2019.
https://hal.archives-ouvertes.fr/hal-02025824
50I. Zangar, Z. Mnasri, V. Colotte, D. Jouvet.
F0 modeling using DNN for Arabic parametric speech synthesis, in: INNSBDDL 2019 - INNS Big Data and Deep Learning, Sestri Levante, Italy, April 2019.
https://hal.inria.fr/hal-02177496

Scientific Books (or Scientific Book chapters)

51M. Sahidullah, H. Delgado, M. Todisco, T. Kinnunen, N. Evans, J. Yamagishi, K. A. Lee.
Introduction to Voice Presentation Attack Detection and Recent Advances, in: Handbook of Biometric Anti-Spoofing: Presentation Attack Detection, S. Marcel, M. S. Nixon, J. Fierrez, N. Evans (editors), Advances in Computer Vision and Pattern Recognition, Springer, 2019, pp. 321-361. [ DOI : 10.1007/978-3-319-92627-8_15 ]
https://hal.inria.fr/hal-01974528

Internal Reports

52B. Caramiaux, F. Lotte, J. Geurts, G. Amato, M. Behrmann, F. Bimbot, F. Falchi, A. Garcia, J. Gibert, G. Gravier, H. Holken, H. Koenitz, S. Lefebvre, A. Liutkus, A. Perkis, R. Redondo, E. Turrin, T. Viéville, E. Vincent.
AI in the media and creative industries, New European Media (NEM), April 2019, pp. 1-35, https://arxiv.org/abs/1905.04175.
https://hal.inria.fr/hal-02125504
53G. Carbajal, R. Serizel, E. Vincent, E. Humbert.
Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise: Supporting Document, Inria Nancy, équipe Multispeech ; Invoxia SAS, November 2019, n^o RR-9303.
https://hal.inria.fr/hal-02372431
54K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P.-M. B. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, J. Patino, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. Ngo Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. Kuan Teh, H. Dat Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J.-F. Bonastre, C. Xu, Z. H. Lim, S. Chng, S. Ranjan, J. H. L. Hansen, M. Todisco, N. Evans.
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences, I4U Consortium, April 2019.
https://hal.archives-ouvertes.fr/hal-02174317
55M. Pariente, A. Deleforge, E. Vincent.
A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders : Supporting Document, Inria, April 2019, n^o RR-9268, pp. 1-8.
https://hal.inria.fr/hal-02089062

Software

56M. Kowalski, E. Vincent, R. Gribonval.
Underdetermined Reverberant Source Separation, October 2019,
[ SWH-ID : swh:1:dir:ec4ae097465d9ea51589537ea94b2ea50e8d134d ], Software.
https://hal.archives-ouvertes.fr/hal-02309043

Other Publications

57G. Carbajal, R. Serizel, E. Vincent, E. Humbert.
Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise, December 2019, working paper or preprint.
https://hal.inria.fr/hal-02372579
58N. Furnon, R. Serizel, I. Illina, S. Essid.
DNN-Based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays, October 2019, Submitted to ICASSP2020.
https://hal.archives-ouvertes.fr/hal-02389159
59M. Pariente, S. Cornell, A. Deleforge, E. Vincent.
Filterbank design for end-to-end speech separation, October 2019, Submitted to ICASSP2020.
https://hal.archives-ouvertes.fr/hal-02355623
60M. Sahidullah, J. Patino, S. Cornell, R. Yin, S. Sivasankaran, H. Bredin, P. Korshunov, A. Brutti, R. Serizel, E. Vincent, N. Evans, S. Marcel, S. Squartini, C. Barras.
The Speed Submission to DIHARD II: Contributions & Lessons Learned, November 2019, working paper or preprint.
https://hal.inria.fr/hal-02352840
61R. Serizel, N. Turpault, A. Shah, J. Salamon.
Sound event detection in synthetic domestic environments, November 2019, working paper or preprint.
https://hal.inria.fr/hal-02355573
62S. Sivasankaran, E. Vincent, D. Fohr.
Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition, November 2019, Submitted to ICASSP 2020.
https://hal.inria.fr/hal-02355669
63S. Sivasankaran, E. Vincent, D. Fohr.
SLOGD: Speaker Location Guided Deflation Approach to Speech Separation, November 2019, Submitted to ICASSP 2020.
https://hal.inria.fr/hal-02355613
64B. M. L. Srivastava, N. Vauquier, M. Sahidullah, A. Bellet, M. Tommasi, E. Vincent.
Evaluating Voice Conversion-based Privacy Protection against Informed Attackers, November 2019, working paper or preprint.
https://hal.inria.fr/hal-02355115

Previous |

Home