Bibliography
Publications of the year
Doctoral Dissertations and Habilitation Theses
-
1M. Fontaine.
Alpha-stable processes for signal processing, Université de Lorraine, June 2019.
https://tel.archives-ouvertes.fr/tel-02188304 -
2L. Perotin.
Localization and enhancement of speech from the Ambisonics format : analyse de scènes sonores pour faciliter la commande vocale, Université de Lorraine, October 2019.
https://hal.univ-lorraine.fr/tel-02393258 -
3A. Tsukanova.
Articulatory speech synthesis, Univeristé de lorraine, December 2019.
https://hal.archives-ouvertes.fr/tel-02433528
Articles in International Peer-Reviewed Journals
-
4N. Bertin, E. Camberlein, R. Lebarbenchon, E. Vincent, S. Sivasankaran, I. Illina, F. Bimbot.
VoiceHome-2, an extended corpus for multichannel speech processing in real homes, in: Speech Communication, January 2019, vol. 106, pp. 68-78. [ DOI : 10.1016/j.specom.2018.11.002 ]
https://hal.inria.fr/hal-01923108 -
5A. Deleforge, D. Di Carlo, M. Strauss, R. Serizel, L. Marcenaro.
Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition, in: IEEE Signal Processing Magazine, September 2019, vol. 36, no 5, pp. 138-144, https://arxiv.org/abs/1907.04655. [ DOI : 10.1109/MSP.2019.2924687 ]
https://hal.archives-ouvertes.fr/hal-02161897 -
6K. Déguernel, E. Vincent, J. Nika, G. Assayag, K. Smaïli.
Learning of Hierarchical Temporal Structures for Guided Improvisation, in: Computer Music Journal, 2019, vol. 43, no 2.
https://hal.inria.fr/hal-02378273 -
7A. Mesaros, A. Diment, B. Elizalde, T. Heittola, E. Vincent, B. Raj, T. Virtanen.
Sound event detection in the DCASE 2017 Challenge, in: IEEE/ACM Transactions on Audio, Speech and Language Processing, June 2019, vol. 27, no 6, pp. 992-1006. [ DOI : 10.1109/TASLP.2019.2907016 ]
https://hal.inria.fr/hal-02067935 -
8Q. V. Nguyen, F. Colas, E. Vincent, F. Charpillet.
Motion planning for robot audition, in: Autonomous Robots, December 2019, vol. 43, no 8, pp. 2293-2317. [ DOI : 10.1007/s10514-019-09880-1 ]
https://hal.inria.fr/hal-02188342 -
9L. Perotin, R. Serizel, E. Vincent, A. Guérin.
CRNN-based multiple DoA estimation using acoustic intensity features for Ambisonics recordings, in: IEEE Journal of Selected Topics in Signal Processing, February 2019, vol. 13, no 1, pp. 22-33. [ DOI : 10.1109/jstsp.2019.2900164 ]
https://hal.inria.fr/hal-01839883 -
10A. Poddar, M. Sahidullah, G. Saha.
Quality Measures for Speaker Verification with Short Utterances, in: Digital Signal Processing, January 2019, vol. 88, pp. 66-79. [ DOI : 10.1016/j.dsp.2019.01.023 ]
https://hal.inria.fr/hal-01998376 -
11K. Smaïli, D. Fohr, C.-E. González-Gallardo, M. L. Grega, L. Janowski, D. Jouvet, A. Koźbiał, D. Langlois, M. Leszczuk, O. Mella, M.-A. Menacer, A. Mendez, E. L. L. Pontes, E. Sanjuan, J.-M. Torres-Moreno, B. Garcia-Zapirain.
Summarizing videos into a target language: Methodology, architectures and evaluation, in: Journal of Intelligent and Fuzzy Systems, July 2019, vol. 1, pp. 1-12. [ DOI : 10.3233/JIFS-179350 ]
https://hal.archives-ouvertes.fr/hal-02271287 -
12V. Vestman, T. Kinnunen, R. G. Hautamäki, M. Sahidullah.
Voice Mimicry Attacks Assisted by Automatic Speaker Verification, in: Computer Speech and Language, June 2019, vol. 59, pp. 36-54. [ DOI : 10.1016/j.csl.2019.05.005 ]
https://hal.archives-ouvertes.fr/hal-02161773
Invited Conferences
-
13C. Dodane, D. Boutet, F. Hirsch, S. Ouni, A. Morgenstern.
MODALISA une plateforme intégrative pour capturer l’orchestration des gestes et de la parole, in: Défi Instrumentation aux Limites, Colloque de restitution, Paris, France, CNRS, September 2019.
https://hal.archives-ouvertes.fr/hal-02375011 -
14F. Forbes, A. Deleforge, R. Horaud, E. Perthame.
Robust non-linear regression approach for generalized inverse problems in a high dimensional setting, in: AIP 2019 - Applied Inverse Problem conference, Grenoble, France, July 2019.
https://hal.archives-ouvertes.fr/hal-02415115 -
15D. Jouvet.
Speech Processing and Prosody, in: TSD 2019 - 22nd International Conference of Text, Speech and Dialogue, Ljubljana, Slovenia, September 2019.
https://hal.inria.fr/hal-02177210 -
16R. Serizel, N. Turpault.
Sound Event Detection from Partially Annotated Data: Trends and Challenges, in: IcETRAN conference, Srebrno Jezero, Serbia, June 2019.
https://hal.inria.fr/hal-02114652 -
17E. Vincent.
COMPRISE, in: META-FORUM, Bruxelles, Belgium, October 2019.
https://hal.inria.fr/hal-02377051 -
18E. Vincent.
Grands défis scientifiques et technologiques en traitement de la parole: quelles initiatives chez Inria et au niveau européen?, in: Voice Tech Paris 2019, Paris, France, November 2019.
https://hal.inria.fr/hal-02377036 -
19E. Vincent.
Parole & deep learning : succès et grands défis, in: Journée IA, Langage et Citoyens, Nancy, France, March 2019.
https://hal.inria.fr/hal-02090623
International Conferences with Proceedings
-
20K. Abidi, D. Fohr, D. Jouvet, D. Langlois, O. Mella, K. Smaïli.
A Fine-grained Multilingual Analysis Based on the Appraisal Theory: Application to Arabic and English Videos, in: ICALP: International Conference on Arabic Language Processing, Nancy, France, Springer, August 2019, vol. Communications in Computer and Information Science book series (CCIS, volume 1108), pp. 49-61. [ DOI : 10.1007/978-3-030-32959-4_4 ]
https://hal.archives-ouvertes.fr/hal-02314244 -
21T. Biasutto–Lervat, S. Dahmani, S. Ouni.
Modeling Labial Coarticulation with Bidirectional Gated Recurrent Networks and Transfer Learning, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02175780 -
22A. Bonneau.
German obstruent sequences by French L2 learners, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02143360 -
23S. Dahmani, V. Colotte, V. Girard, S. Ouni.
Conditional Variational Auto-Encoder for Text-Driven Expressive AudioVisual Speech Synthesis, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02175776 -
24D. Di Carlo, A. Deleforge, N. Bertin.
Mirage: 2D Source Localization Using Microphone Pair Augmentation with Echoes, in: ICASSP 2019 - IEEE International Conference on Acoustic, Speech Signal Processing, Brighton, United Kingdom, IEEE, May 2019, pp. 775-779, https://arxiv.org/abs/1906.08968. [ DOI : 10.1109/ICASSP.2019.8683534 ]
https://hal.archives-ouvertes.fr/hal-02160940 -
25C. Dodane, D. Boutet, I. Didirkova, F. Hirsch, S. Ouni, A. Morgenstern.
An integrative platform to capture the orchestration of gesture and speech, in: GeSpIn 2019 - Gesture and Speech in Interaction, Paderborn, Germany, September 2019.
https://hal.inria.fr/hal-02278345 -
26I. K. Douros, J. Felblinger, J. Frahm, K. Isaieva, A. Joseph, Y. Laprie, F. Odille, A. Tsukanova, D. Voit, P.-A. Vuissoz.
A Multimodal Real-Time MRI Articulatory Corpus of French for Speech Research, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02167756 -
27I. K. Douros, Y. Laprie, P.-A. Vuissoz, B. Elie.
Acoustic Evaluation of Simplifying Hypotheses Used in Articulatory Synthesis, in: ICA 2019 - 23rd International Congress on Acoustics, Aachen, Germany, September 2019.
https://hal.inria.fr/hal-02180617 -
28I. K. Douros, A. Tsukanova, K. Isaieva, P.-A. Vuissoz, Y. Laprie.
Towards a method of dynamic vocal tract shapes generation by combining static 3D and dynamic 2D MRI speech data, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02181333 -
29I. K. Douros, P.-A. Vuissoz, Y. Laprie.
Acoustic impacts of geometric approximation at the level of velum and epiglottis on French vowels, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02180566 -
30I. K. Douros, P.-A. Vuissoz, Y. Laprie.
Comparison between 2D and 3D models for speech production: a study of French vowels, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02180606 -
31I. K. Douros, P.-A. Vuissoz, Y. Laprie.
Effect of head posture on phonation of French vowels, in: ICPhS 2019 - Proceedings of International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02180486 -
32A. Dufraux, E. Vincent, A. Hannun, A. Brun, M. Douze.
Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition, in: ASRU 2019 - IEEE Automatic Speech Recognition and Understanding Workshop, Singapour, Singapore, December 2019.
https://hal.inria.fr/hal-02316572 -
33B. Elie, A. Amelot, Y. Laprie, S. Maeda.
Glottal Opening Measurements in VCV and VCCV Sequences, in: ICA 2019 - 23rd International Congress on Acoustics, Aachen, Germany, September 2019.
https://hal.inria.fr/hal-02180626 -
34M. Fontaine, A. A. Nugraha, R. Badeau, K. Yoshii, A. Liutkus.
Cauchy Multichannel Speech Enhancement with a Deep Speech Prior, in: EUSIPCO 2019 - 27th European Signal Processing Conference, Coruña, Spain, September 2019.
https://hal.telecom-paristech.fr/hal-02288063 -
35T. Kinnunen, R. G. Hautamäki, V. Vestman, M. Sahidullah.
Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection, in: ICASSP 2019 – 44th International Conference on Acoustics, Speech, and Signal Processing, Brighton, United Kingdom, May 2019.
https://hal.inria.fr/hal-02051701 -
36A. Kulkarni, V. Colotte, D. Jouvet.
Layer adaptation for transfer of expressivity in speech synthesis, in: LTC'19 - 9th Language & Technology Conference, Poznan, Poland, May 2019.
https://hal.inria.fr/hal-02177945 -
37L. Lee, K. Bartkova, D. Jouvet, M. Dargnat, Y. Keromnes.
Can prosody meet pragmatics? Case of discourse particles in French, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02177202 -
38K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P.-M. B. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, J. Patino, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. Ngo Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. Kuan Teh, H. Dat Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J.-F. Bonastre, C. Xu, Z. H. Lim, S. Chng, S. Ranjan, J. H. L. Hansen, M. Todisco, N. Evans.
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.archives-ouvertes.fr/hal-02280151 -
39T. Léonova, G. Coffe, A. Tarasconi, A. Piquard-Kipffer, D. Sardin, A. Gosse, J. Boré.
L'impact du trouble du spectre de l'autisme sur le bien-être psychologique des parents, in: XVIIIème Congrès de l'Association Internationale de Formation et de Recherche en Éducation Familiale, Schoelcher, Martinique, France, May 2019.
https://hal.inria.fr/hal-02179616 -
40M. A. Menacer, C. E. González-Gallardo, K. Abidi, D. Fohr, D. Jouvet, D. Langlois, O. Mella, F. Sadat, J. M. Torres-Moreno, K. Smaïli.
Extractive Text-Based Summarization of Arabic videos: Issues, Approaches and Evaluations, in: ICALP: International Conference on Arabic Language Processing, Nancy, France, Springer, August 2019, vol. Communications in Computer and Information Science book series (CCIS, volume 1108), pp. 65-78. [ DOI : 10.1007/978-3-030-32959-4_5 ]
https://hal.archives-ouvertes.fr/hal-02314238 -
41M. Menacer, D. Langlois, D. Jouvet, D. Fohr, O. Mella, K. Smaïli.
Machine Translation on a parallel Code-Switched Corpus, in: Canadian AI 2019 - 32nd Conference on Canadian Artificial Intelligence, Ontario, Canada, Lecture Notes in Artificial Intelligence, May 2019.
https://hal.archives-ouvertes.fr/hal-02106010 -
42M. Pariente, A. Deleforge, E. Vincent.
A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders, in: INTERSPEECH, Graz, Austria, September 2019, https://arxiv.org/abs/1905.01209.
https://hal.inria.fr/hal-02116165 -
44D. Ribas, E. Vincent.
An improved uncertainty propagation method for robust i-vector based speaker recognition, in: ICASSP 2019 - 44th International Conference on Acoustics, Speech, and Signal Processing, Brighton, United Kingdom, May 2019, https://arxiv.org/abs/1902.05761.
https://hal.inria.fr/hal-02010199 -
45B. M. L. Srivastava, A. Bellet, M. Tommasi, E. Vincent.
Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.inria.fr/hal-02166434 -
46M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, K. A. Lee.
ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection, in: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Graz, Austria, September 2019.
https://hal.archives-ouvertes.fr/hal-02172099 -
47A. Tsukanova, I. K. Douros, A. Shimorina, Y. Laprie.
Can static vocal tract positions represent articulatory targets in continuous speech? Matching static MRI captures against real-time MRI for the French language, in: ICPhS 2019 - International Congress of Phonetic Sciences, Melbourne, Australia, August 2019.
https://hal.inria.fr/hal-02181314 -
48N. Turpault, R. Serizel, A. Parag Shah, J. Salamon.
Sound event detection in domestic environments with weakly labeled data and soundscape synthesis, in: Workshop on Detection and Classification of Acoustic Scenes and Events, New York City, United States, October 2019.
https://hal.inria.fr/hal-02160855 -
49N. Turpault, R. Serizel, E. Vincent.
Semi-supervised triplet loss based learning of ambient audio embeddings, in: ICASSP, Brighton, United Kingdom, 2019.
https://hal.archives-ouvertes.fr/hal-02025824 -
50I. Zangar, Z. Mnasri, V. Colotte, D. Jouvet.
F0 modeling using DNN for Arabic parametric speech synthesis, in: INNSBDDL 2019 - INNS Big Data and Deep Learning, Sestri Levante, Italy, April 2019.
https://hal.inria.fr/hal-02177496
Scientific Books (or Scientific Book chapters)
-
51M. Sahidullah, H. Delgado, M. Todisco, T. Kinnunen, N. Evans, J. Yamagishi, K. A. Lee.
Introduction to Voice Presentation Attack Detection and Recent Advances, in: Handbook of Biometric Anti-Spoofing: Presentation Attack Detection, S. Marcel, M. S. Nixon, J. Fierrez, N. Evans (editors), Advances in Computer Vision and Pattern Recognition, Springer, 2019, pp. 321-361. [ DOI : 10.1007/978-3-319-92627-8_15 ]
https://hal.inria.fr/hal-01974528
Internal Reports
-
52B. Caramiaux, F. Lotte, J. Geurts, G. Amato, M. Behrmann, F. Bimbot, F. Falchi, A. Garcia, J. Gibert, G. Gravier, H. Holken, H. Koenitz, S. Lefebvre, A. Liutkus, A. Perkis, R. Redondo, E. Turrin, T. Viéville, E. Vincent.
AI in the media and creative industries, New European Media (NEM), April 2019, pp. 1-35, https://arxiv.org/abs/1905.04175.
https://hal.inria.fr/hal-02125504 -
53G. Carbajal, R. Serizel, E. Vincent, E. Humbert.
Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise: Supporting Document, Inria Nancy, équipe Multispeech ; Invoxia SAS, November 2019, no RR-9303.
https://hal.inria.fr/hal-02372431 -
54K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P.-M. B. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, J. Patino, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. Ngo Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. Kuan Teh, H. Dat Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J.-F. Bonastre, C. Xu, Z. H. Lim, S. Chng, S. Ranjan, J. H. L. Hansen, M. Todisco, N. Evans.
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences, I4U Consortium, April 2019.
https://hal.archives-ouvertes.fr/hal-02174317 -
55M. Pariente, A. Deleforge, E. Vincent.
A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders : Supporting Document, Inria, April 2019, no RR-9268, pp. 1-8.
https://hal.inria.fr/hal-02089062
Software
-
56M. Kowalski, E. Vincent, R. Gribonval.
Underdetermined Reverberant Source Separation, October 2019,
[ SWH-ID : swh:1:dir:ec4ae097465d9ea51589537ea94b2ea50e8d134d ], Software.
https://hal.archives-ouvertes.fr/hal-02309043
Other Publications
-
57G. Carbajal, R. Serizel, E. Vincent, E. Humbert.
Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise, December 2019, working paper or preprint.
https://hal.inria.fr/hal-02372579 -
58N. Furnon, R. Serizel, I. Illina, S. Essid.
DNN-Based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays, October 2019, Submitted to ICASSP2020.
https://hal.archives-ouvertes.fr/hal-02389159 -
59M. Pariente, S. Cornell, A. Deleforge, E. Vincent.
Filterbank design for end-to-end speech separation, October 2019, Submitted to ICASSP2020.
https://hal.archives-ouvertes.fr/hal-02355623 -
60M. Sahidullah, J. Patino, S. Cornell, R. Yin, S. Sivasankaran, H. Bredin, P. Korshunov, A. Brutti, R. Serizel, E. Vincent, N. Evans, S. Marcel, S. Squartini, C. Barras.
The Speed Submission to DIHARD II: Contributions & Lessons Learned, November 2019, working paper or preprint.
https://hal.inria.fr/hal-02352840 -
61R. Serizel, N. Turpault, A. Shah, J. Salamon.
Sound event detection in synthetic domestic environments, November 2019, working paper or preprint.
https://hal.inria.fr/hal-02355573 -
62S. Sivasankaran, E. Vincent, D. Fohr.
Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition, November 2019, Submitted to ICASSP 2020.
https://hal.inria.fr/hal-02355669 -
63S. Sivasankaran, E. Vincent, D. Fohr.
SLOGD: Speaker Location Guided Deflation Approach to Speech Separation, November 2019, Submitted to ICASSP 2020.
https://hal.inria.fr/hal-02355613 -
64B. M. L. Srivastava, N. Vauquier, M. Sahidullah, A. Bellet, M. Tommasi, E. Vincent.
Evaluating Voice Conversion-based Privacy Protection against Informed Attackers, November 2019, working paper or preprint.
https://hal.inria.fr/hal-02355115