Bibliography

Major publications by the team in recent years

2A. Bonneau, D. Fohr, I. Illina, D. Jouvet, O. Mella, L. Mesbahi, L. Orosanu.

Gestion d'erreurs pour la fiabilisation des retours automatiques en apprentissage de la prosodie d'une langue seconde, in: Traitement Automatique des Langues, 2013, vol. 53, n^o 3.

https://hal.inria.fr/hal-00834278
3D. Jouvet, D. Fohr.

Combining Forward-based and Backward-based Decoders for Improved Speech Recognition Performance, in: InterSpeech - 14th Annual Conference of the International Speech Communication Association - 2013, Lyon, France, August 2013.

https://hal.inria.fr/hal-00834282
4K. Nathwani, J. A. Morales-Cordovilla, S. Sivasankaran, I. Illina, E. Vincent.

An extended experimental investigation of DNN uncertainty propagation for noise robust ASR, in: 5th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2017), San Francisco, United States, March 2017.

https://hal.inria.fr/hal-01446441
5A. A. Nugraha, A. Liutkus, E. Vincent.

Multichannel audio source separation with deep neural networks, in: IEEE/ACM Transactions on Audio, Speech, and Language Processing, June 2016, vol. 24, n^o 10, pp. 1652-1664. [ DOI : 10.1109/TASLP.2016.2580946 ]

https://hal.inria.fr/hal-01163369
6A. Ozerov, E. Vincent, F. Bimbot.

A General Flexible Framework for the Handling of Prior Information in Audio Source Separation, in: IEEE Transactions on Audio, Speech and Language Processing, May 2012, vol. 20, n^o 4, pp. 1118 - 1133, 16.

https://hal.archives-ouvertes.fr/hal-00626962
7A. Piquard-Kipffer, C. Blonz.

Je peux voir les mots que tu dis ! Histoire d'un projet, in: 13ème édition du Festival du film de chercheur CNRS 2012, Nancy, France, June 2012.

https://hal.inria.fr/hal-01263907
8A. Piquard-Kipffer, L. Sprenger-Charolles.

Predicting reading level at the end of Grade 2 from skills assessed in kindergarten: contribution of phonemic discrimination (Follow-up of 85 French-speaking children from 4 to 8 years old), in: Topics in Cognitive Psychology, 2013.

https://hal.inria.fr/hal-00833951

Publications of the year

Doctoral Dissertations and Habilitation Theses

9B. Dumortier.

Acoustic control of wind farms, Université de Lorraine, September 2018.

https://tel.archives-ouvertes.fr/tel-01897853
10K. Déguernel.

Learning of musical structures in the context of improvisation, Université de Lorraine, March 2018.

https://tel.archives-ouvertes.fr/tel-01735308

Articles in International Peer-Reviewed Journals

11N. Bertin, E. Camberlein, R. Lebarbenchon, E. Vincent, S. Sivasankaran, I. Illina, F. Bimbot.

VoiceHome-2, an extended corpus for multichannel speech processing in real homes, in: Speech Communication, 2018.

https://hal.inria.fr/hal-01923108
12K. Déguernel, E. Vincent, G. Assayag.

Probabilistic Factor Oracles for Multidimensional Machine Improvisation, in: Computer Music Journal, June 2018, vol. 42, n^o 2, pp. 52-66. [ DOI : 10.1162/comj_a_00460 ]

https://hal.inria.fr/hal-01693750
13A. Houidhek, V. Colotte, Z. Mnasri, D. Jouvet.

Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic, in: International Journal of Speech Technology, November 2018, pp. 1-12. [ DOI : 10.1007/s10772-018-09558-6 ]

https://hal.inria.fr/hal-01936963
14D. Jouvet, D. Langlois, M. A. Menacer, D. Fohr, O. Mella, K. Smaïli.

Adaptation of speech recognition vocabularies for improved transcription of YouTube videos, in: Journal of International Science and General Applications, March 2018, vol. 1, n^o 1, pp. 1-9.

https://hal.archives-ouvertes.fr/hal-01873801
15K. Nathwani, E. Vincent, I. Illina.

DNN Uncertainty Propagation using GMM-Derived Uncertainty Features for Noise Robust ASR, in: IEEE Signal Processing Letters, January 2018. [ DOI : 10.1109/LSP.2018.2791534 ]

https://hal.inria.fr/hal-01680658
16S. Ouni, G. Gris.

Dynamic Lip Animation from a Limited number of Control Points: Towards an Effective Audiovisual Spoken Communication, in: Speech Communication, February 2018, vol. 96, pp. 49-57. [ DOI : 10.1016/j.specom.2017.11.006 ]

https://hal.inria.fr/hal-01631397
17Z. Wang, E. Vincent, R. Serizel, Y. Yan.

Rank-1 Constrained Multichannel Wiener Filter for Speech Recognition in Noisy Environments, in: Computer Speech and Language, May 2018, vol. 49, pp. 37-51. [ DOI : 10.1016/j.csl.2017.11.003 ]

https://hal.inria.fr/hal-01634449

International Conferences with Proceedings

18B. Abdullah, I. Illina, D. Fohr.

Dynamic Extension of ASR Lexicon Using Wikipedia Data, in: IEEE Workshop on Spoken and Language Technology (SLT), Athènes, Greece, Proceedings of IEEE SLT, December 2018.

https://hal.archives-ouvertes.fr/hal-01874495
19J. Barker, S. Watanabe, E. Vincent, J. Trmal.

The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines, in: Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, September 2018, https://arxiv.org/abs/1803.10609.

https://hal.inria.fr/hal-01744021
20K. Bartkova, D. Jouvet.

Analysis of prosodic correlates of emotional speech data, in: ExLing 2018 - 9th Tutorial and Research Workshop on Experimental Linguistics, Paris, France, August 2018.

https://hal.inria.fr/hal-01889932
21T. Biasutto– Lervat, S. Ouni.

Phoneme-to-Articulatory mapping using bidirectional gated RNN, in: Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, September 2018.

https://hal.inria.fr/hal-01862587
22A. Bonneau.

Impact of fluency and segmental categorization in L2: the case of French final fricatives uttered by German speakers, in: Speech Prosody 2018, Poznan, Poland, June 2018. [ DOI : 10.21437/speechprosody.2018-189 ]

https://hal.inria.fr/hal-01926657
23G. Carbajal, R. Serizel, E. Vincent, E. Humbert.

Multiple-input neural network-based residual echo suppression, in: ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Canada, April 2018, pp. 1-5.

https://hal.inria.fr/hal-01723630
24H. Delgado, M. Todisco, M. Sahidullah, N. Evans, T. Kinnunen, K. A. Lee, J. Yamagishi.

ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements, in: Odyssey 2018 - The Speaker and Language Recognition Workshop, Les Sables d'Olonne, France, June 2018.

https://hal.inria.fr/hal-01880206
25D. Di Carlo, A. Liutkus, K. Déguernel.

Interference reduction on full-length live recordings, in: ICASSP 2018 - IEEE International Conference on Acoustics, Speech, and Signal Processing, Calgary, Canada, IEEE, April 2018, pp. 736-740. [ DOI : 10.1109/ICASSP.2018.8462621 ]

https://hal.inria.fr/hal-01713889
26F. Fang, J. Yamagishi, I. Echizen, M. Sahidullah, T. Kinnunen.

Transforming acoustic characteristics to deceive playback spoofing countermeasures of speaker verification systems, in: WIFS 2018 - IEEE International Workshop on Information Forensics and Security, Hong Kong, Hong Kong SAR China, December 2018.

https://hal.inria.fr/hal-01889910
27M. Fontaine, F.-R. Stöter, A. Liutkus, U. Simsekli, R. Serizel, R. Badeau.

Multichannel Audio Modeling with Elliptically Stable Tensor Decomposition, in: LVA ICA 2018 - 14th International Conference on Latent Variable Analysis and Signal Separation, Surrey, United Kingdom, July 2018.

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01766795
28B. Garcia-Zapirain, C. Castillo, A. Badiola, S. Zahia, A. Mendez, D. Langlois, D. Jouvet, J.-M. Torres-Moreno, M. Leszczuk, K. Smaïli.

A Proposed Methodology for Subjective Evaluation of Video and Text Summarization, in: MISSI 2018 - 11th edition of the International Conference on Multimedia and Network Information Systems, Wrocław, Poland, Advances in Intelligent Systems and Computing, Springer, September 2018, vol. 833, pp. 396-404. [ DOI : 10.1007/978-3-319-98678-4_40 ]

https://hal.archives-ouvertes.fr/hal-01873685
29M. L. Grega, K. Smaïli, M. Leszczuk, C.-E. González-Gallardo, J.-M. Torres-Moreno, E. Linhares Pontes, D. Fohr, O. Mella, M. A. Menacer, D. Jouvet.

An Integrated AMIS Prototype for Automated Summarization and Translation of Newscasts and Reports, in: MISSI 2018 - 11th International Conference on Multimedia and Network Information Systems, Wroclaw, Poland, K. Choroś, M. Kopel, E. Kukla, A. Siemiński (editors), Springer, September 2018, vol. 833, pp. 415-423. [ DOI : 10.1007/978-3-319-98678-4_42 ]

https://hal.archives-ouvertes.fr/hal-01873680
30A. Houidhek, V. Colotte, Z. Mnasri, D. Jouvet.

DNN-Based Speech Synthesis for Arabic: Modelling and Evaluation, in: SLSP 2018 - 6th International Conference on Statistical Language and Speech Processing, Mons, Belgium, October 2018.

https://hal.inria.fr/hal-01904512
31N. Keriven, A. Deleforge, A. Liutkus.

Blind Source Separation Using Mixtures of Alpha-Stable Distributions, in: ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Canada, IEEE, April 2018, pp. 771-775, https://arxiv.org/abs/1711.04460. [ DOI : 10.1109/ICASSP.2018.8462095 ]

https://hal.inria.fr/hal-01633215
32T. Kinnunen, K. A. Lee, H. Delgado, N. Evans, M. Todisco, M. Sahidullah, J. Yamagishi, D. A. Reynolds.

t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification, in: Speaker Odyssey 2018 The Speaker and Language Recognition Workshop, Les Sables d’Olonne, France, June 2018.

https://hal.inria.fr/hal-01880306
33Y. Laprie, B. Elie, A. Tsukanova, P.-A. Vuissoz.

Centerline articulatory models of the velum and epiglottis for articulatory synthesis of speech, in: EUSIPCO 2018 - 26th European Signal Processing Conference, Rome, Italy, September 2018.

https://hal.inria.fr/hal-01921928
34L. Lee, K. Bartkova, M. Dargnat, D. Jouvet.

Prosodic and Pragmatic Values of Discourse Particles in French, in: ExLing 2018 - 9th Tutorial and Research Workshop on Experimental Linguistics, Paris, France, August 2018.

https://hal.inria.fr/hal-01889925
35A. Liutkus, C. Rohlfing, A. Deleforge.

Audio source separation with magnitude priors: the BEADS model, in: ICASSP: International Conference on Acoustics, Speech and Signal Processing, Calgary, Canada, Signal Processing and Artificial Intelligence: Changing the World, April 2018, pp. 1-5. [ DOI : 10.1109/ICASSP.2018.8462515 ]

https://hal.inria.fr/hal-01713886
36H. Peic Tukuljac, A. Deleforge, R. Gribonval.

MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval, in: NeurIPS 2018 - Thirty-second Conference on Neural Information Processing Systems, Montréal, Canada, December 2018, pp. 1-11, https://arxiv.org/abs/1810.13338.

https://hal.inria.fr/hal-01906385
37L. Perotin, R. Serizel, E. Vincent, A. Guérin.

CRNN-based joint azimuth and elevation localization with the Ambisonics intensity vector, in: IWAENC 2018 - 16th International Workshop on Acoustic Signal Enhancement, Tokyo, Japan, September 2018.

https://hal.inria.fr/hal-01840453
38L. Perotin, R. Serizel, E. Vincent, A. Guérin.

Multichannel speech separation with recurrent neural networks from high-order ambisonics recordings, in: 43rd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2018), Calgary, Canada, April 2018.

https://hal.inria.fr/hal-01699759
39R. Scheibler, D. Di Carlo, A. Deleforge, I. Dokmanić.

Separake: Source Separation with a Little Help From Echoes, in: ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Canada, April 2018.

https://hal.inria.fr/hal-01909531
40R. Serizel, N. Turpault, H. Eghbal-Zadeh, A. Parag Shah.

Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments, in: Workshop on Detection and Classification of Acoustic Scenes and Events, Woking, United Kingdom, November 2018, https://arxiv.org/abs/1807.10501 - Submitted to DCASE2018 Workshop.

https://hal.inria.fr/hal-01850270
41S. Sieranoja, M. Sahidullah, T. Kinnunen, J. Komulainen, A. Hadid.

Audiovisual Synchrony Detection with Optimized Audio Features, in: ICSIP 2018 - 3rd International Conference on Signal and Image Processing, Shenzhen, China, July 2018.

https://hal.inria.fr/hal-01889918
42S. Sivasankaran, B. M. L. Srivastava, S. Sitaram, K. Bali, M. Choudhury.

Phone Merging for Code-switched Speech Recognition, in: Third Workshop on Computational Approaches to Linguistic Code-switching, Melbourne, Australia, collocated with ACL 2018 , July 2018.

https://hal.inria.fr/hal-01800466
43S. Sivasankaran, E. Vincent, D. Fohr.

Keyword-based speaker localization: Localizing a target speaker in a multi-speaker environment, in: Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, September 2018.

https://hal.archives-ouvertes.fr/hal-01817519
45M. Strauss, P. Mordel, V. Miguet, A. Deleforge.

DREGON: Dataset and Methods for UAV-Embedded Sound Source Localization, in: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018), Madrid, Spain, October 2018.

https://hal.inria.fr/hal-01854878
46L. Terissi, G. Sad, M. Cerda, S. Ouni, R. Galvez, J. B. Gómez, B. Girau, N. Hitschfeld-Kahler.

A French-Spanish Multimodal Speech Communication Corpus Incorporating Acoustic Data, Facial, Hands and Arms Gestures Information, in: Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, September 2018.

https://hal.inria.fr/hal-01862585
47M. Todisco, H. Delgado, K. A. Lee, M. Sahidullah, N. Evans, T. Kinnunen, J. Yamagishi.

Integrated Presentation Attack Detection and Automatic Speaker Verification: Common Features and Gaussian Back-end Fusion, in: Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, ISCA, September 2018. [ DOI : 10.21437/Interspeech.2018-2289 ]

https://hal.inria.fr/hal-01889934
48Z. Wang, J. Li, Y. Yan, E. Vincent.

Semi-supervised learning with deep neural networks for relative transfer function inverse regression, in: ICASSP 2018 – IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Canada, April 2018.

https://hal.inria.fr/hal-01797886
49I. Zangar, Z. Mnasri, V. Colotte, D. Jouvet, A. Houidhek.

Duration modeling using DNN for Arabic speech synthesis, in: Speech Prosody 2018 - Proceedings of the 9th International Conference on Speech Prosody are now available!, Poznań, Poland, June 2018.

https://hal.inria.fr/hal-01889917

National Conferences with Proceedings

50N. Libermann, F. Bimbot, E. Vincent.

Exploration de dépendances structurelles mélodiques par réseaux de neurones récurrents, in: JIM 2018 - Journées d'Informatique Musicale, Amiens, France, May 2018, pp. 81-86.

https://hal.archives-ouvertes.fr/hal-01791381

Conferences without Proceedings

51M. Vacher, E. Vincent, M.-E. Bobillier Chaumon, T. Joubert, F. Portet, D. Fohr, S. Caffiau, T. Desot.

The VocADom Project: Speech Interaction for Well-being and Reliance Improvement, in: MobileHCI 2018 - 20th International Conference on Human-Computer Interaction with Mobile Devices and Services, Barcelona, Spain, September 2018.

https://hal.archives-ouvertes.fr/hal-01830217

Scientific Books (or Scientific Book chapters)

52A. Deleforge, A. Schmidt, W. Kellermann.

Audio-Motor Integration for Robot Audition, in: Multimodal Behavior Analysis in the Wild, Academic Press, November 2018, pp. 1-27.

https://hal.inria.fr/hal-01929388
53C. Févotte, E. Vincent, A. Ozerov.

Single-channel audio source separation with NMF: divergences, constraints and algorithms, in: Audio Source Separation, Springer, March 2018.

https://hal.inria.fr/hal-01631185
54T. Gerkmann, E. Vincent.

Spectral masking and filtering, in: Audio source separation and speech enhancement, E. Vincent, T. Virtanen, S. Gannot (editors), Wiley, August 2018.

https://hal.inria.fr/hal-01881425
55A. A. Nugraha, A. Liutkus, E. Vincent.

Deep neural network based multichannel audio source separation, in: Audio Source Separation, Springer, March 2018.

https://hal.inria.fr/hal-01633858
56A. Ozerov, C. Févotte, E. Vincent.

An introduction to multichannel NMF for audio source separation, in: Audio Source Separation, Signals and Communication Technology, Springer, March 2018.

https://hal.inria.fr/hal-01631187
57M. Sahidullah, H. Delgado, M. Todisco, T. Kinnunen, N. Evans, J. Yamagishi, K.-A. Lee.

Introduction to Voice Presentation Attack Detection and Recent Advances, in: Handbook of Biometric Anti-Spoofing: Presentation Attack Detection, S. Marcel, M. S. Nixo, J. Fierrez, N. Evans (editors), Advances in Computer Vision and Pattern Recognition, Springer, 2019, pp. 321-361.

https://hal.inria.fr/hal-01974528
58A. Tsukanova, B. Elie, Y. Laprie.

Articulatory Speech Synthesis from Static Context-Aware Articulatory Targets, in: Studies on Speech Production, Q. Fang, J. Dang, P. Perrier, J. Wei, L. Wang, N. Yan (editors), Lecture Notes in Computer Science, Springer, 2018, n^o 10733, pp. 37-47, Revised Selected Papers of the 11th International Seminar, ISSP 2017, Tianjin, China, October 16-19, 2017. [ DOI : 10.1007/978-3-030-00126-1_4 ]

https://hal.archives-ouvertes.fr/hal-01937950
59E. Vincent, S. Gannot, T. Virtanen.

Acoustics - Spatial properties, in: Audio source separation and speech enhancement, E. Vincent, T. Virtanen, S. Gannot (editors), Wiley, August 2018.

https://hal.inria.fr/hal-01881423
60E. Vincent, S. Gannot, T. Virtanen.

Introduction, in: Audio source separation and speech enhancement, E. Vincent, T. Virtanen, S. Gannot (editors), Wiley, August 2018.

https://hal.inria.fr/hal-01881422
61E. Vincent, T. Virtanen, S. Gannot.

Perspectives, in: Audio source separation and speech enhancement, E. Vincent, T. Virtanen, S. Gannot (editors), Wiley, August 2018.

https://hal.inria.fr/hal-01881424
62T. Virtanen, E. Vincent, S. Gannot.

Time-frequency processing - Spectral properties, in: Audio source separation and speech enhancement, E. Vincent, T. Virtanen, S. Gannot (editors), Wiley, August 2018.

https://hal.inria.fr/hal-01881426
63M. Zitt, A. Lelu, M. Cadot, G. Cabanac.

Bibliometric delineation of scientific fields, in: Handbook of Science and Technology Indicators, W. Glänzel, H. F. Moed, U. Schmoch, M. Thelwall (editors), Handbook of Science and Technology Indicators, Springer International Publishing, 2018. [ DOI : 10.1007/978-3-030-02511-3 ]

https://hal.archives-ouvertes.fr/hal-01942528

Books or Proceedings Editing

64E. Vincent, T. Virtanen, S. Gannot (editors)

Audio source separation and speech enhancement, Wiley, August 2018, 504 p. [ DOI : 10.1002/9781119279860 ]

https://hal.inria.fr/hal-01881431

Internal Reports

65M. Cadot, A. Lelu, M. Zitt.

Benchmarking seventeen clustering methods on a text dataset: Comparaison empirique de dix-sept méthodes de classification non-supervisée sur un corpus textuel, LORIA, March 2018, Version française en fichier complémentaire.

https://hal.archives-ouvertes.fr/hal-01532894

Patents

66S. Ouni, G. Gris.

Image processing device, March 2018, n^o US2018/0061109 A1.

https://hal.inria.fr/hal-01862639

Other Publications

67T. Kinnunen, R. G. Hautamäki, V. Vestman, M. Sahidullah.

Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection, 2018, (A slightly shorter version) has been submitted to IEEE ICASSP 2019.

https://hal.inria.fr/hal-01937767
68L. Perotin, R. Serizel, E. Vincent, A. Guérin.

CRNN-based multiple DoA estimation using Ambisonics acoustic intensity features, July 2018, Submited to the IEEE Journal of Selected Topics in Signal Processing, Special Issue on Acoustic Source Localization and Tracking in Dynamic Real-life Scenes.

https://hal.inria.fr/hal-01839883

References in notes

69L. Sprenger-Charolles, P. Colé, D. Béchennec, A. Kipffer-Piquard.

French normative data on reading and related skills from EVALEC, a new computerized battery of tests (end Grade 1, Grade 2, Grade 3, and Grade 4), in: European Review of Applied Psychology / Revue Européenne de Psychologie Appliquée, 2005, n^o 55, pp. 157-186.

https://hal.inria.fr/inria-00184979

Previous |

Home