Research Scientist / Chargé de Recherche
Inria Nancy – Grand Est
Lecture title: “Taking the Best of Physics and Machine Learning in Robot Audition”.
In the era of deep learning, building autonomous systems that perform highly specific tasks in the harsh conditions of the physical world may seem incompatible with the ever-growing need for large volumes of cleanly annotated training data. How to reconciliate these two worlds? In this lecture, we will shed light on this question through the study of several concrete applications in the exciting field of robot audition. We will show how both physical modelling and machine learning can be leveraged and even combined to tackle some of the key challenges in this field.
Robot audition has received growing research interest over the past two decades, sparked by the need for robots that can naturally interact with humans via speech. This includes identifying who talks to whom and when and recognizing speech in real-world conditions. While these high-level goals include many conventional audio signal processing tasks, robot audition also comes with unique challenges and opportunities: How to handle the noise and the possibly fast movements created by the robot itself? How to leverage motor control? How to fuse information from different modalities? We will introduce methodological foundations for these questions that stem from both physics and machine learning, address the topic of building and leveraging datasets, and illustrate the lecture with research results on different robotic platforms from social robots to rescue drones.
Antoine Deleforge is a tenured research scientist with Inria since January 2016. He started in the PANAMA research group (Rennes, France) before moving to his current team MULTISPEECH (Nancy, France) in April 2018. His research lies at the interface between statistical machine learning, acoustics and audio signal processing with main applications to auditory scene analysis and robot audition. He received the engineering B.Sc. (2008) and M.Sc. (2010) degrees in computer science and mathematics from the school Ensimag (Grenoble, France), as well as the specialized M.Sc. research degree in computer graphics, vision and robotics from the Université Joseph Fourier (Grenoble, France). In November 2013, he received the Ph.D. degree in computer sciences and applied mathematics of the university of Grenoble (France). His thesis was awarded the GRETSI-EEA-ISIS French PhD prize in signal image and vision in 2014. He was employed as a postdoctoral fellow (2014-2015) at the chair of Multimedia Communications and Signal Processing of the Friedrich-Alexander-University (Erlangen, Germany). He co-chaired and co-organized numerous special sessions on various aspects of audio scene analysis and signal processing at the international conferences ICASSP (2016, 2018) and LVA/ICA (2017, 2018). He was the main organizer of the 2019 IEEE Signal Processing Cup on Search & Rescue with Drone-Embedded Sound Source Localization. He serves as an elected member of the IEEE technical committee on Audio Acoustics and Signal Processing, which he also represents as a member of the IEEE Automonomous System Initiative.