Do you notice a mistake?
NaN:NaN
00:00
Speech production is a complex motor process involving several physiological phenomena, such as the neural, nervous and muscular activities that drive our respiratory, laryngeal and articulatory movements. Modeling speech production, in particular the relationship between articulatory gestures (tongue, lips, jaw, velum) and acoustic realizations of speech, is a challenging, and still evolving, research question. From an applicative point of view, such models could be embedded into assistive devices able to restore oral communication when part of the speech production chain is damaged (articulatory synthesis). They could also help rehabilitate speech sound disorders using a therapy based on biofeedback (and articulatory inversion). From a more fundamental research perspective, such models can also be used to question the cognitive mechanisms underlying speech perception and motor control. In this talk, I will present different studies conducted in our group, aiming at learning acoustic-articulatory models from real-world data, using (deep, but not only) machine learning. First, I will focus on different attempts to adapt a direct or inverse model, pre-trained on a reference speaker, to any new speaker. Then, I will present a recent work on the integration of articulatory priors into the latent space of a variational auto-encoder, for potential application to speech enhancement. Finally, I will describe a recent line of research aiming at studying, through modeling and simulation, how a child learns the acoustic-to-articulatory inverse mapping in a self-supervised manner when repeating auditory-only speech stimuli.
October 25, 2024 01:05:09
October 25, 2024 00:56:17
November 18, 2022 00:26:53
October 25, 2024 01:09:20
Do you notice a mistake?