My Account

Poster E69, Thursday, August 22, 2019, 3:45 – 5:30 pm, Restaurant Hall

Prediction of continuous speech from intracranial cortical recordings

Gaël Le Godais1,2,4,5, Philémon Roussel1,2, Florent Bocquelet1,2, Marc Aubert1,2, Thomas Hueber4,5, Philippe Kahane3, Stéphan Chabardès3, Blaise Yvert1,2;1Inserm, Braintech Lab, 2University Grenoble Alpes, Braintech Lab, 3CHU Grenoble Alpes, 4CNRS, Gipsa Lab, 5University Grenoble Alpes, Gipsa Lab

Intracranial Brain-Computer Interfaces could allow restoring continuous speech in paralyzed patients that lost ability to speak. A straightforward design of a speech BCI is to implement a regression model that maps brain activity recordings to a mathematical representation of acoustic speech. An indirect design is to map brain activity to a model of speech articulators and then to use an articulatory to speech synthesizer to infer the corresponding sound. Here, we compare both methods in an offline setting. We used electrocorticographic recordings in two French-speaking patients: one patient undergoing brain surgery (acute recordings) and one epileptic patient (subchronic recordings). In both cases, brain activity and speech were recorded while the participant read or repeated short sentences. These recordings were aligned using dynamic time warping to a previous acoustic and electromagnetic articulography corpus of a french speaker that includes the same sentences. We used linear models and neural networks to 1) directly predict MEL generalized coefficients from spectrogram features of brain activity; and 2) predict articulatory trajectories from these same features that were then converted to MEL coefficients with a deep neural networks trained on the acoustic-articulatory corpus. Correlations between 0.2 and 0.4 could be obtained between predicted and ground truth MEL coefficients, which were compared between direct and indirect predictions.

Themes: Computational Approaches, Speech Motor Control
Method: Electrophysiology (MEG/EEG/ECOG)

Back