My Account

Poster B61, Tuesday, August 20, 2019, 3:15 – 5:00 pm, Restaurant Hall

Hang on the lips: The listeners’ brain entrains to the theta rhythms conveyed by lip and auditory streams during naturalistic multimodal speech perception.

Emmanuel Biau1, Danying Wang1, Hyojin Park1, Ole Jensen1, Simon Hanslmayr1;1School of Psychology, Centre for Human Brain Health (CHBH), University of Birmingham

In this electro-encephalogram (EEG) study, we investigated whether neural oscillatory responses induced in the sensory areas during natural audiovisual speech perception reflect the predominant theta activity (4-8Hz) aligning lip and auditory streams. Using short segments taken from real interviews, we calculated mutual information (MI) between speakers’ lip movements and speech envelope to assess instantaneous phase dependence and establish their alignment at theta rhythms. We selected audiovisual clips for which the peaks of MI were dominant in the theta band of interest, and created two conditions: a synchronous condition for which the natural alignment between video and audio onsets was intact, and an asynchronous condition for which video and auditory onsets were shifted by 180 degrees in the theta phase. Participants were presented with the audiovisual clips in an asynchrony detection task, while recording their EEG. After each clip, participants had to indicate whether video and sound were synchronous or asynchronous, basing on speech information. They were also presented with two additional unimodal conditions (silent videos and sounds only) in order to establish the locations of neural responses induced by lip movements and auditory signal. The quality of lip-speech multimodal integration was assessed by the asynchrony detection performance (d-prime). At scalp level, we calculated the MI between EEG single-trials and their corresponding lip movements or auditory information, to quantify the neural entrainment induced by dominant theta activity in separate modalities. We expected an increase of MI in the auditory cortex to reflect neural entrainment to auditory signal, and in the occipital cortex to reflect lip-based visual entrainment. Behavioral results revealed d-prime scores significantly greater than chance level showing that participants were capable to determine whether lips and utterance were synchronous or asynchronous in the clips (although significantly more accurate in synchronous condition). EEG analysis from the sound only trials revealed a clear increase of power in the theta band with a central topography, in line with auditory speech processing literature. In contrast, lip movements in silent videos induced an increase of theta power in the occipital areas, although less salient. Further, greater MI between EEG epochs and auditory trials was observed in central regions compared to MI with mismatched data, confirming neural entrainment to dominant theta activity in auditory speech signal. Greater MI between EEG epochs and movie trials was observed in the occipital areas compared to mismatched data, suggesting that the visual cortex entrained to the theta activity conveyed by lip movements. Interestingly, we also found localized MI in the central area, overlapping with the same topography as in the auditory modality, suggesting that lip movements may also entrain auditory cortex even when participants watched silent videos. Both behavioral and neurophysiological results suggest first that listeners match visual and auditory streams together basing on speech features conveying information at theta rates. Second, the alignment between lip movements and voice modulations onto dominant theta rhythms may tune specialized sensory areas together, as hypothesized in previous audiovisual studies, and facilitate later multimodal binding during natural speech processing.

Themes: Perception: Speech Perception and Audiovisual Integration, Multisensory or Sensorimotor Integration
Method: Electrophysiology (MEG/EEG/ECOG)

Back