Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks

Is face perception necessary for audio-visual speech integration?

There is a Poster PDF for this presentation, but you must be a current member or registered to attend SNL 2023 to view it. Please go to your Account Home page to register.

Poster D114 in Poster Session D, Wednesday, October 25, 4:45 - 6:30 pm CEST, Espace Vieux-Port
This poster is part of the Sandbox Series.

Enrico Varano1, Alexis Hervais-Adelman1; 1UZH

Understanding spoken language is substantially facilitated by seeing the speaker's face, a particularly salient effect when speech intelligibility is compromised. Factors such as degraded auditory signals, background noise, older age, and hearing impairment often necessitate a greater reliance on this audiovisual (AV) comprehension. The neural mechanisms underlying this phenomenon are thought to involve multi-stage feedback between the visual and auditory processing pathways but, despite ongoing research, a gap remains in our understanding of the precise neural pathways that contribute to successful AV integration. This sandbox series study aims to elucidate the relationship between the effects of orofacial articulatory and temporal signals in audiovisual degraded speech comprehension. Evidence from the literature and our previous findings suggest that facial context significantly impacts speech comprehension. Specifically, cartoon mouths based on facial landmarks were found to improve speech comprehension only if presented within the context of a cartoon face. This implies that recognizing dynamic visual signals as belonging to a speaker's face is crucial for audiovisual integration. Our proposed research entails a systematic exploration of the feature space indexed by three parameters. Firstly, following several conflicting results in the literature regarding the viability of simple, speech envelope-driven visual features for the improvement of speech-in-noise comprehension, the AV comprehension gain due to a cartoon face with 5 different drivers of mouth movements will be assessed. Secondly, the putative role of face-sensitive neural circuitry in mediating audiovisual integration will be addressed by repeating previously defined conditions with diffeomorphed versions of each cartoon type. Such controls contain the local motion cues but lack dynamic configural information of faces. Lastly, to differentiate between effects driven by separable and non-separable speech degradation, this matrix sentence behavioural experiment will be repeated with both a noise masker and with noise vocoded speech. This research is anticipated to enhance our understanding of sensory integration mechanisms, which has potential implications for clinical interventions and our understanding of the evolution of language perception. Further, it is expected that these results will clarify conflicting evidence regarding the influence of acoustic energy visualisations on AV gain. If a difference in comprehension gain between original and diffeomorphed videos were to be found, the authors plan to investigate the putative integration-gating role of loci such as the fusiform face area (FFA), superior temporal sulcus (STS) and anterior cingulate cortex (ACC) in a follow-up M/EEG study.

Topic Areas: Multisensory or Sensorimotor Integration, Speech Perception

SNL Account Login

Forgot Password?
Create an Account

News