You are viewing the SNL 2017 Archive Website. For the latest information, see the Current Website.

 
Poster D62, Thursday, November 9, 6:15 – 7:30 pm, Harborview and Loch Raven Ballrooms

Tracking phoneme processing during continuous speech perception with MEG

Christian Brodbeck1, Jonathan Z. Simon1;1University of Maryland, College Park

During speech comprehension, phonemes incrementally provide information about the words making up the linguistic message. Phonemes can thus be analyzed in terms of the information they convey, with measures like phoneme surprisal and entropy. Previous research with single word stimuli has used these measures to study speech processing. However, behavioral measures are limited by requiring behavioral responses, which interrupt continuous processing; and electrophysiological measures are so far limited by analysis techniques restricted to single word stimuli. This limits the naturalness of the experimental paradigms, and hence the potential generality of the results. We demonstrate a new technique that allows estimating responses to phonemes in continuous speech from source localized magnetoencephalography (MEG) data. We estimated subject-specific linear kernels to predict MEG responses from multiple concurrent predictor variables. This allowed us to utilize the high temporal resolution of MEG to track brain responses to phonemes in continuous speech. Each predictor variable was evaluated by whether it significantly improved the model fit compared to a permuted version, while controlling for contributions from other variables. Model improvements, as well as the response functions, were tested for significant regions in anatomical space and response time while controlling for multiple comparisons with permutation tests. We analyzed MEG data from 17 participants listening to 3 repetitions of 2 one-minute long audiobook segments. We created continuous predictor variables for acoustic power and phoneme information content based on a full listing model. Phoneme entropy is highly correlated with the current size of the cohort; to account for this, we also computed covariates reflecting the current size of the cohort, as well as the number of competitors that the current phoneme removes from the cohort. Results for a model containing only phoneme surprisal and entropy indicated significant contributions by both variables in addition to acoustic power. However, if the cohort size covariates were included, cohort reduction emerged as the most relevant variable: the contribution of cohort reduction was significant after controlling for cohort size as well as entropy, neither of which was significant when controlling for cohort reduction. The effect of phoneme surprisal remained significant. All responses were centered on the superior temporal gyrus, suggesting sources in or close to auditory cortex. The time course of the estimated kernels can inform models of speech perception: While cohort reduction was associated with an early effect, around 70 ms after phoneme onset, a robust response to phoneme surprisal started around 200 ms. This could suggest that upon perception of new information the cohort is updated quickly, while updating of probabilistic expectations is a slower process. Our results demonstrate the feasibility of analyzing brain responses related to the information content of individual phonemes in continuous speech. This opens the possibility for contrasting predictions from different models of speech perception in a more natural setting than was hitherto possible. Crucially, the method we demonstrate allows taking full advantage of the online nature of MEG measurements, which can record brain responses to continuous speech without interrupting the comprehension process by requiring behavioral responses.

Topic Area: Perception: Speech Perception and Audiovisual Integration

Back to Poster Schedule