Poster B8, Thursday, August 16, 3:05 – 4:50 pm, Room 2000AB
Neural coding schemes for lexically-driven prediction in superior temporal cortex
Ediz Sohoglu1, Matthew Davis1;1University of Cambridge
We use fMRI and representational similarity analysis to determine how lexical predictions are combined with speech input in superior temporal cortex. Is expected speech content enhanced to leave a ‘sharpened’ version of the sensory input (McClelland and Elman, 1986)? Or is expected content subtracted away to leave only those parts that are unexpected i.e. ‘prediction error’ (Rao and Ballard, 1999)? Recent work suggests that for degraded speech, multivoxel patterns in posterior superior temporal cortex are best explained by prediction error representations (Blank and Davis, 2016). However, that study used an artificial listening situation in which speech was highly distorted and strong predictions obtained from prior written text. In the current work we apply similar multivoxel pattern analysis methods to a more naturalistic listening situation in which speech is clearly presented and predictions obtained directly from the speech signal itself (i.e. from lexical content). Listeners (N=21) heard 64 bisyllabic words in which the second (offset) syllable was strongly or weakly predicted by the first syllable based on long-term lexical knowledge e.g. items like “Cac-tus” where “tus” is the only syllable that follows “Cac” (Strong prediction) and items like “Be-ta” where “ta” is amongst many syllables that follow “Be” in English (Weak prediction). By cross-splicing between items, we also created 64 non-words in which the second syllable mismatched with listeners’ predictions (e.g. “Cac-ta”). To maintain attention, listeners performed an incidental task and pressed a button every time a brief pause was heard during the spoken word. We used a sparse event-related fMRI sequence and the cross-validated Mahalanobis distance (Walther et al., 2016) to test how the representational fidelity of offset-syllables differed as a function of prior strength (Strong/Weak) and congruency (Matching/Mismatching). If multivoxel patterns in superior temporal cortex represent a sharpened version of the sensory input, prior strength and congruency should have additive effects on representational fidelity whereas if multivoxel patterns represent prediction error, these two manipulations should result in interactive influences (Blank and Davis, 2016). In a searchlight analysis, offset-syllables could be reliably distinguished in Heschl’s Gyrus bilaterally (HG; p < .05 FWE corrected within a bilateral temporal lobe search volume). Using the resulting left and right HG clusters as functional regions-of-interest, we observed a significant interaction between prior strength and congruency in the right HG ROI such that the representational fidelity of offset-syllables decreased with increasing prior strength when predictions were matching but increased when mismatching (p < .05 FWE small volume corrected). However, no significant interaction, nor any main effects, were observed in left HG. These preliminary findings suggest that prediction strength and congruency have interactive effects on multivoxel response patterns, consistent with auditory prediction error representations of speech in superior temporal cortex.
Topic Area: Perception: Speech Perception and Audiovisual Integration