Slide Slam C5 Sandbox Series
Cortical tracking of linguistic units at different speech rates
Ryan Law1, Greta Kaufeld1, Hans Rutger Bosker1, Andrea E. Martin1,2; 1Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands, 2Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, the Netherlands
Language comprehension from speech involves the mapping of continuous acoustic signals onto discrete linguistic representations. Recent evidence suggests the brain “tracks” higher-level linguistic structures, such as words and phrases, whose boundaries are not injectively present in the acoustic input. Seminal work using a frequency-tagging approach by Ding and colleagues (2016) showed tracking of words and phrases in artificial syllable sequences lacking any acoustic cues to higher-level structures. Kaufeld and colleagues (2020) extended this finding using naturally spoken stimuli. At the timescale corresponding to the occurrence of phrases in the input, tracking was enhanced for sentences compared to word lists and prosodic jabberwocky. This pattern suggests linguistic structure and meaning shape the neural tracking of speech over and above timing, lexical content, and prosodic information alone. Consistent with this observation, recent accounts of the cortical speech tracking suggest neural oscillations “entrain” to physical acoustics but “synchronize” with abstract linguistic structures that are internally generated (Martin, 2016; 2020; Meyer, Sun, & Martin, 2020) at different speech rates (Kösem et al., 2018). A remaining open question is whether the observed differences in "tracking" by Kaufeld et al. (2020) are tied to specific frequency bands (e.g., delta for phrases, theta for syllables), or whether they relate to the occurrence of linguistic units that are not driven by the timing of acoustic landmarks. Put differently, does the effect scale with the timing of the input? If the neural signal corresponding to tracking of higher-level structures scales injectively with input speech rate, this would suggest that neural tracking readout of higher-level structures is closely tied to the stimulus sensory representation. In contrast, if the timing relationship between input speech rate and tracking readout of higher-level structures is surjective, it would suggest the tracking signal reflects the brain’s transformation of sensory input into other coordinate systems (i.e., that of higher-level linguistic structures) where stimulus timing is not as closely tied to neural representation as in sensation (Martin, 2020). To this end, we varied the speech rate of three experimental conditions (sentence, prosodic jabberwocky, word list) from Kaufeld et al. (2020) with a range of intelligible (k=1,2), less intelligible (k=3), and unintelligible (k=4). In an independent sample, we validated the behavioral impact of speech rates on intelligibility via a transcription task. 28 participants will passively listen to speech at different rates in the MEG and perform an offline transcription task. In sensor space, we will perform a mutual information analysis of speech and the neural response (Keitel & Gross, 2016; Kaufeld et al., 2020) as a function of condition and speech rate. We will also examine effects of structure, meaning, and speech rate on source-localized oscillatory activity within language-related areas. Then we will use transfer entropy (Park et al., 2015) to describe information flow within our source-localized networks. Our speech rate manipulation attempts to replicate previous findings, to map the neural transduction from sensation to comprehension, and to evaluate whether tracking readout distinguishes between stimulus ‘entrainment’ and intrinsic synchronisation (Martin, 2016, 2020; Meyer, Sun, & Martin, 2020).