Poster E15, Saturday, August 18, 3:00 – 4:45 pm, Room 2000AB
Human cortical encoding of a discrete temporal landmark for processing syllables in continuous speech
Yulia Oganian1, Edward F. Chang1;1University of California, San Francisco
A crucial component of the speech signal is its slow amplitude envelope (4-16Hz), and speech comprehension is severely impaired if that temporal envelope is smeared or reduced. Numerous electrophysiological studies have found that brain activity is correlated with the amplitude envelope of speech and that this neural envelope tracking deteriorates when speech comprehension is impaired. A common assumption is that the auditory cortex, in particular speech cortex in the superior temporal gyrus (STG), encodes an analog, continuous representation of the envelope. However, not all periods in the speech signal are equally informative: Discrete landmarks such as amplitude peaks and peaks in rate of amplitude change mark high intensity periods in speech, i.e. stressed syllables, which are most informative for speech comprehension. STG might rely on them to extract information from the speech signal or to segment continuous speech into syllabic units. To test this, we directly recorded neuronal responses from the surface of STG using electrocorticography, while participants (n = 26, 13 right hemispheric) listened to continuous speech. Neural populations in bilateral mid-STG represented the speech envelope. We found that neural responses reflected consecutive evoked responses triggered by local peaks in the rate-of-change of the amplitude envelope (peakRate), but not local peaks in the envelope or its continuous shape. Notably, encoding of peakRate events in mid-STG was anatomically and functionally dissociated from speech onset tracking in more posterior STG and phonetic features encoding in mid-to-anterior STG. To address the role of the envelope in speech comprehension, we analyzed the spectral and phonetic content of speech around peakRate events. We found that peakRate events indicated the timing of consonant-vowel transitions and that the magnitude of peakRate predicted whether a syllable was stressed. Encoding of peakRate events thus can provide an internal reference point to the temporal structure of a syllable and indicates its prominence within a sentence. A follow-up experiment (n=8, 5 right hemispheric) revealed that STG tracks peakRate not only in speech, but also in amplitude-modulated tones, where they appear in isolation without concurrent spectral changes. By parametrically varying the rate of amplitude change, we found that neural response magnitude monotonically encoded the rate of amplitude change, thus differentiating between stressed and unstressed syllables in speech. Strikingly, we found that two distinct neural populations detected peakRate events at sound onsets and in ongoing sounds: Neural populations that represented sound onsets dynamics (in speech and tones) did not respond to amplitude modulations of ongoing sounds. Neural populations that encoded peakRate in ongoing speech or tones, however, did not discriminate between peakRates at stimulus onset. Overall, our results demonstrate that the representation of speech envelope in STG emerges from its sensitivity to peaks in rate of amplitude change in the acoustic signal. Peaks in the rate-of-change of the envelope function as temporal landmarks, the detection of which cues neural processing towards prominent syllables in the speech signal.
Topic Area: Perception: Auditory