You are viewing the SNL 2017 Archive Website. For the latest information, see the Current Website.

Poster B76, Wednesday, November 8, 3:00 – 4:15 pm, Harborview and Loch Raven Ballrooms

N400 modulated by word onset duration but not information content during spoken word recognition

Jonathan Brennan1, Emma Saraff1, Max Cantor2, Dave Embick3;1University of Michigan, 2University of Colorado, 3University of Pennsylvania

Computational models of spoken word recognition emphasize the role that word onsets play in guiding lexical activation rapidly and incrementally. These models predict, correctly, that overall reaction times will be faster for more informative onsets. They also predict that initial word activation will vary as a function of either the first phoneme's duration (e.g. COHORT) or its information content (TRACE), but these finer-grained predictions have not been tested. The amplitude of the N400 event-related potential (ERP) component is sensitive to word frequency and other factors that affect the ease of word recognition. For spoken words, the N400 begins about 200-300 ms after stimulus onset, notably before word offset, and consistent with the timing of early stages of lexical activation. Delayed phonological cues to lexical identity show N400 latency effects. We examine the ERP response to time-dilated words to test, first, whether the N400 is modulated by word-duration when semantic content is held constant. We then test whether N400 latency is sensitive to the duration of the first phoneme or to its information content as quantified via surprisal. Methods: N=17 participants listened to spoken words while electroencephalography (EEG) data were recorded. Stimuli were 100 high frequency mono-syllabic target nouns (mean length: 492 ms) and 68 filler words. Target words were dilated to 80% ("compressed") or 120% ("expanded") of their original duration using the pitch-preserving PSOLA algorithm (Fig. 1A). Dilated and non-dilated stimuli were presented at 45 dB HTL (ISI 900--1100 ms) in 12-item mini-blocks to avoid mixing speech rates. Non-dilated stimuli were binned according to first phoneme characteristics: (1) based on a median split of first phoneme duration (Fig. 1B), or (2) based on a median split of first phoneme surprisal (-log_2(Pr(phoneme | word boundary)), probability estimated from the English Lexicon Project; Fig. 1C). These bins minimally overlap: r(duration, surprisal) = -0.13. Participants made a semantic judgment on 16% of trials. EEG data were recorded at 500 Hz from 61 active electrodes. Epochs spanning -300--1000 ms around word onset were re-referenced to linked-mastoids, cleaned of artifacts with visual inspection and ICA, band-pass filtered from 0.5--40 Hz, and baseline-corrected. A non-parametric statistical analysis was conducted across all electrodes from 0--800 ms. Results: Time-dilation significantly modulates the N400 such that "compressed" words show an earlier negativity on central electrodes (Fig. 1D, 264--432 ms, p < 0.05). The effect is predominant on the leading edge of the deflection such that the latency difference at the 25th percentile is 46 ms. Non-dilated words with shorter initial phonemes also modulate the leading edge of the N400 (Fig. 1E, 258--480 ms, p < 0.05); the latency difference at the 25th percentile is 52 ms. There are no significant effects for first phoneme surprisal (Fig. 1F). These data indicate that the the N400 is modulated by the speed with which lexical information unfolds. We extend prior work to show sensitivity to first phoneme duration, consistent with the COHORT model, but not to first phoneme information content, contra TRACE.

Topic Area: Perception: Speech Perception and Audiovisual Integration

Back to Poster Schedule