Neural correlates of phonotactic context effect in speech categorization
Chih-Chao Chang1, Chia-Hsuan Liao2, Yu-An Lu1; 1National Yang Ming Chiao Tung University, 2National Tsing Hua University
Listeners perceive speech sounds in a categorical manner, mapping varying acoustic information into discrete linguistic representations (Goldstone & Hendrickson, 2010). Earlier studies have provided evidence of neurophysiological mechanisms underlying this non-linear transformation in bottom-up processing, demonstrating that acoustic information is faithfully encoded at the sub-cortical level but is warped into categories at the cortical level (Bidelman et al., 2013; Ou & Yu, 2021). However, listeners normally perceive sounds in context, not in isolation. Despite previous reports of the neural manifestation of phonetic (Zhang & Peng, 2021) and lexical context effects (Bidelman et al., 2021), the top-down influence of phonotactic restrictions on the neural encoding of categorical perception remains unclear. To fill this gap, we conducted a pilot behavioral study (N=10) and are running an ERP experiment to examine the effects of phonotactic restrictions on vowel categorization in Mandarin. A five-step /i/-/u/ vowel continuum was resynthesized from natural tokens by parameterizing F2 values in five equal steps between 680 and 2580 Hz while keeping F0, F1 and F3 values constant. In Mandarin, /s/ and /ɕ/ provide phonotactically constrained contexts in that they form illegal sequences with /i/ (*/si/) and /u/ (*/ɕu/), respectively, whereas aspirated /p/ and /t/ provide neutral contexts. Each of these four onsets (phonotactic: /ɕ/, /s/, neutral: aspirated /p/, /t/) was combined with the vowels from the five-step continuum to construct twenty CV syllables. In the behavioral experiment, participants heard each of the CV syllables and were asked to categorize the vowel (/i/ or /u/) they perceived as quickly as possible. Preliminary results showed that Mandarin listeners’ vowel categorizations were biased towards phonotactically legal sequences (/ɕi/ and /su/), particularly when the response times decreased (RT < 1000 ms). This reaction time effect was most prominent with the ambiguous tokens (Step 3). These findings reflect a strong but transient phonotactic effect, motivating our on-going ERP experiment. The P2 component, which is usually associated with category information at the phonetic or phonological level, will be the neural index of the study. We predict: (1) larger P2 amplitudes for the non-ambiguous (Steps 1 and 5) than the ambiguous vowels (Step 3) in the neutral contexts (aspirated /p/, /t/), in which listeners must resort to bottom-up acoustic information for perceptual categorization, replicating Bidelman et al., 2020; and (2) larger P2 amplitudes for the ambiguous vowels (Step 3) in the phonotactic contexts (/ɕ/, /s/) relative to neutral ones (aspirated /p/, /t/), since categorical vowel percepts could be modulated by phonological biases through top-down processing. We will further run a correlation analysis to investigate how the bottom-up and top-down processes described above interact during speech categorization at the individual level. Finally, topographic distribution of ERPs will be examined to determine if different contexts engage different cognitive processes. Our findings are expected to shed light on the role of top-down processing in speech categorization and on the individual differences in general cognitive processing.
Topic Areas: Speech Perception, Phonology and Phonological Working Memory