My Account

Poster B56, Tuesday, August 20, 2019, 3:15 – 5:00 pm, Restaurant Hall

Classification of consonants and vowels with fast oscillation-based fMRI

Mikkel Wallentin1,2,3, Torben Ellegaard Lund3, Camilla M. Andersen1, Roberta Rocca1,2;1Department of Linguistics, Cognitive Science and Semiotics, Aarhus University, 2Interacting Minds Centre, Aarhus University, 3Centre of Functionally Integrative Neuroscience, Aarhus University Hospital

INTRODUCTION The auditory cortices contain tonotopic representations sound (e.g. Saenz and Langers, 2014), but how about functional organization of speech sounds? Evidence for “phonotopic” representations in auditory cortices have been reported (e.g. Formisano et al., 2008), but the temporal and spatial resolution of neuroimaging has made it difficult to study phonemes in the brain, especially consonants are difficult. Here, we test an oscillation-based method (Wallentin et al., 2018), using a fast fMRI protocol with syllable-rate temporal resolution. METHODS Stimuli consisted of syllables (pairs of vowels (v) and consonants (c)). Two conditions combined 9 Danish vowels/consonants with 5 consonants/vowels to create 45 unique syllables per condition (i.e. 9vx5c and 9cx5v). In each condition, consonants and vowels were repeated in a fixed order, i.e. in condition 9v5c, a vowel would be repeated on every 9th trial, and the consonant would be repeated every 5th trial, making every combination new for the 45 trials, but at the same time creating two independent rhythms for vowels and consonants. Sessions consisted of 6x4 blocks: [6x9v5c, 6x9c5v, 6x9v5c, 6x9c5v], lasting 18 minutes. Three sessions were acquired in 23 participants. fMRI data was acquired at 3T using a whole-brain fast acquisition sequence (TR = 371ms, multi-band EPI acquisition) to capture signal changes at syllable resolution. Data were modelled using sine and cosine waves at the presentation rate for vowels and consonants, i.e. either 1/9 Hz or 1/5 Hz. Fitted sine and cosine waves were used to generate amplitude maps for each participant and condition. These were submitted to a 2nd level repeated-measures ANOVA in SPM. Single participant classification of vowels/consonants was conducted by creating a phase map for each 45s block. Phase is indicative of the delay in a voxel’s responsiveness to a repetitive stimulus, thus indicating differences in phoneme sensitivity. Phase maps from each block were used to perform a multivariate classification test. The phase maps for the 72 blocks were divided into two parts. The first half was used to conduct a search-light analysis (using Nilearn) in order to select the 1000 most predictive voxels. These voxels were used in a subsequent pattern classification analysis on the 2nd half of the phase maps. Both steps involved a Gaussian Naïve Bayes classifier. Cross-validation and permutation tests were used to determine significance. RESULTS The univariate amplitude analysis showed a bilateral main-effect of phoneme type (vowel vs. consonant) in Planum Temporale, bilaterally (P<0.05, FWE-corrected). The same areas also differentiated between phonemes oscillating at 1/5 Hz and 1/9 Hz, regardless of phoneme type. On individual participants, classification tests were able to classify 45 second phase maps from the 9s condition into consonants and vowels with accuracy score of 69% (std. 10%) and the 5s condition with 64% (std.15%) accuracy on average. CONCLUSIONS This protocol provides the first step towards mapping a phonotopic “fingerprint” at the individual participant level. This map may potentially predict native language or foreign language exposure. It is also a step towards making use of fMRI signal for decoding events at near speech rate temporal resolution.

Themes: Perception: Auditory, Phonology and Phonological Working Memory
Method: Functional Imaging

Back