You are viewing the SNL 2017 Archive Website. For the latest information, see the Current Website.

Poster D76, Thursday, November 9, 6:15 – 7:30 pm, Harborview and Loch Raven Ballrooms

Investigating voice imitation using fMRI and real-time anatomical MRI of the vocal tract

Carolyn McGettigan1, Sheena Waters1, Clare Lally1, Daniel Carey1,2, Elise Kanber1, Valentina Cartei3, Marc Miquel4,5;1Royal Holloway, University of London, UK, 2Trinity College Dublin, IRE, 3University of Sussex, UK, 4Queen Mary University of London, UK, 5Barts NHS Trust, London, UK

Perceptually, fundamental frequency (F0; closely related to pitch) and formant spacing (an index of vocal tract length; VTL) are important cues for the extraction of indexical characteristics such as sex and body size from the voice. Behavioural research has further shown that talkers instinctively modulate these cues to emulate various physical and social attributes (Cartei et al., 2012; Hughes et al., 2014). Here, I will give an updated report of the first combined acoustic, articulatory and neurobiological investigation of these paralinguistic aspects of vocal behavior. We scanned native speakers of British English while they performed a voice imitation task. Using synthetic modulations of participants’ own speech (recordings of the monosyllables “BEAD” and “BARD”), we generated target voices with varying F0 and apparent VTL. There were four modulated voice targets: 1) LowF0-LongVTL, 2) HighF0-ShortVTL, 3) LowF0-ShortVTL and 4) HighF0-LongVTL. A pilot study (McGettigan et al., 2015, SNL abstract) had shown that participants were more accurate at reproducing the acoustic properties of the biologically typical combinations of F0 and VTL (i.e. targets 1 and 2) than the less typical combinations (i.e. targets 3 and 4); further, vocal tract MR images (collected at 8 frames per second) indicated that participants were more successful at lengthening and shortening the vocal tract appropriately for the more typical voice targets. In the current study, we collected sagittal real-time images of the vocal tract and whole-brain BOLD fMRI from a new set of participants (N=26 to date) while they listened to, and repeated, the four target voices as well as their unmodulated voice. From the real-time anatomical MRI images, we tracked the frame-by-frame coordinates of the lips and larynx to index the relative changes in vocal tract length for each modulated voice. In separate runs, we recorded BOLD fMRI to separately measure activations related to sensorimotor transformation (ST) and imitation of the voices, respectively. The functional data reflect previous findings (Carey et al., 2017, Cerebral Cortex) for ST and imitation, where ST was associated with activation in bilateral sensorimotor cortex, superior temporal gyri and sulci, cerebellum, hippocampus and subcortical nuclei, while activation during imitation itself was more restricted to sensorimotor cortex and anterior cerebellar sites. Comparisons of the different target voices reveals significant effects of modulations in apparent VTL, in particular during ST for the atypical voices: here, voices with a longer apparent VTL generated greater activation in bilateral insulae extending to ventral IFG in the left hemisphere, and in the left posterior planum temporale. Comparisons based solely on difference in pitch have revealed little activation – however, contrasts exploring the interaction of F0 and VTL have indicated modulations in the response of bilateral STG/STS and subcortical sites including the caudate and thalamus. Ongoing analyses will link the BOLD response during ST and imitation to individual differences in the degree of vocal tract modulation (as determined from vocal tract images). Further, we will explore the effects of expertise in vocal control of pitch and VTL by comparing performance across control participants and highly trained singers.

Topic Area: Speech Motor Control and Sensorimotor Integration

Back to Poster Schedule