My Account

Poster Slam Session E, Thursday, August 22, 2019, 3:30 - 3:45 pm, Finlandia Hall, Brenda Rapp

Neural characteristics of acoustic prosody during continuous real-life speech

Satu Saalasti1,2, Enrico Glerean2, Antti Suni1, Jussi Alho2, Juraj Šimko1, Iiro P. Jääskeläinen2, Martti Vainio1, Mikko Sams2;1University of Helsinki, 2Aalto University School of Science

When we exclaim “I told you so!” we mark the word “told” acoustically; this word is longer, pronounced with more effort, with higher pitch and different voice quality compared to the rest of the utterance. The acoustic parameters of pitch, length, and loudness support linguistic distinctions during continuous speech, and help the listener interpret the linguistic meaning of the utterance, as prosody serves several linguistic processes, e.g. changes related to word stress, phrase boundary and sentence type. Studying the neural processing of prosody during continuous real-life speech has been hindered by lack of efficient analysis methods. The current study aims to quantify prosodic characteristics of speech based on a continuous wavelet transform (CWT) based method and explore the correlation between the prosodic signals and brain activity. We estimated prosodic events from continuous real-life speech with a recently developed unsupervised unified account, which is based on a scale-space analysis based on continuous wavelet transform (CWT). A 3T functional magnetic resonance imaging (fMRI) was used to record brain activity of 29 female participants (age 19-49) while they listened to an 8-minute narrative. A CWT based scale-space analysis was used to extract prosodic characteristics of the narrative, and the obtained wavelet timeseries were used as regressor for the fMRI data to reveal how they map into the brain recordings. More specifically, we used CWT (Morlet mother wavelet (5) to identify frequency bands containing most of the energy of the magnitude timeseries. We used ridge regression to compute a similarity between the obtained magnitude timeseries and individual brain time series. T-value was computed for the regression scores across subjects with 5000 permutations (FSL randomise TFCE at p=0.05) We found that acoustic-prosodic properties of speech aligned in a hierarchical fashion encompassing syllable and short phrase density in the narrated speech, and they predicted distinct brain activity. Syllable density predicted brain activity in the medial temporal as well as superior fronto-lateral areas, suggesting involvement of the speech motor areas. Phrase density predicted brain activity in the medial temporal area. Our findings are in line with what is known about brain activity related to speech and language processing. In conclusion, CWT based scale-space analysis enabled automatic quantification of acoustic prosody during continuous real-life speech. Importantly, the automatically created model identified different levels of linguistic hierarchy, and different wavelet scales of the model elicited different brain activity.

Themes: Prosody, Speech Perception
Method: Functional Imaging

Poster E79

Back