My Account

Poster A70, Tuesday, August 20, 2019, 10:15 am – 12:00 pm, Restaurant Hall

Modeling the EEG from the audio signal: a methodological investigation

Katerina Danae Kandylaki1, Athanasios Lykartsis2, Sonja A. Kotz1;1Maastricht University, 2Technische Universität Berlin

In the study of the neurobiology of language, researchers use artificial and manipulated stimuli or they quantify existing features in texts such as word frequency. Even if one models the continuous EEG response based on word features, modelling is limited to one value per word. In the current project, we investigated how the auditory signal can provide a higher level of granularity for modelling EEG responses in a continuous fashion. This level of granularity is needed to model rapid neurocognitive processes such as beat perception in speech. Inspired by computational algorithms for text processing, we employed audio content analysis algorithms on speech signals, previously used to model speech rhythm in automatic language identification (Lykartsis & Weinzierl, 2015). As this method performs significantly better on noise-free speech, we recorded our stimuli based on existing poems and stories. Additionally, as this method is sensitive to voice characteristics, we recorded stimuli with four different voices (two male, two female). This way, we created a set of 33 files: 19 files with regular rhythm, based on poems, and, 14 files with irregular rhythm, based on prose. Regular rhythm in speech is quantified as equal distance between strong syllables, as for example in isochronous poetry, whereas irregular speech rhythm has varied distances between stressed syllables. First, we calculated the following six novelty functions: Spectral Flux, Spectral Flatness, RMS Energy, Pitch (F0), Spectral Centroid, a novel, compound feature denoting vowel-non vowel excerpts. Then, we extracted beat histograms and features from them as in (Lykartsis & Weinzierl, 2015) in segments of 10sec (with 50% overlap) and full files. Next, we performed two classifications, using all the features with two classifiers (a 1-NN and a Support Vector Machine, SVM) as a proof of concept. First, we classified the contrast male vs. female and found that the 10sec frames were performing better at this classification with an up to 84.8% (SVM) accuracy, compared to the full files, for which both classifiers performed at chance level for both segment sizes. The reverse effect was found for the contrast poem vs. story, with the 1-NN classifier performing at 93.9% accuracy on the full files and both classifiers performing at chance level on the 10sec segments. The next step is to use these novelty functions, or combinations thereof, as predictors for the EEG signal. Importantly, we focused on neurocognitively relevant features of the auditory signal and specifically the ones related to beat perception (Kotz, Ravignani, & Fitch, 2018). Phonological theory suggests that strong syllables in language are realized by alterations in pitch, loudness and duration. We therefore calculated a theoretical beat (theobeat) novelty function to be our factor of interest, by combining F0, RMS Energy, and Spectral Flux in equal proportions (33% each). As control factor, we will use the Spectral Centroid novelty function, which, as it only denotes the spectral center of weight of the signal, is not expected to encode beat/rhythm-related information. This exploratory analysis could provide promising results and open new ways of analyzing speech for modeling neurocognitive data.

Themes: Perception: Auditory, Methods
Method: Electrophysiology (MEG/EEG/ECOG)

Back