Slide Sessions

Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks

Slide Session C

Thursday, October 26, 4:30 - 5:30 pm CEST, Auditorium

Log In to play session recording

Chair: Yanchao Bi, Beijing Normal University

Neural tracking in the visual domain: the role of different articulators in sign language comprehension.

Chiara Luna Rivolta1, Brendan Costello1, Mikel Lizarazu1, Manuel Carreiras1,2,3; 1Basque Center on Cognition, Brain and Language (BCBL), 2Ikerbasque, Basque Foundation for Science, 3Universidad del País Vasco (UPV/EHU)

In sign languages the linguistic information is transmitted through the simultaneous movement of several bodily and facial articulators, some of which, like the right hand, are more informative than others. This study investigates the relative contribution of each articulator during sign language processing by exploiting the phenomenon of language-brain entrainment. Specifically, we examine the extent to which the brain activity synchronizes with (different parts of) the incoming linguistic visual signal. To measure the temporal periodicity that characterises the visual signal in sign language, we used marker-free motion tracking: a custom-built Kinect system allowed us to record videos of semi-spontaneous sign language narratives (by native deaf signers) while registering the position in three-dimensional space of 21 different body points. These videos served as stimuli for the subsequent experiment. We used magnetoencephalography (MEG) to record the neurophysiological activity of two groups of hearing participants – 15 proficient signers of LSE (Spanish Sign Language) and 15 sign-naive individuals – while they watched 20 videos, 10 in a known (LSE) and 10 in an unknown sign language (Russian Sign Language - RSL). We calculated coherence between the preprocessed MEG data and the visual linguistic signal, and used cluster-based permutation tests to assess statistical differences in coherence between groups and experimental conditions. The motion tracking data made it possible to characterize the visual linguistic signal: for each motion-tracked point, its three-dimensional coordinates were normalised and used to create a speed vector to identify that point’s time-frequency profile. As an overall measure of the visual signal, we used PCA to extract the main speed vector for all upper-body points combined (head, neck, shoulders, elbows, wrists, hands, torso); additionally, we selected three linguistically relevant articulators (left hand, right hand and head) to investigate their individual contribution to neural tracking, along with a point with little linguistic import (torso) to provide a sanity check. The results show that neural activity tracks sign language input, and this tracking is dependent upon sign language knowledge. When comparing entrainment to a known and an unknown sign language, the signers show greater coherence for LSE compared to RSL only when considering the combined visual signal and right hand but not for the left hand, head or torso. Furthermore, proficient signers show stronger synchronisation compared to sign-naive controls when considering the head, left hand, right hand and the combined visual signal, but not the torso. Thus, our findings point to the differential role of the body articulators in sign language processing. Entrainment in sign language occurs in the delta frequency band (0.5 - 2.5 Hz), reflecting the slower periodicity associated with articulators movements, mainly over centro-parietal regions linked with biological motion processing. These findings confirm that language-brain entrainment is a feature of language processing beyond the auditory domain, and depends upon language experience. Furthermore, the neural activity tracks visual articulators that are linguistically most informative, namely, the dominant, right hand. The multilayered sign language signal makes it possible to tease apart those linguistic elements that drive neural tracking of the input.

Convergent cortical network dynamics in word retrieval

Kathryn Snyder1, Kiefer Forseth1, Greg Hickok2, Nitin Tandon1; 1UTHSC, 2UC Irvine

Lexical access describes the process involved in the mapping between conceptual representations and phonology and is an integral component of speech production. Furthermore, prominent psycholinguistic models of language propose that lexical access is supported by a network of broadly distributed cortical regions with transient and interactive network dynamics. However, while various brain regions are hypothesized to support separable processes, the language network, its functional mapping to lexical access, and the associated network dynamics remain unclear. Here, we used ECoG to identify brain networks involved in naming using multimodal stimuli with convergent design. Data were obtained from epilepsy patients who underwent invasive electrophysiology (patients=48; electrodes=5,390). Recordings were acquired during cued-naming tasks using auditory (ACN) and orthographic (OCN) descriptions. We identified the lexical-semantic network using a mixed-effects multilevel analysis to estimate group-level broadband gamma power (BGA; 65-115Hz) time-locked to the offset of the last word in the description. In a subset of patients with coverage of these regions (n=5), we characterized the network dynamics using a multivariate autoregressive hidden Markov model (ARHMM). Both tasks engaged an identical lexical access network consisting of the posterior middle temporal gyrus (pMTG; ACN: -176ms, 43.69% BGA, p<0.01; OCN: -193ms, 15.95% BGA, p<0.01), the middle fusiform gyrus (mFus; ACN: -171ms, 16.79% BGA, p<0.01; OCN: 237ms, 22.0% BGA, p<0.01), the intraparietal sulcus (IPS; ACN: 117ms, 24.03% BGA ,p<0.01; OCN: 282ms, 36.15% BGA, p<0.01), and pars triangularis (pTr; ACN: -2ms, 38.78% BGA, p<0.01; OCN: 281ms, 43.55% BGA, p<0.01), and all regions were active immediately following the onset of the last word in the description. ARHMM analyses of ACN isolated 6 distinct cortical states and began with acoustic processing in the superior temporal gyrus (STG) followed by speech processing in the superior temporal sulcus (STS) and pMTG. ARHMM analyses of OCN isolated 7 distinct cortical states and began with orthographic feature processing in visual cortex followed by engagement of the lexical (STS, pMTG) and phonological (STG) routes of reading. Following these initial sensory processing states, ARHMM analyses isolated a conserved lexical semantic processing network consisting of pMTG, mFus, IPS, and pTr. This network state was active following the end of the description in both tasks and correlated with reaction time (p<0.01), which is consistent with lexical access. The final three network states in both tasks corresponded broadly to phonological encoding, articulation, and monitoring (subcentral gyrus and STG). These results reveal that pMTG, mFus, IPS, and pTr constitute a core heteromodal lexical access network. Juxtaposing the cortical network dynamics of word retrieval across different conceptual representations better informs our understanding of both specialized and shared cortical language networks, which provides empirical support for theoretical models of speech production. In the future, we believe that this work will contribute important insights that are critical to the development of improved treatment methods, such as neural prosthetics, for speech-related disorders.

Autistic Traits Modulate Discourse Construction: An fNIRS Hyperscanning Study of School-aged Children

Xuancu HONG1, Patrick C.M. WONG1, Xin ZHOU2; 1The Chinese University of Hong Kong, 2National Acoustic Laboratories

Autism spectrum disorder (ASD) as a neurodevelopmental disorder results in impaired discourse in social contexts (Schaeffer et al., 2023). Little is known about how children with ASD construct discourse with their peer interlocutors in school life. Interpersonal brain coherence (IBC) values between two individuals show how two brains are interconnected and have been shown to be the neural markers of discourse construction in social interaction (e.g., Nguyen et al., 2020). In the present study, we investigate whether and how autistic traits modulate the construction of discourse among school-aged children and the underpinning IBC. We designed three experiments using functional near-infrared spectroscopy (fNIRS) to simultaneously measure the brain activities of dyads of school-age children, some of whom showed elevated autistic traits as measured by the Autism Spectrum Quotient (AQ) questionnaire. We recorded their IBC values from four regions of interest that have been associated with discourse construction (Jacoby & Fedorenko, 2020; Mashashiro & Shinada, 2021), i.e., bilateral temporal-parietal and frontal cortical regions. Altogether, 46 9-to-11-year-old children (23 dyads) participated in this study. Children participated in three experiments. In Experiment 1, they watched a video alone or together with a peer. In Experiment 2, dyads engaged in shared book reading either subvocally or reading aloud together. In Experiment 3, dyads participated in playing Jenga in which they either pretended to play or played for real in turns. To increase ecological validity, the tasks for each experiment are chosen from daily-school-life activities, and we designed the three experiments based on three possible social interactive behaviours in constructing discourse respectively, i.e., implicit interactive behaviour, aligning behaviour and contingent behaviour. The level of interactivity between two individuals is incrementally increased across the experiments. We found that in Experiment 1, the dyad’s IBC value is significantly increased when watching together compared to watching alone. In Experiment 2, their IBC value is also significantly improved when they are reading aloud together compared to reading subvocally. In Experiment 3, dyads have significantly higher IBC values in playing for real than pretending to play. Throughout the three experiments, the tasks that involved stronger interactivity consistently showed an increase in IBC values compared to tasks that involved weaker interactivity, including in the temporal-parietal and frontal cortical regions. Importantly, we found that increases in IBC values were modulated by AQ scores across experiments. Specifically, increases in IBC value were negatively correlated with AQ scores in Experiment 2 (r = -0.63, p = .002), and positively correlated with AQ scores in Experiment 3 (r = 0.55, p = .007). These findings indicate that inter-brain coherence is enhanced during discourse construction with interactive behaviours such as co-presence, alignment, and contingency (turn-taking). Moreover, the inter-brain connection is subject to the different levels of interactivity and is moderated by autistic traits in discourse construction. In general, this study illustrates an ecological and quantitative approach to investigating autism-modulated neural mechanisms of discourse construction among school-aged children.

Speaker-listener neural coupling in a shared linguistic embedding space during natural conversations

Zaid Zada1, Ariel Goldstein2, Sebastian Michelmann1, Erez Simony3, Amy Price1, Liat Hasenfratz1, Emily Barham1, Asieh Zadbood1, Werner Doyle4, Daniel Friedman4, Patricia Dugan4, Lucia Melloni4, Sasha Devore4, Adeen Flinker4, Orrin Devinsky4, Samuel Nastase1, Uri Hasson1; 1Princeton University, 2Hebrew University of Jerusalem, 3Holon Institute of Technology, 4New York University

Language allows us to express our ideas and feelings to others. Successful communication, however, requires a shared agreement regarding the meaning of words in context—which without, would be impossible for strangers to understand one another. Until recently, we lacked a precise computational framework for modeling how humans use words in context to communicate with each other. To overcome this limitation, previous studies of the neural basis of communication have resorted to measuring direct coupling or alignment between brains by using the neural activity in the speaker’s brain to make model-free predictions of the listener’s brain activity. Although these analyses can quantify the strength of brain-to-brain coupling, they are content-agnostic and cannot model how we use words in context to convey our thoughts to others. In this study, we positioned large language model (LLM) contextual embeddings as an explicit model of the shared linguistic space by which a speaker communicates their thoughts—i.e. transmits their brain activity—to a listener in natural contexts. We recorded cortical activity using electrocorticography (ECoG) in five dyadic pairs of epilepsy patients during spontaneous, interactive, and unique conversations. Next, we extracted contextual embeddings for every word in the conversation from GPT-2, an established language model. Then, we trained encoding models to predict brain activity during speech production and comprehension in held-out segments of the conversations. To assess linguistic coupling across brains, we used the encoding model trained on the speaker’s brain activity to predict the listener’s brain activity (and vice versa). This novel intersubject encoding (ISE) analysis quantifies how well the model fit for speech production (or comprehension) generalizes to speech comprehension (or production) in left-out segments of each conversation. We first demonstrate that the contextual embeddings can predict neural activity across the cortical language network during speech comprehension and speech production with high temporal specificity. Consistent with the flow of information during communication, linguistic encoding peaked in the speaker’s brain ~300ms prior to word onset (r=0.192), while linguistic encoding peaked in the listener’s brain ~375ms after each word was articulated (r=0.223). During speech production, we found maximal encoding performance in speech articulation areas along somatomotor (SM) cortex, in superior temporal (ST) cortex, and in higher-order language areas in the temporal pole (TP), inferior frontal gyrus, and supramarginal gyrus (SMG). During speech comprehension, the encoding model predicted neural responses in similar brain areas, particularly superior temporal cortex. We then use the intersubject speaker–listener encoding analysis to demonstrate that linguistic content in the speaker’s brain ~425ms before word articulation re-emerges, word-by-word, in the listener’s brain ~125ms after each word is spoken (r=0.144). For comparison, we used non-contextual word embeddings that resulted in significantly less coupling. Finally, we applied the same analysis across speaker-listener brain regions to identify connections including both higher-level (TP, TP) and lower-level (SM, ST) speaker-listener language areas. Our findings reveal how speaker and listener neural responses during natural conversations are coupled to a shared linguistic space, and suggest that LLMs provide a novel computational framework for studying how we transmit our thoughts to others.

SNL Account Login

Forgot Password?
Create an Account