Modeling linguistic processes through experimental and naturalistic designs

Organizers: Shaonan Wang1, Jixing Li2; 1Institute of Automation Chinese Academy of Sciences, 2City University of Hong Kong
Presenters: Nai Ding, Yohei Oseki, Jixing Li, Yanchao Bi, Ping Li, Alona Fyshe

Recent trends in the cognitive neuroscience of language research have been shifting towards the use of naturalistic stimuli, such as story reading or listening. This approach provides more authentic linguistic processes compared to traditional controlled experiments. The relevance of naturalistic design is further amplified in the context of large language models (LLMs), which are trained on naturalistic texts or speech. However, there is also a growing body of research examining LLMs through various linguistic experiments to assess their linguistic competence. In this symposium, we bring together researchers at different career stages and from varied disciplines to explore the benefits and drawbacks of both controlled and naturalistic stimuli. Our objective is to assess the advantages and challenges associated with both controlled and naturalistic stimuli and to discuss the potential of computational models to advance our comprehension of linguistic processes within the human brain, leveraging insights from both research paradigms.


Probing linguistic processing using highly controlled experiment designs

Nai Ding1; 1Zhejiang University

Language comprehension is extremely complicated and involves a large number of highly correlated processes. Highly controlled experiments are often necessary to tease apart these highly correlated processes and can be designed to have high statistical power to test specific hypotheses about how the brain encodes language. Here, I will elaborate on why two unintuitive experimental designs may provide unique insights into how the brain and language models process language. The first paradigm is to study the neural encoding of linguistic constituents using frequency tagging, and the second paradigm is to assess the behavioral sensitivity to linguistic constituents through a word deletion task. I will talk about the advantages and potential clinical applications of these experimental designs, as well as common misunderstandings when generalizing the conclusions from these controlled experiments to natural speech comprehension.

Targeted and psychometric evaluation of language models

Yohei Oseki1; 1University of Tokyo

Language models (LMs) have developed extremely rapidly and outperformed humans in various downstream tasks such as machine translation and question answering. However, despite their super-human performance, whether those LMs process natural language like humans remains to be investigated. In this talk, we evaluate syntactic processing of language models through both controlled and naturalistic designs. Specifically, various LMs like syntactic language models (SLMs) and large language models (LLMs) are assessed via (i) targeted syntactic evaluation (i.e. modeling controlled acceptability judgments) and (ii) human psychometric evaluation (i.e. modeling naturalistic eye movements and brain activities). The results converge on the conclusion that SLMs seem to process natural language like humans, while LLMs are not always human-like, suggesting that controlled and naturalistic designs should be integrated to understand both humans and machines.

Isolated and contextualized meaning composition

Jixing Li1; 1City University of Hong Kong

Naturalistic paradigms using movies or audiobooks have become increasingly popular in cognitive neuroscience, but connecting them to findings from controlled experiments remains rare. Here, we aim to bridge this gap in the context of semantic composition in language processing, which is typically examined using a "minimal" two-word paradigm. Using magnetoencephalography (MEG), we investigated whether the neural signatures of semantic composition observed in an auditory two-word paradigm can extend to naturalistic story listening, and vice versa. Additionally, we subjected a large language model to the same classification tests. Our results demonstrate consistent differentiation between phrases and single nouns in the left anterior and middle temporal lobe, as well as in LLMs, regardless of the context. This consistency suggests the presence of a unified compositional process underlying both isolated and connected speech comprehension.

Interfacing language and visual perception: Evidence from computational and lesion models

Yanchao Bi1; 1Beijing Normal University

Language and visual perception are intricately related but the underlying neural mechanisms remain elusive. We examine how language experience affects visual object processing by comparing the power of three computer vision models in explaining neural activity patterns in the visual cortex across four fMRI datasets with diverse visual tasks and subjects. These models vary by whether language alignments are introduced in the training—CLIP_ResNet50 (image-text alignment), ResNet50 (image-word labeling), and MoCo v3_ResNet50 (visual only). CLIP_ResNet50 consistently exhibited superior explanatory power compared to the other models in clusters in the visual cortex. Importantly, in patients with brain lesions, we observed a correlation between white matter integrity across language areas and the advantage of CLIP_ResNet50, indicating that the advantage of such language-vision fusion models indeed is driven by the contribution of language processes. These findings highlight the potential role of language in shaping neural computations during object processing in the visual cortex.

Model-brain alignment for discourse comprehension

Ping Li1; 1The Hong Kong Polytechnic University

Reading comprehension remains the main medium for students to gain scientific knowledge despite pervasive use of digital platforms. In this talk, I describe the “model-brain alignment” approach that leverages Large Language Models (LLMs) to study naturalistic reading comprehension in both native (L1) and non-native (L2) languages. By training LLM-based encoding models on brain responses to text reading, we can evaluate (a) what computational properties in the model are important to reflect human brain mechanisms in language comprehension, and (b) what model variations best reflect human individual differences during reading comprehension. Our findings show that first, to capture the differences in word-level processing vs. high-level discourse integration, current LLM-based models need to incorporate sentence prediction mechanisms on top of word prediction, and second, variations in model-brain alignment allow us to predict L1 and L2 readers’ sensitivity to text properties, cognitive demand characteristics, and ultimately their reading performance.

Exploring temporal sensitivity in the brain using multi-timescale language models: An EEG decoding study

Alona Fyshe1; 1University of Alberta

During language understanding, the brain performs multiple computations at varying timescales, ranging from word understanding to grasping the narrative of a story. Recently, multi-timescale long short-term memory (MT-LSTM) models have been introduced to use temporally-tuned parameters to induce sensitivity to different timescales of language processing. Here, we used an EEG dataset recorded while participants listened to Chapter 1 of "Alice in Wonderland" and trained models to predict the temporally-tuned MT-LSTM embeddings from EEG responses. Our analysis reveals that our models can effectively predict MT-LSTM embeddings across various timescales and windows of time, usually in concordance with the timescale the model is tuned for. However, we also observed that short timescale information is not only processed in the vicinity of word onset but also at distant time points. These observations underscore the parallels and discrepancies between computational models and the neural mechanisms of the brain.

SNL Account Login

Forgot Password?
Create an Account


Abstract Submissions extended through June 10

Meeting Registration is Open

Make Your Hotel Reservations Now

2024 Membership is Open

Please see Dates & Deadlines for important upcoming dates.