Poster Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions

Replicating Aphasic Picture Naming in Large Language Models

Poster Session A, Friday, September 12, 11:00 am - 12:30 pm, Field House

Rutvik H. Desai1, Yang Yong, Xiang Guan, Zifei Zhong, Sophie Arheix-Parras, Srihari Nelakuditi; 1University of South Carolina

Large Language Models (LLMs) have demonstrated remarkable capabilities in processing and generating language, potentially offering insights into neural mechanisms of language processing. We tested if stroke aphasia can be computationally modeled with lesions in large language models using picture naming, which is one of the most commonly used tasks to assess abilities of persons with aphasia (PWA). We investigated two fundamental questions: (1) Can systematic neural perturbations in a multimodal LLM replicate different types of picture naming errors observed in stroke survivors? (2) Can parameter-specific modifications be calibrated to match individual PWA picture naming error profiles? We employed LLaVA 1.6, a multimodal language model capable of processing both images and text, to analyze pictures from the Philadelphia Naming Test (PNT). We systematically introduced controlled perturbations to the model's neural network using three key parameters: (1) Gaussian noise level (the magnitude of weight modifications); (2) modification percentage (fraction of neurons in a layer to be modified, ranging from 10% to 100%); and (3) target layer (out of 40 layers). To analyze the resulting outputs, we developed an automated error classification system based on established frameworks from clinical practice. This system categorized naming responses into eight distinct categories including correct responses and seven types of errors: semantic, formal, mixed, nonwords, neologisms, unrelated, and no response. We systematically examined how response patterns changed across the entire three-dimensional parameter space created by the perturbation variables. Results showed systematic modulation in error types in different regions of parameters space. Nonword errors (neologisms, nonwords) pulled away from real word errors at near 50% unit modification and entered a no-response domain at increased perturbation strengths. Results of the layer-specific analysis revealed that, while lower layers (1–9) presented the basic response generation dynamics that turned into no-response errors, middle-to-upper layers (17–29) displayed the dynamics underlying semantic error emergence (with taxonomic errors near the middle layers and thematic errors at upper layers), whereas neologisms mainly emerged from lesions in middle layers (9–21). We then applied this framework to error profiles from 81 stroke survivors who took the PNT. Results showed that individual response profiles could be matched for greater than 6 response categories for 85% of survivors, and for 37%, we successfully matched across all eight categories. An analysis of response types showed that each of the response categories could be matched in >95% of cases, with the exception of formal errors, which could be matched about 70% of the time. This is unsurprising, given that language models work with tokens and have impoverished phonological representations. These results show promise in the use of multimodal language models as powerful computational proxies to study aphasic language impairments. The systematic relationship between perturbation parameters and error patterns provides preliminary computational evidence supporting distributed lexical access theories and suggests possible computational similarities between such multimodal LLM and some aspects of human language processing.

Topic Areas: Computational Approaches, Disorders: Acquired

SNL Account Login


Forgot Password?
Create an Account