Search Abstracts | Symposia | Slide Sessions | Poster Sessions
Using Conversational AI to Elicit Linguistic Variation in Written Language Production: A Useful Neurolinguistic Tool?
Poster Session A, Friday, September 12, 11:00 am - 12:30 pm, Field House
This poster is part of the Sandbox Series.
Paul DiStefano1, Emily Herman1, Berk Atil1, Feiwen Xiao1, Janet van Hell1; 1Pennsylvania State University
Linguistic variation—the ways language use shifts across speakers, contexts, and structures—offers insight into how humans produce and comprehend language. Studying variation reveals the cognitive, social, and grammatical processes underlying communication, yet capturing naturalistic variation within an experimentally controlled framework remains a longstanding challenge. Traditional approaches rely either on controlled elicitation tasks or naturalistic language corpora. While elicitation tasks offer experimental precision, they often sacrifice ecological validity, producing language that may not reflect everyday use. In contrast, corpus studies provide rich, authentic data but are time-intensive to collect and analyze. These methodological differences often result in inconsistent estimates of variation rates. Therefore, we introduce a novel paradigm that uses conversational AI to elicit linguistic variation in a naturalistic but structured setting. We present two parallel studies using this paradigm in English and Spanish. Each study includes approximately 50 monolingual speakers engaging in a written brainstorming dialogue with GPT-4o, a conversational large language model. Rather than instructing participants to use targeted linguistic forms, the AI is prompt-engineered to ask leading questions about the creative prompts to encourage variation—without producing the target forms itself. From the participant’s perspective, the AI acts as a co-author; from the researcher's perspective, it functions as a confederate encouraging natural language variation. We hypothesize that this method will elicit usage patterns more closely aligned with corpus distributions than traditional elicited production tasks. Study 1 focuses on past participle variation in English. Participants respond to creative prompts (e.g., encountering a mysterious traveler) that encourages usage of the modals with past participles. We analyze whether participants produce either standard (He could’ve written, he should’ve eaten) or non-standard forms (He could’ve wrote, he should’ve ate). Study 2 examines variable clitic placement (VCP) in Spanish. Spanish permits clitics before or after non-finite verbs (e.g., lo quiero pintar ‘(it) I want to paint’/ quiero pintarlo ‘I want to paint (it)). The AI encourages repeated referents and target constructions known to condition clitic variation, allowing us to study spontaneous usage patterns. Data from both studies will be compared to corpus-based and elicited production data to assess ecological validity. We seek to establish conversational AI as a confederate-like elicitation tool that preserves the spontaneity and variability of natural language, offering a promising middle ground between elicitation tasks and corpera. Because morphosyntactic variation engages memory, attention, and syntactic planning, this paradigm offers a novel behavioral tool for studying language control, morphosyntactic retrieval, and monitoring processes in neurocognitive research. Additionally, since prior research shows that monolinguals and bilinguals differentially represent language in the brain, this approach could be extended to probe how language experience and proficiency shape grammatical processing. Overall, this approach could open up new methods using conversational AI for behavioral and neurocognitive research or as a therapy tool for clinical patients.
Topic Areas: Methods, Language Production