Poster Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions

Mapping Audiovisual Speech Comprehension in Naturalistic Contexts

Poster Session C, Saturday, September 13, 11:00 am - 12:30 pm, Field House

Hailey C. Smith1, Stephanie Noble2, Jonathan E. Peelle3; 1Northeastern University

Traditional paradigms in cognitive neuroscience, while valuable, often fail to capture the richness of real-world experiences. In many experimental designs, auditory, visual, and language components of stimuli are deliberately isolated, introducing what has been termed a “laboratory-style” bias (Zhang et al., 2021). While such controlled settings are effective for identifying consistent neural responses, they fail to reflect the complexity of everyday perception. To address this gap, researchers have increasingly adopted naturalistic paradigms such as movie watching, which offer rich, cross-sensory stimulation and engage a broader range of neural processes. This approach has been used to evaluate responses across a wide number of cognitive domains (Saarimäki, 2021; Maguire, 2012). Speech perception, for instance, is inherently multisensory—requiring the integration of auditory cues, lip movements, and facial expressions. Prior neuroimaging studies have suggested that the left posterior superior temporal sulcus (pSTS) plays a key role in this integration, particularly when auditory and visual inputs are congruent (Beauchamp et al., 2005). A patchy organization exists within this region: some areas respond selectively to sound, others to visual input, and some to both. This structure shows heightened activation when audiovisual cues align, suggesting a specialized role in binding multisensory information. However, many prior studies have made use of McGurk stimuli, which involve incompatible auditory and visual speech information. The degree to which the pSTS is involved in multisensory speech processing under naturalistic conditions is unclear. The current study aims to assess multisensory speech perception using the full-length film Back to the Future (1 hour and 47 minutes duration), leveraging fMRI data from the Naturalistic Neuroimaging Database (NNDb) (Aliko et al., 2020). We manually annotated 8,634 speech events by onset time, which were then convolved with a hemodynamic response function to create regressors for a general linear model (GLM) analysis. Based on prior literature, we hypothesized increased activation in the superior temporal gyrus (STG) for audio-only words, visual cortex responses for audiovisual speech, and robust pSTS activation during naturalistic viewing. Using a whole-brain corrected threshold, we consistently identified speech-related activation in bilateral STG across all six participants for both audio-only and audiovisual speech. Contrary to expectations, we did not observe visual cortex activation. However, we found enhanced activation in the left pSTS in all participants, supporting its role in integrating speech information in dynamic, naturalistic contexts. These findings underscore the value of movie-watching paradigms in capturing ecologically valid speech processing. By extending traditional experimental approaches, this method offers promising new insights into how the brain supports real-world communication.

Topic Areas: Speech Perception,

SNL Account Login


Forgot Password?
Create an Account