Poster Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions

Modeling hemodynamic responses to the speech envelope in STG

Poster Session C, Saturday, September 13, 11:00 am - 12:30 pm, Field House
This poster is part of the Sandbox Series.

Arielle Moore1, Tyler Perrachione2, Emily Stephen3; 1Boston University

The amplitude envelope of speech provides important information about its underlying phonetic content, and behavioral evidence suggests these amplitude variations are critical for speech recognition. Measuring neural tracking of the speech envelope is an increasingly popular method for investigating the brain bases of speech recognition and how these may vary across individuals and conditions. However, studies of neural tracking of the speech envelope come almost exclusively from electrophysiological methods like EEG, MEG, or ECoG (Oganian and Chang 2019; Keitel, Gross, and Kayser 2018), which have excellent temporal resolution to capture neural responses on the relevant timescales of the speech envelope (1-8 Hz). However, inferences about functional neuroanatomy from these techniques is limited, whether due to the limited spatial resolution in EEG and MEG or limited coverage of recording sites in ECoG. Consequently, little is known about which local brain regions differentially track the speech envelope and other rapid speech features. While fMRI provides much better spatial resolution, this technique has been thought to have limited temporal resolution because it measures slow hemodynamic responses that unfold on the order of 0.05 Hz and samples (at best) at 1-2 Hz. However, recent evidence suggests that the BOLD signal has surprising temporal precision for rapid neural events (Lewis et al. 2016). The only extant fMRI study of speech envelope tracking performs classic voxelwise modeling using a convolved standard hemodynamic response function (Hausfeld, Hamers, and Formisano 2024), which may not effectively capture the BOLD hemodynamics of rapid neural events (Polimeni and Lewis 2021). Here, we are using data from a prior block-design fMRI study of participants listening to 18-s samples of dynamic, natural speech stimuli (obtained at 3T, 3-mm voxels, TR = 0.75s) to examine three different approaches to modeling neural responses to the speech envelope. Specifically, we compare a novel data-driven multivariate regression analysis approach against two classic univariate fMRI data analysis methods (FIR and HRF-convolution OLS models) with respect to their accuracy and precision in identifying latent patterns of neural activation specifically in response to the speech envelope. The models will be evaluated using cross-validated model significance measures to determine how effectively each captures speech envelope tracking responses across the superior temporal gyrus (STG). Univariate analysis of fMRI data measures differences in response magnitudes at the level of individual voxels, but its assumption of independence between voxel activities contradicts what we know about the functional relationships among voxels within similar regions and tissue types (Weaverdyck, Lieberman, and Parkinson 2020). Voxels often share temporal response patterns that may be obscured by traditional univariate approaches. This is particularly relevant for features like the fast and dynamic speech envelope which may elicit temporally coordinated responses across regions that univariate models are ill-equipped to capture. With advances in faster fMRI imaging techniques and growing evidence that BOLD signals can reflect rapid changes in neural activity, data-driven multivariate models offer an exciting opportunity to uncover novel aspects of speech processing using fMRI that have previously only been the purview of other recording techniques like EEG or ECoG.

Topic Areas: Computational Approaches,

SNL Account Login


Forgot Password?
Create an Account