Search Abstracts | Symposia | Slide Sessions | Poster Sessions
Different Speech Styles Involve Distinct Auditory-Phonetic Planning Targets
Poster Session B, Friday, September 12, 4:30 - 6:00 pm, Field House
This poster is part of the Sandbox Series.
Will Chih-Chao Chang1, Srikantan Nagarajan2, John Houde2, Connor Mayer1, Gregory Hickok1; 1University of California, Irvine, 2University of California, San Francisco
Speech production is typically modeled as a process in which speakers use auditory targets to guide motor speech planning (Hickok, 2014; Houde & Nagarajan, 2011; Tourville & Guenther, 2013). But what are these targets — abstract, categorical phonological codes or more specific representations tied to the phonetic details of an utterance? For example, previous research has shown that formant values of the same vowels systematically differ between casual and clear speech styles (Smiljanic & Bradlow, 2005; Leung et al., 2016). Are there distinct auditory-phonetic targets within a sound category for casual and clear speech, or a single prototypical target serving both styles of motor articulation? Here we use the centering effect to test this hypothesis. Centering refers to the tendency to correct ongoing utterances that are initially articulated off target, e.g., “centering” an initially off-target vowel toward a speaker’s prototypical formant values (Niziolek et al., 2013). This is argued to reflect error correction via feedback control, where discrepancies between the prototypical target and actual speech are detected and corrected over time. The centering effect, then, provides a means to identify targets in speech planning. In this study, we asked whether speakers center to different auditory-phonetic targets during casual vs. clear speech production. Participants produced English monosyllabic vowel-initial words app(/æ/), up(/ʌ/), and ebb(/ɛ/) in two speech styles: casual (as in everyday conversation) and clear (as if speaking to a hearing-impaired listener). Data collection is ongoing. For each trial, we calculated the average first and second formant values (F1 and F2, in mels) over two time windows — an early (first 50 ms) and middle (middle 50%) window — for each vowel production. For each speaker, we computed median F1 and F2 values during the early time window for each vowel and speech style to represent distinct phonetic targets for casual and clear speech. Using Euclidean distance from each trial to the relevant medians in the early time window, we classified trials as “center” (on-target; closest third) and “periphery” (off-target; farthest third) productions (Niziolek et al., 2015). We then focused on periphery productions to compare their Euclidean distances during the middle time window to the casual vs. clear vowel medians, assessing whether formant trajectories in casual and clear speech shift toward distinct phonetic targets. A linear mixed-effects model on the pilot data revealed a significant interaction between Speech Style (Casual vs. Clear) and Phonetic Target (Casual vs. Clear vowel median) on Euclidean distance (β = -0.554, p<.01). This result suggests that periphery vowel productions shifted toward different directions in the acoustic-phonetic space over time: casual speech moved closer to the casual vowel medians, while clear speech moved closer to the clear vowel medians. Our findings provide evidence that speakers have distinct auditory-phonetic targets depending on speech style, indicating that there are planning processes operating within sound categories to account for systematic phonetic variation in speech production. Moving forward, our goal is to understand the neural mechanisms underlying auditory feedback control in this within-category planning process through speech-induced suppression in auditory-evoked neural responses.
Topic Areas: Speech Motor Control, Multisensory or Sensorimotor Integration