Search Abstracts | Symposia | Slide Sessions | Poster Sessions
Effects of emotional prosody on sentence intelligibility and emotion recognition in vocoded speech
Poster Session B, Friday, September 12, 4:30 - 6:00 pm, Field House
Jessica Alexander1, Fernando Llanos1; 1The University of Texas at Austin
Effective human communication requires a listener to decode not only the content of speech, but also the emotional state of the speaker. Importantly, human communication rarely takes place in optimal listening environments. Listeners typically face various acoustic challenges, such as background noise, that can exert distinct effects on the intelligibility and recognizability of different vocal emotions. Furthermore, the emotional intonation of a speaker influences the cognitive and neural function of the listener, with consequences for attention, lexical access, and memory. Thus, some vocal emotions may require greater decoding effort in certain contexts. This may be particularly true for individuals with cochlear implants (CI), who perceive speech with significantly reduced spectral detail. While CI users are generally less accurate than listeners with normal hearing in identifying both the content of speech and its emotional inflection, limited prior research has investigated whether such disadvantages are emotion-dependent or, conversely, exist for all vocal emotions. Here, we investigated the effects of emotional prosody on sentence intelligibility and emotional valence recognition across two levels of noise vocoding: 4-channel and 8-channel. Participants (N=35) transcribed semantically-neutral sentences produced in neutral, angry, and happy prosodies that were spectrally degraded using a vocoder to retain the target number of spectral bands. Participants then categorized the emotional valence (neutral, negative, positive) of each sentence at the same two levels of spectral detail, as well as the full-spectrum versions of each stimulus. Sentences produced with happy prosody demonstrated reduced intelligibility (linear mixed-effects: ps<.001) in vocoded speech. While emotion recognition accuracy was high for happy prosody in full-spectrum speech, recognizability of happy prosody dropped as spectral detail decreased (logistic mixed-effects: ps<.001). Multidimensional scaling of categorization behavior revealed that extreme degradation of spectral information (4-channel vocoding) leads to a contraction of the perceptual dimension supporting the separation of happy and angry prosodies (linear mixed-effects: ps<.004), and this contraction is associated with greater uncertainty when categorizing happy speech (logistic mixed-effects: ps<.023). These findings suggest that the disruption of acoustic features crucial to speech intelligibility and emotion recognition, such as natural F0 contours, has a more deleterious impact on speech produced in a happy prosody than speech produced in neutral or angry prosodies. Given that speech intelligibility and emotion recognition are influenced by challenges in the listening context, and given that these challenges do not have an equivalent effect on all vocal emotions, we extend our above findings to investigate the neural correlates of emotional speech perception in vocoded speech. Participants (anticipated N=15) transcribe 498 unique sentences, equally distributed across the same three emotional prosodies studied behaviorally and presented at either full-spectrum or 4-channel vocoding. The strength of the cortical encoding of the auditory signal is measured by speech-brain coherence, and possible attention effects are investigated via alpha power dynamics. Findings may lead to an enhanced understanding of how cognitive systems process vocal emotions in challenging listening conditions and elucidate factors that help explain why speech intelligiblity and emotion recognition are disadvantaged for some prosodies over others. Data collection is currently underway.
Topic Areas: Speech Perception, Prosody