Poster Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions

Transformer-based LLM (WhisperX) vs Clinician Performance for Transcribing Speech Errors in People with Aphasia

Poster C7 in Poster Session C, Saturday, September 13, 11:00 am - 12:30 pm, Field House

Shreya Parchure¹, Harris Drachman¹, Leslie Vnenchak¹, Denise Harvey¹, Olufunsho Faseyitan¹, Roy Hamilton¹, H. Branch Coslett¹; ¹University of Pennsylvania, Philadelphia, PA

Introduction: Speech samples from people with post-stroke aphasia (PWA) offer rich markers of cognitive and linguistic status essential for diagnostics and treatment. However, the utilization of said data has been limited by the need for time- and labor-intensive manual transcription. Recent advances in artificial intelligence- (AI) based speech transcription, particularly transformer-based large language models (LLMs) like OpenAI’s Whisper, hold potential for automating these processes. Despite their utility in healthy naturalistic speech, their ability to recognize impaired speech is understudied. Aim: Here, we examine the ability of LLMs to transcribe speech samples from the Philadelphia Naming Test administered to PWA, compared with systematic IPA transcription by experienced speech-language pathologists (SLPs). Methodology: We transcribed 3660 trials of PNT speech samples from 21 PWA using WhisperX in Python (model type: small.en, batch size: 32, learning rate: 5e-5, maximum sequence length: 512 tokens), which is a transformer-based model developed by Erium on OpenAI’s Whisper. Each trial-level transcription was compared to that of an experienced SLP who previously scored the PNT; then marked as human-AI match or mismatch. Since ~56% of all PWA responses had been off target with either semantic or phonemic errors, we also classified whether AI accurately retained these different types of errors in transcription. Lastly, we benchmarked AI performance at transcribing each given PWA’s speech by subcategory of the aphasia severity and type as measured by the Western Aphasia Battery. Results: SLP transcriptions took 2-4 hours on average for each PWA (~175 trials), whereas automated pipeline required 20-25 mins per subject (2-3min AI transcription + ~20 for a human annotator to align the transcript with each trial). Comparing to SLP transcription with >80% inter-rater match, the AI- assisted transcript achieved 73.63% overall match rate (75.43% with leniency for homophones, spelling and pluralization errors). The model correctly transcribed 96.36 % of semantic errors however only 20.17% of phonemic errors. WhisperX commonly erred by introducing phonemic errors (38.65% of mismatches, i.e. by substituting a phonologically similar word, e.g. “base” for “vase”); by lexicalizing erroneous utterances (29.95% of mismatches, i.e. substituting the target word even when PWA made a phonemic error); or by skipping the trial (23.62 % of mismatches). Discussion: WhisperX reliably transcribes semantic paraphasias but is less reliable in detecting phonemic errors, likely because its lexicalization of impaired speech matches desired processes from its training to auto-correct or interpolate lower quality recordings of healthy speech. Future models should be explicitly trained on impaired speech samples to overcome this issue. Overall, our work is the first to benchmark and implement transformer-based models in automating aphasic speech transcription that may increase clinical and research efficiency. Conclusion: We provide a benchmark of LLM-based real time transcription for speech in PWA. Future algorithmic refinement for phonemic intricacies using input data of impaired speech is necessary. WhisperX offers a scalable alternative to manual transcription, potentially streamlining clinical workflows and research assessments.

Topic Areas: Development of Resources, Software, Educational Materials, etc., Computational Approaches

SNL Account Login

News

2025 Membership is Open - Renew Now!

Meeting Registration is Open.

Abstract Submissions are Closed.

Symposium Submissions are Closed.

See Dates & Deadlines for other important dates.