Treffer: Evaluating Data-Efficient LLMs on a Benchmark of Disfluency Minimal Pairs

Title:
Evaluating Data-Efficient LLMs on a Benchmark of Disfluency Minimal Pairs
Contributors:
Prévot, Laurent
Source:
12th edition of the Disfluency in Spontaneous Speech Workshop (DiSS 2025). :97-101
Publisher Information:
ISCA, 2025.
Publication Year:
2025
Document Type:
Fachzeitschrift Article<br />Conference object
File Description:
application/pdf
DOI:
10.21437/diss.2025-20
Rights:
CC BY
Accession Number:
edsair.doi.dedup.....c8b5cecdca207bf56235e3be711af77c
Database:
OpenAIRE

Weitere Informationen

Zero-shot benchmarks based on minimal pairs have become an essential part of the toolkit for evaluating large language models' linguistic capacities. Most of these tasks focus on syntactic, semantic, and morphological phenomena and are built from expert-crafted or semi-automatically generated sentences. Motivated by the crucial role of spontaneous speech in language processing, we experimented with creating a benchmark that leverages spontaneous speech corpora in three languages (English, French, and Mandarin). Crucially, the benchmark tests LLMs on disfluencies, a ubiquitous and essential feature of spontaneous speech. Our findings show that models pretrained on conversational data exhibit a clear advantage in handling disfluencies compared to those trained on written encyclopedic text. Furthermore, cross-linguistic LLMs trained on much larger datasets did not exhibit strong advantages in our proposed benchmark, highlighting the potential of disfluencybased tasks as a challenging problem for language models.