Restricted Access
Get Access to Full Text

Subject Highlights


In automatic speech recognition, a stochastic language model (LM) predicts the probability of the next word on the basis of previously recognized words. For the recognition of dictated speech this method works reasonably well since sentences are typically well-formed and reliable estimation of the probabilities is possible on the basis of large amounts of written text material. However, for spontaneous speech the situation is quite different: disfluencies distort the normal flow of sentences and written transcripts of spontaneous speech are too scarce to train good stochastic LMs. Both factors contribute to the poor performance of automatic speech recognizers on spontaneous input. In this paper we investigate how one specific approach to disfluencies in spontaneous language modeling influences recognition performance.

Computational Linguistics in the Netherlands 2002

Selected Papers from the Thirteenth CLIN Meeting