Purchase instant access (PDF download and unlimited online access):
The focus of this paper is on spoken corpora – corpora of naturally occurring speech data that have been compiled for the use of linguists and discourse analysts, as opposed to speech corpora, as commonly used for applications in speech technology and containing various forms of elicited data. I discuss some of the practical and theoretical issues involved in compiling and analysing such data, especially the problems of prosodic annotation and the automatic analysis of the speech signal. I argue that the primary data, i.e. the sound files, are of crucial importance: sounds are not just an additional resource for the study of prosody but an integral part of the message.