Edited by Ezra Black, Roger Garside and Geoffrey Leech

This book is about building computer programs that parse (analyze, or diagram) sentences of a real-world English. The English we are concerned with might be a corpus of everyday, naturally-occurring prose, such as the entire text of this morning's newspaper.
Most programs that now exist for this purpose are not very successful at finding the correct analysis for everyday sentences. In contrast, the programs described here make use of a more successful statistically-driven approach.
Our book is, first, a record of a five-year research collaboration between IBM and Lancaster University. Large numbers of real-world sentences were fed into the memory of a program for grammatical analysis (including a detailed grammar of English) and processed by statistical methods. The idea is to single out the correct parse, among all those offered by the grammar, on the basis of probabilities. Second, this is a how-to book, showing how to build and implement a statistically-driven broad-coverage grammar of English. We even supply our own grammar, with the necessary statistical algorithms, and with the knowledge needed to prepare a very large set (or corpus) of sentences so that it can be used to guide the statistical processing of the grammar's rules.

A Changing World of Words

Studies in English Historical Lexicography, Lexicology and Semantics


Edited by Javier E. Díaz Vera

Advances in Corpus Linguistics

Papers from the 23rd International Conference on English Language Research on Computerized Corpora (ICAME 23) Göteborg 22-26 May 2002


Edited by Karin Aijmer and Bengt Altenberg

This book provides an up-to-date survey of current issues and approaches in corpus linguistics in the form of twenty-two recent research articles. The articles cover a wide range of topics illustrating the diversity of research that is characteristic of corpus linguistics today. Central themes are the relationship between theory, intuition and corpus data and the role of corpora in linguistic research. The majority of the articles are empirical studies of specific aspects of English, ranging from lexis and grammar to discourse and pragmatics. Other areas explored are language variation, language change and development, language learning, cross-linguistic comparisons of English and other languages, and the development of linguistic software tools. The contributors to the volume include some of the leading figures in the field such as M.A.K. Halliday, John Sinclair, Geoffrey Leech and Michael Hoey. The theoretical and methodological issues addressed in the volume demonstrate clearly the steady advance of an expanding discipline inspired by an empirical, usage-based approach to the study of language. The volume is essential reading for researchers and students interested in the use of computer corpora in linguistic research.

Greek στωμύλος ‘chatty’

An anomalous ō-grade (and some anomalous o-grades)

Brent Vine

Michaël Peyrot

Dark Matter

The Root *√k̑u̯el ‘Dark, Black’

Stefan Höfler

Stefan Dollinger

Don Ringe

Eugen Hill

