The Coll Corpus: towards a corpus of web-based college student newspapers

in New Frontiers of Corpus Research
Restricted Access
Get Access to Full Text

Subject Highlights

Abstract

Unlike major English-language corpora hitherto released, on-line college student newspapers provide an unexplored record from much younger writers. In these newspapers, 20-year-olds address their peers in a situation that largely parallels standard newspaper writing as regards formal correctness and time pressure. Nearly unconstrained by outside intervention or house style sheets, they deal with a range of university student interests, including creative writing. This preliminary version of the Coll Corpus consists of one issue each of nearly all 300-plus college and university newspapers available on the Web as of spring 1999, with a total of 3.88 million words. Although American English (AmE) dominates, the resultant geographical distribution is relatively well matched to actual population ratios. In its present form, the corpus already allows exploration of numerous lexical and semantic features along temporal and geographic dimensions. Given its on-line accessibility, future versions should be easily expandable by several orders of magnitude.

New Frontiers of Corpus Research

Papers from the Twenty First International Conference on English Language Research on Computerized Corpora Sydney 2000

Series:

Table of Contents

Information

Metrics

Metrics

All Time Past Year Past 30 Days
Abstract Views 27 27 5
Full Text Views 11 11 3
PDF Downloads 3 3 1
EPUB Downloads 0 0 0

Related Content