Restricted Access
Get Access to Full Text

Subject Highlights

 

Abstract

Increasingly, corpus linguists have begun using the World Wide Web as a corpus for conducting linguistic analyses. The Web, however, is really a very different kind of corpus: we do not know, for instance, precisely how large it is or what kinds of texts are on it. In this chapter, we evaluate the Web as a linguistic corpus, providing estimates of its size and composition. In addition, we conduct a series of sample analyses of the Web, demonstrating that while commonly available search engines have definite limitations, they can in a matter of seconds retrieve extremely large volumes of data that are very relevant to a corpus analysis, and also provide frequency information that may not be entirely accurate but suggestive of how frequently particular words and grammatical constructions occur.

Corpus Analysis

Language Structure and Language Use

Series:

Table of Contents

Index Card

Metrics

Metrics

All Time Past Year Past 30 Days
Abstract Views 67 67 15
Full Text Views 14 14 6
PDF Downloads 4 4 3
EPUB Downloads 0 0 0

Related Content