Using currency annotated part of speech tag profiles for the study of linguistic variation – a data exploration of the International Corpus of English

In: Recent Advances in Corpus Linguistics
Author:
Marco Schilk
Search for other papers by Marco Schilk in
Current site
Google Scholar
PubMed
Close

Purchase instant access (PDF download and unlimited online access):

$40.00

Abstract

The International Corpus of English has been widely used for the description of regional linguistic variation for the past two decades. The balanced corpus design of the ICE, which includes a large number of spoken texts and a large variety of different text-types and genres, however, also seems to be an ideal basis for the description of text-type differences in World English. In contrast to the tradition of solely focusing on variety-specific trends, this paper proposes using ICE as a basis to map out similarities and differences between regional varieties and different text-types. Data-driven in nature, it provides an exploration of nine different CLAWS7-tagged ICE subcorpora. After creating currency annotated part-of-speech tag profiles of the different subcorpora and the text-types and genres included therein, these profiles are used to identify homogeneous text-type groups. A comparison of the different groups makes it possible to isolate typical features of specific text-types but also points to some problematic issues concerning the design of the ICE.

  • Collapse
  • Expand

Metrics

All Time Past 365 days Past 30 Days
Abstract Views 64 25 1
Full Text Views 8 0 0
PDF Views & Downloads 15 0 0