Search Results

You are looking at 1 - 10 of 16 items for

  • Author or Editor: Christian Mair x
  • Search level: All x
Clear All
In: Defining New Idioms and Alternative Forms of Expression
In: Us / Them
Author: Christian Mair

Abstract

The past two decades have seen considerable advances in the corpus-based “real-time” investigation of linguistic change in English, both in older stages of the language and in progress now. Inevitably, given our present resources, most claims about changes in the language as a whole have been based on written data. Against this backdrop, the present paper seeks to define the potential and limitations of the corpus-based “real-time” study of change in the spoken language, where even for a well documented language such as English the major problem is the paucity of corpus data.

In the absence of recordings of suitable quality, the study of real speech in real time will never be pushed back further than the early 20th century, but as I will make clear with the example of the WW I Phonographische Kommission recordings, a number of interesting resources may well deserve more corpuslinguistic attention than they have received so far. Considerable progress is also likely in the study of the history of the spoken language “by proxy”, i.e. through speech-based genres, of which vast amounts have recently been made available for corpus-linguistic study (Old Bailey, Literature Online, Google N-grams). Particularly with regard to grammar, though, more attention needs to be paid to the question of what is really speech-like in supposedly speech-based genres and which features of spoken syntax are likely to be edited out of the written rendering. Cleft constructions, present both in written and spoken English, but structurally and statistically more richly represented in the latter, will serve as illustration of this point.

In: English Corpus Linguistics: Variation in Time, Space and Genre
Author: Christian Mair

Abstract

Working styles in corpus-linguistic research are changing fast. One traditional constellation, close(d) communities of researchers forming around a specific corpus or set of corpora (the “Brown / LOB community”, “the BNC community”), is becoming increasingly problematical – particularly in the study of ongoing linguistic change and recent and current usage. The present contribution argues that whenever the possibilities of closed corpora are exhausted, it is advisable to turn to the digitised texts which – at least for a language such as English – are supplied in practically unlimited quantity on the world wide web. Web material is most suitable for studies for which large quantities of text and/or very recent texts are required. Specialised chat-rooms and discussion forums may additionally provide an unexpected wealth of material on highly specific registers or varieties not previously documented in corpora to a sufficient extent. On the basis of selected study examples it will be shown that, contrary to widespread scepticism in the field, web texts are appropriate data for variationist studies of medium degrees of delicacy – provided that a few cautionary procedures are followed in the interpretation of the results.

In: Corpus Linguistics and the Web
Author: Christian Mair

Abstract

The past two decades have seen considerable advances in the corpus-based “real-time” investigation of linguistic change in English, both in older stages of the language and in progress now. Inevitably, given our present resources, most claims about changes in the language as a whole have been based on written data. Against this backdrop, the present paper seeks to define the potential and limitations of the corpus-based “real-time” study of change in the spoken language, where even for a well documented language such as English the major problem is the paucity of corpus data.

In the absence of recordings of suitable quality, the study of real speech in real time will never be pushed back further than the early 20th century, but as I will make clear with the example of the WW I Phonographische Kommission recordings, a number of interesting resources may well deserve more corpuslinguistic attention than they have received so far. Considerable progress is also likely in the study of the history of the spoken language “by proxy”, i.e. through speech-based genres, of which vast amounts have recently been made available for corpus-linguistic study (Old Bailey, Literature Online, Google N-grams). Particularly with regard to grammar, though, more attention needs to be paid to the question of what is really speech-like in supposedly speech-based genres and which features of spoken syntax are likely to be edited out of the written rendering. Cleft constructions, present both in written and spoken English, but structurally and statistically more richly represented in the latter, will serve as illustration of this point.

In: English Corpus Linguistics: Variation in Time, Space and Genre
Author: Christian Mair

Abstract

Working styles in corpus-linguistic research are changing fast. One traditional constellation, close(d) communities of researchers forming around a specific corpus or set of corpora (the “Brown / LOB community”, “the BNC community”), is becoming increasingly problematical – particularly in the study of ongoing linguistic change and recent and current usage. The present contribution argues that whenever the possibilities of closed corpora are exhausted, it is advisable to turn to the digitised texts which – at least for a language such as English – are supplied in practically unlimited quantity on the world wide web. Web material is most suitable for studies for which large quantities of text and/or very recent texts are required. Specialised chat-rooms and discussion forums may additionally provide an unexpected wealth of material on highly specific registers or varieties not previously documented in corpora to a sufficient extent. On the basis of selected study examples it will be shown that, contrary to widespread scepticism in the field, web texts are appropriate data for variationist studies of medium degrees of delicacy – provided that a few cautionary procedures are followed in the interpretation of the results.

In: Corpus Linguistics and the Web
Volume Editor: Christian Mair
The complex politics of English as a world language provides the backdrop both for linguistic studies of varieties of English around the world and for postcolonial literary criticism. The present volume offers contributions from linguists and literary scholars that explore this common ground in a spirit of open interdisciplinary dialogue.
Leading authorities assess the state of the art to suggest directions for further research, with substantial case studies ranging over a wide variety of topics - from the legitimacy of language norms of lingua franca communication to the recognition of newer post-colonial varieties of English in the online OED. Four regional sections treat the Caribbean (including the diaspora), Africa, the Indian subcontinent, and Australasia and the Pacific Rim.
Each section maintains a careful balance between linguistics and literature, and external and indigenous perspectives on issues. The book is the most balanced, complete and up-to-date treatment of the topic to date.
Author: Christian Mair

Abstract

The paper is a plea for closer cooperation between two traditions in corpus linguistics which have tended to develop in mutual isolation and, occasionally, in some hostility, namely (1) a “small-and-tidy” approach which emphasises detailed philological analysis of clean corpora, and (2) a “big-and-messy” one which stresses the advantages to be gained from the computer-assisted analysis of vast quantities of dirty data. Taking the familiar study example of the get-passive as a starting point, I argue that there are aspects of this well-studied and fairly common construction which cannot be investigated even in a very large closed corpus such as the BNC. Subsequently, I discuss cautionary procedures which need to be followed when mining for data on the Web. In spite of its obvious shortcomings as a corpus, the Web is an indispensable source of data for the study of infrequent and recent linguistic phenomena and, in addition, often provides high-quality data on badly documented “New Englishes”.

In: The Changing Face of Corpus Linguistics
Author: Christian Mair

Abstract

This paper takes as its theoretical framework an approach to corpus-aided discovery learning in which the central role of corpora is seen as that of providing rich sources of autonomous learning activities of a serendipitous kind. Here the suggestion is put forward that availability of different corpora and software tools and the ability to combine these in different ways depending on the purpose of the activity may help learners develop an understanding of the patterned quality of activity may help learners develop an understanding of the patterned quality of language (probability, strength of co-occurrence restrictions, levels of contextual appropriateness), and be conducive to more appropriate use, as learners are guided not just to observe patterns, but also to develop hypotheses as to their variability. A learning experience is described, in which learners are introduced to a number of corpus tools (larger and smaller, general and specific, monolingual and bilingual corpora; two different software programmes for corpus analysis), and guided to progress from more convergent activities to autonomous browsing. Positive and negative sides of the approach are discussed, also in the light of learners' comments, and suggestions for improving the methodology and the tools currently available to learners are put forward.

In: Teaching and Learning by Doing Corpus Analysis