Sonia Zyngier

Since its beginning in the late 19th century, literary education has lacked theories that systematize teaching and methodologies that validate practice. Consequently, much work in the area has relied on argument rather than on real data. What is needed in literary education are ways in which scholars develop descriptions of methods that will help them arrive at evidence-based conclusions. However, this is easier said than done. Trying to cope with the problems of dealing with hypotheses, statistics and numbers in general, Humanities students tend to see the experience as both frightening and fascinating. In order to find out the difficulties students of literature encounter when learning to do empirical research, a questionnaire was distributed to 14 participants from different countries who attended the IGEL2 Summer Institute in 2004. Participants were asked how they became interested in empirical studies, what their literary biography was, what they considered the main problems of empirical work to be, and how they thought it related to literary education. Respondents agreed that there is a need to teach students how to deal with real, palpable knowledge by means of well-structured and objective data. This article presents the main problems participants raised in empirical work.


Hans Martin Lehmann and Gerold Schneider


In this paper we present a corpus-driven approach to the detection of syntax-lexis interactions. Our approach is based on the output of a syntactic parser. We have parsed the British National Corpus and constructed a database of lexical dependencies. Such a large-scale approach allows for a detailed investigation of patterns and constructions associated with individual lexical items found in argument positions.

We then address the methodological problems of such an approach: precision errors (unwanted instances) and recall errors (missed instances) and offer a detailed evaluation. We investigate the interaction between syntax and lexis in verb-subject and verb-object structures as well as the active-passive alternation. We show that our approach provides relatively clean data and allows for a corpus-driven investigation of rare collocations.


Tania Shepherd, Sonia Zyngier and Vander Viana

This study focuses on the frequency and nature of the lexical choices in two corpora made up of 195 creative writing texts produced by Brazilian public school pupils, living in two markedly different places: a violent inner-city area, and a semi-rural setting near a small market town. To this end, the research adopts a frequency and distribution approach for the extraction and comparison of sequences of words in each of the corpora.

Initially, the discussion focuses on the methodological difficulties encountered when dealing with texts produced by language users with orthographic and punctuation problems. Subsequently, the concept of ‘lexical bundles’ is applied to the data in question, i.e., the most frequent sequences of words are extracted, and classified according to Biber, Conrad & Cortes’ (2004) framework. Finally, results are presented which highlight the large number of lexical patterns in the texts from the semi-rural group, in contrast with a large degree of lexical variability in the texts written by the inner-city pupils. It is suggested that these differences may be attributed to the sociological profile of each individual group.


Georgie Columbus


Invariant tags, such as huh and innit, are discourse markers that often occur at the end of an utterance to provide attitudinal and/or evidential information above that of the proposition. Many previous studies examined the meaning or usage of these tags in single varieties or dialects of English. Few of these studies, however, have examined variation in invariant tag use. Some studies have investigated sociolinguistic divisions within a dialect, but none have compared usage between varieties. Furthermore, differences in research methodology and aims prevent comparison of the prior results. This study investigates the meaning/functions of four invariant tags—eh, yeah, no, and na—in New Zealand, Indian, and British English. The four most frequent meanings are described in detail. The results show differences in the meanings available as well as in their usage frequencies across both items and varieties. This suggests that varietal differences at the level above propositional understanding could cause problems for intercultural and global communication. This has implications for pedagogy and materials for English for Speakers of Other Languages (ESOL) and English for Specific/Business Purposes, in that global communication in English requires an awareness of these subtle differences at the varietal level.

Corpus-linguistic applications

Current studies, new directions


Edited by Stefan Th. Gries, Stefanie Wulff and Mark Davies

This volume provides an overview of four currently booming areas in the discipline of corpus linguistics. The first section is concerned with studies of the history and development of morphological and syntactic phenomena in English, Spanish, and Mandarin Chinese. The second section contains case studies investigating the functions and contexts of use of different morphological and syntactic forms in English, Spanish, Russian, and Mandarin Chinese. The third section contains studies in the field of genre and register from settings as diverse as health, call center, academic, and legal discourse. The final section features papers refining existing, and exploring new, corpus-linguistic methods: dispersions, text mining, corpus similarity, as well as the development of extraction patterns and the evaluation of tagging methods.

Corpora: Pragmatics and Discourse

Papers from the 29th International Conference on English Language Research on Computerized Corpora (ICAME 29). Ascona, Switzerland, 14-18 May 2008


Edited by Andreas H. Jucker, Daniel Schreier and Marianne Hundt

This volume presents current state-of-the-art discussions in corpus-based linguistic research of the English language. The papers deal with Present-day English, worldwide varieties of English and the history of the English language. A special focus of the volume are studies in the broad field of corpus pragmatics and corpus-based discourse analysis. It includes corpus-based studies of speech acts, conversational routines, referential expressions and thought styles, as well as studies on the lexis, grammar and semantics of English. And it also includes several studies on technical aspects of corpus compilation, fieldwork and parsing.


Belinda Crawford Camiciottoli


Moisés Almela and Pascual Cantos

1 Introduction In corpus linguistics collocation is one of the primary sources of information for lexical semantic analysis. The search for distributional correlates of semantic properties is widely established as a fundamental methodological strategy in the discipline. At the same time, there is a


Edited by Wolfgang Herrlitz, Sigmund Ongstad and Piet-Hein van de Ven

Pioneering in the comparison of standard language teaching in Europe, the International Mother tongue Education Network (IMEN) in the last twenty-five years stimulated experts from more than fifteen European countries to participate in a range of research projects in this field of qualitative educational analyses. The volume “Research on mother tongue education in a comparative international perspective – Theoretical and methodological issues” documents theoretical principals and methodological developments that during the last decades shaped IMEN research and may enlarge the fundaments of comparative qualitative research in language education in a seminal way. The topics of this volume include: • IMEN’s aims, points of departure, history and methodology; • research on the professional practical knowledge of MTE-teachers; • innovation, key incident analysis and international triangulation; • positioning in theory and practice. Also included: the IMEN bibliography 1984-2004 which supplies a complete picture of IMEN research activities from the beginning.