From Data to Evidence in English Language Research

Series:

From Data to Evidence in English Language Research draws on diverse digital data sources alongside more traditional linguistic corpora to offer new insights into the ways in which they can be used to extend and re-evaluate research questions in English linguistics. This is achieved, for example, by increasing data size, adding multi-layered contextual analyses, applying methods from adjacent fields, and adapting existing data sets to new uses. Making innovative contributions to digital linguistics, the chapters in the volume apply a combination of methods to the increasing amount of digital data available to researchers to show how this data – both established and newly available - can be utilized, enriched and rethought to provide new evidence for developments in the English language.

Hardback List price

EUR €132.00USD $159.00

Biographical Note

Carla Suhr, Ph.D., University of Helsinki, is a Senior Lecturer in English Philology at that university. She is a co-compiler of the Corpus of Early English Medical Writing and has published on corpus linguistics and historical pragmatics.

Terttu Nevalainen, Ph.D., University of Helsinki, is Professor of English Philology, the Director of the VARIENG Research Unit, and a co-compiler of the historical Helsinki Corpus and the Corpus of Early English Correspondence, with well over 100 related publications.

Irma Taavitsainen, Ph.D., University of Helsinki, Professor Emerita of English Philology, Deputy Director of VARIENG, and a co-compiler of the Helsinki Corpus and the Corpus of Early English Medical Writing, has published extensively on corpus linguistics and historical pragmatics.

Contributors are: Lieselotte Anderwald, Helen Baker, David Brett, Mark Davies, Stefania Degaetano-Ortlieb, Turo Hiltunen, Mark Kaunisto, Hanna Kermes, Ashraf Khamis, Thomas Kohnen, Mikko Laitinen, Alexander Lakaw, Daniela Landert, Magnus Levin, Tony McEnery, Terttu Nevalainen, Antonio Pinna, Antionette Renouf, Juhani Rudanko, Tanja Rütten, Gerold Schneider, Carla Suhr, Irma Taavitsainen, Elke Teich, Jukka Tyrkkö.

Table of contents

Preface
Editors
Notes on Contributors

1 Corpus Linguistics as Digital Scholarship: Big Data, Rich Data and Uncharted Data
Terttu Nevalainen, Carla Suhr and Irma Taavitsainen

Part 1: Evidence from “Big Data”


2 Big Data: Opportunities and Challenges for English Corpus Linguistics
Antoinette Renouf
3 Corpus-based Studies of Lexical and Semantic Variation: The Importance of Both Corpus Size and Corpus Design
Mark Davies
4 Empirically Charting the Success of Prescriptivism: Some Case Studies of Nineteenth-century English
Lieselotte Anderwald
5 Warn Against -ing: Exceptions to Bach’s Generalization in Four Varieties of English
Mark Kaunisto and Juhani Rudanko

Part 2: Evidence from “Rich Data”?


6 Commonplace Books: Charting and Enriching Complex Data
Thomas Kohnen
7 Mining Big Data: A Philologist’s Perspective
Tanja Rütten
8 Function-to-form Mapping in Corpora: Historical Corpus Pragmatics and the Study of Stance Expressions
Daniela Landert
9 Scholastic Argumentation in Early English Medical Writing and Its Afterlife: New Corpus Evidence
Irma Taavitsainen and Gerold Schneider

Part 3: Evidence from Uncharted Data and Rethinking Old Data?


10 Language Surrounding Poverty in Early Modern England: A Corpus-based Investigation of How People Living in the Seventeenth-century Perceived the Criminalised Poor
Tony McEnery and Helen Baker
11 An Information-Theoretic Approach to Modeling Diachronic Change in Scientific English
Stefania Degaetano-Ortlieb, Hannah Kermes, Ashraf Khamis and Elke Teich
12 Academic Vocabulary in Wikipedia Articles: Frequency and Dispersion in Uneven Datasets
Turo Hiltunen and Jukka Tyrkkö
13 Words (don’t come easy): The Automatic Retrieval and Analysis of Popular Song Lyrics
David Brett and Antonio Pinna
14 Charting New Sources of elf Data: A Multi-genre Corpus Approach
Mikko Laitinen, Magnus Levin and Alexander Lakaw
Indexe

Readership

Experts and learners interested in English corpus linguistics and related digital data sources, and in how their recent developments can be applied to old and new linguistic research questions.