Browse results

Restricted Access

Atong Texts

Glossed, Translated and Annotated

Series:

Seino van Breugel

Restricted Access

Basque and Romance

Aligning Grammars

Series:

Edited by Ane Berro, Fernández Beatriz and Jon Ortiz de Urbina

Aligning Grammars: Basque and Romance is a collection of articles describing and analyzing several of the most important morphosyntactic features for which the formal comparison between Basque and its surrounding Romance languages is relevant, such as word order, inflection, case, argument structure and causatives. In the context of a language virtually all of whose speakers are bilingual in either Spanish or French, the theoretically informed in-depth description offered in this volume focuses on the fine grain of linguistic structures from languages typologically quite apart but coexisting and probably interacting in the minds of speakers. It therefore aims at shedding some light on the types of interactions between different systems and on the systems themselves.
Restricted Access

Secondary Content

The Semantics and Pragmatics of Side Issues

Series:

Edited by Daniel Gutzmann and Katharina Turgay

In addition to expressing some main content, utterances often convey secondary content, which is content that is not their “main point”, but which rather provides side or background information, is less prominent than the main content, and shows distinctive behavior with respect to its role in discourse structure and which discourse moves it licenses. This volume collects original research papers on the semantics and pragmatics of secondary content. By covering a broad variety of linguistic phenomena that convey secondary content – including expressives, various particles, adverbials, pronouns, quotations, and dogwhistle language – the contributions show that secondary content is pervasive throughout different aspects of natural language and provide new insight into the nature of secondary content through new semantic and pragmatic analyses.
Restricted Access

Series:

Edited by Manuela E. B. Giolfo and Kees Versteegh

This volume contains sixteen contributions from the fourth conference on the Foundations of Arabic linguistics (Genova, 2016), all having to do with the development of linguistic theory in the Arabic grammatical tradition, starting from Sībawayhi's Kitāb (end of the 8th century C.E.) and its continuing evolution in later grammarians up till the 14th century C.E. The scope of this volume includes the links between grammar and other disciplines, such as lexicography and logic, and the reception of Arabic grammar in the Persian and Malay linguistic tradition.
Restricted Access

Series:

Edited by Carla Suhr, Terttu Nevalainen and Irma Taavitsainen

From Data to Evidence in English Language Research draws on diverse digital data sources alongside more traditional linguistic corpora to offer new insights into the ways in which they can be used to extend and re-evaluate research questions in English linguistics. This is achieved, for example, by increasing data size, adding multi-layered contextual analyses, applying methods from adjacent fields, and adapting existing data sets to new uses. Making innovative contributions to digital linguistics, the chapters in the volume apply a combination of methods to the increasing amount of digital data available to researchers to show how this data – both established and newly available - can be utilized, enriched and rethought to provide new evidence for developments in the English language.
Restricted Access

Sound and Grammar

A Neo-Sapirian Theory of Language

Series:

Susan F. Schmerling

Sound and Grammar: A Neo-Sapirian Theory of Language by Susan F. Schmerling offers an original overall linguistic theory based on the work of the early American linguist Edward Sapir, supplemented with ideas from the philosopher-logicians Kazimierz Ajdukiewicz and Richard Montague and the linguist Elisabeth Selkirk. The theory yields an improved understanding of interactions among different aspects of linguistic structure, resolving notorious issues directly inherited by current theory from (post-) Bloomfieldian linguistics. In the theory presented here, syntax is a filter on a phonological algebra, not a linguistic level; linguistic expressions are phonological structures, and syntax is semantically relevant relations among phonological structures. The book shows how Neo-Sapirian Grammar sheds new light on syntax-phonology interactions in English, German, French, and Spanish.
Restricted Access

Srinagar Burushaski

A Descriptive and Comparative Account with Analyzed Texts

Series:

Sadaf Munshi

In Srinagar Burushaski: A Descriptive and Comparative Account with Analyzed Texts Sadaf Munshi offers the structural description of a lesser-known regional variety of Burushaski spoken in Srinagar, the summer capital of the Indian-administered state of Jammu & Kashmir. The description includes a comprehensive and comparative account of the structural features of Srinagar Burushaski in terms of phonology, morphology, lexicon and syntax. The grammar is supported by an extensive digital corpus housed at the University of North Texas Digital Library. Using contemporary spoken language samples from Srinagar, Nagar, Hunza and Yasin varieties of Burushaski as well as data from the available literature, Munshi provides a thorough understanding of the historical development of Srinagar Burushaski, complementing the existing studies on Burushaski dialectology.
Restricted Access

Series:

Turo Hiltunen and Jukka Tyrkkö

Abstract

Despite its popularity, the status of Wikipedia in higher education settings remains somewhat controversial, and the linguistic characteristics of the genre have not been exhaustively described. This exploratory paper takes a data-driven approach to assessing the use of academic vocabulary in Wikipedia articles. Our analysis is based on Coxhead’s Academic Word List, and the data comes from the Westbury Lab Wikipedia Corpus. We employ methods of statistical data analysis to classify Wikipedia articles according to the frequencies of academic words, and apply the same procedure to a comparable set of texts representing another genre, published research articles. The unsupervised classification procedure groups the articles according to academic content regardless of topic, which allows us to measure genre-specific similarities. The findings of the study show that academic words are common in both genres in focus, and more interestingly, if we look at aggregate frequencies of academic words, Wikipedia articles are not markedly different from RAs within the same discipline. This being said, we can observe disciplinary differences in the distribution of academic words in Wikipedia, such that Economics writing contains more academic words than the other two disciplines in focus. Disciplinary differences can likewise be observed in the distribution of individual academic words.

Restricted Access

Series:

Antoinette Renouf

Abstract

This paper draws on our personal experience of working with a large diachronic corpus, namely 1.3 billion words of Guardian and Independent news text, from 1984–2013 and ongoing. Big data is thus, for us, both quantitative and temporal. The data exist as raw text and as analysed databases, created by AVIATOR (1990–3), APRIL (1997–2000), WebCorpLSE (2000–) and other tools. We also refer to the coca corpus (Davies 2008).

Our research focus is on lexis, and such big data is thus desirable (Sinclair 1991; Lindquist 2009). The lexicon comprises a few high-frequency words, but many more medium–low frequency words, and a majority of hapax legomena. Big data increases scope and enhances granularity of study, allowing rare and intuitively inaccessible features to be glimpsed (Renouf 1987c). Thirty-plus years of diachronic text bring the corpus linguist an evolving understanding of language innovation and change (Renouf 2013; Renouf & Kehoe 2013).

On the other hand, big data presents challenges for the corpus linguist. High and even medium-frequency search words and affixes begin to retrieve too much data; hapax legomena, since they are mainly studied for the patterns they show with particular sub-word elements, constitute enormous numbers of tokens for analysis, supplemented by typographical and tagging errors in the corpus “sump” (Clear, 1986). Moreover, whilst it undoubtedly allows microscopic analysis, a very large corpus reveals details of language use which complicate descriptions, and can entice the linguist down time-consuming paths of enquiry which prove fruitless or excessive. At this point in corpus linguistic history, large-scale language corpora are available in advance of the necessary tools for automated analysis.

Through small case studies, the paper will illustrate some of the opportunities and challenges of big data experienced recently, in our work in corpus-based lexicology and in two allied fields: socio-pragmatics and lexical morphology.

Restricted Access

Series:

Mikko Laitinen, Magnus Levin and Alexander Lakaw

Abstract

The article discusses research that charts new lingua franca English data and broadens the scope of written elf corpora. We illustrate that, apart from the academic domain, there exist various written genres in non-native contexts in which English is used as a second language resource alongside native languages. These uncharted data can provide us with new ways of approaching the ongoing globalization of English. The new approach incorporates a broader perspective on elf than previously, seeing it as one stage in the long diachronic continuum of Englishes rather than as an entity emerging in interaction. The first part details a corpus project that produces written multi-genre corpora suitable for real-time studies of how ongoing variability is reflected in lingua franca use. It is followed by three case studies investigating quantitative patterns of ongoing change in elf. The conclusions suggest that a diachronically-informed angle to lingua franca use offers a new vantage point not only to elf but also to ongoing grammatical variability. It shows that the traditional and canonized way of seeing non-native speakers/writers is not sufficient, nor is the simplified view of norm dependency of non-native individuals.