the morphological by Mike Pacey’s morphological chart parser. The 2015 University of Helsinki conference on “Big Data, Rich Data, Uncharted Data” focussed on the complexity and overlap involved. This paper is a contribution to the debate, with particular reference to big data, and the opportunities

In: From Data to Evidence in English Language Research

For several years now, the media, the business world, and information technology (IT) professionals have often used ‘Big Data’ to describe a new society-wide dynamic. It is characterised not only by the production of massive amounts of data, but also – and especially – by the huge potential benefits that new statistical data-analysis tools would confer. The proliferation of data is so extensive that data capture and analysis are increasingly presented as exceeding human reach, thus necessitating the use of tools and IT methods for interpretation. Data extraction and analysis are defined as ‘data mining,’ without presenting data production, access, and analysis as socially constructed (e.g., with ideological, political, and economic dimensions). Instead, they constitute a means for deriving ‘natural’ information (the Real). In this light, Big Data may be seen as a technique that dispenses with symbolic mediation and thus lies outside the field of politico-ideological debate. Our presentation deals with the potential consequences of models based on Big Data. Big Data modelling relies, in part, on the production, diffusion, and use of individual-specific data (e.g., recommender systems, or data-collection-based profiling). It is primarily used in contexts where future actions must be forecast from present behaviours. Yet insofar as this kind of modelling infers social representations from claims of direct access to the Real (i.e., the non-ideological and non-political), one must wonder whether it is not helping create a governance dynamic that links individuals not to a set of views (e.g., a political choice) but rather – because necessary and inevitable – to the Real itself. In other words, Big Data could be a way of (re)producing the social by preventing the sudden emergence of the political – that is, a means for assuring control by arranging for social representations of the Real.

In: A Digital Janus: Looking Forward, Looking Back

effectiveness of certain traditional approaches that govern the use of data in health research is, however, decreasing in the era of Big Data. It has been indicated that a strict ‘consent or anonymise approach’ neither sufficiently allows for progress in data-intensive health research, nor adequately protects

In: European Journal of Health Law
Author: Tanja Rütten

1 Introduction 1 In historical corpus linguistics, the concept of what constitutes big data has drastically changed with the availability of databases such as coha , clmet 3 or eebo - tcp . For linguists used to handling, by comparison, rather “small data” in corpora of the Helsinki Family of

In: From Data to Evidence in English Language Research
Author: Marcel Lepper

’s compromise-tested philologies, it is nonetheless important to ask what it conceals. That is especially necessary when the rhetoric of revolution joins the ranks of discourses that extend far beyond the domain of philology: “Big data” is the buzzword used to refer to the inevitability of taking a position on

In: Philological Encounters
Big Data, Data Science, maschinelles Lernen – Begriffe, die auf Daten und ihren Wert für vielfältige Anwendungen hinweisen. Wem nützen die Daten? Wer darf sie nutzen? Wie werden wissenschaftlicher Fortschritt, Erfolg im ökonomischen Wettbewerb und Schutz der Privatsphäre gleichzeitig erreicht? Der Sammelband gibt die Referate einer Fachtagung der Nordrhein-Westfälischen Akademie der Wissenschaften und der Künste wieder, mit Beiträgen aus der Informatik, Statistik, Medizin, den Ingenieursdisziplinen, Rechts- und Wirtschaftswissenschaften. Damit beteiligt sich die Akademie an der dringend nötigen Diskussion, wie den Herausforderungen durch die heutigen Möglichkeiten der Datensammlung und -nutzung zu begegnen ist.

. Defining their respective boundaries falls outside the scope of this chapter. Suffice it to say that in areas such as big-data applications their interests converge, and both technical and subject know-how, automated processing of language data and human validation of the results, are needed (for some

In: From Data to Evidence in English Language Research
The series aims to publish the latest research at the intersection of Digital Humanities and Biblical Studies, Ancient Judaism, and Early Christianity in order to demonstrate the transformation of research, teaching, cognition and the economy of knowledge in digital culture. In particular, DBS investigates and evaluates the practices and methodologies of Digital Humanities as applied to texts, inscriptions, archaeological data, and scholarship related to these fields.

The primary areas of focus are the digital edition of ancient manuscripts, the evolution of research between big data and close reading, the visualization of data, and the epistemological transformation of ancient studies through digital culture. DBS will encompass collected essays as well as monographs, with a particular emphasis on cutting-edge research. Several ancient languages are in the scope of the series, including ancient Greek, Hebrew, Latin, Arabic, Coptic, and Syriac.
Authors: Wang Lu and Cai Rongxiao

We are now living in an era of big data. From the perspective of data analysis of classroom questioning, the paper chooses three districts in City B that have significant differences. These are educationally developed District D, less developed District F, and developing District M. The study uses the stratified sampling method to choose from three different groups of teachers, namely novice teachers, competent teachers, and experienced teachers, by way of video case analysis, the Item Response Theory (IRT) model method, and the inductive and deductive method, to analyze the characteristics of teachers’ classroom questions. It was found that: (1) In terms of open questioning, all three districts with their different levels in educational development need to improve open questioning levels among their experienced primary school teachers. In middle schools, novice teachers’ open questioning tendency is significantly lower than that of qualified and experienced teachers; (2) From the quantitative study of three kinds of tendencies, the lowest level of the three tendencies in the classroom questions is problem-solving; (3) In the three districts, the experienced and qualified primary school teachers in the developing district has a prominent advantage in raising critical and creative questions, while in the middle schools, novice teachers generally have a lower level than qualified and experienced teachers in raising critical and creative questions. The results of the big data analysis enable us to draw a conclusion about teachers’ value orientation regarding questioning in class. At present, they pay attention to the local value of classroom questioning, but ignore the overall value; pay attention to the instrumental value of classroom questioning, but ignore the objective value; and pay attention to the superficial value of classroom questions, but ignore the underlying value.

In: Frontiers of Education in China

The past decade has seen remarkable progress made in the field of environmental information disclosure in China. While the overall institutional changes and the motivation/willingness of the government to open up information are important conditions, China’s encounter with revolutionary Information and Communication Technological (ICT) advancement and rapidly emerging big data quickly changed China from an “information poor environment” to an “information complex environment.” While most previous studies centered on those drives/constraints that were recognized in established informational governance framework, recent advancement in ICTs and emerging big data posed new challenges, opportunities and research questions. When increasing information disclosure became a new game changer in environmental governance, China has had to cope with risks and pitfalls in a new technology-empowered information environment as well. This article updated previous studies on legislation/regulations/policies regarding environmental information disclosure in China and their implementation effectiveness, and paid special attention to China’s recent informatization progress and emerging big data. Information disclosure was treated as a process that includes data/information generation/collection, disclosure, functional pathways of communication, and direct/indirect impacts. Changes in environmental information disclosure should be understood in a broader context of overall changing environmental governance and informatization in China. It is important to understand ICTs and information disclosure as a double-edged sword. Normative, substantive, and instrumental benefits of disclosure as well as collection and reporting costs, the issue of targeted transparency, and the risk of unintended use should be strategically considered. Principles and guidelines need to be developed to avoid pitfalls while maximizing benefits.

In: Frontiers of Law in China