Computer assisted spelling normalization of 18th century English

in New Frontiers of Corpus Research
Get Access to Full Text

Have an Access Token?

Enter your access token to activate and access content online.

Please login and go to your personal user account to enter your access token.

Help

 

Have Institutional Access?

Login with your institution. Any other coaching guidance?

Connect

Abstract

This paper describes the ongoing development of a software spelling normalization system named ZENSPELL. It is intended to assign normalized, present-day English spellings to 18th spelling variants with minimal user intervention while keeping the source text intact and available for comparison. The article examines the possibility of adapting 18th century English newspaper texts in order to make them comply with 20th century spelling rules. The idea is to create a hybrid text: like glossed word-for-word ‘translations’ of Latin texts, the target text will contain 18th century sentences, but with 20th century orthographic words. Despite somewhat doubtful linguistic qualities, the resulting ‘artificial’ text will be useful for two purposes: first, lexical searches can be made using one normalized search term instead of having to guess possible spelling variations of the intended term. Second, the target text can be used as input for wordclass taggers such as ENGCG

New Frontiers of Corpus Research

Papers from the Twenty First International Conference on English Language Research on Computerized Corpora Sydney 2000

Series:

Table of Contents
Index Card
Metrics

Metrics

All Time Past Year Past 30 Days
Abstract Views 44 30 0
Full Text Views 59 53 0
PDF Downloads 25 16 0
EPUB Downloads 0 0 0
Related Content