Save

Identifying, Categorising and Exploring ‘Ælfrician’ Vocabulary Using the Dictionary of Old English, A Thesaurus of Old English and Evoke

In: Amsterdamer Beiträge zur älteren Germanistik
View More View Less
  • 1 Leiden University Centre for Linguistics, Leiden UniversityLeidenNiederlande
Open Access

Abstract

Ælfric of Eynsham (c.955×957–c.1010) is one of the most prominent authors of the Anglo-Saxon period. Despite this fact, there has not yet been an exhaustive study into his typical vocabulary. This article employs the Dictionary of Old English and prior scholarship in order to collect and categorise the lexis that is characteristic for his works. This vocabulary is then analysed using the web application Evoke together with A Thesaurus of Old English, which provides insights into the semantic domains that predominate in Ælfric’s vocabulary, as well as the degrees of ambiguity, synonymy and specificity of his typical lexis.

Abstract

Ælfric of Eynsham (c.955×957–c.1010) is one of the most prominent authors of the Anglo-Saxon period. Despite this fact, there has not yet been an exhaustive study into his typical vocabulary. This article employs the Dictionary of Old English and prior scholarship in order to collect and categorise the lexis that is characteristic for his works. This vocabulary is then analysed using the web application Evoke together with A Thesaurus of Old English, which provides insights into the semantic domains that predominate in Ælfric’s vocabulary, as well as the degrees of ambiguity, synonymy and specificity of his typical lexis.

1 Introduction

Ælfric of Eynsham (c.955×957–c.1010) is arguably the best-known and most prolific writer of Anglo-Saxon England (Hill, 2009: 36–37). Ælfric’s significance for the history of the English language stretches beyond the Norman Conquest, since his works were copied until the early thirteenth century (Treharne, 2009: 400). Aside from his own works being copied in the centuries after the Conquest, Ælfric’s influence on new compositions made in this period is also occasionally cited. For instance, one study by Elaine Treharne, which focuses on the twelfth-century English translation of Ralph d’Escures’ homily on the Assumption of the Virgin Mary, notes that “[m]uch of the vocabulary [of the text] is Ælfrician in nature, so that relatively rare words like ‘wiðmeten,’ ‘bearneac[n]inde,’1 and ‘earplættigen’ appear to be based on a thorough knowledge of Ælfrician prose” (Treharne, 2006: 185, n. 26). The three lexemes that Treharne provides as examples of ‘Ælfrician’ vocabulary differ in some important aspects. According to the Dictionary of Old English (DOE), the adjective bearnēacniende ‘big with child’ and the verb ēarplætt(i)an ‘to strike on the ear’ are quite rare in the Dictionary of Old English Corpus (DOEC): bearnēacniende occurs only three times in the corpus, twice in the works of Ælfric and once in Ralph d’Escures’ homily (DOE, s.v. bearn-ēacniende), while ēarplætt(i)an occurs twice in the DOEC, once in the works of Ælfric and once in d’Escures’ homily (DOE, s.v. ēar-plættan, ēar-plættian).2 By contrast, a search in the DOEC for all instances of the verb wiðmetan ‘to compare’ reveals that 20 out of its 56 occurrences are found in the works of Ælfric, and that this lemma is found in more than 20 distinct texts in the corpus.

The discrepancies between these lemmata, which have all been termed ‘Ælfrician’ by Treharne, raise some important questions. First of all, if a lemma is found only in the works of Ælfric and one other text, but occurs rarely, can this lemma really be considered characteristic of Ælfric’s lexis? Similarly, can a lexical item that is found in more than twenty distinct texts also be labelled as ‘Ælfrician’? Indeed, how can ‘Ælfrician’ lexemes such as bearnēacniende and wiðmetan be compared to each other? Is it possible to make a classification system that can differentiate between lexemes which are either more or less typical for Ælfric’s vocabulary?

In order to answer these questions, a general overview of Ælfric’s characteristic vocabulary would be helpful. To my knowledge, such a large-scale study has not yet appeared, although there are some smaller studies in which some tendencies in his lexis are highlighted (e.g., Jost, 1927; 1950; Pope, 1967: 99–103; Ono, 1988). In order to fill this lacuna, this article demonstrates how this prior scholarship and the DOE can be used to collect and categorise vocabulary that

has been identified as being characteristic of Ælfric’s writings. Next, the article shows how the web application Evoke (Stolk, 2018) may be used to further explore this Ælfrician vocabulary.

Section 2 will address the collection and the categorisation of the Ælfrician lemmata, as well as discuss a number of issues relating to the use of the DOE for studies of this type. Subsequently, section 3 will focus on using A Thesaurus of Old English (TOE) and Evoke to explore Ælfrician lexis. I will discuss the process of tagging the Ælfrician vocabulary in Evoke and the issues that were encountered during this process, the tendencies which characterise Ælfric’s typical lexis, and a number of categories in TOE in which Ælfric’s vocabulary is over- and underrepresented. In the conclusion, some further possible avenues of research into Ælfric’s vocabulary will be pointed out. A full overview of the Ælfrician vocabulary established on the basis of the DOE and prior scholarship is provided in Appendices A and B.

2 Identifying and Categorising Ælfrician Vocabulary

The label ‘Ælfrician’ is not one which was used in Ælfric’s own day. Rather, it is a term that will be employed in this article to refer to vocabulary which prior scholarship and the DOE have identified as being restricted or predominantly found in Ælfric’s works, or lexical items that were preferred by him over synonymous lexemes. When the term ‘Ælfrician’ is used to refer to vocabulary that is primarily found in or restricted to the works of Ælfric, it is quite likely that his contemporaries, whose works have simply not come down to us, may have used the same words. Since the corpus of Old English texts is incomplete and Ælfric’s works are overrepresented in this corpus, especially so in particular text genres, such as grammars, the label ‘Ælfrician’ is simply used in relation to the texts that we have left (see also section 2.4).

2.1 Sources: DOE and Prior Small-Scale Studies

The DOE is the most important source for any study dealing with Old English lexis. In addition to listing senses of lemmata, the DOE also provides citations for these senses, and occasionally provides information on the usage of particular lemmata, for instance, when they are found frequently in the works of Ælfric. For this reason, the DOE was consulted first in order to find lemmata which have been labelled by this dictionary as Ælfrician. These words can be identified in the DOE by the information that the entries provide following the number of occurrences of a lemma. For instance, the entry for the lemma antimber ‘material, substance’ mentions the following about the occurrence of the lexeme in the DOEC: “ca. 45 occ. (freq. in Ælfric)” (DOE, s.v. an-timber). The DOE uses a number of different labels for Ælfrician vocabulary. For instance, the lemma anmōdlīce ‘resolutely’ has six occurrences “in Ælfric”, meaning that it is wholly restricted to the works of Ælfric (DOE, s.v. an-mōdlīce). Two other labels which are often encountered are “mainly in Ælfric” (see, e.g., DOE, s.v. ǣfnung) and “freq. in Ælfric” (see, e.g., DOE, s.v. ed-wist), which are applied to lemmata that also occur outside of Ælfric’s works.3 Searching for ‘ælfric’ in the “Occurrence” field identifies all of the lemmata which have been labelled as Ælfrician in the DOE. In addition, searches were performed for the short titles of Ælfrician works, such as ‘ÆCHom’, ‘ÆLS’, and ‘ÆGram’ in the same field, since lexemes that primarily or exclusively occur in these works can also be seen as part of Ælfric’s vocabulary as a whole.4 These lexemes employ labels similar to those mentioned above, such as “in ÆGram” (see, e.g., DOE, s.v. āxiendlic) or “mainly in ÆGram” (see, e.g., DOE, s.v. dǣdlic).

The current edition of the DOE only goes up to the letter I. In order to complement the data from the DOE with information about lemmata beyond the letter I, a literature review was also conducted, which has aimed to include as many sources as possible that mention lexemes seen as characteristic for Ælfric.5 Through a number of small-scale studies, previous scholarship has established that Ælfric exhibits a consistent lexical usage which is characteristic of his works. The first to note Ælfric’s preferred usage of certain lexical items over synonymous lemmata was Dietrich (1855: 544–545, fn. 140). Since Dietrich’s article, there have been many studies which have mentioned similar preferences, as well as the restriction of particular lemmata to the works of Ælfric; prominent studies include those by Jost (1927; 1950), Pope (1967: 99–103), Godden (1980), and Ono (1988).6 Another important facet of research into Ælfric’s vocabulary relates to his usage of the ‘Winchester vocabulary’ – a particular lexical usage associated with the school of Ælfric’s teacher Æthelwold – of which “Ælfric is considered [the] most prominent and most consistent proponent” (Gretsch, 2009: 125).7

An attempt has been made to include as many studies as seemed relevant, but sources that have been shown to be problematic in later literature have been avoided.8 In consulting the sources, the focus was solely on Ælfric’s lexical usage, which includes lexemes that are restricted to his works, preferred lexemes, and the use of meanings which are particular to Ælfric. In other words, data such as Ælfric’s preferential use of the verb bedǣlan with an object in the genitive, rather than the dative (Jost, 1950: 122), have not been included. Features of Ælfric’s vocabulary that either I or the relevant source deem of questionable relevance, such as Ælfric’s preference of swā swā over a single swā (Pope, 1967: 102–103), have also not been cited. It should be stressed that there is a vast amount of literature on the peculiarities and tendencies in the vocabulary of Ælfric of Eynsham. Although my study is not exhaustive, I believe that I have gathered the most important sources on Ælfric’s vocabulary.

2.2 Categorisation

Although the DOE employs a number of different labels for Ælfrician vocabulary, it is not immediately clear how labels such as “mainly in Ælfric” and “freq. in Ælfric” differ from each other, nor how these labels differ from other, less frequently used labels, such as “disproportionately freq. in Ælfric” (see, e.g., DOE, s.v. cyre). For this reason, it was deemed necessary to create a categorisation which could be used to create a distinction between the lexemes which are more strongly associated with Ælfric and the lexemes which may be less characteristic of his works. In this categorisation system, an Ælfrician lemma is assigned to one of four categories, A–D, based on the number of non-Ælfrician texts in which the lemma occurs. The reasoning behind this system is that a higher number of non-Ælfrician texts implies that a lemma is less exclusive to the works of Ælfric and, for this reason, may be less characteristic of his lexis. The four categories are given below:

  • Category A contains lexemes which exclusively occur in the works of Ælfric, e.g., bedūfan ‘to sink’ (DOE, s.v. be-dūfan).

  • Category B contains lexemes which occur in the works of Ælfric and one other text, e.g., hremman ‘to hinder’ (DOE, s.v. hremman).

  • Category C contains lexemes which occur in the works of Ælfric and between two and four other texts, e.g., flǣsclicnes ‘incarnate condition; incarnation (of Christ)’ (DOE, s.v. flǣsclicnes).

  • Category D contains lexemes which are frequently found in the works of Ælfric and occur in five or more other texts, e.g., æþelboren ‘of noble birth’ (DOE, s.v. æþel-boren).

In order to make a more detailed distinction between more and less characteristically Ælfrician vocabulary, categories A–C each have two subcategories, which relate to a lemma’s total number of occurrences in the DOEC:9

  • Category 1 contains lexemes which occur five or more times in the DOEC.

  • Category 2 contains lexemes which occur fewer than five times in the DOEC.10

If a lemma is rare even in the works of Ælfric, it may be argued that this lemma is less characteristic of his vocabulary, and of limited relevance for the identification of typically Ælfrician lexis. The four categories listed above, in combination with the subcategories used for categories A–C, facilitate the use of a convenient shorthand. A lemma such as dydrung ‘delusion’ may be referred to as a ‘B1 lemma’, which indicates that it occurs in the works of Ælfric and only one other text, and has at least five occurrences in total in the DOEC (DOE, s.v. dydrung).

In order to categorise the lemmata retrieved from the DOE using the categorisation given above, it is necessary to be able to identify Ælfrician and non-Ælfrician texts. The works of Ælfric are indicated as such in the DOE by either their Cameron number (B1) or otherwise the prefix ‘Æ’ (e.g., ‘ÆLS’, which refers to Ælfric’s Lives of Saints).11 In addition, I have relied on the work of Aaron J. Kleist in order to determine which parts of the Heptateuch were translated by Ælfric, and to identify any other texts which are believed to have been written by Ælfric, but which have not been categorised as such by the DOE (Kleist, 2019: 66–206).12 Counting the number of non-Ælfrician texts in which a lemma occurs was carried out based on the texts cited in the entry of a lemma in the DOE. Whenever these texts numbered fewer than five and not all occurrences of the lemma were given in the entry, I also consulted the DOEC in order to check for any other texts, whenever this was reasonably possible. Determining whether similar texts, such as manuscript variants, should be considered different texts is always a complicated task. Whenever possible, the DOE entries have been followed. For instance, if different texts are given in a single quote, these texts are usually counted as one single text. However, other texts have been counted separately despite their similarities; charters, for instance, may use similar formulas, but are nevertheless different texts. Composite homilies which make use of Ælfrician material presented a difficult case. Sometimes the Ælfrician text in the composite homily may be virtually identical to the edition of the Ælfrician base text in the DOEC; at other times, the composite homily may differ from the base text in terms of word order, omissions, etc. In order to be consistent, all composite homilies containing Ælfrician material have been counted as non-Ælfrician texts.

Works that have been identified by the DOE as having been written by the same author have been counted as a single text, such as the combined works of Wulfstan. These texts are arguably all examples of the same, idiosyncratic lexical usage of their author. Other texts that have been taken to constitute a single unit include the various versions of the glosses to Aldhelm’s De laude virginitatis, such as ‘AldV 1 (Goossens) C31.1’ and ‘AldV 13.1 (Nap) C31.13.1’, due to their similarity and the fact that they gloss the same text, and three versions of the Benedictine Rule in ‘BenR B10.3.1.1’, ‘BenRW B10.3.4’ and ‘BenRWells B10.3.3’, for the same reason. However, individual glosses to the psalter and canticles have been counted separately, since the gloss to the Vespasian Psalter is obviously not the same as the one to the Royal Psalter. If a lexeme occurs in a psalter gloss and a canticle gloss in the same manuscript, both glosses have been counted as one text, even though the two have been assigned different Cameron numbers. With respect to this policy, I believe my results would not be significantly different if I had made different choices.

The categorisation of the lemmata in prior scholarship is based, for the most part, on the secondary sources themselves; the claims made in the sources have not been checked against the DOE or DOEC. Nevertheless, the DOE and DOEC have been used for the categorisation of a number of lexemes about which very little information, e.g., in terms of their frequency, was given in the sources.13 I limited myself to those lemmata which are found in the DOE. Lemmata that do not begin with the letters A–I, which are not found in the DOE, have been placed in a separate category. Lastly, whenever sources have indicated that a specific lemma or specific lexical usage is part of the Winchester vocabulary, this has been indicated in the relevant footnotes in Appendix B.

In contrast to the other lexemes found in prior scholarship, the words found in previous research on Ælfric’s Grammar have been checked in the DOEC as far as possible.14 It seemed preferable to categorise only those grammatical terms which had a significant number of occurrences in the works of Ælfric, and reject such lexemes mentioned in the literature as nama ‘noun’ and word ‘verb’ (Chapman, 2010: 423), which are arguably quite general. My rule of thumb is as follows: if a lexeme has been determined to belong to category D and fewer than 50% of its occurrences are found in the works of Ælfric (not necessarily Ælfric’s Grammar), then this word is not categorised. If, however, a lexeme has been determined to belong to categories A, B or C, it is always categorised, even if, for instance, only one of nine occurrences of this word is found in the works of Ælfric.15 A number of words which were quite difficult to check in the DOEC (because their forms were similar to other lemmata and these forms could not easily be distinguished from each other) have been discounted.

One guiding principle of the categorisation is that the DOE is followed whenever this is possible. This principle has led to some inconsistencies in the categorisation of the words in Ælfric’s Grammar. For instance, if a word mentioned in a secondary source is a present participle such as fæstnigende ‘affirmative’ (Chapman, 2010: 441), and this word can only be found as part of the DOE entry for the whole verb (fæstnian), which is not an Ælfric word according to the rule above, then it is not categorised. However, for words beyond the letter I, i.e., those which could not be checked in the DOE, present participles are taken as separate from their main verbs if these present participles are specifically mentioned in prior scholarship, e.g., ofcumende ‘derivative’ (Chapman, 2010: 443), since the DOE is not always consistent in categorising present participles or lemmata derived from present participles.16

The secondary sources that I consulted often featured various types of information about Ælfric’s lexical usage, which could not easily be compared to each other. Information such as Ælfric’s preference of one lemma over another for the expression of a certain concept seemed relevant to record, but could not be categorised in categories A–D due to the lack of information pertaining to the number of non-Ælfrician texts in which these preferred lemmata occurred, and their total number of occurrences in the DOEC. In order to ensure the accurate categorisation of the Ælfrician vocabulary identified by prior scholarship, it was necessary to add an additional four categories to the categorisation outlined above. The following four categories were created:

  • Category E contains particular lexemes that Ælfric prefers over synonymous lexemes. This category is further subdivided into categories E1 and E2, which relate to whether these preferences are constrained by semantic, contextual or other factors:

    • Category E1 features preferences which are, generally, unrelated to specific semantic or contextual usages.

    • Category E2 features preferences which are, generally, related to specific semantic or contextual usages, or certain other factors.

    One example of an entry in category E1 is Ælfric’s preferred usage of gefrēdan, rather than fēlan, to express the verb ‘to feel’.17 This preference is independent of contextual or semantic factors. Conversely, an example of an entry in category E2 is Ælfric’s preference of the verb (ge)rihtlǣcan over (ge)rihtan when expressing the verb ‘to correct’ in figurative senses (and, conversely, the verb (ge)rihtan over (ge)rihtlǣcan in literal senses) (Hofstetter, 1987: 51).18

  • Category F contains particular morphological forms of lexemes that Ælfric prefers over other morphological forms. The root is the same for both preferred and dispreferred equivalents; the synonyms merely differ in terms of the other morphemes that they may contain, such as prefixes. For instance, Ælfric prefers the form bebod ‘command’, with the prefix be-, over gebod, with the prefix ge- (Sato, 2011: 308).19

  • Category G contains widely used lemmata that Ælfric uses in particular contexts or with specific meanings. For example, the sense ‘to bury’ for the verb bestandan is primarily attested in the works of Ælfric (Jost, 1950: 144).20

  • Category H contains lemmata that do not fit in the preceding categories. This is where claims have been placed such as ‘most instances of þwȳrlic can be found in Ælfric’ (Jost, 1950: 130). Since there is no information pertaining to the number of occurrences of this lemma in non-Ælfrician texts, it is not possible to place it in categories A–D. At the same time, it is impossible to place þwȳrlic in categories E–G, since the source does not mention if Ælfric prefers this lemma over an equivalent lemma, or if he uses it in a specific sense.

Taken together, categories A–H allow for the creation of an overall characterisation of Ælfric’s lexical usage, featuring lemmata that are primarily or exclusively restricted to his works, preferences of particular lemmata over others, and lemmata that have semantic or contextual usages which are specifically Ælfrician.

2.3 Results

The results of the categorisation are presented in Tables 1 and 2.21

Table 1
Table 1

Results of the categorisation of Ælfrician vocabulary in the DOE and prior scholarship for categories A–D

Citation: Amsterdamer Beiträge zur älteren Germanistik 81, 3-4 (2021) ; 10.1163/18756719-12340237

Table 2
Table 2

Results of the categorisation of Ælfrician vocabulary in prior scholarship for categories E–H

Citation: Amsterdamer Beiträge zur älteren Germanistik 81, 3-4 (2021) ; 10.1163/18756719-12340237

Out of the eight categories A–H, which contain a total of 465 items, the two largest categories are those which, respectively, contain words which are the most (category A) and the least (category D) restricted to the works of Ælfric. The vast majority of the 152 lexical items in category A can be found in category A2, which features 120 lemmata (78.95%). These Ælfrician words are quite rare, occurring between one and four times in the works of Ælfric. For categories B and C, this tendency is reversed: category B2 contains fewer items than B1, and there are no lexical items at all in category C2. This outcome is not surprising, since a higher number of non-Ælfrician texts in which a word occurs directly correlates with a higher overall frequency of that lemma. In categories E–H, there are 55 items, of which the majority can be found in category E: 34 items (61.82%).22 Within category E, the best represented category is category E1, which contains 26 items (76.47%). This result implies that Ælfric’s preferences for particular lemmata over other, synonymous lemmata that are unrelated to specific semantic or contextual constraints have received the most attention in prior scholarship.

Note that there is an important difference in the way the items in categories A–D and those in categories E–H have been counted. Whereas in categories A–D each lemma is counted individually, this is not the case for categories E–H; in the latter categories, the entire entry, regardless of the fact that it may contain more than one Ælfrician lemma, is counted as a single unit. For instance, the entry “ǣlc/gehwā/gehwilc ‘every’ preferred to ǣghwilc23 in category E1 is counted as a single unit, despite the fact that there are three preferred lemmata. One reason for counting in this way is that the logic of having ‘preferred’ lemmata versus ‘dispreferred’ lemmata breaks down when counting entries in category E2. Recall the aforementioned example of (ge)rihtlǣcan being preferred over (ge)rihtan when expressing the verb ‘to correct’ in figurative senses: this fact does not entail that (ge)rihtan is a dispreferred lemma, since Ælfric, conversely, prefers to use (ge)rihtan over (ge)rihtlǣcan in literal senses. In other words, the preference goes both ways. Furthermore, all lemmata in an entry, whether preferred or dispreferred, have identical or strongly related senses, which also implies that it is sensible to count them as a single unit.

2.4 Reflection on the Use of the DOE for the Collection of Ælfrician Vocabulary

The collection of data on the lexis of a particular author from a dictionary such as the DOE is perhaps somewhat unorthodox when compared to such methods as consulting secondary sources or analysing a corpus of the author’s works. In this section, I will briefly reflect on some of the issues that were encountered during this study.

The choices made by the DOE with respect to lemmatisation directly influence which lemmata are considered to be Ælfrician in this study. Some of these lemmata are of questionable relevance. For instance, due to the policy of the DOE to create two separate entries for lemmata with and without the prefix ge-, the lemmata edcennan and ge·edcennan are lemmatised separately in the DOE. Although the A1 lemma edcennan occurs six times, only in the works of Ælfric (DOE, s.v. ed-cennan), the longer form ge·edcennan occurs twice, both times in non-Ælfrician texts (DOE, s.v. ge·ed-cennan). Lastly, the past participle ge·edcenned also receives an entry of its own, because it cannot be determined if this past participle belongs to edcennan or ge·edcennan. Although the past participle, which has 21 occurrences in total, does appear in Ælfrician texts, it is also found in at least six non-Ælfrician texts (DOE, s.v. ge·ed-cenned). It is very likely that, if all these forms had been subsumed under a single entry, e.g., (ge·)edcennan, this entry would not have been labelled as Ælfrician in the DOE. In addition, the lemma edcennan is, according to its label in the DOE, primarily found in a late twelfth-century manuscript (DOE, s.v. ed-cennan).24 It may be argued that a lemma which is mainly restricted to a copy written almost two centuries after Ælfric’s lifetime cannot be seen as characteristic of his lexical usage. Both of these factors – the lemmatisation policies of the DOE and the restriction of certain lemmata to late copies of Ælfric’s works – affect the way the DOE might be used as a source for Ælfrician vocabulary.25

Manuscript-specific readings such as edcennan can be problematic in other ways. For instance, if two authoritative copies use different lemmata, it may be difficult to determine the ‘true’ Ælfrician reading. A relevant example is the lemma flocc ‘flock’, which has been placed in category C1, based on the fact that it occurs in four non-Ælfrician texts.26 However, one of its occurrences, ‘floccum’, occurs in manuscript P of Ælfric’s translation of the book of Judges, while manuscript Z, which is the base manuscript used by the DOE, employs a form that is based on the lemma folc, namely, ‘folcum’.27 If ‘folcum’ is the original Ælfrician reading, the occurrence of ‘floccum’ in manuscript P should be counted as non-Ælfrician, which brings the total number of non-Ælfrician texts to five, and requires this lemma to be placed in category D. Although both readings make sense in the context, a case can be made for ‘floccum’ being the original Ælfrician reading, due to the fact that it is the more plausible variant: according to the DOE, the two instances of folc with the sense “band of men, company, division of an army” (one of which occurs in the quotation found in Judges, and the other in the D version of the Anglo-Saxon Chronicle) were “perhaps intended for flocc q.v.” (DOE, s.v. folc, sense 13).28 The issue of counting an instance of a particular lemma as Ælfrician if it only occurs in one or two manuscript copies is especially relevant to the works of Ælfric, which often exist in multiple manuscripts.29 This factor may, therefore, also influence the use of DOE data in studies on Ælfrician vocabulary.

Lastly, there were a number of more general issues with this study. As has been mentioned above, Ælfric was a prolific writer whose works have been well-preserved. This fact is borne out by his presence in the DOEC, in which his works may be said to be overrepresented. The works identified as having been written by Ælfric constitute 22.66% of the prose corpus (B) and 15.91% of the entire DOEC – these percentages would be even higher if the word counts of the Ælfrician parts of the Heptateuch were included.30 In other words, there may be said to be a higher-than-average chance of a lexeme being found exclusively in the works of Ælfric.

An issue that is related to this overrepresentation is that some lemmata which, according to the DOEC, are found exclusively in the works of Ælfric were possibly used by other writers as well. For instance, the A1 adverb cēnlīce ‘boldly’, which has eight occurrences (DOE, s.v. cēnlīce), is derived from the more common adjective cēne ‘bold’, which occurs around fifty times in a number of different, mainly poetic, Old English texts (DOE, s.v. cēne). The higher frequency of occurrence of cēne, coupled with the transparent derivation of cēnlīce, makes it plausible to believe that this adverb must also have been used by other authors, whose texts have now simply been lost to us.31 A similar example is provided by the A1 adjective hārwenge ‘grey-haired’, which occurs six times in Ælfrician texts (DOE, s.v. hār-wenge). The existence of a derived noun hārwengnes ‘greybeardedness’, which occurs only once in a non-Ælfrician glossary (DOE, s.v. hārwengnes), seems to imply that the adjective must have been more common than the corpus shows. If exclusively Ælfrician lemmata, such as cēnlīce and hārwenge, have strongly related lemmata which are not restricted to the works of Ælfric, then this factor may reduce the significance of these Ælfrician lemmata for studies into Ælfric’s vocabulary.

One final point is that the DOE has not consistently labelled words that primarily or exclusively occur in the works of Ælfric. The noun alēfednes ‘infirmity’, for instance, has only one occurrence in the corpus, in Ælfric’s works, but it does not receive a specific label in the DOE. This lack of labelling implies that there are still more Ælfrician lemmata to be found in the DOE, which may perhaps be labelled in future editions of the dictionary.32

3 Analysing Ælfrician Vocabulary in Evoke

3.1 Methodology

In order to discover the characteristics of the Ælfrician vocabulary that was categorised in the previous section, the lemmata were tagged in Evoke. Since Evoke uses a Linguistic Linked Data version of TOE,33 which lemmatises differently from the DOE, a number of choices had to be made in order to tag the words found in the DOE and prior scholarship in Evoke. These choices will be outlined in this section. More specific information about the tagging of individual lemmata can be found in the footnotes in the appendices. For the purpose of tagging the Ælfrician vocabulary in Evoke, only categories A–D have been taken into account, since the lemmata in these categories form a cohesive unit in that they are either restricted to or occur frequently in the works of Ælfric. The lemmata in categories E–H are more difficult to quantify in this sense. For instance, Ælfric may prefer the verb gefrēdan ‘to feel’ to its synonym fēlan (see section 2.2), but this fact does not imply that the verb gefrēdan is in some way restricted to the works of Ælfric; this entry in category E1 simply indicates a preference.

Each lexical entry (i.e., not the individual lexical senses) for an Ælfrician lemma in Evoke receives three tags:

  • #Ælfrician: All Ælfrician lemmata receive this tag, which allows for the immediate selection of all Ælfrician vocabulary in Evoke.

  • #Ælfrician_A/#Ælfrician_B/#Ælfrician_C/#Ælfrician_D: These tags indicate the category (A–D) to which a lemma belongs.

  • #freq5plus/#freq1to4: These tags indicate the subcategory (1 or 2) to which a lemma belongs. Subcategory 1 is tagged as #freq5plus (since the lemma which receives this tag has five or more occurrences in the DOEC) and subcategory 2 is tagged as #freq1to4 (the lemma has between one and four occurrences in the DOEC).

In addition, a number of lemmata also receive the tag #comment, which is accompanied by a brief explanation outlining the discrepancy between the ways in which these lemmata are treated in the DOE and TOE (see below).

There were a number of issues with tagging the Ælfrician vocabulary in Evoke. One issue is that some DOE lemmata do not have equivalent lemmata in Evoke, which means that they could not be tagged.34 Other issues relate to the different lemmatisation choices made by the DOE and TOE. For instance, the DOE considers words which occur with and without the prefix ge-, such as the verbs gehūslian and hūslian ‘to administer the Eucharist’, as separate lemmata (DOE, s.vv. ge·hūslian, hūslian). Since the past participle gehūslod could theoretically belong to either of these verbs, it too receives its own entry (DOE, s.v. ge·hūslod). In TOE, however, these three entries correspond to a single entry: (ge)hūslian, which creates problems for categorisation, since the three DOE entries each have their own category (gehūslian is A1, hūslian is B1, and gehūslod is not in the appendices). Conversely, the opposite may be true: a single entry in the DOE may correspond to two or more entries in TOE. For instance, a search for the verb bedydrian ‘to delude’ in Evoke gives two results: bedydrian and bedydrian … wiþ. The second of these entries, with the sense ‘to conceal’, is listed as sense 2 in the DOE (s.v. be-dydrian).

In order to employ a consistent strategy for dealing with these discrepancies, it was once again established as a main principle that the DOE is followed whenever possible (see section 2.2), since the categorisation of the Ælfrician vocabulary is primarily based on the DOE. This principle led to the following solutions to the problems mentioned above: when the DOE has multiple lemmata which correspond to a single lemma in TOE, the labels for these lemmata are consolidated. In other words, the lemmata gehūslian (A1), hūslian (B1) and gehūslod (not part of the appendices) are taken as a single lemma, and their occurrences in non-Ælfrician texts are combined. Therefore, the equivalent lemma (ge)hūslian has been tagged in Evoke as C1. Comments have been added to Evoke entries which subsume multiple DOE lemmata for the purposes of clarity; e.g., in the case of (ge)hūslian: “#comment Conflation of three DOE entries: gehūslian (A1), hūslian (B1) and gehūslod (not in appendices).” Conversely, when a single DOE entry corresponds to multiple TOE entries, all relevant TOE entries in Evoke are assigned the same category as the DOE entry; the DOE lemma has not been ‘split up’ into TOE lemmata which are then recategorised. In other words, both bedydrian and bedydrian … wiþ are tagged as C1 in Evoke; the fact that bedydrian … wiþ with the sense ‘to conceal’ only occurs once in total, in the works of Ælfric (DOE, s.v. be-dydrian, sense 2), has not been taken into account.

Note that for entries in TOE in which a preposition is part of the lemma for a verb (such as bedydrian … wiþ), only those entries have been tagged which have demonstrably been used by Ælfric, i.e., there is an Ælfrician quote for this particular verb + preposition combination in the equivalent DOE entry. This decision was also made in order to reduce the number of errors with respect to tagging senses of Ælfrician lemmata which do not actually occur in Ælfric’s texts. For instance, with respect to the D lemma abūgan, one of the four results in TOE, ābūgan fram, has not been tagged, since the sense that is attested for it in TOE, ‘to move from’, seems to correspond to sense 2.b in the DOE (s.v. a-būgan), which does not list any Ælfrician quotes. The same principle has been applied to other lemmata: if a TOE entry is solely associated with senses which are not found in the works of Ælfric for that particular lemma, then this entry is not tagged in Evoke.35

3.2 Results

The Ælfrician vocabulary which has been tagged in Evoke can be subjected to a number of statistical analyses, which highlight the similarities and differences between Ælfric’s vocabulary and all words in TOE as a whole. Therefore, these analyses provide insights into the characteristics of Ælfric’s vocabulary. Due to the discrepancies between the DOE and TOE (see section 3.1), the number of tagged entries per category in Evoke differs from the number of entries which have been categorised based on the DOE and prior scholarship, as found in Appendices A and B (see Table 3 below and cf. Table 1 above). For reasons of space and since this is an exploratory study, Ælfric’s lexis will be analysed as a whole in this section, without taking into account the differences between categories A–D.

Table 3
Table 3

Results of the tagging of Ælfrician vocabulary in Evoke

Citation: Amsterdamer Beiträge zur älteren Germanistik 81, 3-4 (2021) ; 10.1163/18756719-12340237

First of all, Evoke can be used to determine the degree of ambiguity of Ælfric’s lexis. The degree of ambiguity is related to the number of senses that a lemma may have. For instance, if Ælfrician words generally have a low number of different possible senses, this result would imply that Ælfric’s lexical usage can be characterised as unambiguous, and could mean that he is particularly concerned about writing as precisely as possible. A high degree of ambiguity would indicate the opposite: a lack of a particular concern for precision in lexical usage, and perhaps a deliberate effort to allow for multiple interpretations of his words.

As Figure 1 shows, Ælfric’s vocabulary is somewhat more ambiguous than the vocabulary in TOE. Around two-thirds – 65.95% – of the lexical entries tagged as Ælfrician have only one sense associated with them, as opposed to 78.40% of the entries in TOE as a whole. Conversely, Ælfric’s vocabulary contains relatively more lemmata with two, three, four or five senses than TOE does. According to Evoke, an Ælfrician lemma has, on average, 1.66 senses associated with it, while a lemma in TOE has 1.45 senses. While this is perhaps not a significant difference, it seems that, on average, Ælfric’s vocabulary is somewhat more ambiguous than Old English vocabulary in general. Nevertheless, this difference is not great enough to allow for the conclusion that Ælfric was deliberately ambiguous or unconcerned with lexical precision in his works.

Figure 1
Figure 1

The degree of ambiguity of Ælfric’s vocabulary (orange) and TOE (blue)

Citation: Amsterdamer Beiträge zur älteren Germanistik 81, 3-4 (2021) ; 10.1163/18756719-12340237

Evoke can also determine the degree of synonymy of Ælfric’s vocabulary. One crucial difference between this analysis and the previous one is that the degree of synonymy relates to lexical senses, rather than lexical entries. Since only lexical entries have been tagged in Evoke, not lexical senses, the analysis in Evoke takes into account all 693 senses that are associated with the 417 Ælfrician lemmata, including those senses which have not been attested in the works of Ælfric.36 The degree of synonymy is related to the number of synonyms that are available for a lexical sense. If Ælfric mainly uses lexical senses with a high number of synonyms, the implication would be that Ælfric, in choosing one particular synonym over other available equivalents, often made deliberate lexical choices in his writings. On the other hand, if the works of Ælfric generally feature lexical senses with few synonyms, this fact would make it more difficult to argue that Ælfric frequently made particular conscious lexical choices.

The graph in Figure 2 shows that a quarter of the senses associated with Ælfric’s vocabulary (25.69%) have zero synonyms available for them (i.e., there is only one lemma associated with these senses). This percentage is almost the same for the senses in TOE: 24.63% of all senses in TOE have zero synonyms available for them. The graphs for the senses of the Ælfrician lemmata and the senses for the TOE lemmata are roughly equivalent. This fact is borne out by Evoke’s statistical analysis, which shows that, on average, a sense of a given Ælfrician lemma has 4.88 synonyms available for it, while a sense of a lemma in TOE has 4.92 synonyms associated with it. In other words, it is very likely that Ælfric made deliberate lexical choices in his writings, something which is also borne out by categories E and F, which feature lemmata that Ælfric prefers over their synonyms. Nevertheless, Ælfric probably did not make particular lexical choices to a greater extent than was normal in Old English.

Figure 2