In Ancient Greek, as well as in other languages, whenever agreement is triggered by two or more coordinated phrases, two different constructions are allowed: either the agreement can be controlled by the coordinated phrase as a whole, or it can be triggered by just one of the coordinated words. In spite of the amount of information that can be read on this topic in grammars of Ancient Greek, much is still to be known even at a general descriptive level. More importantly, the data still lack a convincing explanation. In this paper, we focus on a special domain of agreement (subject and verb agreement) and on one morphological feature that is expected to covary (number). We discuss the agreement in number for conjoined phrases, by revising some of the modern hypotheses with the support of the empirical evidence that can be collected from the available syntactically annotated corpora of Ancient Greek (treebanks). Results are interpreted according to syntactic features, cognitive factors and semantic properties of the coordinated phrases.
In a recent work, Johnson (2013) has drawn the attention to a syntactic phenomenon concerning agreement in Latin and Ancient Greek (AG) that, although well known to general linguists and discussed by the grammars of the two languages, has been rather overlooked.1 In AG, as well as in Latin and in many unrelated languages, whenever the agreement is triggered by two or more coordinated phrases, two different constructions are allowed: either the agreement can be controlled by the coordinated phrase as a whole, through a synthetic resolution of the conjoined phrase, or it can be triggered by just one of the coordinated words. Thus, e.g., two coordinated singular nouns can license either plural or singular agreement over the syntactic elements that they control.
The lack of interest on the subject from scholars of AG contrasts starkly with the attention paid by linguists to the same phenomenon in other modern languages. In spite of the amount of information that can be read in grammars of AG (see especially Kühner and Gerth 1898: 77–82), much is still to be known even at a general descriptive level. What is the distribution of the different patterns? Can one of them be considered as the rule, with a series of a possible exceptions? Or is the situation different and more fluid?
More importantly, however, the data still lack a convincing explanation. What is the actual influence of the different factors that are known to impact on the choice of the agreement with multiple antecedents? Do the AG data support any of the hypotheses that have been proposed by modern linguists? Can AG contribute to the current debate in syntactic theory?
The aim of this study is to provide a detailed discussion about the agreement with conjoined phrases and to revise some of the modern hypotheses with the help of the evidence that can be collected from the available syntactically annotated corpora of AG. We concentrate our attention on a special domain of agreement (subject and verb agreement) and on one morphological feature that is expected to covary (number); we also limit our investigations to cases of coordinated noun phrases: clause and verbs (like, e.g., infinitives) that play the role of coordinated subjects are excluded from the present study.
The work is structured as follows. In section 1.1, we will provide the definitions and the terminology that we will use; sections 1.3 and 1.4 are dedicated to the corpus that will be employed for the investigation and the methodology of our experiments. We will then present full quantitative data for the observations reported by grammars. The hypothesis that partial agreement is only operative in the domain of the clause will be discussed in sec. 2.2. Finally we will concentrate on factors that seem to play the biggest role in influencing the choice of the agreement pattern, namely constituent order (2.3), cognitive aspects (2.4), and the animacy value of the subjects involved in coordination (2.5).
1.1. Agreement and Coordinated Subjects: Some Definitions
Following Steele (1978: 610), Corbett (2006) defines agreement as a systematic and asymmetric covariance that is triggered by the formal or semantic features of a controller and influences the form of a target. This covariance can be effective over different domains and is manifested by some (morphological) features that are expected to co-variate.
In AG, the syntactic subject(s) acts as trigger in relation to the verb; the agreement is manifested primarily in the morphological feature of the number: the plurality of the subject triggers the use of the plural form for the targeted verb.2
Whenever two or more subjects are joined in a coordinated phrase the patterns that we mentioned above are possible. Two or more coordinated phrases can license either plural/dual, as in example 1, or singular agreement, as in ex. 2.3 The two patterns can even be used simultaneously, as in ex. 3
In AG, as in other historical languages, the pervasive nature of this double strategy has produced variants in the manuscript transmission of the texts; in the passage of the Iliad reported in ex. 4 both the plural and the singular verb forms were already attested in Antiquity.4 Similar cases are documented also in Old Spanish (England 1976: 814, e.g.) and Latin (Johnson 2013: 77, on Suet. Aug. 73.1)
Although the existence and the diffusion of a similar double pattern is well known to linguists, the terminology remains rather fuzzy. The pattern of ex. 2 is often called asymmetric, analytic or (depending on the rank of the controller in the sequence of the subjects) either first- or second-conjunct agreement. Following a more neutral terminology, we will speak of “Resolved Agreement” (RA) for the agreement triggered by the coordinated phrase as a whole, and of “Partial Agreement” (PA) where only one subject in the coordinated phrase acts as controller.
1.2. Resolved and Partial Agreement in Current Linguistics
The blurring in the regularity of the agreement strategies that is determined by the presence of multiple coordinated subjects has attracted considerable attention. Many studies have documented the distribution of the agreement patterns throughout different languages or attempted to provide an explanation to the attested phenomena. As it can be expected, the approaches and the solutions proposed vary considerably.
It is generally known that a number of factors play a role in the distribution of the constructions across the languages where both RA and PA are allowed. One of the most prominent is the semantic difference between the conjunctions that link the coordinated subjects. “Or-coordinates”, that are coordinated with disjunctive conjunctions that correspond to English “or”, tend to trigger PA more often than “and-coordinates”, for the meaning of the or-coordinates logically implies disjunction rather than conjunction. This distinction is known to play a role in AG as well (Smyth 1920: § 963–972), and it is attested even in a language that allows only a very limited space to PA such as English, where constructions like that of ex. 5 are the norm (Quirk et al. 1985: 759–765).
Constituent order is also known to play a crucial role in the agreement of conjoined phrases in many languages. In the Moroccan and Lebanese Arabic, PA is restricted to conjoined postverbal subjects (Aoun et al. 1994). Likewise, PA is favored with preposed predicates in Old Norse and Serbo-Croatian (Johannessen 1996: 667), as well as in Portuguese (Munn 1993: 92–94) and Old Spanish (England 1976). In English too, PA is confined to a single structure with postverbal subjects, the there construction, like in the sentence: There is Paul and John (Quirk et al. 1985: 759–765; Munn 1993: 94–95). The distribution of RA and PA in Ancient Greek with respect to the order of constituents will be discussed in section 2.3.
Finally, animacy has been pointed out for its role in agreement across a significant number of languages. In Latin and in Modern Greek, it is mostly the animacy of the coordinated nouns that influences the choice between neuter (inanimate nouns) or masculine/feminine (animates) in gender agreement.5 Data from Old Spanish texts confirm that animacy is the most important factor in the distribution of RA and PA with coordinated subjects, along with the aforementioned order of the constituents (England 1976). Corbett (1979: 207) noted that conjoined animate subjects tend to trigger preferably “semantic agreement” (i.e. RA). In English too, ambiguity may arise whenever the conjoined subjects are two or more abstract nouns. In cases like ex. 6, both RA and PA are allowed (Quirk et al. 1985: 761).
The role of animacy in AG will be discussed in section 2.5.
At a more interpretative level, one approach that has been attempted to explain PA is to treat it as a peculiar type of clausal coordination where the verb forms are elided in the surface realization of the sentence. According to this hypothesis, which has been argued for most notably by Aoun et al. (1994), PA’s domain does not involve a triggering conjoined phrase and a target verb: what should be reconstructed instead is a set of coordinated VP’s, with the verb(s) elided in all but one of them.
This explanation (which will be discussed in light of the AG data in section 2.2) is argued for for Modern Greek by Spyropoulos (2011), and is sometimes taken for granted in the interpretation of PA in Ancient Greek too (e.g., Devine and Stephens 2000: 158–159), but has been vigorously debated. In explicit contrast to clause explanation, most current accounts (Munn 2000; van Koppen 2007; Bošković 2009) assume that PA arises in the context of phrasal coordination. According to Principles and Parameters theory (Chomsky and Lasnik 1993), where coordinate structures are conjunction phrases with the first conjunct acting as the specifier, the conjunction as the head, and the second conjunct as the complement, such accounts posit a prominence asymmetry among conjuncts and make PA dependent on head government. Hence, PA is assumed to be a result of the fact that one of the conjuncts is higher or more prominent than the other. The more prominent conjunct is generally the first in head-initial languages and the second/last in head-final ones.
It is interesting to note that the oscillation between RA and PA is only one, albeit notable and relatively frequent, of the possible cases of mismatch between trigger and target that can occur in natural languages. Semantic factors are well known to impact on the syntactic phenomenon of subject-verb agreement, like, e.g., in the cases of the so-called “notional concord” of English (Quirk et al. 1985: 758–759).6
Along this line, research in comparative Indo-European grammar has questioned the existence of a systematic agreement (in Corbett’s terms) in the reconstructed Proto Indo-European. According to Meillet and Vendryes (1953: 598–600), the semantic value of the number was expressed by the morphology of subjects and verbs independently. Gradually, a more restrictive and systematic covariance emerged, imposing strict syntactic rules of subject-verb agreement. Yet, the pre-historical situation is witnessed by the many types of so-called agreement ad sensum that survive in the ancient texts.7 In this perspective, the verb number may still retain a semantic force, independently from its syntactic trigger: speakers can use a plural verb form to stress the plurality of the agents that join their efforts, while the singular can result from the fact that two coordinated concepts are in fact conceived as two facets of one single notion (see, e.g., ex. 6 above, and the discussion of sec. 2.5)
Other studies, on the other hand, have stressed the role played by cognitive factors. With stress on Latin data, Johnson (2013) has argued for an interpretation of PA as a form of “avoidance strategy” that is triggered whenever speakers are confronted with cognitive difficulties. “Speakers, when faced with the task of assigning gender to a mixed group of controllers, can instead choose to avoid the problem altogether by agreeing with only one antecedent (and usually the more local one)” (Johnson 2013: 82).
1.3. The Corpus: Treebanks of Ancient Greek
In what follows, we will use evidence from two treebanks of Ancient Greek in order to extract quantitative data on RA and PA and to discuss some of the hypotheses that were summarized above.
In current linguistics, a treebank is a corpus of sentences enriched with word-by-word morphological and syntactic analysis. For many languages, treebanks are widely used for corpus-based research. Two such treebanks that include some of the Greek texts that have survived from Antiquity are also available to the public.8
The Ancient Greek Dependency Treebank (AGDT), created in 2009, is the first syntactically annotated corpus of Greek literary texts of the Archaic and Classical Age (Bamman et al. 2009). In the release that we used (1.7), the AGDT amounted to a total of 354,529 annotated tokens, which are publicly available for download.9 The composition of our corpus is summarized in Table 1.10
PROIEL is a project from the University of Oslo that aims to provide aligned treebanks for translations of the New Testament in a broad spectrum of Indo-European languages (Haug and Jøhndal 2008), along with a selection of other prose texts for each language. For Greek, the morphological and syntactic annotation of Herodotus is ongoing. Table 2 summarizes the composition of the Greek sample of the PROIEL corpus that we have retained for the present work.11
In their extension, the two corpora are quite substantial. As it can be seen, they include large samples (or in some cases the complete extant opera) of poets and prose authors of the Archaic and Classical period. For poetry, two major genres and cultural phenomena like epic poetry and tragedy are extensively represented in the AGDT. On the other hand, the texts that we used from PROIEL provide important evidence on prose genres, which are still scarcely represented in the other treebank. The sample of Herodotus is composed by significant excerpts from books 1, 4, 5, 6 and 7 of the Histories. The Greek New Testament is the only work that exceeds the chronological and geographical boundaries set by the other texts. We decided to retain it in our survey as a significant foil that the Archaic and Classical literary texts can be compared with.
Even after the addition of the PROIEL data, the corpus that we are investigating is not entirely balanced: the two Homeric poems, with their specific linguistic features, account for a section of the total that is not matched in size by any single other author or genre.12
Rather than being a limitation, this distribution of the data can be turned into an opportunity for research. As we will see, clear tendencies seem to emerge from the corpora and the two treebanks are still ongoing projects that are destined to be enriched with many more additions. In the future, it will be possible to test our conclusions on evidence from other genres and authors, as soon as they are made available. The fact that, for some authors, and for the Homeric poems in particular, the AGDT includes the complete data can also allow for what can be considered as definitive conclusions on stylistic trends in given texts. Some notable examples of this situation will be encountered during the discussion.
1.4. Methodology and Samples
All the texts in the AGDT and PROIEL corpus are annotated manually, with lemmatization and information on part of speech and morphological features; both treebanks note syntactic relations according to the principles of dependency grammar (Tesnière 1959). Although the rules for annotation and the tag set used by the two collections differ sensibly in some details, the differences can be considered negligible for our purposes, for the main phenomena under consideration (verb-subject relation and coordination) are treated in a very similar way. We decided however to automatically convert the data from PROIEL to the AGDT format; both corpora were subsequently queried using the software TrEd.13
With the help of interrogation software like TrEd, it is extremely easy to identify the sentences where coordinated subjects are attested and count the distribution of plural, dual and singular number of the governing verbs. The results that can be obtained by such a query are reported in Table 3.14 Although interesting for highlighting general trends, this evidence is not quite conclusive. We have mentioned the fact that logical disjunction reflected by or-coordinates favors PA: the evidence from the two treebanks reported in the table provides a confirmation to this and, in any case, the distribution of and- and or-coordination is quantitatively clearly unbalanced in favor of the former. The case of the and-coordinates appears therefore more interesting to study.
Furthermore, not all the coordinated phrases can be considered equally informative: those that are formed by plural subjects only do not provide decisive evidence, since there is no way to ascertain whether the plural number of the verb is the result of RA or it is triggered by just one of the (plural) controllers; the cases that need retain our attention are those in which real ambiguity can arise, namely the constructions where at least one of the subjects is singular.
These two observations provide some criteria to identify a more interesting sample of sentences within the two collections. We have therefore proceeded to isolate a narrow selection of sentences such that:
they feature at least two coordinated NP’s performing the function of subjects;
the NP’s are coordinated by conjunctions that introduce and-coordinates (kaí, ēdé, idé, te and commas);
the governing verb (that can be either the main predicate or the head of a subordinate clause) is expressed and displays the number feature overtly in its morphology.15
Finally, a few individual passages in the collections that still match these criteria but are too controversial or are not genuine examples of the phenomenon under consideration were excluded from the subset.16
In total, the restricted sample (that will be called Sample B) include 666 sentences, 464 from the AGDT, 202 from PROIEL.
Once that the subcorpus is defined, there remains the difficult methodological problem of what to count. Even if the number feature of the verb can be easily tested, a plural or dual verb does not necessarily indicate one construction or the other: a plural/dual verb (as in ex. 7) may be triggered by the closest plural/dual subject, and be therefore an example of PA.
The contrary, however, holds always true: singular agreement necessarily excludes RA and can therefore be considered a good measure of the distribution of the two constructions. In the following tests, we will report the number of singular, plural or dual agreement with the verb, but we will also provide the percentage of singular agreement on the total of the matches as an index of RA vs PA. Although the number of singular agreement (as opposed to non-singular agreement, i.e. the sum of both plural and dual agreement) is in defect as a measure of PA, since some cases of authentic PA with plural/dual verb such as ex. 7 are left out, we consider it a good approximation nevertheless.
2. Resolved and Partial Agreement in Ancient Greek
Partial agreement appears to be a very pervasive phenomenon in AG. Singular agreement occurs more often (in circa 56 % of the cases) in the AGDT, while it is attested in 46 % of the cases in PROIEL. Looking at these general frequencies in the two corpora reported above (tab. 3), one has quite an opposite impression from what is often implied by the grammars, where the singular agreement is considered as an allowed variation from the standard rule.17
With subjects coordinated with “and-coordinate” conjunctions, the ratio of non resolved agreements seems in line with the general trend; or-coordinate, although much less attested than and-coordinate, show a more pronounced tendency for singular agreement. The general data of the two treebanks, reported in tab. 4, attest that there is a correlation between type of coordination (and- or or-coordination) and the choice of construction. A chi-square test for independence shows that this correlation is very significant (χ2 = 21.90, df = 1, p < 0.001), but the effect size (φ = 0.142) is rather weak.18 Moreover, a reading of the contribution of the different cells of tab. 4 to the chi-square test (Gries 2009: 175) shows that the two combinations that are contributing the most to the highly significant outcome are the or-coordinate with singular agreement (10.24) and especially with plural agreement (10.67).19 This result may therefore be too influenced by the scarce number of observations for or-coordinates to be conclusive. More work on the or-coordinate is therefore needed to confirm the results of our scrutiny of the AGDT and PROIEL.
Even if they are sufficient to cast doubts about the interpretation of PA as an exception, the general data do not provide a clear indication on how PA and RA are distributed in the corpora. For one thing, as we saw in sec. 1.4, the general sample is limited by the presence of both and- and or-coordinates, and by the fact that all-plural subjects may yield false negative results. On the other hand, the distributions of singular and non-singular (i.e. plural + dual) agreement for the and-coordinate reported in tab. 3 may at first sight suggest the hypothesis that the two constructions are evenly distributed, with each of them having a 50 % probability of occurring.
Data from our Sample B, which as we saw is limited to a specific set of and-coordinates with at least one singular subject, do offer a different picture, where the trend is more decidedly in favor of PA. This sample is best suited to be tested for the hypothesis of an equal distribution of the two constructions.
Table 5 reports the results from Sample B. If we consider the totals, the observed distribution of singular agreement (436 cases) versus non-singular (plural + dual: 232 cases) in our sample deviates very significantly from the hypothesis of an even distribution.20 Singular agreement is encountered significantly more often than it is to be expected from the hypothesis of a 50 % probability. This observed distribution, a fortiori, provides a further argument against the view of some normative grammars, namely that singular agreement is the exception to a standardized agreement rule.
As it is visible from the variations in the rows of tab. 5, the distribution of singular and non-singular agreement is indeed variable between the authors. In general, PA accounts almost always for more than 60 % cases, with an average of 65 %; Herodotus (with 50 %) and Hesiod (56 %) marks two exceptions to this picture. The hypothesis that the distribution is influenced by genre and individual style is entirely plausible. The question, however, would require the creation of an ad-hoc representative corpus of the different genres and authors to be tested, and will be therefore left for future work.
On the other hand, our data confirm the assumption of Johnson (2013): cases where actual ambiguity in the agreement of the verb can arise from the presence of coordinated phrases that meet the conditions listed for Sample B are a rare phenomenon. As it can be observed in the sixth column of Tab. 5, verbs that can be tested for our experiments cover a very limited portion of the text of each author which constantly amount to less than 1 % of the total number of tokens, with an average of 0.12 %. The tragedians Aeschylus and Sophocles stand out for both the very low incidence of the phenomenon and the extremely high rate of PA.
In both the general corpus and in Sample B, close-conjunct agreement (i.e. the situation where it is the subjects which lies closest to the verb that triggers agreement) is dominant. In our inspection of the datasets we were able to find only two cases where PA is triggered by the furthest subject. They are reported below in exx. 8 (a case of or-coodinate) and 9.21
2.2. Clausal Interpretation?
As we have seen in sec. 1.2, Aoun et al. (1994) have advanced the hypothesis, based on data from Moroccan and Lebanese dialects of Arabic, to account for RA and PA in terms of difference between phrase and clausal coordination.
It is tempting to adopt this interpretation for the PA patterns of our data too, especially when the order of constituents follows the pattern: subject-verb-second subject (Johnson 2013: and see next section). We may observe that clausal coordination is indeed implied in some of the sentences that display that order, as in ex. 10.
In this case, both the verb (the copula “was”) and the nominal predicate (“best”) agree with the singular subject (“he”). We are, therefore, required to mentally supply not only a verb, but also a second nominal predicate. The entire sentence should be interpreted as: hò phértatos êen, híppoi-te [phérteroi êsan].
The main argument that supports the hypothesis of Aoun et al. (1994) is the fact that some lexical items that imply semantic plurality (e.g., the equivalent of the English verb “meet”, or of the adverb “together”) seem to be incompatible with PA in Moroccan and Lebanese Arabic. This fact, however, does not hold for all the languages where PA is admissible, as the examples adduced by Johannessen (1996) prove.22
Evidence that contradicts the analysis of Aoun et al. (1994) is also found in the AG treebanks. We may consider the example of the verbs composed with the preverb sýn (“with”, “jointly”). When they are not used with a dative complement, these verbs imply the cooperation of two or more different agents to the same action. Accordingly, the meaning of the preverb should require semantic plurality and be incompatible with a clausal interpretation.
In the 4 cases attested in Homer, sýn-compounds are never used with PA; plural agreement is regular, and dual is often employed in case of two coordinated subjects. Ex. 11 is a famous case of a similar pattern.23
However, example 12 shows that PA is admissible even in such cases. In this sentence, the clausal interpretation must be ruled out: the city and the justice operate together.24
Sentence adverbials of similar meaning yield the same results. The adverb háma (“together”, “at the same time”) is prima facie hardly compatible with a clausal interpretation, like the Arabic counterpart studied by Aoun et al. (1994).25 Again, in Homer, we observe that in the two cases where two warriors converge together to the same place, it is the plural or dual which is preferred (Il. 9.170 and 10.196–197). Yet, exx. 13 and 14 show that partial agreement is quite admissible in such cases.
All these examples show that clausal coordination alone does not account for all the variance in the agreement patterns of AG. The evidence offered by the samples considered, though, is too scanty to conclude with certainty whether a tendency, like the preference for PA with the syn-compound that seems to be recognizable in Homer, is indeed operative in our corpus.
2.3. Agreement and the Order of Constituents
As for the distributions of the coordinated subjects, four different constructions are possible and attested in the treebanks of AG. The subjects (indexed 1 to n: Sb1 ,n ) can precede the verb (V), follow it, or one single subject (Sb a ) can be isolated either to the left or right of V.
As we reported above, many languages tend to favor or even restrict PA to postverbal subjects. Although by no means confined to this position, PA tends to be more frequent when the conjoined subjects follows the verb in AG too. The constituent orders for the two agreement patterns in Sample B are resumed in Table 6; Table 7 reports a synthetic view of the distributions by author and text.26
Leaving the cases of right and left dislocation of a single subject aside, the first two rows of tab. 6 can be used to test the hypothesis that the choice of one construction over the other is influenced by the position of the coordinated subjects before or after the verb. A chi-square test for independence confirms the hypothesis of a correlation between pre- or postverbal order of the subjects and the agreement pattern: according to the test, this correlation is highly significant.27 Postverbal subjects show the strong tendency to avoid plural agreement and to be associated with singular agreement. The opposite is true for preverbal subjects.
The configuration with a single left-dislocated subject, where a subject is isolated from the rest of the coordinated phrase in preverbal position as in ex. 15, is the one where PA figures most prominently in tab. 6 and 7. This is hardly surprising: the isolated subject that acts as trigger seems to be placed in an emphatic position; the agreement can be seen as stressing the prominence of the single noun in the relation with the verb.28
In this group, left-dislocated singular subjects followed by plural verbs (and the rest of the conjoined phrase) have drawn the greatest attention since Antiquity.29 The ancient grammarians named this pattern schema Alcmanicum, and explicitly associated it with the work of the poet Alcman (7ht century BCE).30 One of the best-known examples of that construction is ex. 11 quoted above. As the name itself makes it clear, the Ancients regarded it as a strained, poetical constituent order, and linked it to archaic literary style.
More recently Devine and Stephens (2000: 158–159) have argued that the conjoined phrases with a single left-dislocated subject should be explained as a type of appositive structure that is frequently observed in nonconfigurational languages. As for the schema Alcmanicum, they observe that the construction is not limited to Greek, but attested also in a few other archaic Indo-European poetic texts.31 They propose to read it as a construction with null pronominal argument and left- and right-dislocated adjuncts. In their words, “more than any other piece of evidence, the schema Alcmanicum requires us to take seriously the idea that in its prehistory Greek was not only a nonconfigurational language but one that made at least some use of pronominal arguments” (Devine and Stephens 2000: 159)
The hypothesis is fascinating, especially since the case for the survival of traces of nonconfigurational syntax in AG is strong (see also Luraghi 2010). The archaic nature of the construction should thus explain its relative rarity.
2.4. Cognitive Factors and Oral Composition: The Case of Homeric Enjambment
Many of the explanations of the AG data that we have met so far (including that of Meillet and Vendryes 1953 and Devine and Stephens 2000) stress the notion of syntactic archaism that perpetuate old Indo-European constructions in historical times. This trait might very well suit such rare cases as the schema Alcmanicum; however, the tendency towards postverbal PA is so marked, both comparatively within unrelated languages and in the different texts of our corpus, that it inevitably calls for other explanations.
Cognitive factors can certainly play an important role. As we reported, Johnson (2013) has stressed the fact that assigning the required morphological feature to the target in the (comparatively rare) case of conjoined subjects represents a cognitive challenge to the speakers. Preverbal position in AG is a place of special prominence, generally reserved to pragmatically marked constituents, such as shifting or contrastive topics or narrow foci (Dik 1995; Matić 2003). It is reasonable to assume that resolution, i.e. the operation that computes the members of a conjoined phrase and assigns the value of plurality to their sum, is favored when the multiple triggers are produced before the target in the actual utterance; in addition to that, preverbal coordinated subjects are also more salient in the information structure and are therefore more present to the attention of the speakers, as it is shown by the fact that they are given a pragmatically preeminent position.
A special feature of the Homeric texts seems also to confirm that RA and PA imply a difference in the way the information is organized in the sentence. Bakker (1990), among others, has persuasively shown how the appositive syntax of the Homeric poems is connected with its composition as oral poetry. In particular, he focuses on the peculiar nature of the Homeric enjambment (i.e. the non-coincidence of syntactic and metrical boundary) to show that metrical pauses correspond to well delimited pieces of linguistic information that are produced and cognitively processed as a unity.
If we apply Bakker’s hypothesis on the enjambment to the case of the agreement pattern, our Homeric data offer a very precise picture. Among the sentences of the Iliad and the Odyssey included in our Sample B, we were able to find only one case where the subjects that trigger plural agreement are separated by verse end (ex. 16; line end is marked with “/”).
Conversely, we counted 29 cases in the Homeric sentences of our sample where subjects in PA are separated by verse end, like in exx. 17 and 18.32
Moreover, in 72 cases, enjambment separates the verb and the subjects: RA is triggered in 50 of them (as in ex. 19); conversely, PA is attested in 22 (see ex. 20).
In PA, thus, the cognitive and linguistic bond between the subject-trigger and the verb seems tighter: when subjects are realized on the same line and the verb occurs in the following, plural agreement is triggered more often than the singular. What emerges clearly, however, is the stronger bond between the subjects that form the conjoined phrase in the cases of RA. Whereas, as we saw, enjambment between them is extremely rare, verse end can separate the verb from the conjoined subjects rather often (see, e.g., ex. 1, quoted above).
The different behavior of RA and PA constructions in relation to enjambment is a strong indication that also the syntactic phenomenon of agreement between verb and conjoined noun phrases is subject to the constraints of oral communication and performance. In speech, syntactic coherence tends to work on a smaller scale than a complete sentence, and to be confined to smaller information units (Bakker 1990). RA and PA in Homer differ precisely on the boundaries of these units: the subjects that trigger RA seems to be bound together more tightly and tend to be realized as a discrete bit of information; in any case, line end between them is clearly avoided. Whereas in PA the bond links the trigger-subject and the verb, in RA it is the conjoined subjects that form the tighter group.
As mentioned, previous works have stressed the role of the semantic property of the subjects known as animacy in the choice of the agreement pattern. In particular, it has been observed that in a number of languages coordinated abstract notions (see above, ex. 6) and animated subjects tend to favor PA and RA respectively.
To measure what role (if any) animacy plays in our Sample B, we decided to cross our data with the “Animacy Greek Lexicon” created from the PROIEL corpus (Haug and Jøhndal 2008). We have annotated manually those lemmata that were not included in the lexicon according to the same taxonomy and guidelines (Zaenen et al. 2004).33 As a following step, we have assigned an animacy label, which resulted from the combination of the animacy value of the single subjects, to the conjoined phrase as a whole.34 Table 8 summarizes the results for each of the categories of the animacy lexicon.
These values can be further grouped using two different set of oppositions. The “human” label is used for persons and other animated entities, such as the gods and heroes or Greek mythology or personifications of concepts, like Aidos (respect) and Nemesis (righteous anger) of ex. 21. The subjects labelled as “human(s)” (326 cases) can then be taken to represent the category of “animated” and contrasted to the other categories taken together (271 cases, with the exception of “mixed”). On the other hand the “non-concrete” group (146 cases) can be opposed to all the others (again with the exception of the “mixed” values) to represent the class of the abstract subjects (but see the discussion below).
The score of PA in conjoined phrases formed by “non-concrete” nouns is indeed very high, almost close to 100 %. It must be observed, however, that the category is very general, and includes all nouns that are clearly inanimate. Nouns for genuine abstract notions (e.g., díkē, aidṓs, pístis: “justice”, “reverence”, “trust”), sounds (sígē, oimogḗ, féngos: “silence”, “wailing”, “voice”), passions and sensations (chármē, pénthos, hímeros: “joy of battle”, “grief”, “desire”), or for the mind and other inner activities (thymós, frḗn, nóēma: “soul”, “mind”, “thought”) are all ranged together under this label. Further study is needed to see how these different classes of nouns behave in agreement patterns.
On the contrary, the trend in favor of RA when human subjects are involved is unmistakable. Within the groups of animacy values, conjoined phrases with all human subjects are the only class where RA outnumbers PA. The correlation between the animacy of subjects and agreement structure is confirmed by a chi-square test for independence: non-singular agreement is strongly favored with animate subjects and equally strongly disfavored with inanimate coordinated nouns. According to the chi-square independence test, this correlation is highly significant.35
Human subjects display also another peculiarity, namely the abnormally high number of dual agreements, a feature that is absent from all the other categories, with the exception of a single sentence (quoted above: ex. 11) where two rivers are the subjects.36
Both the trend toward RA and the association of dual with human subjects are clearly confirmed by the distribution of the animacy patterns in the single authors and genres of our sample. In the text of Herodotus, the strong preference of RA with human coordinated subjects is even more marked. Data from tragedy, with a high value of PA (67 %) even with human subjects, might seem to mark an exception. It must be noted however, that the only 5 cases of dual agreement with conjoined subjects in Aeschylus and Sophocles are indeed triggered by human subjects.
This distribution of the dual can hardly be a coincidence. Whenever two persons play the role of syntactic subjects, the plural is often often used in place of the dual. But whenever the dual is in fact employed, the idea of duality, of mutual involvement of the two parts, is stressed (Chantraine 1953: 25–26). Ex. 21 from Hesiod’s Works and Days illustrates the point very clearly: the two moral concepts of aidṓs (“respect”, “shame”) and némesis (“righteous anger”) are personified and portrayed as they fly away from the mortal men; the dual, along with vivid descriptive details such as the veiled heads of the two (lines 197–198), stresses the personification.
The fact that plural and dual agreement are used alternatively only with a particular class of subjects seems to point to one conclusion. Human subjects are conceived as agents who can be involved in a common action at a special degree. RA, and particularly in the special marked resolution represented by the dual verb agreeing with two singular conjoined subjects, is another way to represent this form of mutual involvement.
By gathering evidence from the different texts, genres and times that make our corpus, this survey has confirmed the importance of the double agreement strategy in the case of conjoined subjects. The distribution of RA and PA remains fluid and variable across all our data. At the same time, our Sample B proves that PA, rather than being an exception, is a construction that is firmly established in AG. The impressive quantitative relevance of PA suggests that it is more than a mere deviation from a rigid syntactic behavior. On the contrary, semantic and discursive factors influence the choice for one or the other construction.
In the previous paragraphs we have presented and discussed some of the most well known factors that impact on the choice between RA and PA in light of the AG material. But in particular we hope that we have succeeded in highlighting two of the semantic and communicative constraints. Firstly, whenever the bond between one subject and the predicate is stronger, while the other subject(s) is (are) added as adjunct(s) in the context of a discursive flow with a loosened syntax, PA is favored. This is clear in the Homeric poems, where the verse end (which typically marks a pause in the flow of thoughts) is generally avoided between two subjects agreeing with a plural verb.
Secondly, whenever the mutual involvement of the agents is stressed, as in the case of human subjects in all the texts of our corpus, RA (plural and especially dual, at least in the poetical texts where dual is still used) is the favorite choice; in any case, PA is considerably reduced by the concurring raise of the resolved pattern.
Abeillé, Anne (ed.). 2003. Treebanks. Building and Using Parsed Corpora. Dordrecht and Boston: Kluwer Academic Publishers.
Aoun, Joseph, Elabbas Benmamoun and Dominique Sportiche. 1994. Agreement, word order and conjunction in some varieties of Arabic. Linguistic Inquiry 25(2):195–220.
Aoun, Joseph, Elabbas Benmamoun and Dominique Sportiche. 1999. Further remarks on first conjunct agreement. Linguistic Inquiry 30:669–681.
Badecker, William. 2007. A feature principle for partial agreement. Lingua 117(9):1541–1565.
Bakker, Egbert J. 1990. Homeric discourse and enjambement: A cognitive approach. Transactions of the American Philological Association 120:1–21.
Bamman, David, Francesco Mambrini and Gregory Crane. 2009. An Ownership Model of Annotation: The Ancient Greek Dependency Treebank. In: Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories, Milan, 5–15. Milan: EDUCatt.
Bošković, Želiko. 2009. Unifying first and last conjunct agreement. Natural Language & Linguistic Theory 27(3):455–496.
Burkert, Walter. 1985. Greek Religion: Archaic and Classical. Oxford: Blackwell.
Calame, Claude. 1983. Alcman. Fragmenta edidit, veterum testimonia collegit C. Calame. Romae: in Aedibus Athenaei.
Chantraine, Pierre. 1953. Grammaire homérique, vol. 2: Syntaxe. Paris: Klincksieck.
Chila-Markopoulou, Despina. 2003. Γένος και συμφωνία στη Νέα Ελληνική [gender and agreement in modern greek]. In: Anna Anastasiadi-Simeonidi, Angeliki Ralli and Despina Chila-Markopoulou (eds.), Το Γένος [Gender], 132–167. Athens: Patakis.
Chomsky, Noam and Howard Lasnik. 1993. The theory of principles and parameters. In: J. Jacobs, A. von Stechow, W. Sternefeld and Th. Vennemann (eds.), Syntax: An International Handbook of Contemporary Research, vol. 1, 506–569. Berlin: Walter de Gruyter.
Corbett, Greville. 1979. The agreement hierarchy. Journal of Linguistics 15:203–224.
Corbett, Greville G. 2006. Agreement. Cambridge: Cambridge University Press.
Dawe, Roger. 1967. The end of Seven against Thebes. Classical Quarterly 61:16–28.
Devine, Andrew M. and Laurence D. Stephens. 2000. Discontinuous Syntax: Hyperbaton in Greek. Oxford: Oxford University Press.
Dik, Helma. 1995. Word Order in Ancient Greek: A pragmatic account of word order variation in Herodotus. Amsterdam: J.C. Gieben.
Dik, Helma. 2007. Word Order in Greek Tragic Dialogue. Oxford: Oxford University Press.
England, John. 1976. “Dixo Rachel e Vidas”: Subject-verb agreement in Old Spanish. The Modern Language Review 71(4):812–826.
Evelyn-White, Hugh G. 1913. Hesiodea. Classical Quarterly 7:217–219.
Gries, Stefan. 2009. Statistics for Linguistics with R: A Practical Introduction. Berlin and New York: Mouton De Gruyter.
Hajdú, Kerstin. 1998. Ps. Herodian, De Figuris: Uberlieferungsgeschichte Und Kritische Ausgabe. Berlin and New York: de Gruyter.
Haug, Dag Trygve Truslew and Marius Larsen Jøhndal. 2008. Creating a parallel treebank of the old Indo-European Bible translations. In: Proceedings of the Second Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2008), 27–34. Marrakech, Morocco: European Language Resources Association (ELRA).
Humbert, Jean. 1960. Syntaxe Greque. Paris: Klincksieck, 3 ed.
Jebb, Richard C. 1892. Sophocles. The Plays and Fragments. Part V. The Trachiniae. Cambridge: Cambridge University Press.
Johannessen, Janne B. 1996. Partial agreement and coordination. Linguistic Inquiry 27(4):661–676.
Johnson, Cynthia A. 2013. Multiple antecedent agreement: A comparative study of Greek and Latin. In: Stephanie W. Jamison, H. Craig Melchert and Brent Vine (eds.), Proceedings of the 24th Annual UCLA Indo-European Conference, 67–86. Bremen: Hempe.
Kühner, Raphael and Bernard Gerth. 1898. Ausführliche Grammatik der griechischen Sprache. Zweiter Teil: Satzlehre. Erster Band. Hahnsche Buchhandlung.
Lattimore, Richmond. 1951. The Iliad of Homer. Translated by R. Lattimore. Chicago: University Of Chicago Press.
Leaf, Walter. 1902. The Iliad. Vol. 2: Books XIII–XXIV. London: Macmillan, 2 ed.
Liddell, Henry George and Robert Scott. 1996. A Greek-English Lexicon. With a Revised Supplement. Oxford: Oxford University Press, 9 ed.
Lloyd-Jones, Hugh. 1994. Sophocles, Volume II. Antigone. The Women of Trachis. Philoctetes. Oedipus at Colonus. Loeb Classical Library. Cambridge: Harvard University Press.
Luraghi, Silvia. 2010. The Rise (and Possible Downfall) of Configurationality. In: Silvia Luraghi and Vit Bubenik (eds.), Continuum Companion to Historical Linguistics, 212–229. London and New York: Continuum.
Luraghi, Silvia. 2011. The origin of the proto-indo-european gender system: Typological considerations. Folia Linguistica 45:435–463.
Maehler, Herwig. 1967. Griechische literarische Papyri. Museum Helveticum 24:61–78.
Matić, Dejan. 2003. Topic, focus and discourse structure: Ancient greek word order. Studies in Language 27(3):573–633.
Meillet, Antoine and Joseph Vendryes. 1953. Traité de grammaire comparée des langues classiques. Paris: Honoré Champion, 2 ed.
Munn, Alan. 1993. Topics in the Syntax and Semantics of Coordinate Structures. Thesis (phd), University of Maryland.
Munn, Alan. 1999. First conjunct agreement: Against a clausal analysis. Linguistic Inquiry 30(4):643–668.
Munn, Alan. 2000. Three types of coordination asymmetries. In: Kerstin Schwabe and Nung Zhang (eds.), Ellipsis in conjunction, 1–22. Tübingen: Max Niemeyer.
Passarotti, Marco. 2009. Theory and practice of corpus annotation in the “Index Thomisticus Treebank”. Lexis 27:5–24.
Piotrowski, Michael. 2012. Natural Language Processing for Historical Texts. Morgan and Claypool Publishers.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik. 1985. A comprehensive grammar of the English language. Harlow: Longman.
Smyth, Herbert Weir. 1920. A Greek grammar for colleges. New York: American Book Company.
Sommerstein, Alan H. 2008. Aeschylus, Vol. I. Persians, Seven against Thebes, Suppliants, Prometheus Bound. Loeb Classical Library. Cambridge: Harvard University Press.
Spyropoulos, Vassilios. 2011. Some remarks on conjunction and agreement in greek: Implications for the theory of agreement. In: Proceedings of the 7th International Conference of Greek Linguistics. University of York: Available online at: http://22.214.171.124/icgl7/Spyropoulos.pdf.
Steele, Susan. 1978. Word order variation: a typological study. In: Joseph H. Greenberg, Charles A. Ferguson and Edith A. Moravcsik (eds.), Universals of human language. Vol. 4: Syntax, 585–623. Stanford: Stanford University Press.
Tesnière, Lucien. 1959. Éléments de syntaxe structurale. Paris: Klinksieck.
van Koppen, M. 2007. Agreement with coordinated subjects. A comparative perspective. Linguistic Variation Yearbook 7(1):121–161.
West, Martin L. 2011. Old Avestan Syntax and Stylistics: With an Edition of the Texts. Berlin and New York: de Gruyter.
Zaenen, Annie, Jean Carletta, Gregory Garretson, Joan Bresnan, Andrew Koontz-Garboden, Tatiana Nikitina, M. Catherine O’Connor and Tom Wasow. 2004. Animacy Encoding in English: why and how. In: Proceedings of the 2004 ACL Workshop on Discourse Annotation, 118–125. Association for Computational Linguistics.
In the examples, the ancient texts are quoted with the abbreviations used in Liddell and Scott (1996). For the translations, we have generally adopted the versions reproduced in the Perseus Project; however, for specific authors and texts we have also used (occasionally, with minor modifications): Lattimore (1951), Lloyd-Jones (1994), Sommerstein (2008); for the Odyssey, we adopted James Huddleston’s translation published in the Chicago Homer: http://digital.library.northwestern.edu/homer/.
The gender feature is also involved, e.g., whenever a copular verb is construed with a nominal predicate (see example 10 below) or in the case of periphrasitc verb forms. Though common, e.g., in Latin, periphrastic forms are not very frequent in our AG corpus: in this work, we have therefore decided to concentrate our attention only on the number feature.
It may be noted that in 2, although the subjects are formally both plural, singular agreement is licensed by the closest conjunct hórkia (neuter plural), which triggers singular agreement, as neuter plurals can do in AG, on account of a morpho-syntactic archaism in the language.
The plural sáōsan is transmitted by the manuscripts of the poem. The singular aorist optative saṓsai was the lesson read by the Alexandrian philologist Aristarchos (3rd–2nd century BCE). The optative is generally considered more expressive by the editors (cf. Leaf 1902: ad 21.609 and 611) and therefore Aristarchos’ reading is preferred in many modern editions.
On Latin, see Johnson (2013), with bibliographical references to the standard grammars. For the difference between human and non-human nouns in the resolution of gender agreement in Modern Greek, see Chila-Markopoulou (2003) and Spyropoulos (2011).
Badecker (2007) has drawn attention to the cases of mismatch between grammatical vs semantic gender/number. In French, e.g., mannequin (model) is morphologically masculine and normally triggers masculine agreement even when it is used with a feminine referent. In coordinated phrases, however, the evidence shows that semantic agreement is favored: Le mannequin et sa maquilleuse sont assises (*assis) dans le coin (Badecker 2007: 1543). Corbett (1979) has proposed an agreement hierarchy to account for the influence of semantic and syntactic factors in the different types of agreement.
For the exceptions to subject-verb agreement in AG, cf. the lengthy discussion in Kühner and Gerth (1898: 52–88). In rather romantic terms, Kühner and Gerth (1898: 52–53) stated that the frequent usage of ad sensum constructions is a peculiarirty of the Greeks, “whose free spirit considered the dead form of the word as less important than the living content of the form”. The rise of gender agreement discussed by Luraghi (2011) is somewhat comparable to what Meillet and Vendryes have hypothesized for number agreement.
Abeillé (2003) provides an excellent introduction to treebanks, with many examples for different languages. See also Passarotti (2009) for a treebank of Medieval Latin, with useful theoretical discussion. Piotrowski (2012) provides a detailed discussions of Natural Language Processing for historical languages: see in particular 85–100 for a discussion of part-of-speech tagging, lemmatization and syntactic parsing; 114–115 for an overview of corpora of Greek and Latin.
Dates refer to the century BCE. Brackets are used for works of contested authorship. A new prose text, Book 12 of the Deipnosophistae by Atheneus of Naucratis (3rd Century CE; philosophical prose, 19,961 tokens), was added to the AGDT 1.7 too late for us to include it in the present work.
The corpus can now be downloaded at: http://proiel.github.io/. The version we used was retrieved on February the 8th, 2014 from: http://foni.uio.no:3000/.
The fact that a significant part of our evidence is made of poetic works raises also the important methodological question of the metrical constraints on the language of the texts. This issue, however, is not specific of our treebank-based approach, but it is shared by all linguistic studies that deal with the stages and strata of Ancient Greek where poetical texts constitute indispensable (and very often unique) evidence. To this it may be added that metric factors, although arguably crucial in some examples (e.g., ex. 3), cannot be invoked to explain other cases, where both singular and plural verb forms are either possible (e.g., ex. 14) or even attested (as in ex. 4). Sensible arguments about the problem can be read in Dik (2007).
The software, originally developed for annotating and querying the Prague Dependency Treebank of Czech, can be downloaded at: http://ufal.mff.cuni.cz/tred/. In order to load our data into TrEd we used a conversion stylesheet developed by the Alpheios project that can be obtained at: http://sourceforge.net/p/alpheios/code/HEAD/tree/xml_ctl_files/xslt/trunk/aldt2pml.xsl.
“And-coordinates” are introduced by the following lemmas: kaí, ēdé, idé, te or, in six cases in our corpus, by a comma. “Or-coordinates” are governed by the conjunction ē (also spelled ēé in epic poetry). The line “total” includes the comprehensive counts of all the sentences where coordinated subjects are governed by a conjunction annotated as COORD (i.e. head of coordination) in our treebanks. This may include other lemmas (like the adversative allá) and the total is therefore higher than the sum of and- and or-coordinates.
Although a few cases of genitive absolute (4 in the AGDT, 12 in PROIEL) would respect these constraints, we have decided not to include them: the peculiar nature of the absolute construction of the genitive participle sets them aside from the rest and these cases are best left to be studied separately.
The sentence corresponding to Hes. Op. 169b in Evelyn-White’s Loeb edition (on which the AGDT treebank of Hesiod is based) is an interesting case that had to be ruled out. This sentence comes from the problematic lines transmitted by two papyri that are numbered as 173a–e in the more recent editions. The conjoined subjects are introduced by a conjectural reconstruction of line 173c in PGenav 94 proposed by Evelyn-White (1913: 218–219). However, the alternative reconstruction of the line offered by Maehler (1967: 63, 66–69), which doesn’t have coordinated subjects, is nowadays generally adopted, especially because it is more compatible with the evidence from the other papyrus (PBerol 21107). This sentence proves that philological scrutiny of every case is crucial when working with treebank data of AG. Other cases that we have ruled out involve coordinated words that refer to the same person or concept, as in, e.g., Matth. 7.26: kaì pâs ho akoúōn mou toùs lógous toútous kaì mḕ poiôn autoùs homoiōthḗsetai andrì mōrôi (“but everyone who hears these words of mine and does not put them into practice is like a foolish man”), where the participles akoúōn and poiôn were treated as subjects by the annotators.
See, e.g., Smyth (1920: 265); even the detailed and informed discussion in Kühner and Gerth (1898: 77) is placed right after a long section dedicated to the exceptions to the rule of agreement. On the other hand, Humbert (1960: 73–76) presents the different constructions in a more neutral way.
We have calculated the effect size φ using the formula discussed by Gries (2009: 173–174; see 165–180 for a detailed introduction to the chi-square test for independence); a φ score ranges between 0 and 1, so that a value of 0.142 can be read as rather low.
Both are close to the χ2 value for p < 0.001 with df = 1 (10.87); the contribution of the cells for and-coordinate (0.48 and 0.50, for singular and non singular agreement respectively) is considerably lower.
A two-tailed chi-square goodness-of-fit test (df = 1) yields a χ2 value of 62.30, with p < 0.001. On the chi-square goodness-of-fit test see Gries (2009: 151–158).
The agreement in S. Tr. 883–884 is thus explained by Jebb (1892): “The words ḕ tínes nósoi are really parenthetical,—suggesting that the excited mind (thymós) may have been also deranged; hence the verb can agree with thymós, on which the chief stress falls”.
Note that the validity of clausal interpretation was contested with serious arguments by Munn (1999) even for Moroccan and Lebanese Arabic. See Aoun et al. (1999) for a reply to these remarks.
For the Schema Alcmanicum, for which this sentence is especially known, see sec. 2.3.
This line comes from a section of the play that is rightly regarded as interpolated by the majority of critics. However, since this passage appears to be “competent fifth-century tragic idiom” (Dawe 1967: 17), it still counts as evidence for our linguistic study.
As a matter of fact, it might be observed that a clausal interpretation (in the line of, e.g., joy at the same time took her and pain [at the same time took her]) is possible. For exx. 13 and 14, this reading is in our opinion ruled out by the word order.
Note that the two cases with single right-dislocated subjects, a strained poetical construction found only in the Homeric poems, were disregarded in Table 7 and in the following discussion.
The chi-square test of independence yields a χ2 = 20.40, df = 1, p < 0.001; the φ measure of size effect = 0.235.
Note also, in the ex. 15, the use of the adverb homoû, “at the same time”. Again, this sentence offers further evidence against a clausal interpretation of PA: see above, sec. 2.2.
Cf. in our sample: Od. 5.295–296, 10.513–514 (ex. 16 below) and 14.216–217.
On the schema Alcmanicum see especially pseudo-Herodian, Fig. 54 (Hajdú 1998: 132, with a list of other ancient sources); see also Calame (1983: 308) on his fr. 2 of Alcman.
To the example they quote we may add West (2011: 9).
Note that the formula for oath quoted in ex. 18 is sometimes realized without enjambment, with all the subjects, or at least two of them, placed on a single line (cf. Od. 5.184, 14.158). The metrical caesura between the two subjects may also play a role, that should be the object of further investigation. In any case, a verse like Il. 19.258 (cf. also Od. 19.303–304) suggests that, even when the conjoined subjects do not stretch over two different verses, the verb and the first subject form one unit.
2081 lemmas of Sample B were found in PROIEL’s lexicon; 554 were annotated by us. Note that we have eliminated the category “oanim”, generally used for verbs, pronouns and adjectives. Substantivized participles and adjectives were annotated as “human” or “nonconcrete” according to their use, whether they were referred to persons or abstract concepts (as in tò agathón, “the good”). We wish to thank Dag Haug for sharing the PROIEL lexicon with us.
In some cases the coordinated subjects have different animacy values: see, e.g., Soph. Ant. 599–602, where the subjects that bring ruin to the royal family of Thebes are “dust” (concrete), “folly in speech” (nonconcrete) and the “Erinyes of the mind” (human). These conjoined phrases were marked with the label “mixed”.
Tab. 8 reports 146 cases of singular agreement vs. 180 of non-singular agreement for animated subjects (labelled as “human”: see discussion above); inanimate subjects (i.e. all the other labels with the exclusion of the “mixed” values) account for 240 cases of singular vs. 31 of non-singular agreement. The chi-square test of independence yields a χ2 = 124.0955, df = 1, p < 0.001; the φ measure of size effect = 0.456.
There may be something to say about this distinction. Rivers are the object of a personal cult and are often personified in Greek religion (Burkert 1985: 174–175). In the same poem, we don’t have to go further than a few books from Il. 5.774 to see one of the rivers quoted in the passage, the Scamander, speak and fight in anger against Achilles (Il. 20.211–327).
On Latin see Johnson (2013) with bibliographical references to the standard grammars. For the difference between human and non-human nouns in the resolution of gender agreement in Modern Greek see Chila-Markopoulou (2003) and Spyropoulos (2011).
Badecker (2007) has drawn attention to the cases of mismatch between grammatical vs semantic gender/number. In French e.g. mannequin (model) is morphologically masculine and normally triggers masculine agreement even when it is used with a feminine referent. In coordinated phrases however the evidence shows that semantic agreement is favored: Le mannequin et sa maquilleuse sont assises (*assis) dans le coin (Badecker 2007: 1543). Corbett (1979) has proposed an agreement hierarchy to account for the influence of semantic and syntactic factors in the different types of agreement.
Abeillé (2003) provides an excellent introduction to treebanks with many examples for different languages. See also Passarotti (2009) for a treebank of Medieval Latin with useful theoretical discussion. Piotrowski (2012) provides a detailed discussions of Natural Language Processing for historical languages: see in particular 85–100 for a discussion of part-of-speech tagging lemmatization and syntactic parsing; 114–115 for an overview of corpora of Greek and Latin.