Ancient ‘Solmisation’ and the Meaning of Notes

In: Greek and Roman Musical Studies
Stefan Hagel Austrian Academy of Sciences Vienna Austria
University of Vienna Vienna Austria

Search for other papers by Stefan Hagel in
Current site
Google Scholar
Open Access


When contextualising the ancient Greek solmisation system, known from Aristides Quintilianus and one of Bellermann’s Anonymi, within its musical and linguistic environment, it emerges that it hardly predates the Roman Imperial period, an important part of whose musical schooling it appears to have formed. The system seems based on a combination of the various vowels’ intrinsic F2 pitch and intensity and reflects the harmonic hierarchies of contemporary music, shedding a much more favourable light on the music-psychological relevance of Aristides’ gendered musical notes than is conventionally assumed.


When contextualising the ancient Greek solmisation system, known from Aristides Quintilianus and one of Bellermann’s Anonymi, within its musical and linguistic environment, it emerges that it hardly predates the Roman Imperial period, an important part of whose musical schooling it appears to have formed. The system seems based on a combination of the various vowels’ intrinsic F2 pitch and intensity and reflects the harmonic hierarchies of contemporary music, shedding a much more favourable light on the music-psychological relevance of Aristides’ gendered musical notes than is conventionally assumed.

1 Introduction

In a previous issue of this journal, I have discussed the so-called ancient ‘solmisation’ in the context of its employment in what I have argued formed a musical schoolbook, transmitted as part of a collection of texts known as Bellermann’s Anonymi.1 There I have also made some remarks concerning the only other witness of that practice, Aristides Quintilianus. However, the topic

of that contribution did not allow me to go into further detail regarding the system as such; here I will therefore develop some thoughts along more general lines.

Both Aristides and the Anonymi associate pitches with vowels of the Greek alphabet. Strung together as the nuclei of a sequence of syllables, these were obviously used to indicate melodies. Phrasing was expressed by different onset consonants, or, within musical microphrases, the immediate succession of vowels. These are detailed in the fourth anonymous section in the compilation first edited by Friedrich Bellermann, where they are invariably combined with the onset consonant ‘t-’ to what we may regard as the standard form of ‘quoting’ a note (BA4, §77), mentioned also by Aristides. The same system is used (but not described) in the closely related first and fifth sections. The latter expresses certain melodic progressions in terms of consecutive vowels or intervening consonant clusters in the shape of ‘-nn-’ or ‘-nt-’ (BA5, §86; §91–93); the former retains only the most complex combinations (BA1, §9–10). Here we will focus exclusively on the vowels.

But let us start by putting the idea of ‘solmisation’ in context. The eponymous sol-fa familiar to Western readers was famously devised by Guido of Arezzo. In its original form it covered a sixth (ut – la) and was later extended to include the seventh step in the octave (si). When employed beyond an octave, the syllables would repeat, in line with the conception of notes an octave apart being functionally identical; the description of modulation by shifting the ‘hexachord’ to a different position need not concern us here. In any case, the syllables are not conceived as imitating the sound of the notes in any way; being derived from the (half-)line starts of a pre-existing poetic text, they are entirely random as far as their phonetic shape is concerned. The same is true, for instance, for the traditional Indic note syllables.

In contrast, various traditions are documented on several continents that express certain qualities of notes (or even pairs of notes) directly by phonetic means.2 Being based on the immediate experience of musicians, these tend not to form strictly canonised systems. While clear tendencies can be observed, there is normally no stringent association between individual phonemes and specific musical characteristics such as pitch, duration or metric value. Therefore, writing down the linguistic expression of a melody falls short of resulting in an unequivocal notation from which melody, rhythm, or both might be retrieved. Such systems have their typical place in the oral transmission of musical, most typically instrumental, skills.

Various phonetic parameters have been exploited in such contexts, with considerable constancy across historically unrelated traditions. Consonants may imitate plucking, striking and dampening of strings, as well as lip and tongue action in wind instruments.3 For vowels, two main factors have been observed, which may at least in part be mutually exclusive: intrinsic pitch and intrinsic intensity (the latter possibly coupled with intrinsic duration). While the concept of intrinsic intensity is quite straightforward – some vowels typically sound louder than others, other factors being equal4 – the idea of intrinsic pitch is less simple. On the one hand, in speech the basic frequency (F₀) of different vowels under similar circumstances tends to vary: sounds like /i/ and /u/, for instance, tend to be uttered at a higher pitch than /a/ or /o/.5 This fact may be exploited for indicating melodic or non-melodic pitch relations in speech, i.e. without at the same time actually realising the expressed pitch contours as such. In this respect, some vowels actually are higher than others, and these tiny differences may substitute for the much larger intervals of an actual melody.

On the other hand, vowels are characterised not only by their relative basic frequency, but also by a typical configuration of harmonics, effected by shaping the cavities of mouth and throat so that they resonate with different frequencies. As a result, different vowels feel in some way differently pitched, even if they are actually uttered with an identical basic frequency. Here again, /i/ is typically highest, but /u/ sits rather low in the spectrum, while /a/ normally occupies the centre. David W. Hughes has argued convincingly that it is the so-called second formant (F2) that determines this perceived intrinsic pitch, and consequently dubbed systems where it is used for communicating musical pitch ‘F2 systems’. In different cultures, the differences between vowels may respectively indicate direction in a melodic movement, pitch regions, a combination of these, or even fixed scale degrees. Just as in the previously discussed version of intrinsic pitch, they also may be used in utterances that do not actually convey the pitches of the melody themselves. On the other hand, in a tradition where the notes of the melody are actually sung together with the syllables that designate them, only the F2 approach makes any sense, since the realisation of the correct musical pitch necessarily overrides the automatic relative adjustments of basic frequency which would establish those ‘actual’ pitch differences between vowels discussed above.

As these two types of ‘intrinsic pitch’ constitute very diverse phenomena, it is confusing to designate them by a common name. I will here reserve the term ‘intrinsic pitch’ for the former, which involves an actual pitch difference in the normal sense of the word, while I will refer to the second one, which creates the perception of pitch differences even for identical basic frequencies, as ‘intrinsic F2 pitch’.

Where does ancient Greek ‘solmisation’ fit in that overall picture? Its vowels (in Anon. Bell. these generally appear within syllables starting with ‘t-’) are set out in Figure 1. The two sources that transmit it disagree only about the realisation of the lowest note – a small but lucky disagreement, because it adds a further blow to the idea that the Bellermann source might depend on Aristides, instead of representing an independent witness to a widespread musical practice.

Figure 1
Figure 1

The ancient ‘solmisation’ vowels as described in Aristid. Quint. 2.14 and Anon. Bell. §77

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

The vowels, as is generally accepted, must have been selected for their own sake, i.e. they do not, for instance, abbreviate note names, nor can they in any other way be related to the well-known naming and notation conventions of ancient music. It has been suggested that the system implements intrinsic pitch, originally assisting the production of enharmonic microtones within the tetrachord.6 This hypothesis conflicts with the evidence on several points. Firstly, the system is attested only more than half a millennium after the glory days of the enharmonic; secondly, we find it associated exclusively with the diatonic genus that indeed dominated Roman-period music; thirdly, once its sounds had established hard-wired associations with pitches or pitch relations in musicians’ minds, using the same sounds for different relations would contradict the very principle of the system. Moreover, the enharmonic hypothesis is not borne out by the specific attested vowels, because the sequence of ‘a-ē-ō’ cannot possibly be understood as an ascending triad.7 Equally importantly, the employment of intrinsic pitch would imply that the system was typically used to convey musical information in spoken, not in sung form. But this is hardly conceivable. Since each vowel except ‘e’ recurs five or six times across the double octave of the Unmodulating Perfect System, any spoken sequence would have needed additional means of expression whenever describing a melody that exceeded three or four notes – and even these would have needed to be agreed beforehand. This might just be possible if the system originated with an instrument that was, in some way, recognisably organised along the tetrachords of theory – but this is not true for any of the auloi or lyres we know, and highly improbable for harps.8 Within the transmitted system, the only segments with intrinsic pitch order are the sequences of ‘ō-a-ē’, which span a minor third in the diatonic and cut across tetrachord boundaries. The addition of any adjacent note would already create a duplicate vowel.

Indeed the few preserved applications of the system in the Anonymus presume that it was used within larger ranges and also for intervallic jumps from one domain into another. At one occasion, a series of two-note figures is extended across the entire lower octave: τωα ταη τηω τωα ταη τηω τωε = A-B B-C C-D D-e e-f f-g g-a (§86). In another instance, the interval of a fourth is exemplified by τωω = D-g. All this makes no sense as spoken language: the preserved examples are obviously meant to be sung (even if they were applied to an instrument afterwards).

If intrinsic pitch was involved at all, we would therefore consider F2 pitch. Its application would however appear peculiar. The repetition of vowels throughout the double octave shows that they clearly do not convey the idea of absolute pitch range. The pitches would therefore have to be relative; relative, however, not in the sense of implying direction in the unfolding of a melody, which is incompatible with their association with specific scale degrees. Consequently they might indicate pitches relative to each other only within a small range. As we have seen, this range would typically be encapsulated within the sequence ‘ō-a-ē’, the only segments in which three subsequent pitches are expressed by vowels in F2 order. Such a melodic microenvironment, as noted above, would not conform with the tetrachord as a unit of theoretical analysis, since tetrachords are bounded by the ‘fixed’ notes which stand at the centre of our F2 triad, being associated with the intermediately-pitched ‘a’. In any case, we would not expect a F2-based system to be devised for theoretical analysis. In a practical context, on the other hand, the observed grouping might make very good sense. Assuming that the ‘fixed’ notes were of harmonic importance, the ‘movable’ notes would most usefully be perceived in relation to the closest ‘fixed’ note. In this way, for instance, ‘ō’ would serve as the lower leading note to hypátē ‘a’ as a typical final, while ‘ē’ would provide the upper leading note.9 While in such constricted contexts the vowels would correctly imitate pitch relations, the extension of such a system to the double octave presupposes a focus on the framework of ‘fixed’ notes, with a special sensitivity to mésē (and, in Aristides Quintilianus’ flavour, proslambanómenos, its counterpart an octave below).

2 Literary Testimonies

So far we have gained a plausible interpretation from analogies with other musical cultures; but do the ideas we have derived in this way resonate with the perception of somebody actually trained in ancient music? Let us examine the arguments of Aristides Quintilianus himself, both a native speaker of Greek and a member of the elite who was probably familiarised with the musical association of the vowels already in elementary school.10 Indeed Aristides’ gendered interpretation of musical matters shares the common ancient association of low pitch with male and high with female characteristics – a view obviously derived from human voices and upheld in spite of intrinsic contradictions with its physical basis since Aristotle’s times. (Ancient physics associates high pitch with speed, vigour and tautness, which ‘ought’ to be male qualities.) Aristides employs this stereotype when classifying the various scales, resorting to salvaging low-pitched masculinity by focussing on air quantity instead of velocity, in the old Peripatetic way.

Ἔτι τῶν συστημάτων τὰ μὲν βαρύτερα τῷ τε ἄρρενι κατὰ φύσιν καὶ ἤθει κατὰ τὴν παίδευσιν πρόσφορα, τῇ πολλῇ καὶ σφοδρᾷ κάτωθεν ἀναγωγῇ τοῦ πνεύματος τραχυνόμενα καὶ πλείονος ἀέρος πληγῇ διὰ τὴν τῶν πόρων εὐρύτητα τό τε γοργὸν δηλοῦντα καὶ ἐμβριθές, τὰ δ᾿ ὀξέα τῷ θήλει, τῇ τοῦ περὶ τὰ χείλη καὶ ἐπιπολῆς ἀέρος πληγῇ διὰ λεπτότητα γοερά τε ὄντα καὶ ἐκβοητικά.

Aristid. Quint. 2.14, 81.7–13 Winnington-Ingram

Furthermore, the lower scales are suitable for the education of what is male in nature or character, because they acquire roughness by the plentiful and vehement upwards movement of the breath from below, and because they manifest vigour and dignity by having a greater amount of air striking through wide ducts, but the high ones for what is female, because they are plaintive and screamlike due to the tenuity caused by the air striking around the lips and superficially.

However, when explaining the qualities of the individual vowels, Aristides by no means refers to them by pitch or remarks that some of them sound intrinsically higher or lower than others. Instead, his focus is solely on sound quality, which he (rightly) explains with the various associated shapes of the mouth cavity. In Aristides’ vision, the character of the various sounds can ultimately be understood directly from this geometry, male and female becoming associated with the two dimensions that lie perpendicular to the breath’s ultimate flowing direction. Although this is not expressed, horizontality and verticality thus reflect the old gendered dichotomy of earth and sky at least unconsciously – note also that the Greek term for the palate is οὐρανός:11

Τῆς δὲ μελῳδίας ἔν τε ταῖς ᾠδαῖς κἀν τοῖς κώλοις ἐκ τῆς ὁμοιότητος τῆς πρὸς τοὺς ὀργανικοὺς ἤχους λαμβανομένης τὰ τῶν στοιχείων ἁρμόττοντα πρὸς τὴν τῶν μελῶν ἐκφώνησιν ἐπελεξάμεθα. ἑπτὰ γοῦν τῶν φωνηέντων ὄντων ἔν τε τοῖς μακροῖς καὶ τοῖς βραχέσι τὰς προειρημένας διαφορὰς ὁρῶμεν. καθόλου γὰρ τὰ μὲν ἐς μῆκος αἴροντα τὸ στόμα σεμνοτέρους τε τοὺς ἤχους καὶ ἀρρενοπρεπεῖς, τὰ δ᾿ ἐς πλάτος διαιροῦντα καὶ τὰς ἐκφωνήσεις ἥττους τε καὶ θηλυτέρας ἔχει. πάλιν δὲ εἰδικῶς ἐν μὲν τοῖς μακροῖς ἄρρην μὲν ὁ τοῦωφθόγγος, στρογγύλος τε ὢν καὶ συνεστραμμένος, θῆλυς δὲ ὁ τοῦη· διαχεῖται γάρ πως ἐν αὐτῷ τὸ πνεῦμα καὶ διηθεῖται. ἐν μέντοι τοῖς βραχέσι τὸ μὲνοτὸν ἄρρενα δηλοῖ, τό τε φωνητικὸν συνίλλον ὄργανον καὶ τὸν φθόγγον πρὶν ἐκφωνηθῆναι συναρπάζον, τεθήλυνται δὲ τὸε‹ , κεχηνέναι πως ἀναγκάζον κατὰ τὴν ἀπαγγελίαν. τῶν δὲ διχρόνων ἐς μελῳδίαν τὸακράτιστον· εὐφυὲς γὰρ διὰ πλάτος τῆς ἠχήσεως ἐς μακρότητα· τὰ δὲ λοιπὰ διὰ λεπτότητα οὐχ οὕτως ἔχει.

Aristid. Quint. 2.13, 77.30–78.16 Winnington-Ingram

Since melody, both in songs and instrumental lines, is understood from its similarity to instrumental12 sound, we have selected those of the letters for the pronunciation (ekphṓnēsis) of melodies which are fitting for it. Of the seven vowels which there are, after all, we can discern their above-mentioned differences both in the long and in the short ones: generally, those that lift the mouth longitudinally have more dignified and manly sounds, those that extend it laterally, also lesser and more female pronunciations. Specifically, among the long ones the sound of the ‘ō’ is masculine, being spherical and compact. In contrast, that of the ‘ē’ is feminine, as the breath somehow disperses in it and percolates. Among the short ones, on the other hand, the ‘ŏ’ indicates the masculine, compressing the vocal system and clutching the sound together before it is pronounced. In contrast, the ‘ĕ’ is made female, somehow causing the vocal system to gape during its utterance. Of those that permit both measurements, ‘a’ is optimal for melodic use, being well fitted for length, thanks to the breadth of its emission, while this is not true for the rest [‘i’ and ‘y’], due to their tenuity.

In the context of distinguishing neutral ‘a’ and feminine ‘ē’, Aristides introduces a novel line of reasoning, associating the sound of Greek dialects with the character of the respective tribes, stereotypically contrasting the stern Dorians with the refined Ionians – both rife with musical associations thanks not least to Plato’s Republic.

ἔστι δέ τινα κἀν τούτοις ἰδεῖν μεσότητα· τὸ μὲν γὰρακοινωνίαν τε ἔχον καὶ ἀντιπάθειαν πρὸς τὸη‹ , ᾗ μὲν ἐς ἀντίστροφον χρείαν ἐκείνου παραλαμβάνεται πέφυκεν ἄρρεν, ᾗ δὲ τὴν ὁμοίαν ποιεῖται σημασίαν τεθήλυνται. δηλοῦσι δὲ τοῦτο καὶ αἱ τῶν διαλέκτων ἀλλήλαις ἀντιπεπονθυῖαι τῇ τῶν ἐθνῶν ἀναλόγως ἐναντιοτροπίᾳ, ἡ Δωρίς τε καὶ ᾿Ιάς· ἡ μὲν γὰρ Δωρὶς τὴν θηλύτητα φεύγουσα τοῦητρέπειν αὐτοῦ τὴν χρῆσιν ὡς ἐς ἄρρεν τὸανενόμικεν, ἡ δὲ ᾿Ιὰς τὸ στερεὸν ὑποστελλομένη τοῦακαταφέρεται πρὸς τὸη‹ .

Aristid. Quint. 2.13, 78.16–25 Winnington-Ingram

In these matters, as well, one can discern the intermediate. The ‘a’ has both community with and opposition to the ‘ē’: where it assumes the opposite usage of the latter, it is male, but where it creates the same meaning, female. This also becomes clear from those of the dialects that are in opposition to each other due to the analogously contrary characters of the tribes, namely the Dorian and the Ionian [dialect]. The Dorian, eschewing the femininity of the ‘ē’, has the custom of switching its usage to ‘a’, as this is the direction towards the masculine, but the Ionian, shrinking from the hard nature of the ‘a’, tends towards the ‘ē’.

This argument, it will be noted, appears to draw on traditional stereotypes more than on personal experience. After all, Aristides’ and his contemporaries’ impression of Ionian would have been shaped first of all by the Homeric epics, which would hardly serve as typical examples of Ionian indulgence. In contrast, Aristides’ final argument, which starts from typical patterns of Greek inflection, quite plausibly illustrates the native speaker’s deeply ingrained associations:13

τὸ δὲεθῆλυ μέν ἐστι κατὰ τὸ πλεῖστον, ὡς προείρηται, τῷ δὲ τὸν ὅμοιον ἦχον ἐπιφαίνειν, εἰ ἐκταθείη, τῇαιδιφθόγγῳ γραφομένῃ διὰ τοῦαἐπ᾿ ἐλάχιστον ἠρρένωται. ἀλλὰ καὶ τῶν ἄρθρων καὶ τῶν καταλήξεων τὰ καθ᾿ ἁπάσας τὰς πτώσεις εἴ γε ἐξετάζειν ἐθέλοις, σαφῶς εὑρήσεις ὡς τῶν μὲν ἀρρενικῶν ὀνομάτων ἀρρενικὰ στοιχεῖα καθηγεῖται καὶ ἐπὶ τελευτῆς τίθεται, τῶν δὲ θηλυκῶν τὰ ὅμοια καὶ ὅμοιοι φθόγγοι, τῶν δ᾿ οὐδετέρων τὰ μεταξύ.

Aristid. Quint. 2.13, 78.25–79.2 Winnington-Ingram

The ‘ĕ’, as said before, is mostly feminine, but insofar as it denotes, when being extended, the same sound as the diphthong ‘ai’, which is written with an ‘a’, it becomes ever so slightly more masculine. Similarly, if you wanted to examine the articles and endings in all cases, you would clearly find that masculine nouns are preceded as well as terminated by masculine letters, and feminine ones similarly by respective sounds, but neuters by intermediate ones.

The fact that Aristides adduces such a plethora of concepts in support of what he first states as immediate personal perceptions of vowel qualities makes it all the more conspicuous that the idea of intrinsic pitch does not turn up in any way. That it was only secondary is also suggested by the absence of /i/ (ι) and /ü/ (υ) from the solmisation system. Had intrinsic pitch been its foremost motivation, one would hardly have excluded precisely the vowels with strongest associations of pitch. Aristides’ denunciation of the two (to which he only refers as “the rest”) as tenuous already hints at what he regards as the foremost criterion for the selection of solmisation vowels. When he finally justifies the selection of four specific vowels out of seven, however, he refers to yet another criterion:

Τέτταρα μὲν οὖν τῶν φωνηέντων τὰ εὐφυῆ πρὸς ἔκτασιν διὰ τῆς μελῳδικῆς φωνῆς [διαστήματα] πρὸς τοὺς φθόγγους ἐχρησίμευσεν

Aristid. Quint. 2.14, 79.3–5 Winnington-Ingram

Four of the vowels, therefore, those which are well shaped for being prolonged by the musical voice, have been found useful for being applied to the notes.

This explanation appears to account well for the invariantly long vowels ‘ē’ and ‘ō’, and also for ‘a’ which is often long. Aristides has also prepared the idea of extending the intrinsically short ‘ĕ’ by pointing to the current pronunciation of the previous diphthong ‘ai’, which he regards as a long version of ‘ĕ’.14 On the other hand, Greek also has long ‘i’, ‘u’, and ‘y’. The concept of prolongation, therefore, would merely exclude ‘ŏ’ – quite apart from raising the question of how intrinsically long ‘ē’ and ‘ō’ might be used with rhythmically short notes. Aristides and his original readers, being familiar with the practical application, of course did not need an explanation; we have to content ourselves with the conclusion that the formants of ‘ō’ and ‘ē’ were distinct enough as not to be confused with other phonemes even when used, contrary to their phonetic identity, as short vowels. Nonetheless it seems clear enough that Aristides’ explanation by extensibility is not very much more than a rhetorical trick.15 It can only stand if the concept of prolongation is implicitly complemented by that of a full sound, which he had introduced before by dismissing the feeble “remaining” vowels. The prolongation argument can only appear to work because these have not even been named and are thus excluded from the reader’s horizon, from which only short ‘ŏ’ now remains to be expelled. On balance, it appears that the most important concept in Aristides’ mind is neither pitch nor extensibility but a sort of intensity.

Before pursuing these questions further, it may be worthwhile to put Aristides’ evaluation of vowel sound in its historical context.16 Clearly, his remarks about written ‘ai’ being pronounced as long ‘ĕ’ would not have held true at earlier stages of Greek pronunciation, especially as we may believe that in elite circles monophthongisation appeared comparatively late. But the subtler shifts within the variants of ‘e’ also deserve attention. As a start, we may compare the stance of Dionysius of Halicarnassus, writing perhaps a couple of centuries earlier, who assesses vowel qualities in the context of spoken language. He also commends long vowels and establishes a hierarchy, albeit without Aristides’ gender categories.

αὐτῶν δὲ τῶν μακρῶν πάλιν εὐφωνότατον μὲν τὸα‹ , ὅταν ἐκτείνηται· λέγεται γὰρ ἀνοιγομένου τε τοῦ στόματος ἐπὶ πλεῖστον καὶ τοῦ πνεύματος ἄνω φερομένου πρὸς τὸν οὐρανόν. δεύτερον δὲ τὸη‹ , διότι κάτω τε περὶ τὴν βάσιν τῆς γλώττης ἐρείδει τὸν ἦχον ἀλλ᾿ οὐκ ἄνω, καὶ μετρίως ἀνοιγομένου τοῦ στόματος. τρίτον δὲ τὸω· στρογγυλίζεται γὰρ ἐν αὐτῷ τὸ στόμα καὶ περιστέλλεται τὰ χείλη τήν τε πληγὴν τὸ πνεῦμα περὶ τὸ ἀκροστόμιον ποιεῖται. ἔτι δ᾿ ἧττον τούτου τὸυ· περὶ γὰρ αὐτὰ τὰ χείλη συστολῆς γινομένης ἀξιολόγου πνίγεται καὶ στενὸς ἐκπίπτει ὁ ἦχος. ἔσχατον δὲ πάντων τὸι· περὶ τοὺς ὀδόντας τε γὰρ ἡ κροῦσις τοῦ πνεύματος γίνεται μικρὸν ἀνοιγομένου τοῦ στόματος καὶ οὐκ ἐπιλαμπρυνόντων τῶν χειλῶν τὸν ἦχον. τῶν δὲ βραχέων οὐδέτερον μὲν εὔμορφον, ἧττον δὲ δυσειδὲς τοῦετὸο· διίστησι γὰρ τὸ στόμα κρεῖττον θατέρου καὶ τὴν πληγὴν λαμβάνει περὶ τὴν ἀρτηρίαν μᾶλλον.

Dion. Hal. Comp. verb. 14

Of the long vowels, again, ‘a’ is the best sounding, when it is prolonged. The reason is that it is pronounced with the mouth most widely open and the breath moving upwards towards the palate. Second is ‘ē’, because it presses the sound (êkhos) downwards around the root of the tongue, not upwards, while the mouth is opened only moderately. Third is ‘ō’, because the mouth becomes spherical when pronouncing it, draws the lips together and the breath [generates the sound by] striking around the edge of the mouth. Even weaker is ‘y’: due to a considerable contraction just at the lips, its sound is stifled and emerges thin. The last of all is ‘i’, because the breath beats against the teeth, with the mouth opened only a little, while the lips contribute nothing to brightening the sound. On the other hand, none of the short vowels is shapely, but the ‘ŏ’ is less ugly than the ‘ĕ’, because it expands the mouth more than it and the generation of its sound takes place more around the windpipe.

In spite of some similarities – note the association of ‘ō’ with roundness and the ultimate position of ‘y’ and ‘i’, which owe their weakness to a restricted stream of air17 – there are also noticeable differences. Most conspicuously, Dionysius values long ‘ē’ above ‘ō’, which runs contrary to Aristides’ evaluation of the former as involving a process of dissipation in contrast to the compactness of ‘ō’. Dionysius’ assessment of this pair is remarkable also because it inverts that of the respective short vowels, where he places ‘ĕ’ clearly behind ‘ŏ’. No less noteworthy is the fact that the phoneme ‘ĕ’, which Dionysius holds in contempt, was selected for representing such an important note as mésē by the creators of the solmisation system.

At least some of these discrepancies may be explained diachronically: from classical Attic onward (and before), especially the long variants of ‘e’ had been subject to ongoing development. The old open variant, which integrated inherited ē with ǣ resulting from ā, gradually closed up until becoming, long after Aristides’ times, the [i] of modern Greek pronunciation.18 Dionysius appears to reflect a phase in this process when the opening of the mouth was significantly reduced in comparison with /a/, but still large enough to grant it the second place. In Aristides’ environment this may no longer have been the case, and the development towards [i] may have proceeded to a stage in which the causes of the old reservations against the latter were perceived to affect it as well.

3 Evidence for Early Solmisation?

All this might not be worth more than a footnote, were it not for its implications for dating the solmisation system. As noted above, researchers have tried to project it back to the Classical period. For the interpretation of an inscribed vase as a musical document, Annie Bélis even needed it to be in place (and used by otherwise often hardly literate painters) in the early fifth century BC.19 Her interpretation of the letter sequences ΤΟΤΟΤΕ and ΤΟΤΗ, which doubtless illustrate a trumpet call, as solmisation for notes of the harmonic series accessible to a natural trumpet translates these to three scale degrees a fourth and a major third apart (obviously assuming that the letter O stands in for the required ω even though the vase does differentiate between Ε and Η in the manner of the Ionian script):


Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

However, I cannot see how such an interpretation can be reconciled with the transmitted solmisation system. While it is true that in the system, as we have it – i.e. in diatonic form – we always find a fourth between an ‘ĕ’ and the ‘ō’ in the disjunct tetrachord above it (e.g., A–d), no possible interpretation would establish the major third between ‘ō’ and ‘ē’ that Bélis transcribes. The interval between a diatonic likhanós/paranḗtē (‘ō’) that stands a fourth above mésē/proslambanómenos and the diatonic parypátē/trítē above it is always a minor third (d–f). In the chromatic genus, it might indeed form a major third (c♯–f), but at the inevitable expense of destroying the required fourth below, between ‘ĕ’ and ‘ō’ (which would now become a major third as well, A–c♯). The enharmonic, finally, would provide neither: here the upper interval would become a neutral third (c–e), and the lower only a minor third (A–c). As a result, there is no possible rendition of the letters in question, within the framework of the attested systems of ancient music, that would fit within the harmonic series of a trumpet in the playable range – the sequence of fourth plus minor third only appears between the 15th, 20th and 24th harmonic, and thus in an entirely unrealistic range, as Bélis herself emphasises.20 The vase, therefore, is certainly not a witness to an early existence of the attested solmisation system. Bélis may still be right in identifying a series of three adjacent partials, but the vowels designating these would probably have to be interpreted in order of intrinsic pitch, making ΤΟ the lowest, ΤΗ the middle and ΤΕ the highest note.21 This would salvage Bélis’ ending of the fanfare on the highest note when we restore the much more plausible reading of the line order as ΤΟΤΗ ΤΟΤΟΤΕ instead of ΤΟΤΟΤΕ ΤΟΤΗ. Possible renditions, therefore, might include22


Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

or more probably, using harmonics 2, 3 and 4, and therefore ending on the tonic (i.e., on an octave counterpart of the ideal fundamental):23


Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

4 Diachronic Phonology

While our examination of the vase inscription has undermined suspected evidence for the longevity of ancient solmisation, it has also increased our awareness of problems imposed by a diachronic view. In particular the sounds of the various forms of ‘e’ underwent substantial change, not only in comparison with the Classical period, but apparently also in the comparatively small time span between Dionysius and Aristides. Facing the linguistic development, we must wonder whether an ancient solmisation system would even have had the chance to remain functional throughout. One might object that the vowels used by music teachers when conveying melodies in a singing voice would not need to match those of ordinary spoken or sung language, and that a much older system might therefore have been preserved basically intact, only minimally influenced by the sound change of language. But Aristides was in no noticeable doubt that the solmisation letters were identical to those of everyday speech, and we can hardly assume that the Greek of his time had by chance just developed to a point of faithfully reflecting the sounds of a system conceived by an earlier generation and transmitted orally up to a point when the pronunciation of some letters had luckily become ready for noting it down.

Most strikingly, we find ‘ĕ’ assigned to the note of mésē, which the ancient accounts regard not only as central in position, but also as ‘leading’ (ἡγεμών) and of utmost melodic importance.24 As we have noted, musical usage must always have required a long variant of this vowel as well. In the Classical period, this would have resulted in the phoneme that was almost universally written as ει in standard Attic orthography after the adoption of the Milesian alphabet, and whose pronunciation had in the Hellenistic period shifted to long [i:] (at least when it did not follow a vowel). Aristides, however, distinguishes musically acceptable ε from all too weak ι; his equation of ‘long ε’ with the sound of αι proves that it was some form of rather open /e/. The linguistic development therefore appears to rule out an early origin of the attested solmisation system, at least in the Attic and Ionian environment. But Attic and Ionian formed the basis of koinē Greek, while no other dialect may easily be conceived to have contributed to a Roman-period schooling practice so widespread that its vestiges survived into medieval codices. Also, the locations which preserve explicit epigraphic references to musical schooling in the Hellenistic period, Teos and Magnesia, are Ionian cities writing Attic.25 Moreover, the other major dialects provide no feasible alternatives: without the Ionian separation of Ε and Η, one of the phonemic (or, at the very least, graphemic) distinctions on which the solmisation system is based breaks down. On balance, the terminus post quem of its invention must be sought in a period where the η of standard Greek had become sufficiently closed to provide a contrast to lengthened ε, which was perceived as less closed. This will not take us further back in time than to the Hellenistic period; but Dionysius’ statements ought to caution us that the Roman Imperial era is after all more likely.

5 Intrinsic F₂ Pitch and Intrinsic Intensity

Starting from the interpretation of the repeated vowel triad of ‘ō-a-ē’ within a framework of F2 pitch, we have concluded above that such a pattern would appear to be centred on the ‘fixed notes’ of the ancient system, towards which its upper and lower neighbours would be oriented. Aristides’ interpretation corroborates such a view, in making the fixed notes ‘intermediate’ between lower male ‘ō’ and higher female ‘ē’. Although he does not spell this out, his tripartite gendering of vowels in combination with the traditional association of gender and pitch, which he acknowledges in a different context, isolates precisely the postulated triads. Assuming a certain amount of structural focus on the fixed notes, we may thus be able to understand Aristides’ perception of tonality much better and appreciate that, as much as his elaboration may seem artificial and far-fetched to us, it may have made excellent sense to a reader whose primary musical training had established a firm link between vowel sounds and scale degrees. After all, the language preserved in Bellermann’s Anonymi plainly claims that the vowels are intrinsic to the notes, and this may be the way Roman-period children were first exposed to the idea of notes as individual entities:26

τῶν δεκαπέντε τρόπων οἱ προσλαμβανόμενοι λέγουσιτω‹ , αἱ ὑπάταιτα‹ , αἱ παρυπάταιτη‹ , οἱ διάτονοιτω‹ , αἱ μέσαιτε‹ , αἱ παράμεσοιτα‹ , αἱ τρίταιτη‹ , αἱ νῆταιτα‹ . (§77)

Of the fifteen keys [trópoi], the proslambanómenoi say ‘tō’, the hypátai ‘ta’, the parypátai ‘tē’, the diátonoi ‘tō’, the mésai ‘te’, the parámesoi ‘ta’, the trítai ‘tē’, and the nêtai ‘ta’.

Apart from a very probable element of F2 pitch, which readily inserted itself into Aristides’ gendered musical universe, we have seen that he appears to conceptualise the differences as some kind of intrinsic vowel intensity. His evaluation chimes in with other known ‘mixed’ systems and general linguistic data:

In several such systems […] this highly regular relation between vowels and melodic direction based on Intrinsic Pitch (IP) is often disrupted due to the competing vowel acoustics of Intrinsic Duration (ID) and Intrinsic Intensity (II). Phoneticians have found that in the vast majority of languages the vowels closest to [i] and [u], those spoken with the mouth relatively closed, will take less time to articulate and will also register a lower volume on a vU meter than will more open vowels; by contrast, the “longest” and “loudest” vowel is [a], followed by [o] and [e]. This is why [i] and [u] are often favoured for short notes or those in weak metric positions in oral mnemonic systems, while [a] tends toward the opposite.

Hughes 2000, 105f.

The designation of rhythmical features such as weak or short positions is of course incompatible with a system that establishes a fixed association between pitches and vowels, as long as the various notes may appear in all sorts of rhythmical functions. So it is no wonder that ancient Greek solmisation found no use for the weakest vowels. Here, therefore, intensity may only be associated with harmonic hierarchies, the functional significance of notes in given musical modes.

6 Harmonic Hierarchy

With the question of harmonic hierarchy, we are finally on terrain whose exploration can proceed beyond the limited textual accounts. The relative importance of notes may be gleaned first of all from the remains of ancient melodies, but sometimes also from sufficiently preserved instruments.

From the Hellenistic period we possess only very few melodies that lend themselves to an evaluation.27 These appear to adhere to a modality that is typically centred on an axis between a focal mésē (functional a) and a final hypátē (e). The harmonic primacy of the former, as we have already mentioned, is amply acknowledged in the literary record as well. However, the melodies from the Roman period suggest that it had become somewhat sidelined, overshadowed by another harmonic domain that rests on functional D and G instead, which may have been enforced by a particular restraint concerning the use of the formerly dominant mésē.28 The solmisation system fittingly expresses the primary axis of a G-D mode in terms of ‘ō’. The older domain of E-A would appear as ‘a’ and ‘e’, instead, with the strongest vowel indicating the final, while mésē is associated only with the weaker ‘e’. This would appear surprising within what we know about the earlier, ‘Hellenistic’ paradigm, but it accords with the data from the Roman period. As we have already said, together with the linguistic history of the involved vowels, this may be another hint that the solmisation system is a comparatively late invention.

At any rate, it must have been a kind of invention, must have been designed or at least codified at some point, even though this may conceivably have happened on the basis of less systematic precursors. An assessment of this creative process must also depend on the question of whether the ancients, and particularly the musicians of the Roman period, possessed the concept of modal hierarchies between notes, beyond the appreciation of mésē or the harmonically fundamental old triad of hypátē (e), mésē (a) and nḗtē (e’).29 Conspicuously, such a concept is not invoked in a particularly difficult chapter from the pseudo-Aristotelian Problems (19.3), which attempts to explain the fact that parypátē (the next note above hypátē) is especially troublesome for singers. The text is clearly corrupt and the nature of the proposed answers is far from clear, but it does not seem that the idea of a weak harmonic status of the note in question plays a role.

In contrast, a clear awareness that different pitches held different musical significance transpires from Vitruvius’ account of the design of stone theatres, written in the first century BC, where he famously enumerates a variety of pitched resonators that were embedded at particular positions in the construction of the cavea. Of the twenty-one musical notes he mentions, some receive only a single resonator, others up to six, while some are entirely neglected. This reflects a clear idea of tonal hierarchies; and indeed the numbers match the musical needs of the extant melodies.30 Notably, the Problems’ problematic parypátē is not worth a single resonator.

To those who have studied the technicalities of ancient Greek music, the weak harmonic status of parypátē along with the other notes associated with solmisation by ‘ē’ will appear natural. On the other hand, anyone whose starting point is Western music may be perplexed: after all, these comprise the notes that are equivalent to the C of our natural scale, which has come to hold some sort of primacy in Western musical thinking.31 In ancient theory, in contrast, our C and F were classified as the lower ‘moving’ notes within the tetrachord, susceptible to variation in relative pitch in respect to the ‘fixed’ (A, B, E) notes and their upper ‘moving’ neighbours (D, G). In practice, this meant, for instance, that they would often not partake in the tuning framework of fifths and fourths that held the others together, normally including the upper diatonic ‘moving’ note. This already holds true in the earliest detailed account of possible tetrachord tunings by Aristoxenus in the fourth century BC, and especially pervades Roman-period music, as may be gleaned from the lyre tunings transmitted by Ptolemy, where a small septimal semitone below the note in question is the standard. This small semitone appears in all diatonic tetrachords on Ptolemy’s lýra, and in eight out of twelve tetrachords on his cithara, the other four being tuned diversely. This note, the tuning of which was shifted from the point of maximal resonance with the other strings, was therefore not equally enforced by sympathetic vibrations and consequently must have sounded duller. Also, there were fewer or no options to play such notes in concord with another, which would contribute to their harmonically subordinate status. These parypátai-unfriendly conventions of lyre tuning were perhaps inspired by the music of early auloi, in which the notes in question had to be obtained by partial covering of a finger hole, which made their pitches particularly unstable and compromised their sound.32 In this way, no musical modes that were shared between the instrument families would possibly have employed parypátai in strong harmonic functions.

It is therefore no wonder that theatre architects would not have provided resonators for such notes: on the one hand, their tuning was particularly variable, so that a resonator of fixed pitch would often have remained useless; on the other, their less brilliant sound had probably become a feature of the music that served to underline the harmonic relations.

This hypothesis appears to be corroborated by archaeological evidence from the best studied pair of doublepipes from Pompeii. On both these pipes, the fingerhole for the note is distinctly smaller than its neighbours (Figure 2), as if it still reflected the practice of half-covering by which the same note was obtained some centuries earlier. This note must be the parypátē Vitruvius had in mind: since he refers to note names without specifying the key (tónos/trópos), we must assume that he writes in terms of the natural Lydian key, where the functional note names coincide with the respective string names on the lyre. This is not unusual; other ancient writers similarly treat the Lydian double octave as the paradigmatic system. The solmisation diagram in Bellermann’s Anonymus is a perfect example: while the text insists that the same sequence of syllables is used for all fifteen trópoi, the diagram shows merely the notes of the Lydian. However, on a modulating instrument such as that from Pompeii, things get muddier, since each possible key would associate its functional notes and consequently the respective solmisation vowels with different fingerholes, as displayed in Figure 2. Would this explain why we find both a smaller and a larger hole for the note , which is associated with weak ‘ē’ in Hypolydian and Hyperiastian, but with stronger ‘ō’ in Lydian?

Figure 2
Figure 2

Solmisation vowels for different tónoi on Pompeii aulos 76892 (below) + 76893 (above). Data for the exit of 76893 and the second highest hole of 76892 are missing

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

At any rate, having identical fingerholes for functionally different notes in various keys thwarts a clear-cut representation of harmonic function by hole diameter. Nevertheless, the Pompeii pipes retain a measurable tendency to reflect the former in the latter. Without recourse to evidence from the solmisation vowels, this can be shown by plotting fingerhole diameters against the numbers of resonators (Figure 3). However, an immediate comparison (left) reveals hardly any correlation. A more significant correspondence is only established by grouping resonators that stand in an octave relationship together (right). This reflects a characteristic of the resonator system that also emerged from comparing it with the notes it would typically have amplified.33

Figure 3
Figure 3

Correlation of fingerhole diameters in Pompeii aulos 76892+76893 and number of large-theatre resonators

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

Can we, however, assume that such observed tendencies reflect harmonic hierarchies beyond mere questions of note frequency? After all, one might argue that Lydian parypátē, not forming part of any other playable key on the Pompeii instrument, would be used less frequently than notes that were shared between several keys, and might receive a smaller fingerhole just for this reason. However, where there is no technical advantage in having smaller holes, there is certainly no a-priori reason not to make a large one. The structure of the instrument would certainly not be compromised by a larger hole at that particular position. On the other hand, smaller holes are to be placed a little bit higher on the tube than are larger holes playing the same pitch, which might arguably release some strain from the player’s hand. Still, if the same advantage was not exploited for other holes as well, one would need to support such an argument by the other notions, such as harmonic inferiority or at least infrequent use. Consequently it is certainly of interest to compare the hole sizes directly with the data for use frequency.

Figure 4 displays the number of occurrences in the extant ancient scores for each note for which a fingerhole of the Pompeii instrument exists in relation to the diameter of this hole, both for all involved keys (left) and only for the Lydian, to which the note in question belongs exclusively. In the second case, there is basically no correlation, and in the first case, it is weak and explains the data much less well than did a comparison with the Vitruvian resonator system. Once more, it is only by counting notes together that are separated by any number of octaves that the correlation between note frequency and finger hole diameters approaches the same strength as that between fingerholes and resonators (Figure 5). This may be a hint that the relation between song and aulos accompaniment often incorporated octave intervals.

Figure 4
Figure 4

Correlation of fingerhole diameters in Pompeii aulos 76892+76893 and occurrences of notes in the musical documents dating from the Roman Imperial period

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

Figure 5
Figure 5

Correlation of fingerhole diameters in Pompeii aulos 76892+76893 and octave-invariant occurrences of notes in the musical documents dating from the Roman Imperial period

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

Even so, Roman-period note frequency does not predict the fingerhole diameters very well. In contrast, as long as octave-invariance is assumed, note frequency in the Roman period is an excellent indicator for the number of Vitruvius’ resonators (Figure 6, right). Conversely, we can detect no correlation between resonators and Hellenistic music (Figure 6, left), even though the Lydian key was commonly used in both periods, and though Vitruvius’ account includes resonators for the chromatic notes, so that the discrepancy cannot be blamed on the earlier fragments’ preference for chromatic scales in contrast to the preponderance of diatonic in the later fragments. From this, admittedly limited, body of evidence, one might infer that the musical transformation that we observe between the Hellenistic and the Roman-period musical documents must have been more or less concluded in the first century BC.

Figure 6
Figure 6

Frequency of notes, projected to a single octave, vs. number of large-theatre resonators for each pitch, also projected to a single octave. Notes from DAGM that are not marked as doubtful

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

Even with a predictive value of 85%, it is instructive to examine the dots that are furthest from the regression line. Figure 7 displays the deviations for all attested notes within any octave (A thus represents the pitches of the Lydian proslambanómenos, mésē and nḗtē hyperbolaíōn etc.). Interestingly, C♯ gets almost two resonators more than it appears to deserve and makes us wonder whether this is due to some specific preservation bias in the musical documents. After all, this pitch class appears not only in the diatonic Hyperiastian and Iastian, where it is attested, but forms part of the chromatic upper tetrachord in the tuning Ptolemy calls ‘trópoi’, where it would be notated as . We should therefore expect that this note enjoyed some prominence, even if, so far, no melody in this mode has come to light. Anyway, questions of Roman-period chromaticism gone missing are irrelevant for the present investigation of a purely diatonic system of vocalisation.

Figure 7
Figure 7

Deviation of reported resonator counts from non-rounded expectations based on Roman-period note frequency, in absolute values (left) and as a percentage (right). Data as in Figure 6

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

Secondly, we find B slightly overrepresented, which might point to its status as an important note in aulos construction: the low variant forms the bass note of the instrument from Pompeii and also serves as a conspicuous bass note in the documents.34 On the other hand, E, F and G might each be expected to receive one resonator more. This is only a loss of 20% for the otherwise well-endowed E and G, but knocks out F completely. Here we need to keep in mind that the symmetrical design of the cavea had only two positions for non-paired resonators, so that it would hardly be possible to arrange them in what the fragments make us suspect was the optimal manner. Even so, the utter disregard for F emerges unjustified even in view of its relative lack of frequency in the surviving scores. With some certainty, we can thus confirm that the designers of the scheme operated on the basis of a harmonic awareness that rested on more than the observable note frequency. It seems as if the sound of F was intentionally compromised, very much in line with Aristides Quintilianus’ perception.

Why then was C not equally spurned, one might ask, given that it shares the status of a lower movable note with F? Several reasons may have contributed. First of all, in the range of keys favoured by the Roman period, C is not unique to Lydian, but also appears in Hypolydian. Secondly, it may have been much better integrated within the framework of fifths and fourths even in Lydian. This is firstly suggested by its double role not only as lower moving note, trítē diezeugménōn , but also as the higher moving note of the modulating tetrachord, paranḗtē synēmménōn (cf. Figure 1, where the same pitch is once associated with ‘ō’, once with ‘ē’). Relatedly, in Ptolemy’s Lydian tuning ‘lýdia’ this same c is tuned a pure fourth above g (), forfeiting the otherwise regular small septimal semitone in order to gain greater modulating capability.35

6.1 Harmonic Function

So far we have focused on notes mainly as individual pitches, as was necessary when dealing with material fingerholes and once-material tuned resonators. The text of the Anonymus, however, prompts us to think more in terms of functional notes, defined not by their absolute pitch, but by the relation to the other notes in each key. In order to assess this aspect, we can rely on only little more than statistics from the extant scores. In Figure 8, all functionally similar notes from the different keys are added to the grand picture of ancient functional note frequency, whatever it may be worth. Of course, notes in a central region are more frequent than bass and treble notes; the bell shaped line in the diagram indicates the average frequency level in each particular region, representing the normal distribution that fits the evidence best. Harmonic function, as a determinant of mode, will of course greatly vary with the latter. For that reason, in this case it is particularly likely that, unless precaution is taken, the data would become biased by a few long fragments that are not really representative of the music of their period (a typical example are the Mesomedes pieces which were probably schoolbook songs).36 The values have therefore been weighted in a way that attributes an equal impact to every fragment (defining individual fragments following the DAGM edition).

Figure 8
Figure 8

Relative note frequency in the ancient musical documents BC (left) and AD (right). Weights are normalised so that each document contributes equally. Red line: normal distribution indicating the contribution of a note’s range within the Greater Perfect System on its frequency

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

In Figure 9, finally, the values for all note classes with similar solmisation vowels are added and compared with the respective expectation values (i.e., the sum of the respective values indicated by the bell-shaped lines in Figure 8). Once more it becomes clear that the largely chromatic music from the Hellenistic period yields no meaningful correlation, while the evidence from the Roman period at least confirms a comparatively low presence of lower movable notes (η = ‘ē’), just as we had observed in the Lydian.

Figure 9
Figure 9

Deviation of values from expectations for notes associated with solmisation vowels; data as in Figure 8

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

However, on closer inspection it turns out that the difference is mainly due precisely to the Lydian, together with the Hypolydian, with which the Lydian shares half its tetrachordal structure (Figure 10). In contrast, functional notes associated with ‘ē’ are quite frequent in Iastian and even more so in Hyperiastian. Indeed, in Hyperiastian, functional C falls on the lyre’s ‘central’ string (mésē), which maintains its strong resonant relations with the outermost strings throughout Ptolemy’s tunings.37

Figure 10
Figure 10

As Figure 9, but for individual keys in the Roman period

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

As a consequence, it appears very likely that the perceived vowel qualities did not reflect the harmonic functions of all tunings/keys similarly. After all, when embedded in the materiality of actual instruments and the singers’ main vocal range, the keys no longer represented identical double-octave structures that were merely set to different absolute pitches, but assumed very specific characteristics. Even so, the Anonymus’ claim that the solmisation worked for all keys may well reflect ancient teaching practice: once a student had learnt to associate the vowels with intervals, these could be used at various pitch levels. But the original teaching process would have used the ‘natural’ Lydian scale, just as we find it in the text. Our evaluation suggests that this was also the key in which and for which the solmisation system was originally designed, and where its vowels therefore made the fullest harmonic sense.

On string instruments, as we have mentioned, this harmonic sense would have been corroborated, if not historically created, by the inclusion of each note within different frameworks of resonance with other strings. In the Lydian key, the tuning of the lyre immediately reflected the tetrachords of theory. Only here did the fixed notes of theory correspond with the four strings that were never retuned. On early lyres that spanned no more than an octave, only these four would also form part of concords that could be reinforced by an additional note that sounded the octave of one of the involved pitches. The typically diatonic notes (g and d) would appear secondary, being linked to the primary framework by concords, but not taking part in possible octave-reinforced pairings. The lower movable notes, as we have discussed above, would often fail to connect to the rest of the scale by concords, being potentially tuned to smaller semitone variants. Notably, even in a tuning in fifths and fourths throughout, the parypátē (f) would connect to trítē (c), but not to another note: functional F and B are the respective ends of the cycle that establishes a diatonic tuning.38 On the early seven-stringed lyres, finally, trítē (c) appears to have been typically missing,39 leaving parypátē (f) without any resonant support and a potential vocal trítē (c) without unison accompaniment. Such an instrumental configuration explains both the harmonic weakness of the lower movable notes and the attested early emphasis on mésē (a) and hypátē (e).

Later lyres however not only filled the gap at the upper part of the tuning but also acquired a new bass string, termed hyperypátē (D). In this way, a new resonant framework of D-g-a-d was established, rivalling the old e-a-b-e’ and perhaps paving the way for the harmonic modes centred on functional D and G that we observe in the Roman period. With such a double framework, we would expect that the higher diatonic movable notes would catch up with the fixed notes, as regards harmonic importance, leaving the lower movable notes even more isolated. This is what we observe in the melodic data from the Roman period, and what Aristides Quintilianus seems to describe.

Figure 11 displays the number of notes in the Lydian pieces from the Roman period, grouped by the associated solmisation vowels. On the left, these are ordered by plausible vowel intensity, artificially positing equal distances between the four. The right diagram, in contrast, reflects Aristides’ evaluation in terms of gender, where ‘ē’ is feminine, ‘ō’, masculine, ‘a’, intermediate, and ‘e’ not quite as feminine as ‘ē’. Only the latter predicts the note frequencies fairly well, while the ordering by intensity does not produce a meaningful result. It is interesting to see that a contemporary interpretation, which appears oriented towards F2 pitch, correctly accounts for harmonic significance as expressed in statistical prominence of pitches, associating high formants with weakness.

Figure 11
Figure 11

Roman-period Lydian note frequency vs. solmisation vowels

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

Since the other keys do not similarly associate weak solmisation vowels with low melodic prominence, one might wonder whether their notes might instead reproduce the corresponding prominence of the same pitches in the natural Lydian key. After all, projecting them onto the natural key in this way would expose their harmonic relation with the most natural harmonic framework of the lyre. This would indeed result in an excellent correlation between pitch count and plausible vowel intensity, though not F2 pitch (Figure 12). However, this correlation has no explanatory value, because it is largely a side effect of the retuning process and therefore trivial. The near-vanishing of ‘ē’ is thus mostly explained by the fact that (f) is not used in any of the keys in question, while (c) is absent from Hyperiastian and Iastian;40 in turn, the Lydian ‘a’ notes are favoured because they form the tuning framework that is never altered across the four keys, while one representative of Lydian diatonic ‘ō’ is lost in Iastian, where / ‘g’ is replaced by ‘g♯’.

Figure 12
Figure 12

Roman-period note frequency in non-Lydian keys vs. ‘absolute’ (Lydian) solmisation vowels

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

6.2 Rhythmical Prominence

Musical prominence, of course, need not be harmonic. Rhythm is at least as important a factor; after all, its arguably most basic function is to organise time in terms of marked and unmarked events. However, the disputes and uncertainties regarding questions of ancient metre often make the rhythm of ancient poetry difficult to assess. Ancient musical studies provide a lucky exception wherever ancient scores are furnished with rhythmical notation determining árseis by dots (‘stigmaí ’) over note and rest signs. In possibly confusing contrast to the graphical representation, it is nowadays agreed that (as Bellermann’s Anonymus §3 || §85 affirms) these graphically marked árseis form the rhythmically unmarked (‘weak’) times, while the unmarked notes indicate rhythmically marked (‘strong’) théseis.

The respective documents thus provide a welcome source that enables comparison between original rhythmical indications and harmonic criteria of prominence. In Figure 13, all pieces are evaluated that make use of ársis dots at all. The bars indicate the percentage of ársis for the notes belonging to each solmisation class. Once more, the most conspicuous difference separates the lower movable notes (η, ‘ē’) from the rest. Obviously, harmonically weak notes were particularly likely to be placed in rhythmically weak positions. This finding not only substantiates our results, but also corroborates the mainstream interpretation of the ársis-thésis relation.

Figure 13
Figure 13

Stigmḗ on solmisation note classes in Romanperiod Lydian pieces that contain stigmḗ (only cases not marked as doubtful in DAGM)

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

Interestingly, ‘e’ appears here most strongly associated with rhythmical prominence, which contrasts with its rather feminine status in Aristides, while it would meet our expectations for the vowel associated with the central note of the system. However, the differences between ‘e’, ‘a’ and ‘ō’ are too small to hold any statistical significance, given the limited sample size.

6.3 Vowels in Lyrics

If the picture of a musical education that established a deep-rooted association between certain vowels and certain classes of notes in the students’ minds is accepted, we must wonder whether this might not have influenced the creative process of composing new melodies. Would musicians have felt a certain tug, either wholly unconsciously or partly reflected upon, to place syllables of a certain sound on notes with a certain function, just because this function would have sounded ‘natural’ for that syllable? Might it even be possible that artists exploited such associations in order to distinguish compositions or musical sections where text and melody contrasted in that respect from others where it coincided? While the latter question is beyond the scope of this contribution and probably not answerable on the basis of the small number of surviving documents, we can at least attempt to investigate a possible general tendency.

I will start with anecdotal evidence. It has been noted above that the famous pieces by the second-century AD composer Mesomedes, which have been transmitted in the manuscript tradition, most plausibly derive their origin from an ancient elementary schoolbook. Their date and environment thus suggest that they were studied by the same people who also used the solmisation system to convey melodies and create an intimate acquaintance with harmonic structures. So I think it cannot be coincidental that in the starting half-verse of the Invocation of Calliope and Apollo the melody follows precisely a contour that the vowel qualities of the text would suggest. Indeed, if each of the vowels of the text is replaced by its closest relative in the solmisation tetrad, the melody follows directly from the starting note – which is the common starting note of three of the four pieces, and the only one associated with the required vowel ‘a’ – in combination with the principle always to select, of all possible notes associated with a given vowel, the one that is closest to its predecessor. In this way, a simple sine-wave-like melodic contour emerges, which returns to its starting point after exploring the two notes above and one below, remaining within the narrow range of a fourth (Figure 14).

Figure 14
Figure 14

Solmisation for the first half-verse in DAGM no. 25

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

The required substitution of vowels must have felt all but natural. The ‘ο’ in ‘Kalliópeia’ and ‘sophá’ is just the short variant of the solmisation ‘ō’, and Aristides assigns unadulterated maleness to both. The ‘i’ in the first syllable of ‘Kalliópeia’ can only stand in for ‘ē’, which was developing in the direction of /i/, and was already sufficiently different from the more open variants of the vowel represented by ε and its ‘long variant’, into which the diphthong ‘ai’ had monophthongised. The same is true for the written diphthong in the penultimate syllable of ‘Kalliópeia’, which was at that time almost certainly pronounced as a long ‘ī’. In its antevocalic position, it would typically have been pronounced as a long closed ‘ē’ a few centuries earlier, a sound that must have been almost identical with the solmisation ‘ē’ = η in Mesomedes’ or Aristides’ time.

The undulating half-verse may also be taken to substantiate the proposed interpretation of the solmisation vowels as intrinsic F2 pitch centred on a fixed note realised as ‘a’, towards which ‘ē’ above and ‘ō’ below are oriented. The fixed note appears here as hypátē (e), which undoubtedly formed the most typical instance of such a triad centre: its counterpart an octave above was seldom used, and hardly ever in conjunction with the note above it, which fell outside the regular tuning of the cithara, and the other ‘a’ notes stood above the structural disjunctions, therefore lacking a lower ‘o’-flavoured neighbour. The melodic-harmonic primacy of hypátē is clearly emphasised by its forming the starting note of the piece and its return to it at the caesura; later it will also serve as the final of the short song. In the excursions, hypátē is both start and end point of conjunct movements from and to above and below, firmly establishing the core of the solmisation triad. Notably, the movement also incorporates a conjunct falling run through a fourth – precisely one of the figures in the exercise tables in Bellermann’s Anonymus (§81–82, in instrumental notation).

A final observation may further reduce the chances that all this is just coincidence. As mentioned at the outset, the Anonymus distinguishes three kinds of syllabic onsets when expressing melodies through solmisation. The most distinctive uses the plosive ‘t’, also mentioned by Aristides, while more closely integrated movements connect the vowels either with double ‘n’ or allow them to succeed each other without any mediation.41 Mesomedes’ opening half verse appears to allude to this technique. Indeed we find plosive onsets reserved for the start of the metrical feet (‘k-’, ‘p-’, ‘ph-’), which clearly represent the most obvious more closely integrated domains in the hexameter rhythm. In contrast, within the domain of the dactylic metron, syllables are connected by comparatively smooth consonants (‘-ll-’, ‘-s-’) or immediate succession of vowels. The text thus serves the music as much as the music serves the text. After all, the words also needed to be chosen carefully in order to implement the undulating melody with its very precise requirements regarding a downwards turn after the third and again an upwards turn after the sixth position, which calls for an oxytone word in the caesura as well as an accent on the third syllable. The opening of the three-line invocation thus appears as a skilfully designed cameo of solmisation, forming a link between the most elementary musical education in the style of the ancient grammar school to its first application in lyre-accompanied song, making the melodic and rhythmical elements of the ancient solmisation system equally fruitful.

Such a way of composing must of course remain an exceptional tour de force;42 under normal circumstances, it would add little to the appreciation of a musical work. Nonetheless it can be shown that related tendencies were operative in other pieces, as well. However, the inevitable statistical fluctuation in the available body of evidence – in the light of our previous results we must restrict our investigation to Lydian melodies from the Roman period – prevents a detailed analysis. At least we will be able to prove a correlation and obtain a general picture of its nature.

The relevant data are set out in Figure 15. Its upper half evaluates all securely readable notated syllables that contain those pure vowels which we may associate with one of the four solmisation vowels with reasonable certainty; this excludes not only the diphthongs but also υ and ου (all of which are in any case comparatively rare). However, taking all syllables together results in a methodologically problematic approach: while short vowels often do form long syllables, it is only in exceptional cases that a syllable with a long vowel may become rhythmically short. As a consequence, by lumping everything together we incur a significant risk of obtaining either a biased or a blurred picture. It is therefore advisable to exclude all metrically short syllables, even if this further restricts the data pool. Indeed this very limitation leads to clearer distinctions: while the melodic usage of the various vowels of the text shows no statistically significant variation for the common pool (Figure 15, upper half, leftmost diagram), once the short syllables are excluded, the observed differences are comfortably significant at a level of 5% (lower half).

Figure 15
Figure 15

Pure vowels and solmisation note classes in Roman-period Lydian vocal pieces (only cases not marked as doubtful in DAGM). Left: number of notes associated with each solmisation vowel, for six vowels in the lyrics. Centre: ‘ω-someness’ for vowels in the lyrics, obtained by weighting the occurrences according to intrinsic F₂ pitch of respective solmisation vowels of associated notes, with equal distances between the four solmisation vowels. Right: as centre, but with equal distances between ‘ō’, ‘a’ and ‘ē’, and ‘e’ halfway between the latter (cf. the abscissa in Figure 11)

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

In order to visualise the data beyond the confusing display of absolute values, we need to boil it down to a single meaningful measurement, in a process that will also level out part of the statistical fluctuation. This is conveniently done by assigning weights to the solmisation vowels according to intrinsic F2 pitch, in line with Aristides’ interpretation and our results. The precise relation of these weights, which would translate to Aristides’ concept of maleness, is necessarily artificial; however, this affects the general picture only minimally, as a comparison of the pair of possible choices in Figure 15 demonstrates. One of these (that in the centre) puts all four solmisation vowels at equal distances, while the other (the rightmost one) maintains equal distances between Aristides’ neutral ‘a’ and the extremities of male ‘ō’ and female ‘ē’, while inserting ‘e’ between ‘a’ and ‘ē’. The values are conveniently restricted to the range between 0 and 1, where 0 would stand for fully feminine status (i.e., all instances of a lyric vowel would be set to a lower movable note, expressed by ‘ē’) and 1, for pure maleness (all instances set to upper movable notes, associated with solmisation by ‘ō’). The resulting figure may therefore be regarded as measuring the degree of ‘ō-someness’ of each vowel of Greek poetry. In the diagrams, these vowels are also arranged by intrinsic F2 pitch, to facilitate an assessment of the inherent trends.

Again, the most conspicuous result is the tendency to set the ‘weakest’ vowels in the lyrics, η and ι, on notes that the solmisation system also associates with ‘weaker’ vowels. The pairing of η and ι in particular corroborates the prosodic proximity of the two, which, we have concluded, also underlies their apparent exchangeability in Mesomedes’ opening. Surprising, on the other hand, appears the weak status of short ο. However, as this is the least frequent of all the examined vowels, with only twelve examples in total, its outlier status need not be more than coincidental.

We may put somewhat greater confidence in the prominence of ε. After all, this is the vowel the creators of the solmisation selected for representing mésē, which the sources ubiquitously acclaim as the pivotal note of the scale. Our statistics might thus provide the required evidence that ε was not generally perceived as a weak sound. As a solmisation vowel, we have also found it occupying the rhythmically most prominent status. And much the same might be said for textual ε, which is particularly rare in notated ársis (11%, compared to an average of 21%).

It must be emphasised that these last observed correlations do not reflect any ‘rules’ that composers would have followed; they only show slight, most probably unconscious preferences to associate syllables that sounded weaker with weaker pitches. This is not necessarily a self-evident stylistic feature; on the contrary one might imagine a compositional style that strives for less contrast, balancing weak syllable sound by strong notes and vice versa. Most importantly, harmonic functions are not correlated with particular vowels, as in the solmisation system, but merely with a tendency towards one or the other direction in the F2 spectrum. Our findings are useless for the purposes of ‘predicting’ missing notes or ‘reconstructing’ lost music. Also, they apply only to Roman-period musical culture; a survey of note functions and the vowels of the lyrics in the Hellenistic fragments did not find a correlation.43

7 Intervals

So far we have envisaged vowels as markers of relative pitch, related to basic structural features of ancient scales, and as markers of relative harmonic prominence, related to the musical function of individual notes in these scales. But there might be another possible association: according to their variously perceived intrinsic pitch, vowels might correlate with different degrees of intervallic movement in a melody.44 None of our texts describes or implies such an idea. Still, composers might implement such an association without being aware of the principles they are unconsciously following. What this would entail is open to speculation; it is equally conceivable that intrinsic pitch influenced the melodic line in a positive correlation, low-pitched vowels being associated with smaller rising and larger falling intervals, while the opposite would be true for high-pitched vowels. Or conversely, intrinsic vowel pitch might be associated with a reverse melodic tendency, if the nature of the vowel contributed to the overall feeling of the melodic movement in a way that allowed it to carry part of the intended impression.

Testing such a hypothesis is no straightforward task. The distribution of vowels and diphthongs within words is strongly biased by morphology. As a consequence, vowels are not evenly distributed with regard to the accentual contours of speech, and since most of the extant melodies follow speech intonation to some degree,45 a correlation between some vowels and rising and especially falling tendencies is an unavoidable side effect. A meaningful investigation must therefore abstract from the effects of morphology and accentuation: how are individual vowels treated ceteris paribus?

This can be achieved by first comparing the various vowels within similar environments within the accentual domain, consequently calculating a weighted average from the individual observed deviations. For all practical purposes, it suffices to distinguish fourteen syllable types within the possible accentual contours. These are final syllables (1) with acute, (2) grave or (3) circumflex, syllables immediately following (4) an acute or (5) a circumflex on the penultimate or (6) an acute on the antepenultimate, penultimate syllables with (7) an acute, (8) a circumflex or (9) following an acute on the antepenultimate, (10) antepenultimates with acute, syllables immediately preceding (11) an acute or grave or (12) a circumflex, and finally syllables preceding (13) an acute or grave or (14) a circumflex with greater distance. For all these categories the intervallic steps leading to each kind of vowel must be evaluated, so that the individual deviations from the mean can be determined for each category. For each vowel, all these deviations can then be combined into a single figure by adding up the products of the deviations for each category with the number of incidences of the vowel within that category, and then dividing the sum by the total incidences of the vowel. For a small dataset that however covers a wide range of measurements we need also be aware of the hazards posed by outliers. After all, some melodic jumps exceed an octave, and this may misrepresent the facts especially when the apparent melodic interval actually bridges a metrical and/or syntactical divide.

Figure 16 shows the result for all reasonably frequent vowels and diphthongs. The complete data are presented at the left side, while the diagrams at the right side address the outlier problem by excluding all intervals larger than a fifth. For the total of the Roman-period documents, no general trend is discernable; with the exception of ω and υ, we seem but to observe statistical noise in the range of about plus or minus a tenth of a tone.

Figure 16
Figure 16

Roman-period syllabic nuclei and deviation from average melodic steps reaching the respective syllables in similar accentuation contexts (only cases not marked as doubtful in DAGM)

Citation: Greek and Roman Musical Studies 10, 1 (2022) ; 10.1163/22129758-bja10042

In the lower four diagrams of Figure 16, one clearly identifiable pool is separated from the rest of the data: the lower couple shows the data for the pieces from the manuscript tradition, which are mostly attributed to the stylus of Mesomedes. It comprises 398 cases, as opposed to 396 from all other documents together. These two sets now show partially conflicting tendencies. For the bulk of the documents, which stem not only from different hands but also from a range of several centuries, the differences are mostly levelled out as soon as outliers are excluded, showing that the evidence from these is very likely meaningless. The case is different for the pool that traces back to a single source and a named individual, where the tendencies mostly persist or become even more emphasised. It seems, therefore, that if there was any preference for different intervallic motion associated with different vowels in Roman-period music, it was certainly not a general phenomenon but may have been specific to individual styles.46

8 Conclusion

The preceding attempt to obtain a fuller understanding of the ancient solmisation system by embedding it in its literary and musical context has led to a multifaceted image of linguistic and harmonic relations, of their appreciation and philosophical integration by a well-informed contemporary mind, and of potential influences on poet-composers.

First and foremost, more than one line of reasoning suggested that the system was hardly much older than the Roman-Imperial period, that it was devised exclusively for diatonic scales, and that our two sources for it do not depend on each other. The system was used to convey melodies by singing them to the appropriate vowels, indicating rhythmic partitioning by the selective use of consonants. The vowels had been chosen according to the intrinsic pitch of their second formants, in combination with their intrinsic intensity. The pitch component mainly related the ‘moving’ diatonic notes to their respective adjacent fixed notes, so that pitch-neutral but intense ‘a’ was flanked by lower ‘ō’ and higher ‘ē’. The sound of the latter, which had developed to a relatively closed variant of /e/ and was steadily moving towards /i/, was both the highest and the feeblest, which warranted its association with the harmonically subordinate lower movable notes in the tetrachord, most importantly with parypátē. On the other hand, in the extant musical documents, the notes associated with the more intense vowels are often found in important modal roles, as finals or focal notes. However, many of these correlations are only born out in the natural ‘Lydian’ key (which corresponded to lyre tunings in the Dorian octave), which doubtless formed the starting point in musical education. In Mesomedes’ Invocation of Kalliope and Apollo, we have detected a probable case of a sophisticated application of the system in musical schooling, at a point when the pupils passed on from grammar-school-like exercises on intervals and scale fragments to performing real songs. However, we have also found traces of a much broader and probably largely unconscious correlation between textual vowels and their musical implementation, regarding both melodic/harmonic and rhythmic parameters. On the other hand, the system of theatre resonators as reported by Vitruvius as well as the design of an instrument from Pompeii testify to an awareness of harmonic hierarchies, which ancient engineers needed to take into account.

At any rate, the notions of intrinsic vowel pitch and intensity, whose harmonic repercussions were doubtless perceivable by a member of the educated elite, integrated smoothly within the stereotypes informing Aristides Quintilianus’ overarching gendered interpretation of ethical matters. As a consequence, his evaluation, when interpreted along the right lines, deserves less ridicule than was often assumed, and may stand as a unique and considerate testimony of Roman-Imperial musical perception.


This paper is based on research funded by the Austrian Science Fund (FWF) through grant P32816-G. In particular, all data that include filtering by certainty criteria (“not marked as doubtful in DAGM”) were prepared in the course of this project by Chrḗstos Terzḗs; the respective results are owed not least to his extraordinary diligence and precision. The respective data, which were evaluated by dedicated software designed by the author, will be made public in 2022.


  • Allen, W.S. (1987). Vox Graeca: The Pronunciation of Classical Greek. Cambridge: Cambridge University Press.

  • Barker, A. (1989). Greek Musical Writings, II: Harmonic and Acoustic Theory. Cambridge: Cambridge University Press.

  • Bélis, A. (1984). Un nouveau document musical. BCH 108, pp. 99109.

  • Cosgrove, C.H., and Meyer, M.C. (2006). Melody and Word Accent Relationships in Ancient Greek Musical Documents: The Pitch Height Rule. JHS 126, pp. 6681.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • D’Angour, A. (2016). Vocables and Microtones in Ancient Greek Music. GRMS 4, pp. 273285. DOI:

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Devine, A.M., and Stephens, L.D. (1994). The Prosody of Greek Speech. Oxford/New York: Oxford University Press.

  • Hagel, S. (2009). Ancient Greek Music: A New Technical History. Cambridge: Cambridge University Press.

  • Hagel, S. (2013). Aulos and Harp: Questions of Pitch and Tonality. GRMS 1, pp. 151171. DOI:

  • Hagel, S. (2016). ‘“Leading notes” in Ancient Near Eastern and Greek Music and Their Relation to Instrument Design. In: R. Eichmann, L.-Ch. Koch, F. Jianjun, eds, Sound – Object – Culture – History, Rahden, Westf.: Marie Leidorf, pp. 135152.

    • Search Google Scholar
    • Export Citation
  • Hagel, S. (2018a). “Musics”, Bellermann’s Anonymi, and the Art of the Aulos. GRMS 6.1, pp. 128176. DOI:

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hagel, S. (2018b). Adjusting Words to Music: Prolongating Syllables and the Example of “Dactylo-Epitrite”. JHS 138, pp. 227248.

  • Hagel, S. (2020). The Birth of European Music from the Spirit of the Lyre. In: G. Kolltveit and R. Rainio, eds, The Archaeology of Sound, Acoustics and Music: Studies in Honour of Cajsa S. Lund. Publications of the ICTM Study Group for Music Archaeology 3, Berlin: Ekho Verlag, pp. 151169.

    • Search Google Scholar
    • Export Citation
  • Hagel, S., and Lynch, T. (2015). Musical Education in Greece and Rome. In: W.M. Bloomer, ed., A Companion to Ancient Education, Chichester: John Wiley & Sons, pp. 401412.

    • Search Google Scholar
    • Export Citation
  • Hirschfeld, G. (1875). Inschrift von Teos. Hermes 9, pp. 501503.

  • Holmes, P. (2008). The Greek and Etruscan Salpinx. In: A.A. Both, R. Eichmann, E. Hickmann, L.-Ch. Koch, eds, Challenges and Objectives in Music Archaeology, Rahden, Westf.: Marie Leidorf, pp. 241260.

    • Search Google Scholar
    • Export Citation
  • Hughes, D.W. (1989). The Historical Uses of Nonsense: Vowel-Pitch Solfège from Scotland to Japan. In: M.L. Philipp, ed., Ethnomusicology and the Historical Dimension, Ludwigsburg: Philipp Verlag, pp. 318.

    • Search Google Scholar
    • Export Citation
  • Hughes, D.W. (2000). No Nonsense: The Logic and Power of Acoustic-Iconic Mnemonic Systems. Br. J. Ethnomusicol. 9, pp. 93120.

  • Lehiste, I. (1970). Suprasegmentals. Cambridge, Mass.: The MIT Press.

  • Pöhlmann, E., and West, M.L. (2001). Documents of Ancient Greek Music: The Extant Melodies and Fragments. Oxford: Clarendon Press (DAGM).

    • Search Google Scholar
    • Export Citation
  • Rocconi, E. (2002). The Development of Vertical Direction in the Spatial Representation of Sounds. In: E. Hickmann, R. Eichmann, A.D. Kilmer, eds, The Archaeology of Sound: Origin and Organisation, Rahden, Westf.: Marie Leidorf, pp. 389392.

    • Search Google Scholar
    • Export Citation
  • Standford, W.B. (1943). Greek Views on Euphony. Hermathena 61, pp. 320.

  • West, M.L. (1992). Ancient Greek Music. Oxford: Clarendon Press.


Hagel 2018a.


For the following cf. Hughes 1989; Hughes 2000.


Hughes 2000, 96f.


Cf. e.g. Lehiste 1970, 120–5.


Cf. e.g. Lehiste 1970, 68–71. For Greek vowels and one specific way of investigating their influence on melodic intervals, which yielded no conclusive result, Devine and Stephens 1994 173–6.


D’Angour 2016, 278f.; chromatic and enharmonic variants are also construed in Bélis 1984, 105.


D’Angour addresses this predicament with an argument that appears to undermine the fundamental assumptions of his theory: if the low-pitch vowel ō intends to flatten a diatonic likhanós (e.g., g) towards the proper pitch of an enharmonic one (~ f) – I do not understand why it might only be “a semitone higher than the preceding vocable” – one would have to assume that the origin of the system is to be sought at the same time in the diatonic (providing the ‘original pitches’) and in the enharmonic (providing the ‘target pitches’ underlying the choice of vowels), which appears to be a blatant contradiction, assigning the movable notes in question to two different pitches at the same time.


Cf. Hagel 2013: the geometry of a Classical trígōnon shows that is must have included modulating strings, which precludes a neat tetrachordal scheme.


On ancient leading notes and especially F F and R R (in the frequent Lydian key) as typical leading notes to final S S, cf. Hagel 2016.


Hagel 2018a, 142–4; 160.


Cf. Dion. Hal. Comp. verb. 14, quoted below.


Barker 1989 translates ὀργανικός in 1.4, 4.20 Winnington-Ingram, as “instrumental” (402), 1.5, 6.22 W.-I., “instrumental performance”, 1.11, 23.4 W.-I., “instrumental”, but here (479) “produced by our vocal organs”. I opt for consistency – indeed Aristides’ idea of solmisation emphasises the reproduction of instrumental sounds (cf. his explanation why solmisation syllables start with ‘t’ in 2.14, 79.5–14 W.-I., including μόνον τε ταῖς τῶν ὀργάνων χορδαῖς ἐμφερῶς ἠχεῖ “it is the only one to resemble the sound of strings on instruments”), and this accords with my interpretation of the solmisation system being intimately associated with instrument-backed elementary musical schooling (Hagel 2018a, 143f.).


These would of course go only as far as the facts support the model. Barker (1989, 140 n. 124) rightly points out that Greek neuters in fact terminate in some variant of ‘o’ in most cases; Aristides may have overstated the observation that a nominative/accusative plural ending in ‘a’ is typical for neuters.


For the importance of prolongation of notes (ἔκτασις) in the context of solmisation-based musical education cf. the instruction preceding the exercises in Anon. Bell.: τὰς ἀγωγὰς καὶ τὰς ἀναλύσεις δεῖ μελῳδεῖν ἐκτείνοντας μᾶλλον καὶ μὴ βραχύνοντας τοὺς φθόγγους, ἡ γὰρ ἔμμονος αὐτῶν καὶ ἐπιμηκεστέρα ἐκφώνησις ἀκριβεστέραν τῇ ἀκοῇ χαρίζεται τὴν κρίσιν (§78): “The upwards and downwards movements need to be executed by rather extending, not shortening the notes, because when they are produced steadily and for an extended time, the ear can more easily assess their correctness”.


One might entertain the hypothesis that Aristides (mis)applied a concept to vowels that was of recognised importance in another field of musical composition: at least in certain kinds of song, composers preferably implemented especially long notes only by readily extensible syllables, avoiding syllables consisting of a short nucleus followed by a non-sonorant consonant; cf. Hagel 2018b.


For a general overview on Greek letter euphonics see Stanford 1943.


The letter is already associated with tenuity in Pl. Crat. 426e.


Cf. Allen 1987, 73–5.


Eleusis inv. 907, Bélis 1984; cf. DAGM no. 1.


Bélis 1984, 107: “probablement entre les harmoniques 2 à 7 ou 8, à en juger par la longueur du tuyau”.


Importantly, at the time in question Ε would have represented not only the short vowel but also long closed ‘e’, which came to be spelled as ει once the diphthong /ei/ had been monophthongised. This long version, at least, would have had distinctly higher intrinsic pitch compared with long open ē, which the Ionic alphabet rendered as η.


Transcribing the final Ε as a long note is unproblematic because at the time the letter was not only used to designate the long variant (which later had come to be written as ει), but even called by this long variant (to be renamed ‘ἒ ψιλόν’ only in Byzantine times). The rhythm in my examples is only meant to show that we cannot derive the actual rhythm the painter had in mind from the ‘metre’ of the syllables. Note that my first example assumes a base note of C, as in Bélis’ transcription, but my second example, a base note E, keeping the top note identical.


These hypothetical renditions, like Bélis’ rely on the assumption that the trumpet type in question would have played an undistorted harmonic sequence. This is not necessarily true; as experiments with models have shown, much depends on the type of mouthpiece (Holmes 2008, 101f.), and many different sequences of intervals are possible, though few of them may have been acceptable as perfect fourths.


E.g. Aristot. Met. 1018b; [Aristot.] Pr. 19.20; Cleonid. 11, 202.3–5; cf. Hagel 2009, 117–22.


CIG 3088; Syll. 3.960; cf. Hirschfeld 1875; Hagel and Lynch 2015, 406f.


Cf. Hagel 2018a, 142–4.


For the following, cf. Hagel 2009, 219–50.


Hagel 2009, 245.


Cf. Pl. Resp. 443d; Plut. Quaest. conv. 744c; 745b; SEG 30.382; Frag. Cens. 12, 75.5–6.


Hagel 2009, 251–5.


For a hypothesis about the non-Mediterranean European origins of C-based music, cf. Hagel 2020.


Cf. West 1992, 235 n. 42; Hagel 2009, 135f.; 156–8; 410; 442–4.


Hagel 2009 252–4.


Hagel 2009, 351–61.


Ptol. Harm. 1.16, 39.12–14, cf. Hagel 2009, 198–201.


Hagel and Lynch 2015, 405; Hagel 2018a, 136f.


Cf. e.g. West 1992, 171; Hagel 2009. 196 Diagram 50.


Note that B could maintain a prominent status because two instances of its single concordant companion E were present, one a fifth below and one a fourth above. The resulting structure of e-b-e’, when played simultaneously, conforms to the sequence of 2nd, 3rd and 4th harmonic and therefore blends into an especially smooth chord.


For the sources and modern views cf. West 1992, 176f.; Hagel 2009, 104f. n. 5.


However, it is also conspicuously rare in Hypolydian, which appears to demand an explanation. An especially weak status of this pitch might recall the old omission of trítē on the lyre, half a millennium earlier, perhaps even the designation of the respective Near Eastern lyre string, the third from the upper end, as qatnu, ‘thin’, in Nabnītu 32.1.3, almost two millennia earlier, or the fact that this string has always been the first to break on my lyres, generally in circumstances where replacement cicadas were hard to come by.


Aristid. Quint. 2.14, p.79.5–8 W.-I.: ἐπεὶ δὲ ἔδει καὶ συμφώνου παραθέσεως, ὅπως μὴ διὰ μόνων τῶν φωνηέντων γινόμενος ὁ ἦχος κεχήνῃ, τῶν συμφώνων τὸ κάλλιστον παρατίθεται τὸτ· “But since it turned out that a consonant must also be added, in order to avoid the gap in the sound that would result from concatenating only vowels, the most beautiful of the consonants is added: ‘t’”. This need not imply that Aristides would always insert a ‘t’. Successions of multiple vowels are a typical feature of Greek speech and especially poetic language; it is however avoided between words. Consequently, solmisation practice would naturally regard small musical units as ‘words’, between which ‘hiatus’ must be avoided.


I wonder whether it is of any significance that of the five preserved openings in the corpus of ancient melodies, four have the correct solmisation vowel in the first syllable. In addition to the discussed piece these are the Invocation of the Muse (DAGM no. 24: ‘á̱eide’ starts from hypátē S e), the Hymn to Nemesis (DAGM no. 28.1 and 16: ‘Né̱mesi’ twice starts from mésē I a), and the Seikilos stele (DAGM no. 23: ‘hó̱so̱n’ starts with a jump from one functional diátonos S to another Z). The only counterexample is the Hymn to the Sun (DAGM no. 27: hypátē repeated four times on the initial syllables of ‘khionoblephárou’). All pieces are Lydian except Seikilos; two (Seikilos and the Invocation of the Muse) famously involve an initial clash between melody and accent.


Even though we have no solmisation for the preponderantly chromatic music of that period, an investigation is possible by comparing harmonic functions with a weighted average of textual vowels ordered by intrinsic F2 pitch. The differences are however minimal, even for the most promising subgroups: for the differences between the syllable nuclei ω, ο, α, αι, ε, η, ει, ι on the one hand, and hypátē meson, parypátē meson and oxýpyknos meson in all keys, on the other, χ2 = 15.62 at 14 degrees of freedom, for that between hypátē meson and parypátē meson in the Lydian, χ2 = 6.81 at 7 degrees of freedom. The respective measures of textual ‘ω-someness’ (calculated analogously to Figure 15, but using the actual vowels of the songs instead of solmisation vowels, equally spaced in the order given above) range between 12.11% and 14.63% in the first set, and between 12.37% and 14.68% in the second. Grouping note functions analogously to the solmisation (similar ‘e’, ‘a’ and ‘ē’ groups, plus a group for the oxýpykna replacing the ‘ō’ group of diátonoi) does not help either; the measures for the resulting groups are practically identical (13.01%, 13.49%, 13.38%, 12.46%), with a χ2 = 12.24 for the data table at 21 degrees of freedom.


A related model is proposed by D’Angour (2016, 282–4), on a very selective basis: relinquishing his original idea of solmisation, D’Angour interprets the Orestes fragment (DAGM no. 3) in terms of relative F0 pitch where the intrinsic pitch relation between the vowels of subsequent syllables is only relevant when the second falls on a note that appears ‘microtonal’ in modern stave notation. In this way, the test might appear designed to suit the theory; stubbornly disagreeing evidence is subsequently rejected (while the small difference between ι and υ is welcomed as confirming the picture, the hardly smaller difference between ι and ε in a problematic instance is dismissed as “arguably neutral”). If the count is done more rigorously and expanded to all microtonal movements in a strict sense (e.g. including quartertone steps whose initial pitch is a mesópyknon), I find eight cases in agreement with D’Angour’s tenets and five in disagreement, in line with a random distribution (p = 0.29). The reader may be cautioned, when assessing D’Angour’s argument, not to be confused by finding the lower enharmonic pitches in question designated as e♯ and a♯ and the higher as f and b♭ respectively, contrary to the normal pitch relation of these notes in (non-tempered) music.


The principles have long been known and studied; cf. more recently West 1992, 197–200; Devine and Stephens 1994; Cosgrove and Meyer 2006.


One might of course search for other factors of which the tendencies that are observed for Mesomedes might but form corollaries. One example might be the musical expression of the preposition ὑπό (DAGM 27.9; 27.23; 28.7; 28.11; 28.12; cf. also the plunge within this word in DAGM 39.4) in accordance with a spatial concept of pitch that had been developing by that time (cf. Rocconi 2002). But this explains only a small part of the observed deviation.

Content Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 437 200 15
PDF Views & Downloads 573 286 21