Linking Syriac Liturgies: Digitizing Card Collections and Handwritten Notes from the Archives of the Peshitta Institute

The Peshitta is the Syriac translation of the Bible from the second century AD. It had an enormous cultural and literary impact in the Middle East, no less than the role that, for example, the Latin Vulgate played in large parts of (Western) Europe. Up to the present, the Peshitta plays a crucial role in the various Syriac Christian communities in the Middle East and the Syriac Diaspora. A valuable resource for Peshitta studies is the collection of handwritten notes on cards and binders from the first decades of the Peshitta Project (started 1959), which were digitized in the dans “Klein Data Project” (kdp) Linking Syriac Liturgies (2018–2019). The notes relate to a wealth of information about manuscripts, liturgical traditions and calendars and various Syriac Bible translations. The kdp has made this material accessible and allows for computational research into the complex interactions between textual and structured data included in this data set, in which more than 9,000 entries containing pericopes read in liturgy are indexed according to more than 70 categories.


Introduction
One of the earliest traditions in Christianity is the Syriac tradition, a diverse tradition closely connected through the shared use of the Syriac language. Syriac belongs to the Aramaic branch of the West Semitic languages (cf. the Glottolog classification). The Aramaic languages cover a time period from the tenth century BC up to the present and allow for longitudinal linguistic analysis over three millennia. Syriac takes pride of place among the Aramaic languages because of its long literary tradition including poems, ancient Bible translations, apocryphal stories, historiography, Bible commentaries, philosophical tractates, and scientific works. It is the language of a long neglected branch of early Christianity, and one of the most important languages to study the religious and cultural context in which Islam emerged. Syriac translations served as a bridge between Greeks and Arabs. Syriac is still the liturgical language of Oriental Christian Churches in the Middle East and is used increasingly by these Churches in the Syriac Diaspora (on the Syriac language, see further Butts, 2018; for a short introduction to Syriac Christianity, see Murrevan den Berg, 2010). The Syriac tradition originated in the city of Edessa (nowadays Şanlıurfa in se-Turkey). This city is located on an important trade route, in an area contested for centuries by major empires and is, therefore, a place where different cultures met. This tradition was spread through trade routes to East Asia, the Levant and Egypt. After the Christian councils of the fifth century the Syriac tradition was identifiable with four main denominations: the Syriac Orthodox (also labelled Miaphysite), East Syriac (also "Church of the East", or Diophysite), Melkite, and Maronite community.1 1 Simply put, "Miaphysite," "Diophysite," and "Melkite" are labels used to denote distinct theological doctrinal views on how to verbalize in what way Jesus Christ is to be seen as "God" and "human." These doctrines were articulated during the fifth/sixth-century Christological debates in the Christian church following the Councils of Ephesus (431) and Chalcedon (451). The labels designating those who adhere to one nature of Christ ("Miaphysites") and those who clearly distinguish two natures ("Diophysites") do no justice to the complexity of theological doctrine and are usually avoided by modern scholars. The label "Melkite" (from malkā, "[Byzantine] emperor") is used for Syriac adherents of the Byzantine Church who accepted the Christology proclaimed by the Council of Chalcedon. The Maronite denomination, closely connected with the West-Syriac, was named after a fourth-century religious leader. From the thirteenth century onwards it became gradually Latinized following contacts with the The tradition's most important Bible translation is the Peshitta, which is a translation of both the Hebrew Old Testament (ot) and the Greek New Testament (nt). The translations of both corpora are dated differently and have a distinct provenance. The Peshitta ot dates from the latter part of the second century and is presumably made in Edessa. Scholarly opinions differ as to whether it is of Jewish and/or Christian origin. The Peshitta nt is a fifth-century revision of one of the earlier nt translations. The most common assumption is that it is a revision of the third-century "Old Syriac" translation. The Peshitta ot and nt became the standard translation for Syriac-speaking Christians. The name "Peshitta" is probably a later designation (the earliest reference is attested in the ninth century) and is used to suggest it being a "simple", "widespread", or even "vernacular" translation -in this respect, it has the same role as the Latin Vulgate in the West (cf. ter Haar Romeny & Morrison, 2018).

Materials
The archives of the project was combined with the clariah research pilot LinkSyr: Linking Syriac Data, a project involving Syriac nlp tools and Linked Data. This combination will be instrumental for the website to be established and for further research in the field of nlp connected with "liturgical" words (see below, section Concluding remarks).
The vts edition started in 1959. It is based on manuscript evidence that was acquired and/or photographed (stored on microfilm) in the 1960-70s. In advance of the edition, all available manuscripts were catalogued and described (Peshitta Institute, 1961),2 and various handwritten collections were produced. The edition represents the earliest Peshitta text as closely as possible. Its main text is based on a sixth-/seventh-century manuscript (the earliest complete manuscript that has been preserved -earlier manuscripts, from the fifth century onwards, contain only one or two individual books of the Bible) to which variants from manuscripts up to the twelfth century are added in a critical apparatus. Only the base manuscript's obvious errors and inferior variants are corrected depending on the most ancient manuscripts.
Two types of manuscripts were used for vts. These two types form indeed the materials studied for this project, that is, lectionary manuscripts and Bible manuscripts: -Lectionary manuscripts are conceived for liturgical use since these manuscripts contain pericopes (i.e., parts of the biblical text) used for the liturgical service of a given ecclesiastical feast. One pericope, and sometimes even more pericopes, clustered under an introductory phrase (e.g., "From Genesis") and a liturgical title (e.g., "For the second day of the week of Easter"), constitutes a reading, which is assigned to a certain ecclesiastical feast or service in commemoration of a saint. The ordering of readings follows the ecclesiastical calendar and is connected to a rite, specific to a denomination. -Bible manuscripts are manuscripts containing a complete biblical book, either an individual book in one manuscript or clustered as a known series, for instance, the "Five Books of Moses" (i.e., the first five books of the Hebrew Bible). The digitized materials are part of three collections. These collections are centred around a particular type of manuscript: either a lectionary or a Bible manuscript.
2 Especially its fourth supplement contained several manuscripts which are part of public and private collections in the Middle East (Peshitta Institute, 1968 Wim Baars meticulously studied the content of lectionary manuscripts. In the 1970s he produced two collections: 1. A card collection in which the pericopes are ordered by book, chapter, and verse, providing an overview of corresponding pericopes used in the various Syriac liturgies (see Figure 1). 2. A set of binders containing a continuous, complete description of the contents of each manuscript, presenting all available headings, colophons, and liturgical titles in Syriac, as well as presenting the readings in order of their appearance in the described manuscript (see Figure 2). Jenner (1993) studied the liturgical titles in Syriac Bible manuscripts and brought them together in several lists. From these, we derived the third collection for our digitized data.3 For this project manuscript materials of all pre-ninth-century Bible manuscripts containing liturgical titles, a total of 39 (cf. Jenner, 1993, p. 11), and the total number of all lectionary manuscripts that were at the time available to Baars (also a total of 39 manuscripts) were digitized (for an example of a page of a lectionary manuscript, see Figure 3). These manuscripts are dated differently, Bible manuscripts are presenting older materials than lectionary manuscripts. The total number of folios on which the data was found is 8,982, but since a folio is written on both the recto and the verso side, this number may safely be multiplied with two in order to reflect the more common "book pages". The data set is quantified and placed into context in Tables 1-3. The data containing the readings is presented in 9,056 entries (for the precise contents of an "entry", see below section Data set). Most of these entries are taken from the 39 lectionary manuscripts. These are dated later than the Bible manuscripts studied by Jenner (1993), containing liturgical titles inserted in the main text to denote pericopes to be read. In the margins of several of these Bible manuscripts, annotations have been added in later centuries, which are difficult to date with certainty (more than half of the 920 liturgical titles; see Table 2).
Readings contained in a lectionary are ordered according to an ecclesiastical calendar, which is connected to a rite specific to a denomination. These specifications make it possible to assign even the unprovenanced lectionary manuscripts which are part of the data set to a denomination. All analysed lectionary manuscripts can therefore safely be distributed among the main Syriac denominations. A summary of these manuscripts related to the denomination is found in Table 3. The Bible manuscripts prove to be more difficult to assign to a specific denomination. Moreover, only six of them contain a date of writing, the other manuscripts are dated based on palaeographical grounds. All Bible manuscripts studied contain liturgical titles, which are either preceding the pericopes in the biblical text itself, inserted in the margin near the intended pericope or found in an inventory of titles appended to the biblical text. Table 1 Summary of the used Bible and lectionary manuscripts

Number of manuscripts Number of folios Century
Lectionary manuscripts 39 5,412 9-16 Bible manuscripts 39 3,570 5-8 Table 3 The lectionary manuscripts assigned to Syriac denominations

Number of manuscripts Century
Syriac Orthodox 13 9-16 Melkite 14 9-13 East Syriac 11 9-16 Maronite 1 13 Given the fact that all of the manuscripts were collected in preparation for a text edition of the Peshitta ot, available lectionaries only containing pericopes from the nt were not considered. Nevertheless, 1,733 pericopes out of the 8,136 entries from the lectionary manuscripts are found extracted from the nt. The nt-pericopes found in the data set were part of lectionaries which were studied in preparation of vts because they consist of mainly ot-pericopes.
All entries containing readings from the Bible manuscripts are taken from the Peshitta (920), but with the lectionary manuscripts the situation is slightly different. Most of the 8,136 entries are also from the Peshitta (7,607). Of the remaining 529 entries, 61 contain references to earlier pericopes in a lectionary manuscript (and are not giving the actual text of the intended pericope), 79 pericopes are from a yet unidentified translation (to be precise: 77 from the ot, including two references; and two from the nt, including one reference), 99 pericopes are from the Harclean, 289 are from the Syro-Hexapla, and one pericope is from the translation of Jacob of Edessa. The Philoxenian translation seems not attested in the data set, as is the case with the second-century Gospel Harmony, the Diatessaron. The use of divergent translations is widespread: out of the 39 studied lectionary manuscripts, 30, covering all four denominations, contain pericopes taken from other translations than the Peshitta.
A divergent translation is sometimes indicated by the scribe in the introductory phrase (e.g., ak mašlmānutā dšabʿin "according to the Seventy", i.e., taken from the Syro-Hexapla), but is sometimes only established by comparing the text of the pericope with modern-day editions.

Methods
The steps taken to arrive at the final presentation of the materials will be explained in chronological order. The Latin script and the Arabic numerals as contained in Baars' card collection were digitized in a text document (cf. Figure 1). The resulting data was cross-checked with the collection contained in the binders. Based on these binders the Syriac text was added (i.e., introductory phrases, liturgical titles, and colophons; cf. Figure 2.) For cross-checking the Syriac text, the card collection could not be used, since this only contains details in Western script. The resulting data was imported into a spreadsheet program (Microsoft Excel). Uncertain passages were checked against the microfilms or digital images of the manuscripts, and hyperlinks were inserted in order to connect the data to the digitized manuscripts available online. research data journal for the humanities and social sciences 5 (2020) 20-38 Subsequently, the data in Syriac script taken from Jenner (1993, pp. 46-63, 402-421) was digitized. The computational facilities of the etcbc were used to extract the relevant data from files kindly made available by Jenner (the initial data was processed using Multi-Lingual Scholar 3);4 and the codification of the Syriac font as used in the files was decoded. The resulting list was cross-checked with the print edition of the dissertation and the resulting text file was imported in Excel, and hyperlinks were inserted.
Both data sets were consolidated into one workbook. Finally, because of the filtering and sorting tools of Excel, the Syriac texts were translated in the spreadsheet file itself.

Entries and Categories
The data of the pericopes is presented in 9,056 entries, and in total 76 categories are used to describe any entry.
An entry contains at least the following five main building blocks of a reading. We first explain these building blocks and the 32 categories used, and give two examples also (for the remaining categories, see Appendix A): 1.
One or more pericopes. The chapter and verse numbering is based on a scholarly edition of the Syriac Bible translation, but when there is a discrepancy between the Syriac translation and the Greek or Hebrew text, the numbering of both is given. One category, parts of which are divided up by plus-signs, round brackets, square brackets (cf. Figure 4, column C), and an additional six categories for pinpointing start and end of the pericope(s). 2. The introductory phrase, i.e. words used to introduce the reading. Nine categories: three for the Syriac text, the translation in English, and additional remarks; and six to give the exact position of the phrase in the research data journal for the humanities and social sciences 5 (2020) 20-38 Figure 4 Screenshot taken from "Pericopes.xlsx". This screenshot covers the first sixteen categories, describing entries 060-0686 -060-0723. It represents the pericope, introductory title, translation, place of the title in the actual manuscript, the liturgical title, and its translation in English.
Linking Syriac Liturgies research data journal for the humanities and social sciences 5 (2020) 20-38 manuscript referring to the folio, column and lines of both the start and end of the phrase (cf. Figure 4, columns D-F, I, K-O). 3. The liturgical title, which contains the data to connect the reading to the liturgical year. Eleven categories: five for the Syriac text (cf. Figure 4, column P, "Taksa"), translation in English, and additional Syriac remarks as found in the manuscript, an translation in English, and additional remarks; six categories giving folio, column and lines, thus denoting start and end of the phrase. 4. The Syriac translation from which the pericope is taken. Four categories: the Bible translation used, the Syriac text used in the manuscript to denote the Bible translation, a translation of the Syriac text in English, and additional remarks. 5. The denomination in which the reading was used. One category.

4.2.
Two Examples The first example is taken from a lectionary manuscript.
Example 1 -entry 060-0688, pericope "Jb#23:01-12a.(sepwātēh)" (cf. Figure 4): -The pericope is taken from the book of Job (Jb), chapter 23, verse 1 to 11, and the first part of verse 12 until the Syriac word which is to be translated as "his lips". -The introductory phrase is "From (the book of) Job the righteous", and spans fol. 271a, column 2, lines 12-13. -The liturgical title is "Furthermore, the order of the readings for the ascent of our Lord to heaven", the title spans fol. 270b, col. 1, lines 5-7, and the complete pericope spans: fol. 271a, col. 2, line 12 to fol. 271b, col. 1, line 17. -The denomination is "Maronite". The pericope is found in manuscript 13l7, which is a Maronite lectionary (nr. 60 of the list of manuscripts). -The pericope is taken from the Peshitta translation. More categories are not used for describing this particular entry.
The second example is taken from a Bible manuscript.
Example 2 -entry 036-0007, pericope "Ex#06:02" -This pericope starts at Exodus 6:2, the end is not as such indicated in the manuscript. -It concerns a reading for "The fifth day (i.e., Thursday) of the first week of Fast". The title is found on fol. 9a, column 3, lines 20-21.
research data journal for the humanities and social sciences 5 (2020) 20-38 -The title is scarcely readable and this is indicated by Jenner (1993) using a series of abbreviations: 1; gr; bz>r; vr; h2+gr, meaning: the title is preceded by the word qeryānā ("reading") (1), the complete title is effectively erased, some traces are legible (gr); it is made out of two ink-colours: a brownish-black ink and a red ink (bz>r); where the ink is accumulated, its edge contains traces of a moist (vr); the title which spans two lines is written on the left part of the first line and a complete second line (h2+gr). -The pericope is taken from manuscript 8a1 (nr. 36 on the list of manuscripts, see Figure 5). Since the manuscript is found online, a link to the relevant folio is given. -The denomination for which the manuscript was intended is not clear; in later times it was certainly used in the East Syriac denomination. More categories are not used for describing this particular entry.

4.3.
Further Considerations A reconstruction of the intended order following the ecclesiastical calendar can only properly been attempted when the data set is connected to the calendar of the ThALES database developed by Stökl Ben Ezra. This database provides a representation of early Jewish and Christian liturgies based on scholarly descriptions and its website runs a complex reconstructed liturgical calendar. When the link is established, it will connect the digitized data to this calendar and will advance future research of the data within the broader Jewish and Christian liturgical context. For this important connection, two categories have been inserted to facilitate a future link to ThALES. 334 readings from lectionary manuscripts are evidently used for more than one occasion. In those cases, the second occurrence often is only a reference to the first occurrence within the manuscript itself. The Syriac words used for denoting a reference are (variants of) ḥzi ("see" or "look up"), ktib ("as is written"), simā ("as is placed"), hā ("now then" or "behold!"), or bʿai ("look for"), for instance, ktib bdukrānā dasṭpnws pāsoqā ḥrāyā "As is written in the Commemoration of Stephan, the last section". These references are not always pointing to a well-defined set of readings. Such issues concerning references may be solved when the ThALES liturgical calendar is inserted.
Some pericopes are only indicated by giving just the first and the last words of a pericope (a total of 160 entries, 130 of which are found in ninth-century manuscripts, mostly nt pericopes).
Since the data set is construed on the basis of the manuscripts themselves, the various codicological features have been given an important place in the data set. The codicological data presented in Jenner (1993) is inserted in the data set, together with a register of the used abbreviations (cf. Example 2, above).

33
Linking Syriac Liturgies research data journal for the humanities and social sciences 5 (2020) 20-38 Figure 5 Screenshot taken from "Manuscripts.csv" exported to Excel-format. This screenshot shows the description of manuscripts no. 22-52 including Bible manuscripts described by Jenner (no. 7h2-8h4) and the earliest lectionary manuscripts described by Baars (9l1-11l5). Some peculiarities of the manuscript materials deserve to be mentioned. Titles and some liturgical indications are written in red ink unlike the text itself, which is written in a blackish or brownish ink most of the time. This change of colour could not be included since in most cases it proved impossible to establish the colour on the basis of the monochromatic microfilms or photographs.
The data set combines liturgical data from the fifth century till the sixteenth century including revisions and reparations which are sometimes dated even later. The data stems not only from a large span of time but also from the four main denominations of Syriac Christianity (cf . Tables 1-3).
Whereas lectionaries, as ecclesiastical constructs, are directly linked to a given denomination, it proved impossible, prima facie, to connect Bible manuscripts to a specific denomination, since most of the time colophons or other information needed is lacking (cf. above, section Materials). The data set may serve as a means to study the liturgical use of these Bible manuscripts, but not as a simple means to establish the milieu for which the manuscript is written. What may be possible, however, is to examine the (inter-)denominational use of these Bible manuscripts by studying the 457 entries which are part of the biblical text separately from the 463 liturgical titles which are to all probability added later and placed in the margins of the manuscript.
The greater part of the liturgical terminology found in lectionary manuscripts is present in Baars' notes and is therefore included in the data set. Some liturgical terms, however, are not found in either Baars' notes or in the data set. The data is slightly affected by the fact that the description of the lectionary manuscripts in the 1960s and 1970s served the preparation of the critical edition of the ot. Because of this focus, liturgical chants (i.e. a prokeimenon contained in Melkite lectionary manuscripts, or an ʿanāyutā in East Syriac lectionary manuscripts), or references to psalms (i.e. a šurāyā or a māzmorā as found in Melkite or East Syriac lectionary manuscripts) are not inserted. This holds also true for other liturgical instructions (e.g. the first words or the title of a commonly known prayer or a creed). Inclusion of these instructions in the data set has been started to get a fuller picture of the liturgical settings in which the biblical passages functioned.
Since more and more libraries and institutions are providing high resolution images of manuscripts in their collections, a hyperlink to the digital image of the relevant folio of the manuscript is added to 3,225 entries.
The resulting data set contains: 1.
a workbook including the entries; 2. a detailed list of the manuscripts used-including links to digitized manuscript catalogues and/or other relevant secondary sources; research data journal for the humanities and social sciences 5 (2020) 20-38 7. Codicology. The information about physical aspects of the manuscripts, about their origins and usage, will contribute to the re-edition of the List (Peshitta Institute, 1961), which will meet the current standards of codicological information. 8. Scribal culture. A further study of the colophons could result in more knowledge about the environments in which the manuscripts were written. The place names and personal names mentioned in the colophons were only in part given by Baars and could therefore only in part be incorporated into the data set. More study could establish links to the enlisted place names in Pelagios or Syriaca, continuing the work that has been done by the etcbc thanks to a Pelagios Research Development Grant (2018; see van Peursen, 2019). In due course, a website based on the deposited materials will be established which could be a starting point to facilitate research into questions and opportunities such as those mentioned above (van Peursen & Veldman, 2018). The connection with the pilot project LinkSyr: Linking Syriac Data (see above, section Materials) may enable further research into the liturgical terms given in the glossary, the frequency of the names of saints in the lectionaries, or personal names in the colophons of manuscripts.
The data contained in the data set are not only an important historical source, covering many centuries, but also part of a living tradition up to the present. Therefore, it is valuable not only for liturgical and Syriac scholars but also for the Syriac communities who nowadays use these centuries-old liturgies.6