In this chapter, I wish to chart the state that digitization is in, for the humanities worldwide, by evaluating twenty repositories. By doing so, the reader will learn more about what exactly makes digital manuscripts distinctly digital, how they are benefitting and sometimes obstructing the user, and how digital manuscripts are to be acquired. I shall close with an eye towards the future, exploring the strengths, weaknesses, opportunities, and threats of digitization of manuscripts.
This ethnography draws from my own field, Islamic studies, since that is what I can speak about with the most certainty. However, many repositories discussed here also hold non-Islamic manuscripts and for those which do not they will nonetheless shed light on the variety of digital surrogates you can encounter. Further, this ethnography was made in the summer of 2017. Some of these repositories have since then changed and that is only to be expected in a digital world. Even though important changes have been noted in footnotes, the assessment and the analysis have remained as they are, to be as fair as possible to all repositories. The repositories were selected for the different choices they made in digitizing, storing, and permitting access. Secondary attention was paid to the uniqueness of the libraries’ holdings and their geographical location. The following is therefore not an exhaustive list of the world’s best repositories, but rather a representative sample.
Five are from Europe: the State Library in Berlin, Leiden University Library, the British Library in London, the French National Library in Paris, and the Vatican Library.
Five are from North-America: the University of California, Los Angeles Library, Beinecke Library at Yale University in New Haven, Princeton University Library, University of Michigan Library in Ann Arbor, and McGill Library in Montreal.
Five are major digital collections from the MENA region: Suleymaniye Library in Istanbul, Topkapi Palace Museum Library in Istanbul, Malek National Library in Tehran, National Library of Morocco in Rabat, and King Saud University in Riyadh.
The final five are smaller or from other regions: Najah University in Nablus, Jafet Library at the American University in Beirut, Malaya University in Malaysia, the Aboubacar Bin Said and Mamma Haidara Libraries in Timbuktu, and the Institute of Oriental Culture, University of Tokyo.
In the discussion on each repository I shall give a short description and explain how manuscripts are made available digitally. After individual descriptions, I compare the repositories under the ten notions introduced in the previous chapter. These notions can roughly be divided into two groups, one expressing the means of distribution, the other assessing the quality of the photos. To round out this discussion, I will finish this section with a comparison of the word qāla (‘he said’) as it appears in manuscripts from each collection. By keeping as many variables as possible constant, the appearance of this word should give us a visual overview of the range of quality currently on offer.
1 Old Collections in Europe
Libraries around Western Europe have been acquiring Islamic manuscripts for many centuries. An important characteristic of these libraries is that often times collectors chose their purchases well, bringing back some of the finest, rarest, and generally most interesting manuscripts back to Europe. Digital access to these collections is therefore crucially important.
1.1 Staatsbibliothek zu Berlin
The ‘Stabi’ in Berlin has to date more than 1,600 manuscripts digitally available online. Some of them come from digitization projects. Others were digitized upon request, following the first-customer-pays principle, in which the person first interested in a manuscript pays for it to be digitized, after which it becomes freely accessible online. The digital surrogates are integrated into their online catalog, which is therefore the principal entry, though through a special page one can be sure to find only digitized texts. The online catalog can behave erratic, giving different results for searches in Arabic, in simple transliteration, and in proper transliteration. It is therefore beneficial to also consult the printed catalog, which can be obtained in PDF format. From a catalog entry, one can click to open a viewer. This has most of the functionalities for efficiently viewing, browsing, and downloading the digitized manuscript.
1.2 Qatar Digital Library
The online digital holdings of the British Library relevant to Islamic studies is only a fraction of the actual collection, consisting at time of writing of only a bit more than 100 manuscripts. These are consolidated on a dedicated website, hosted in Qatar. Here one can find other types of sources as well, such as archival records, newspapers, and photographs. It seems to have been done on a project basis, with a topical concentration on the Gulf region. The website is sleek and clean and provides several ways to find manuscripts. It is advisable to use both Arabic and Latinized key terms when searching for specific words, as some entries are unevenly cataloged. Clicking on a record immediately takes you to images of the object. From a technical point of view, every image is a separate item and all images of one manuscript are linked. This means in practice that moving from one folio to another is time consuming. An ameliorating factor of this repository is that they included in the portal a page with ‘Articles from Our Experts.’ These are small case studies making use of the digitized materials, which are both informative and exemplary.
1.3 Leiden Universiteitsbibliotheek
The famous collection of Islamic manuscripts at Leiden is not freely available, nor fully. The university sold the right to Brill publishers, to commercially exploit the digital distribution rights of the manuscripts. A day pass is available for individuals. Brill decided to sell digitized manuscripts in batches, so far totaling less than 700. The portal seems to have gotten little thought. There is a long list of all manuscripts, without much description. Then there is a search function which did not perform well when I tested it. The best way to make use of this resource is to go through Jan Just Witkam’s catalog to find a manuscript of interest,1 and then use different search strategies to ensure the manuscript is included in Brill’s modules. The viewer is decent, with options such as rotate, zoom, and download. A big drawback is that it is not indicated which folio you are looking at, or in fact which manuscript it is you are seeing. Given that the collection is behind a paywall, it is odd that the download function gives you inferior quality compared to zooming in, in the browser.
1.4 Bibliothèque nationale de France
The BnF has been in the business of digitizing for a long time, through a dedicated portal called Gallica. Islamic manuscripts can be found here too. There are seemingly in excess of 4,000 relevant items. Unfortunately, a large number of the digital surrogates are surrogates of surrogates: they are digitized microfilms. Digitizing microfilms is relatively inexpensive and fast, making it easier to establish a sizable digital repository. However, being two degrees away from the actual object impacts the usability. If, on the other hand, your manuscript of choice is available in full color, you are in luck, as Gallica has a superb portal and includes a superb viewer. The search function performs well, and the viewer is very flexible in how you want to browse, look, and download the manuscript.
1.5 Biblioteca Apostolica Vaticana
On the website of the Vatican Library, about 350 Islamic manuscripts are available, a fair number of them being digitized microfilms. They have only recently started a digitization project and intend to digitize their entire collection and make it freely available. Currently, you can only find what you are looking for if you know the exact manuscript you want. The website is arranged by collections and in each collection only manuscript numbers are given. Once you click on that, you see the images of the manuscript. The viewer has most of the desired features, including a precise indication which manuscript and folio you are looking at. Downloading can only be done one page at a time, and it includes a watermark of the Vatican.
2 New Collections in North America
It is estimated that about 33,000 Islamic manuscripts are to be found in North America, virtually all of them at universities.2 Given the financial and technological means available in North America, it should be possible to have all manuscripts digitized, with a union catalog to hold it together. Most libraries are hard at work to make that a reality, and as such the libraries listed here are among the leaders in blazing a trail in digital preservation of Islamic manuscripts.
2.1 Caro Minasian Collection at the University of California, Los Angeles Library
I am including UCLA’s repository for its unwieldy approach and its reliance on outdated technology. Less than 600 manuscripts have been digitized, as a capita selecta from the Caro Minasian collection. They are trapped in a catalog with fuzzy metadata. For example, there are six categories to tag a work as ‘Philosophy’, each with different works tagged. It also includes duplicates, e.g. search for “
2.2 Beinecke Rare Book & Manuscript Library
The Beinecke Library, part of Yale University, has a state of the art digitization studio. It has placed its Islamic manuscripts online within the wider digital collections that they curate. This makes it unclear exactly what they offer freely online, though currently it seems to total less than 300 manuscripts, some of them only partially digitized. Searching can only be done using transliteration. The viewer is spartan and a bit of a hassle to work with, though it includes the option to download a page or the entire document.
2.3 Princeton University Library
At Princeton, two repositories have been created for digitized Islamic manuscripts, with their own portals. One draws from Princeton’s own holdings, the other is thematic, focussed on manuscripts from Yemen. The former was largely done through internal funds. The latter was supported as a project by NEH and DFG grants. Together they offer more than 1,500 manuscripts, of which the bulk is digitized microfilms. The metadata is useful, but not curated or cleaned up, meaning that there are too many categories and in addition they are not listed alphabetically. This is solved by the search function, which is excellent. For example, searching for “20 lines” gives all manuscripts whose text is laid out on 20 lines per page. It is beneficial to search in both Arabic and Latin transliteration as in rare cases the title is only mentioned in one or the other way. The viewer is simple but efficient, similar to the viewer of Archive.org (see below, under McGill University). A button for downloading the entire document is missing. In the case of the Yemen repository, this is remediated by having a table of contents as a sidebar when examining the manuscript in the viewer. This is a rather unique feature that greatly expedites and enhances the study of these manuscripts.
2.4 University of Michigan Library
The most exemplary digital repository of North America was done at the University of Michigan. More than 1,000 manuscripts were cataloged and digitized in a concerted effort, thanks to an NEH grant. They decided to leave out a portal and simply let users interact with the catalog, which is integrated into their wider library catalog, and with the repository, which they arranged to be hosted with HathiTrust. The library catalog is connected with the repository, which is useful as the search function works well in the catalog but is broken within HathiTrust. The viewer has a fair number of options, but lacks clear indication of which manuscript and folio the user is seeing. Several download options are available. In the cataloging process they experimented with crowd-sourcing, which did not yield a significant result.4
2.5 Islamic Studies Library, McGill University
The repository at McGill distinguishes itself by focussing on lithographs, including only few manuscripts. Additionally, the books published by the Tehran branch of the Institute of Islamic Studies at McGill University have also been digitized and are available through the same portal, bringing the total to over 400. The digitization seems to have been done over the course of several years. The portal is hosted at the Internet Archive, a non-profit organization that seeks to archive a variety of things in digital format and make them freely available online.5 The most obvious benefit of this is that this repository has a greater assurance of a continued existence, with hardware and software upgrades along the way. The metadata seems to have been pulled directly from the library catalog. There is relatively rich metadata, yet much of it is non- uniform. For example, not all manuscripts are tagged as manuscripts and there is both an “Islamic philosophy” category as well as an “Islamic philosophy.” category, the difference being merely the period at the end, as well as nine categories all variations on the word “Philosophy.” This will surely leave a user missing relevant texts when looking through one of the categories. Next to finding texts through filtering categories, one can also search by key words. Unfortunately, this search function seems to be unreliable. For example, searching for
3 Major Collections in the MENA Region
In the Middle East and North Africa we can find some of the largest digitization efforts. Especially notable are the efforts of libraries in Istanbul and Tehran to conserve their heritage; almost everything in the centralized collections of these two cities has been digitized or can be digitized upon request. To that we may add the efforts at King Saud University in Riyadh which has engaged in creating an online repository of unmatched dimensions, and the National Library of Morocco which is on its way to establish a sophisticated digital flank to their library system. Other cities with notable collections, such as Cairo, Damascus, Najaf, and Baghdad, are lagging behind. Notwithstanding, the efforts by the four aforementioned cities has been truly transformative for our conduct of research and they merit further attention.
3.1 Süleymaniye Kütüphanesi
Currently, the Suleymaniye Library sits on top of the largest pile of digitized Islamic manuscripts in the world, likely in the range of 100,000. Most collections in Istanbul were centralized in the Suleymaniye and from the early 2000’s onwards they started to digitize basically everything. The size of the collection made this a long process, and this, in turn, causes significant quality variance among the corpus, as over the years different equipment and technologies were used to create digital surrogates. These photos are not free, nor available online, but can be accessed on computers at Suleymaniye itself, or can be send upon request. It seems that at time of writing they are busy to make the photos freely available online. The catalog is notorious for utilizing an inconsistent Turkified Latin transliteration. They are actively curating their digital catalog and repository, so these shortcomings will hopefully be taken care of in the future.
3.2 Topkapı Sarayı Müzesi Kütüphanesi
I mention Topkapi separately since they operate independent of Suleymaniye and enforce a very strict policy towards digitization. In 2014, I had to go in person and get written permission from the Ministry of Culture and Tourism to even file a request. It turned out my manuscript had not been digitized, but they were willing to do so, without an additional fee (the per photo fee can quickly add up though). A few weeks later I received the files. As such, I include this collection to show that sometimes digitized manuscripts cannot even be consulted on-site, but have to be obtained by exerting a lot of time, energy, and money, without even perusing an on-site catalog.
3.3 Ketābkhāna va mūza-ye melli-ye Malek
Malek Library, in Tehran, provides free, online access to digital surrogates of their manuscripts. To date, more than 5,000 manuscripts have been digitized, according to their website. The online catalog is among the most extensive, with ample metadata for virtually every holding. The portal is in Persian only, and allows for detailed searches on every individual element of the metadata. The site can be slow and can time-out at times. Not all manuscripts can be viewed, and for ones that can, the viewer is basic. You get to see one photo and need to use buttons to move to the next or previous page. This much is available for free, beyond that requires a log-in.
3.4 al-Maktaba al-waṭaniyya li-l-mamlaka al-maghribiyya
The National Library in Morocco currently has about 100 manuscripts available online. They, supposedly, set out to digitize all of their manuscripts, 80,000 in total, to be recorded in a digital catalog. Manuscripts are not their sole focus, as they include a variety of other library holdings such as journals and even audio recordings such as audio books. As such, this is a multi-year endeavor, which looks to be a pillar of the library’s long-term strategy. Previously, the library had put large amounts of manuscripts on microfilm, requiring visitors to look at the microfilms rather than the original objects. As these objects are digitized, visitors are now allowed to see these digital surrogates.6 With only a hundred online, the free dissemination seems to be not a priority. Notably, the viewer is built with Microsoft Silverlight, a technology similar to Adobe Flash, meant to deny the possibility of downloading the images. One cannot zoom in very far, and browsing can only be done page by page.
3.5 Jāmiʿa Malik Saʿūd
The King Saud University is in possession of some 11,000 manuscripts. So far, about half of them are made freely available on a dedicated website, with decent metadata which is also searchable. One does well to navigate the website in Arabic, as the English version seems less reliable. The portal, in general, is solid. The breadth of topics included is commendable, covering virtually all aspects of Islamic civilization. The viewer is too basic, offering no flexibility beyond looking at a page, clicking on it to see a larger version, and offering the option to see the page as a PDF. Furthermore, every image has a big watermark. On certain parts of the internet, files can be found of entire manuscripts of this collection, though the quality of those files may be less than currently found on the official website.
4 Notable Collections in Africa, the Levant, and Asia
Under this header I would like to shed light on the variety of digital repositories that exist beyond the beaten path of the famous collections described above. I chose to divide this up into three regions: Africa, the Levant, and Asia. South America, Oceania and Antarctica remain completely absent from this chapter, as they are largely without Islamic manuscripts. The three chosen regions, on the other hand, all have a strong history with Islamic culture and therefore have a large amount of manuscripts. Unfortunately, all three regions sorely lack in digital resources for these manuscripts.
For all three regions I specifically sought for local digital repositories; conceptualized and executed at the particular place of a collection. Much more work could and should be done on the collections from these regions. A collaborative approach may be unavoidable, such as done by the British Endangered Archives Project, the German Gerda Henkel Foundation, and the Hill Museum and Manuscript Library (see the Timbuktu repository). From the libraries selected in this article, it should become clear that the vast area in between Tehran and Tokyo is lagging behind in disseminating digitized manuscripts. My personal experience with Punjab University Library, in Lahore, Pakistan, speaks to this. It took me nearly half a year of e-mailing and eventually having someone go in person, and then paying a fee, to obtain photos of only mediocre quality. For India, it seems that the organization National Mission for Manuscripts has been industriously digitizing manuscripts. They do not, however, state how these digital surrogates can be obtained, and their website is offline at time of writing. The Internet Archive makes snapshots of websites and their last record is from September 2016, when the NMM’s website claimed to have digitized close to 3.5 million manuscripts. We can only hope these files will be made public at some point. In the meantime, let us consider the following repositories.
4.1 Jāmiʿa al-najāḥ al-waṭaniyya
Admirably, the university in Nablus, Palestine, makes more than 700 of their manuscripts freely available online. Some basic errors will make it hard to profitably make use of it, though. Most notably, no mention is made of a call number, making it difficult to refer to them. A large part of the collection consists of digitized microfilms. The catalog includes only the title and author, and clicking on them immediately opens the viewer. Viewing can only be done one page at a time. Clicking on an image opens a much higher quality photo on which the watermark is relatively small.
4.2 Jafet Library
The digital repository at the American University of Beirut is nothing more than a page with titles and clickable covers. Clicking on them takes you to a page with no more information than the title and author—no MSS number is mentioned. The viewer is as simple as it can get, and only allows access to the first and last five pages of a manuscript. Since only 27 manuscripts are listed, it looks more like a pilot project than anything else.
4.3 Aboubacar Bin Said Library and Mamma Haidara Library on vHMML
The collections of Timbuktu made world news when it seemed to become the next target of cultural heritage destruction by rebels, in 2013. As is clear from the repository here under consideration, this did not happen, or at least not completely. SAVAMA-DCI,7 a Timbuktu-based NGO, collaborated with several organizations to transfer the manuscripts to safety and digitize them. The Hill Museum and Manuscript Library, at Saint John’s Abbey and University, has been the primary institution to take on the task of digitization and building a repository. Of the Timbuktu collections, about 1500 are available online, with the promise of possibly the entire collection of several hundred thousand manuscripts to be digitized. They are available through the portal called vHMML. The portal and viewer are modern and flexible from a technical point of view, but can be notably slow in use. Moreover, if you do not search for a specific title, titles are not shown in the list view. Instead, the user only sees the name of the collection, which is rather useless information. This makes casual browsing in hopes of a serendipitous encounter nearly impossible. On the same portal, other relevant collections are also available, most notably from Lebanon and Jerusalem.8 Combined, the digital collection comes to about 5,500 manuscripts. When you combine manuscripts tagged as Arabic, Kurdish, Persian, or Turkish, this number rises to almost 8,500. In short, it seems that vHMML is fast becoming one of the primary destinations for those interested in Islamic manuscripts. Manuscripts can be viewed online, after free registration. Downloading is behind a paywall. Who the actual copyright owner is, and who will receive the money paid, remains unclear. This is, I think, a precarious issue. The West has had a painful history of claiming cultural heritage in foreign countries and shipping it off without compensation. Such issues should be thought out in the digital sphere too. I am not saying HMML is in the wrong here, but merely wish to note the lack of clarity in their documentation. It should also be noted that other organizations involved in the digitization of the Timbuktu manuscripts are more suspect in this regard. Aluka, a project of a US-based NGO, offers more than 300 manuscripts on Jstor, behind a paywall.9 Another project, supported by the Gerda Henkel Stiftung and the University of Cape Town, supposedly hosts digital surrogates but restricts access to registered users, and has quietly turned off the registration form.
4.4 MyManuskrip Malaya University
I mentioned earlier that between Tehran and Tokyo, very few repositories are to be found. A fitting example for this claim is that the only free, online repository of Islamic manuscripts I could find is no longer available. The Malaya University in Malaysia used to run a website called MyManuskrip, on which they hosted a significant amount of manuscripts, in a variety of languages among which was Arabic. It seems to have been put reasonably well together, that is to say, with effort scholars could have made profitable use of it. The project was started with much enthusiasm in the late 2000s.10 The plug was pulled somewhere in 2014 or ‘15, without warning or explanation. All that is left are archived snapshots of some pages of it. As such it is a sober warning that on the Internet, what is one day, can be not the next.
4.5 Daiber Collection Database at the Institute of Oriental Culture, University of Tokyo
Apparently the German scholar Hans Daiber was an avid collector of Islamic manuscripts. His collection ended up in Tokyo, and the Japanese did a good job providing digital access to more than 500 of them, likely the entire collection. The website which hosts it looks and feels barebones, but it makes up for this in versatility. For example, because of the way this website is put together, it is one of very few collections that allows users to search for specific characteristics, such as the number of lines of text a manuscript has. I also greatly appreciate the distinction between manuscript and text; they basically treat every manuscript as a collection of texts (majmūʿa) and subsequently list the different texts contained within, even if only one text fills the entire manuscript. This conceptually separates the codicological description of the object and the philological description of its contents.
5 A Grand Comparison of the Quality of Digital Surrogates
Now that we have come to know various repositories, let us compare them, so that we may better understand the state of the art of digitization of manuscripts worldwide. I shall do this along the ten notions discussed in the previous chapter, after which I will make some final remarks on the general result of our analysis. The notions are: size of the collection; online availability; ability to download; the portal; the viewer; indication of page numbers; image resolution; color balance; lighting; and how the image is cut. For each of them I prepared a pie-chart which is easiest read starting top-right and going clockwise. The slices present the number of repositories which perform from bad to good under the specific category. In listing the repositories I have used shortened names which I hope are self-explanatory.
5.1 Size of Collection
Digitization is about scaling up. Digital photos can be stored and transmitted cheaply in large quantities. Maintenance, including migration and upgrading, can be done as easily for a large set of files as it is for a small set. Larger repositories can therefore benefit from a larger total pool of funds-per-file, ensuring a more professional and future-proof curation. In grading I looked in particular at how much has been digitized in comparison to the total collection of actual manuscripts held by the library.
It is good to see the top two categories make up close to half of all the repositories. Of the second category, which only offer a little, we know that the Vatican and the National Library of Morocco intend to make much more digitally available. The national libraries of France and Germany are likewise still expanding their digital holdings. As such, the future is looking bright. Notably, of the middle category we have less expansion to expect, since most of them digitized on a project basis which has already terminated.
5.2 Online
Digitization inevitably means online distribution. Given that classical Islamic studies remains a niche discipline, free distribution is essential. Allowing re-distribution, for example by licensing the photos in the public domain, is much welcomed too.
I divided the repositories into three groups. Those that are not online, meaning you have to order photos at the library itself, for example on a CD, or they do not exist anymore, in the case of Malaya University. Those for which you have to pay in order to instantly see them online. And those which are fully, freely accessible online. As becomes clear from the pie-chart, we live in a world in which most repositories make their Islamic manuscripts freely available.
5.3 Downloadable
Are the digital surrogates downloadable? This is different from the previous category. For example, Leiden’s manuscripts are behind a paywall, but downloadable, while UCLA’s manuscripts are freely accessible online but not downloadable. For the first category downloading is impossible since they are not made directly available online. Libraries which allow online viewing and put in place technology to prohibit users from downloading are in the second category. One can still make a screenshot, of course, if all that is needed is a small part. Since this is too labor intensive to do for even a small, relevant portion of a manuscript, these manuscripts can be said to be undownloadable. Next are those for which a water mark is included in the downloaded photos. A higher category allows users to download one photo at a time. The highest category, then, means that users can download the entire manuscript, and often times there will be functionality to download only parts of it.
An important part of this aspect of being downloadable is the restrictions that libraries impose on using the digital surrogates, that is, their policy on copyright. In principle, anything on the web can be captured and independently distributed, and often times this is much easier done than the independent reproduction of paper copies of books and manuscripts. It is therefore good to avail ourselves beforehand of the intended use, to forego any legal problems. It turns out there is a vast difference in how libraries are situating their repositories of digital surrogates of ancient manuscripts. In total, I distinguish six attitudes on redistribution.
The first is to assert that the photos are in the public domain, thereby allowing any type of usage and redistribution. The repositories of the University of Michigan and the Qatar Digital Library do this.
A tier below that is allowing non-commercial use, as long as the source is attributed to the library. The Staatsbibliothek Berlin, Bibliotèque nationale de France, and American University in Beirut assert this. Next is Tokyo, which only allows downloading and transformation for personal use.
Then there are those who under all circumstances require to be asked permission. These are the Vatican, the Leiden-Brill project, the Timbuktu manuscripts on vHMML, and the Suleymaniye and Topkapi libraries. Brill, it should be noted, gives some leeway to download and transform for personal and academic usage.
This last remark may sound vague, but that is nothing compared to the next category. This category comprises libraries who purposely describe the rights and permission in vague and opaque terms. Since I do not know what they mean to us on a practical level, I will not summarize it. It seems that their statements are not written for users, but rather to safeguard themselves from any litigation. These are the repositories of Princeton, the Beinecke at Yale, and McGill.
Lastly, the repositories of Malaya University, Najah in Nablus, King Saud in Riyadh, and UCLA do not give any terms of use. I find it interesting that of the last three categories, it is the commercial enterprise, Brill, which is the most friendly to reuse and gives the best options for contacting them. I therefore mean to exclude Brill when I say that using repositories of the last three categories is, on paper, a risky undertaking. Legal action against perceived misuse of academic materials is becoming more common. The question ‘Will I be sued if I do something with these images?’ deserves a simple answer. To have institutions with big pockets like Princeton and Beinecke answer with ‘Maybe. Maybe not.’ is discomforting to say the least. Taken as a whole, though, it seems well within virtually any country’s legal code, and within most libraries’ explicit policy, to keep private copies on personal computers for research purposes.11
5.4 Portal
Digitized manuscripts are usually accessed through what I call a portal and a viewer. A portal is a website featuring a catalog, which allows a user to find a particular manuscript. The viewer is a page which allows a user to view the photos of a manuscript, usually with buttons to navigate and zoom. The portal is judged on how well it discloses the digital holdings. Given its importance, I already gave it attention in the description of each repository.
In general, the portal seems to be an afterthought. An extreme example of this is the case of the Michigan University repository. They first had a dedicated portal and later deleted it, obliging users to find their own way within the larger online library catalog. On the other end of the spectrum we may note portals such as for the Daiber Collection Database and the Qatar Digital Library, which neatly define their purpose and bring all relevant information together. As I noted before, in the latter case there is even value added by including research articles within the portal to showcase its usability.
5.5 Viewer
The viewer is judged on its flexibility in viewing and navigating the manuscript. It too was highlighted above in the description of the repositories.
A satisfactory grade was given to indicate that a user can generally make use of a repository but will experience inconvenience and will like have to put in extra effort. Beggars can’t be choosers so I gave it a mildly positive term, but in reality this is not a good grade. Using technology only makes sense insofar as it helps us. When it starts working against us, which is the case for the vast majority, we can only proceed with caution. Given this result, the viewer is in my estimation a point on which all repositories can greatly improve. They may take the Gallica Viewer of the Bibliothèque nationale de France as an example.
5.6 Page Numbers
Is the user able to quickly deduce which folio and which manuscript they are seeing? It turns out, many repositories forget to build in this functionality, while it is crucial for looking something up or citing it. When pencilled folio numbers are absent from the material manuscript, it is arduous and prone to error to count back from the first folio in a digital surrogate. Many repositories give false information, by providing a ‘page number’ which is actually an image number, starting from the very first image which is usually the cover or an index card of the catalog.
Please note that I drew up these answers by taking sample tests for each repository. I discovered in the case of the Bibliothèque nationale de France that page numbering can be correct for certain items but missing or incorrect for others. This may likewise be true for other repositories. I stand by my conclusion, though, that page numbering is among the biggest problems repositories currently have. Notably, none of the repositories include a feature which I think would be highly useful, which is to automatically generated a small stamp in a corner of the image indicating its origin. Such information would ideally consist of library name, manuscript number, and folio number. If this would be done, the origin would be hardcoded into the image, making it easier for users to use individual photos or reorganize them on their computer. A similar system could be put in place for the file name when downloading images, which is again an opportunity missed by all repositories. Once origin details are robustly taken care of, scholars will find it much easier to use and refer to digital surrogates in their writings.
5.7 Resolution
Under this header I judge the level of detail of digital surrogates, based largely on the file size in combination with the dimensions of the picture (height and width in pixels). Both should be taken into account because there is invariably some form of compression, and hence loss of quality, involved in the process from making photos to putting them online. I have come to think of 500kb per page as a desired minimum, meaning 1mb for a two-page spread. A visual assessment is also informative, as different file formats make it hard to objectively compare all repositories. Later in this chapter we shall take a look at a visual comparison to give an impression of the quality.
In my opinion, the vast majority of digital surrogates are fine to be used for various scholarly purposes. Interestingly, this division seems to correspond to a geographical division. In the top category comes Europe. In the second category North-America, in the lower categories the rest of the world. Notable exceptions are Topkapi and Leiden. The former does an excellent job, as far as I have been able to assess. The latter is lagging behind, which is all the more surprising as it is the only for-profit repository.
5.8 Color Balance
In the graphic design industry, people sometimes ask if the blacks are black. They mean that if an object is in reality black, does it display on a computer screen as black or does it seem like dark grey? True-to-color is also an issue for us, as the color of the paper, ink, and binding are part of the materiality of the manuscript and we do well to preserve them truthfully in their digital surrogate. This is what I assess under this heading.
Most repositories do well with this. The difference between the excellent and good categories was made on consistency; the ‘good’ repositories show their manuscripts true to color but have an occasional, slight aberration, whereas the ‘excellent’ repositories are consistent. At this moment, color balance does not need to be a concern for us, for most purposes. One case when we should keep it in mind is when we use digital surrogates from different repositories that have a different color balance quality.
5.9 Lighting
Ideally, photos are made in a well-lit place so that the entire folio is evenly visible. When one, direct light source is used, it often creates shadows and shines, which can make it difficult to read parts of the text and give an uneven, nervous quality when looking at the entire page.
As with color balance, the majority of repositories does well for lighting. However, in this case the bottom two categories are concerning. For these, six repositories in total, shadows and shines pose a real threat to the readability (and hence usability) of the manuscripts.
5.10 Cut
The borders of a photo crucially separate what remains digitally visible, and what is out of sight. I have given a high score if at all times the entire edge of the manuscript is visible, in other words, if also a bit of the supporting surface can be seen. Lower scores are given for when a photo is cut tighter. Sometimes they are too tight, when marginalia are cut off.
At the start of my investigation, the cut was a major concern to me. It is a relief to see that virtually all repositories get this right. In some cases, such as many manuscripts from Suleymaniye, the cut is very tight, but not too tight.
5.11 A Final Rating
We can combine the previous evaluations in one grade, to get a sense of the overall performance. For four categories, they were assigned values from 0 to 4. In case of three categories they received the values 0, 2, or 4, and where there were only two categories they were assigned 0 or 4. This means that repositories falling in a lower category are penalized considerably and this should be taken into account when evaluating the final grade. The scores were transformed to a grade on a 10-point scale and to a letter grade. We get the following list:
What is immediately visible is that the letter grades make less sense in this case. This is especially so since in letter grade education systems grades are usually arrived at by starting at A and then deducting when necessary, whereas in this evaluation the digital materials have to slowly earn a higher grade by virtue of doing better in a certain regard. Another important note is to bear in mind that a low grade does not mean the repository is unusable. It only indicates that it is lacking certain qualities that were included in this evaluation. Comparatively, however, it is highly instructive to notice how far ahead the Bibliothèque nationale de France and the Staatsbibliothek zu Berlin are, against the rest. They perform well both in terms of delivery (online availability, portal, viewer, etc.) and in terms of quality (resolution, lighting, etc.). This is true too of the Qatar Digital Library. The other two in the top five, Beinecke and Michigan, earn their place more on their quality than their delivery. Beyond the top five, we find a large group sitting between the grades 7 and 6. Leiden’s repository is perhaps the most surprising. It is well beyond the top ten, and barely receives a passing grade. In the bottom part of the list, we may notice that Suleymaniye and Topkapi are punished for the difficulty of accessing their digital surrogates.
6 A Visual Comparison
To support the judgments I made in evaluating the repositories on the quality of their images, I include here a sample that illustrates the difference one can encounter while looking for photos of manuscripts. To make this as objective as possible, I set out to find in each collection a manuscript of about 20 by 15 cm, with about 19–23 lines of text per page, written in a straight naskh. From those manuscripts I selected an image that included the word qāla (‘he said’) which I cut out in a square, and resized each square to be of the same size. I further did my best to obtain the highest quality color image in each case. For some repositories, I had to make concessions. For example, the American University in Beirut has so few manuscripts that I had to settle for a slightly less straight naskh without diacritics (iʿjām). In the case of Timbuktu, all manuscripts were copied in Maghribī script, being slightly more round and using only one dot to indicate a qāf. Further, the repositories of the BnF, Princeton, Nablus, Tokyo, and the Vatican use a lot of microfilms, resulting in black and white images, yet I chose a color image for the comparison. It should also be noted that for the Suleymaniye it is hard to let one image be a representative for the entire collection since there is considerable difference in quality.
Comparison of qāla from different online repositories around the world
7 Difference between Professional and Amateur Photos
Relatively simple consumer electronics can also be used to make photos. On the next page we see on the left a professional photo of MS Leiden Or. 137, f. 3a (Ibn Kammūna, Sharḥ al-Talwīḥāt), and on the right a photo I shot myself, using an iPad.12 At that time, I was not concerned to make precise shots for use in serious editorial work as I merely wanted to document the various manuscripts at Leiden that were of interest to me. The shots I made show the pages at an angle and with heavy shading. Additionally, I did not get all the edges of the page.
The professional photo is much better in many regards. However, when we look at the readability of the two photos, I cannot detect that much difference.
8 The Future of Digital Manuscript Repositories
Digitized manuscripts come in a great variety yet at the same time their digital existence also show some common traits. The opportunities these traits offer can be described through a SWOT analysis—a tool from economics to measure the strengths, weaknesses, opportunities, and threats of a business. Strengths and weaknesses are positive and negative aspects inherent to a business, opportunities and threats are positive and negative aspects of a business brought about by the environment it is in. In this case I consider the business to be the digitization of Islamic manuscripts.
Among the strengths of digitization of Islamic manuscripts worldwide is the big amount already done. Part of this, perhaps, is because of the presence of large amounts of Islamic manuscripts in countries with loose copyright laws and the availability of cheap labor. Although the quality of these photos varies tremendously, as we have seen, the vast majority is usable. I wish to draw particular attention to the study of majmūʿa’s.13 Florilegiums are a particularly strong subset of Islamic manuscripts and they have largely been under-cataloged.14 With their digital surrogates easier accessed, we can peruse, annotate, catalog, and study them much faster. What discoveries will come out of this, and the effect it will have on our methodology and general understanding of Islamic culture, remains to be seen, but I think it is among the most potent possibilities for a breakthrough, enabled by the digitization of Islamic manuscripts. A more obvious breakthrough, already gained, is that editions can now be based much easier on manuscript witnesses from all corners of the world and it is fairly easy to check existing editions against actual manuscript evidence. I do think, however, that editions based on digitized manuscripts require a description of the quality of the digital surrogates as part of the codicological description.
Weaknesses of digitized Islamic manuscripts pertain, I believe, to the seeming carelessness about securing that these digital surrogates are future-proof. I see two aspects to this. Firstly, the quality of some digital photos is lacking, sometimes spurred by a temptation to digitize microfilms. This results in poor surrogates, in fact, when a microfilm is used, we end up with a surrogate of a surrogate. The result may still be usable right now, but we do not know what we will want to do with them in the future. For example, technology is developed to automatically recognize text within manuscripts. For this to work properly, the quality of the photos is of paramount importance. Imagine if we have good optical character recognition technology, and are able to run this automatically over a large (huge, even) repository. We could engage in entirely new ways with the vast sea of literature that authors from the Islamic civilization produced! Since money is limited, and the number of manuscripts to be digitized is great, we realistically only have one opportunity to digitize a manuscript. By doing it poorly, we will likely hurt ourselves later on.15 A similar weakness inherent to Islamic manuscripts is the relatively poor state of digital availability of catalogs. What good is a digitized manuscript if we cannot find it?16 Further, having good, regularized metadata will open up possibilities of large scale research. Yet, we noticed that some repositories currently offer an overlapping, incoherent categorization and others describe their holdings in an eclectic, unpredictable transliteration. There is a solution to this, namely digitizing and regularizing printed catalogs, and this is a necessary condition to move the digital work on manuscripts forward. How much of this work ought to be the responsibility of scholars is not clear to me.
Secondly, no matter the quality, most digitization is done on a project basis, not as a program. For long term viability, the latter would be more desirable, seeing that there is continuity in funding, curation, and migration to newer technology.17 When a project is finished, could it be merely waiting to go offline, like the repository at Malaya University, or become outdated and abandoned, like UCLA’s repository? As with the quality of the photos, storage and curation needs to be thought of in terms of decades, for otherwise we are simply wasting our precious resources.18
What are some of the opportunities? With IT infrastructure and Internet connections becoming cheaper, the possibility is just around the corner for countries like Turkey, Iran, and India, to deposit their digitized manuscripts online, for free. This could make Islamic studies the field with the biggest amount of digitized manuscripts, in comparison to similar fields such as medieval studies, classics, and sinology. Already now, Islamic studies vies with other fields in the renaissance of philology within the humanities generally, as seen for example in the journal Philological Encounters. If in addition we can have more manuscripts digitally cataloged, perhaps connected through a union catalog, other possibilities will open up as well. For example, so far it has been nearly impossible to edit a text as a ‘digital documentary edition,’ in which the reader can turn on and off manuscript witnesses to create particular readings that work for them, instead of relying on the one and only reading an editor offers through a traditional Lachmannian approach.19 A documentary edition requires a broad set of manuscripts with strong metadata concerning their origin. As the methodology is being fine-tuned in other fields, we will soon be able to profit from this approach.
Secondly, if we find a stable way to store these digital surrogates, we could refer to manuscripts more easily in our studies. This is because this evidence would have a status of being semi-published, whereas before we could not expect our readers to have access to the manuscripts in order to fact-check our argumentation. This will require, again, more robust digital cataloging. One aspect much discussed in other fields is ‘interoperability’, meaning, agreements and technical frameworks that allow for easy communication between different repositories. The International Image Interoperability Framework is currently the most promising initiative in this regard, as discussed in Chapter Five. Through this framework, one is able to pull up and manipulate images of manuscripts of different libraries, side by side, in real time. It will be a long shot for libraries in the MENA region to join this initiative, but perhaps something similar will be developed, which is hopefully flexible enough to not have to reach manuscripts of each library in a specific way. Additionally, one could hope that this will encourage mirror-hosting agreements between institutions. A fine example of this principle is The Stanford Encyclopedia of Philosophy. This online encyclopedia is developed and hosted at Stanford University, but has agreements with the universities of Sydney, Australia and Amsterdam, the Netherlands, to run identical copies. This diversification strategy, literally spreading the odds over different continents, would be much welcomed in the world of digitized Islamic manuscripts.
Additionally, there is an opportunity for ameliorating digital surrogates by connecting them to large text corpora. Thus can be a two-way street. We could make connections to manuscripts from existing texts, and we could make connections from manuscripts to new entities within the text corpus. On a small scale, we are talking about a useful, flexible editing environment, but it could be done on a large scale in which case scholars could travel between texts by means of the digital text corpus and only dip into the manuscript evidence where necessary.
Lastly, let us consider the threats. I imagine these to be in the domain of continuity. As technology changes, our repositories are threatened to be left behind, eventually becoming inaccessible. Additionally, political pressure should be taken into account. Islamic studies is a field with clear political relevance in today’s geopolitical discourse. As repositories seem to rely more often than not on state support, the state can weigh in on the course digitization takes. For example, a fair number of digitization projects in the US relied on grants from the National Endowment for the Humanities, an agency whose budget was threatened to be eliminated in ‘17– ‘18. It could also be that more power is asserted over repositories. A worrying first sign is that digitized manuscripts requested from Istanbul now come with a watermark, a new practice since 2017.
Whatever happens, it does seem that digitization has become an unstoppable force and more and more students and scholars are seeking out digitized manuscripts, often preferring them over material manuscripts. At the same time libraries are noticeably making it more difficult or even impossible to see the material manuscript, arguing that the digital surrogate will do. A fair evaluation of the digital materiality of digitized manuscripts is therefore crucial, in order to make use of them in an appropriate manner.
In this and the previous chapters, we have sought a better understanding of digitized manuscripts, which are digital documents with a strong relationship with material manuscripts that interact in a complex way with print publications. With this conceptual framework in place, it is now time to see what the essential skills and tools are when using digitized manuscripts. In the next chapter, we begin with this by considering the choice between working in a team or working alone, and by learning how to redraw glyphs and symbols in a more natively digital format.
Witkam, J.J. Inventory of the Oriental Manuscripts in Leiden University Library. 28 vols. Leiden: Ter Lugt Press, 2007.
Some of them are in museums, such as The Walters and the Freer and Sackler Galleries.
On Mac a screenshot can be made with: Shift+Command+3, on Windows and Linux: PrintScreen.
See Kropf, E., Rodgers, J., “Collaboration in Cataloguing: Islamic Manuscripts at Michigan,” pp. 17–29 in MELA Notes 82 (2009).
They state on their website that: “The Internet Archive is a 501(c)(3) non-profit that was founded to build an Internet library. Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format.” “About the Internet Archive”, archive.org.
Hendrickson, J., Adil, S., “A Guide to Arabic Manuscript Libraries in Morocco: Further Developments,” pp. 1–19 in MELA Notes 86 (2013), p. 5.
The full name is L’organisation Non-Gouvernmentale pour la Sauvegarde et la Valorisation des Manuscrits pour la Defense de la Culture Islamique.
HMML carefully avoids labelling Jerusalem as either Palestine or Israel.
Cf. Ryan, D., “Aluka: digitization from Maputo to Timbuktu,” pp. 29–38 in OCLC Systems & Services: International digital library perspectives vol. 26, iss. 1 (2010).
Zaynab, A.N., A. Abrizah, and M.R. Hilmi. “What a Digital Library of Malay Manuscripts Should Support: An Exploratory Needs Analysis.” pp. 275–289 in Libri 59 (2009); Zulkifli, Z. “A Collaborative E-Workspace for Digital Library of Malay Manuscripts.” pp. 368–372 in International Journal of Information and Education Technology 4, no. 4 (2014).
This is admittedly a grey area in which, it seems, the code of law itself is inadequately covering these issues, cf. Besek., J.M., et al., “Digital Preservation and Copyright: An International Study,” pp. 104–111 in The International Journal of Digital Curation, vol. 2, nr. 3 (2008).
I used a 4th generation model, from late 2012, which has a 5MP camera.
Nir Shafir also makes mention of this. Gratien, C., M. Polczyński, and N. Shafir. “Digital Frontiers of Ottoman Studies.” pp. 37–51 in Journal of the Ottoman and Turkish Studies Association 1, no. 1–2 (2014), pp. 40–41.
Some notable exceptions deserve to be mentioned, such as the catalogue dedicated specifically to the contents of florilegiums in Cairo’s Dar al-kutub; Halwaji, A.S. (ed.), Fihris al-makhṭūṭāt al-ʿarabīya bi-dār al-kutub al-miṣrīya: al-majāmīʿ, London: Muʾassasat al-furqān li-l-turāth al-islāmī, 2011.
I agree with Jeanneney in that “haste does a disservice to the initiative.” Jeanneney, J.-N. Google and the Myth of Universal Knowledge: A View from Europe. Translated by T.L. Fagan. Chicago: The University of Chicago Press, 2007, p. 55.
As Weiss states in his analysis of massive digital libraries: “Without metadata available to anchor a digital version to the original object, it could […] be ‘lost at sea.’” Weiss, A. Using Massive Digital Libraries. Chicago: ALA TechSource, 2014, p. 28.
Rieger, O.Y., “Enduring Access to Special Collections: Challenges and Opportunities for Large- Scale Digitization Initiatives,” pp. 11–22 in RBM vol. 11, iss. 1 (2010).
This point is also made in an excellent case study of digitization of African collections, see Krätli, G., “Between Quandary and Squander: A Brief and Biased Inquiry into the Preservation of West African Arabic Manuscripts: The State of the Discipline,” pp. 399–431 in Book History 19 (2016), in particular p. 419.
For a critique on Lachmann’s approach for Islamic texts, see Witkam, J.J., “Establishing the Stemma: Fact or Fiction?”, pp. 88–101 in Manuscripts of the Middle East 3 (1988).