1 Barriers to Overcome
The material culture of ancient Egypt constitutes one of the best preserved and most robust archaeological and linguistic corpora to survive from antiquity, as well as one of the most popular avenues for public engagement with the humanities. Unfortunately, the global scattering of Egyptian artifacts among disparate museums and other institutions limits public access to this important facet of our shared, global heritage. At the same time, the lack of easy access to far-flung collections also hinders academic study and research into the linguistic and archaeological remains of this vitally important ancient culture. These problems have been mitigated partially through the digitization of museum collections over the last few decades. However, none of these efforts have exploited the full potential for interconnectivity that the Internet has to offer. Persistent issues may be summarized as follows:
-
Lack of connectivity. Virtually all digitized collections of Egyptian material (currently about 50) exist as isolated websites with no ability to search between them or simultaneously. Google and similar web-crawler search engines are capable of locating only a small proportion of the data that exists within the museums’ databases. Furthermore, most currently digitized museum collections remain unconnected to relevant external resources, such as the UCLA Encyclopedia of Egyptology,3 Giza Archives,4 et al.
-
Insufficient metadata. Descriptions of material are frequently too brief or incorrect, leading to difficulties in locating specific and relevant objects within a given collection.
-
Lack of standardization. Egyptological terminology standards, such as those provided by Thot thesauri (discussed below), have not yet been implemented widely.
-
Data unavailable in common language(s). While search functions and metadata for some digital collections are available in multiple languages, many other museums describe their objects primarily or exclusively in their local language. Given the global distribution of Egyptian material, users might be forced to contend with metadata or even basic search functionality in French,5 German,6 Italian,7 Danish,8 Dutch,9 or other languages. Translated versions into a common language, e.g., English or Arabic, are often not provided or not employed throughout the site. Furthermore, the accuracy of widely available automated translators, such as Google Translate, remains insufficient for reliably rendering technical terminology, object descriptions, and metadata.
Several cultural aggregator sites have been created to address some of these issues, including Europeana,10 Artstor,11 and Google Arts and Culture.12 However, those services still pose difficulties for culturally specific disciplines such as Egyptology, including difficulty finding relevant objects in large datasets spanning multiple cultures; inability to execute full-text searches of the metadata in common languages; aggregator focus on collection highlights rather than complete collections (thus, e.g., Google); and a lack of global focus for some sites (thus, e.g., Europeana). The Global Egyptian Museum attempted to overcome some of these barriers.13 However, that site includes only a relatively small selection of objects (14,975) and has not been updated in over a decade. Furthermore, technological progress—particularly with regard to Artificial Intelligence and “smart,” image-based searching—offers new opportunities for more efficient research.
2 Possibilities of AI
AI developed for computer vision and image search has been proposed as a partial solution to the problems outlined above, specifically those surrounding metadata. However, existing, large-scale AI solutions for image search, such as Google, Microsoft Bing, or the TinEye reverse image search,14 do not work well for specialized disciplines such as Egyptology, because their algorithms for image recognition have not been trained specifically on ancient Egyptian corpora. Similarly, other promising experiments within the digital humanities, such as Replica15 and the those of the Bibliothèque nationale de France,16 have not yet created definitive solutions focused on the specific problems and priorities of ancient Egyptian material.
3 The Cleo AI Egyptology Platform
Figure 23.1
Cleo search interface with “Advanced search” tab expanded
To address this deficit, in 2018, Wilbrink launched the beta version of Cleo, the Artificial Intelligence Egyptology platform.17 The platform was developed together with software company Goldmund, Wyldebeast & Wunderliebe. Cleo stands as the first and, to date only, “smart” museum collections aggregator, developed specifically to address the needs of scholars and the public seeking to engage with and explore the vast material culture of ancient Egypt. Cleo features intelligent image search and integrated metadata translation capabilities, with standardized terminology.
Figure 23.2
Sample search results for search term “stela” illustrating World Map and AI search functionality
Cleo was developed originally under the open source Apache License, Version 2.0, as a platform to connect four major international collections of Egyptian artifacts with over 45,000 objects from the online collections of the National Museum of Antiquities (The Netherlands),18 the Brooklyn Museum,19 The Metropolitan Museum of Art,20 and the Walters Art Museum (Baltimore).21 After registering an account, the user can search available objects from all collections, on any device (mobile, tablet, laptop, or desktop), via a multilingual interface, available currently in either English or Dutch (Figure 23.1).22
The researcher can start with a text search or upload an image of a specific object and find similar objects, initiating the AI search capability, discussed below. Results can be filtered, e.g. by “Material” or “Periods,” can be studied in detail (always with a link to the original collection website), or analyzed on a world map, where objects are plotted based on their original provenance (Figure 23.2). Every search can be expanded by selecting several objects of interest and performing an (additional) image search. In addition, a personal collection of objects can be created for later reference or descriptions and images can be downloaded immediately (Figure 23.3).
Figure 23.3
Sample object metadata (excerpted here for length) and additional user options for image download and personal collections
The Cleo team standardized the Egyptological terminology using Thot thesauri, a “wide range of multilingual thesauri related to documentary and textual metadata” pertaining to Egyptology,23 in order to improve text search functionality. Most importantly, Cleo incorporates an innovative AI algorithm, created using Keras and TensorFlow, which has been trained specifically on Egyptian corpora, and from which the platform was able to suggest objects relevant to the initial text or image search query, which the user might otherwise have overlooked.24
The Cleo AI algorithm offers several innovative features that permit unparalleled levels of access to linked collections. The AI image search can be used in two ways: 1.) Alongside traditional text-based searches, a user might also upload photos, from which the algorithm suggests a typological identification (e.g., “stela,” “scarab”), in conjunction with a selection of similar objects. This classification process is based solely on the AI analysis of the uploaded image; 2.) Users can search for similar objects by selecting two or more objects indexed within the platform and clicking “AI search.”
The results that the platform returns are based on both object metadata and all available images. At present, the algorithm recognizes 23 types or groups of objects, consisting of at least 400 objects per group. In addition to the AI image search, users might search entire texts using translations, available presently in English and Dutch. This process consists of translating Egyptological words and concepts by means of Thot thesauri, in conjunction with automatic translation of the remaining texts by means of the Google Translation API. These translations are contained within the Cleo platform itself, where they may be checked for accuracy and sense by native speakers and Subject Matter Experts (SME s), improving significantly the quality of the results.
4 History
Wilbrink presented her initial ideas for an integrated AI Egyptology search platform at the International Conference of Egyptologists (ICE) in 2015, in Florence, Italy. She created a first prototype in collaboration with AI company SynerScope, Google Cloud, and the National Museum of Antiquities in the Netherlands. The platform’s prototype was unveiled the next year at the sixty-seventh annual meeting of the American Research Center in Egypt, held in Atlanta, Georgia, in 2016, followed by a press conference at the Netherland’s National Museum of Antiquities, in 2017.25 Additional presentations of the Cleo prototype were held at the Digital Humanities Benelux conference in Utrecht, Netherlands (2017); and Dutch-Flemish Egyptology Day in Nijmegen, Netherlands (2017). A design sprint was executed with a team of international Egyptologists and AI experts in Leiden (2017), facilitated by the University of Leiden, the National Museum of antiquities and Google. Wilbrink founded Aincient in order to create the Cleo platform, which was supported by grants and investments from the SIDN fund (Netherlands), Google, and the National Museum of the Netherlands.
The official launch of Cleo’s pilot phase was presented at the meeting of the International Committee for Egyptology and International Council of Museums (CIPEG-ICOM), in Swansea, UK (2018); followed by a poster presentation at the annual meeting of the American Schools of Oriental Research (ASOR), in Denver, CO (2018); a presentation at the conference of the Knowledge Institute for Culture and Digitization (DEN), in Rotterdam (2019); a presentation at the XIIth International Congress of Egyptologists (ICE) and at the CIPEG-ICOM meeting at the British Council, both in Cairo (2019).
Since the launch of Cleo’s beta version in September 2018, more than 4,800 individuals have visited the site, with more than 1000 registered users. At least three universities in the US, Canada, and the Netherlands have so far incorporated Cleo into their curricula for Egyptology. The source code for Cleo’s first phase has been made freely available under the Apache License, Version 2.0, via GitHub.26
In March of 2019, Wilbrink presented the Cleo platform at the conference on “Ancient Egypt and New Technology,” held at Indiana University, Bloomington. During the Q&A session, representatives from numerous national and international museums, housing Egyptian collections of all sizes, expressed interest in making their data available through Cleo. The overwhelming consensus that emerged in discussions was that Wilbrink’s platform, with its intelligent and multivariate search capabilities, represented the best solution to integrate disparate museum collections. In these discussions, Wilbrink emphasized the urgent need for institutional partnership, as a means of sustaining the Cleo platform and increasing its footprint, through the integration of additional museum collections.
Also present at this discussion was Joshua Roberson (University of Memphis), who proposed a collaboration with the Department of Art, Institute of Egyptian Art & Archaeology, and
5 Expansion of Cleo
Expansion of Cleo’s existing dataset will be accomplished through the addition of new collections of Egyptian artifacts, their images, and metadata. These will be added using the methods developed and tested already on the 45,000+ objects incorporated into Cleo’s beta phase. This process may be summarized briefly as: downloading the images and metadata using either a museum-developed API or a data dump; standardizing the data using Thot thesauri; and automatically translating the metadata into a common language.
The present (beta) incarnation of Cleo stands already as the largest and most sophisticated aggregator of Egyptian artifacts currently online. For Cleo’s expansion, nine international museum collections and five online Egyptological platforms have expressed their intent to share data on Cleo, which would add at least 100,000 objects to the platform. The selection of new museum partners has been based mainly upon the collections data provided by the Egyptological museum search tool.27 The expansion, which will include many thousands of new exemplars for existing object classes as well as entirely new classes of objects, will permit the software engineers to train the AI image recognition algorithm to recognize objects more effectively. Furthermore, the introduction of new and more varied metadata will permit our Egyptologists and language specialists to refine the platform’s translation capabilities on a variety of collections, both large and small.
Alongside the collections of artifacts, the next phase of Cleo will also expand the sorts of information that Cleo might return for a given object query, to include more robust linguistic information in the case of inscribed artifacts, as well as secondary literature, where available, and other data as aggregated by relevant third-party platforms with an Egyptian focus, such as the UCLA Encyclopedia of Egyptology, Trismegistos,28 et al.
Improvement of the AI image search will be achieved on two separate fronts. Firstly, the existing algorithm will be extended by searching the input space of the model for better features, as well as searching the topology space of the neural network model itself to arrive at a more performant system. Those results can be quantified using standard measures (precision, recall, and F1 scores). The second improvement will involve extracting salient details from images and artifacts using machine vision techniques and training a separate, clustering-based model. The result will be a system capable of finding both direct and subtle connections between data points.
One benefit to this improvement in the AI will be significantly increased accuracy of search results from user-generated object photos, comparable to results from scanned or professional images. Of course, any changes to the system’s back end must carefully consider usability on the front end, ensuring that both common and novel processes readily suggest their presence and operational parameters. User experience (UX) testing will accomplish this through a series of use case time trials, efficiency comparisons, and affective analyses within a range of participants from likely user populations (researchers, educators, students, public).
Cleo’s translation abilities currently offer options for translation of metadata into English, the most widely used language for academic research and public humanities programming, and Dutch, the mother tongue of Cleo’s creator. The goal for Cleo’s translation capabilities is to include Modern Standard Arabic (MSA). This is a vital addition to the platform’s functionality, as Arabic is now the fourth most popular global language.29 In addition, and perhaps most importantly, the inclusion of Arabic will facilitate access to the artifacts by the Egyptian public, for whom these collections—scattered globally, far beyond their original home—represent a vital cultural inheritance. Warfare, colonialism, and unregulated archaeological excavation facilitated the industrial-scale export of ancient Egyptian material culture prior to the 1970 UNESCO Convention. As a result, many modern Egyptians have no means or opportunity to engage with artifacts held in foreign collections. By expanding Cleo’s interface to accommodate MSA, the platform will help to fulfill an ethical obligation that museums and other cultural institutions around the globe owe to the people whose past has been mined for foreign benefit.
6 Long-Term Benefits to Research, Education, and Public Programming
At the recent conference in Bloomington, numerous attendees—representing Egyptologists and digital humanists from leading international institutions and all levels of instruction—expressed their enthusiastic support for the Cleo platform as a tool for research, education, and teaching. This enthusiasm reflects several major, long-term benefits to academic research, education, and public programming in the humanities, which the large-scale implementation of Cleo promises to fulfill. Above all, there is currently no other solution for searching object classes and comparanda across multiple museum collections, with the purpose-built capability to recognize Egyptian artifacts. Thus, anyone seeking information concerning specific objects or more general object types would benefit directly from this innovative platform.
User testing during the beta phase has included Egyptologists and other academics, graduate students, museum professionals, and interested members of the public. User feedback during this pilot period has been overwhelmingly positive, with particular enthusiasm expressed for the use of Cleo in the classroom, common language searching, and location of objects via AI search. Our intention is that, as Cleo becomes more widely recognized during its implementation phase, users will expand to include teachers and students and at every level of education, given the prominent place that Egypt holds in curricula nationally and internationally, at the primary, secondary, and college/university levels. To better reach this wider audience, the Cleo team will solicit users from educators at each of these levels, to inform them of Cleo’s features and possibilities, so that they might incorporate it directly into their curricula and teaching. In fact, any program, from elementary education through graduate training, that focuses on Egypt, the Near East, or the ancient world in general stands to benefit from Cleo’s integrated search features and intelligent image recognition algorithms.
In addition, there is a clear long-term benefit to museum professionals of all sorts, working with collections that might include an Egyptian component. Cleo will allow researchers who might not have direct access to the narrow expertise of an Egyptological specialist to bridge the gap between their local institution and the vast collections and metadata of the largest museums in the world, alongside the no-less important small collections, whose pieces might be overlooked in a conventional search of so-called “masterpieces.” Thus, the expansion of the Cleo dataset to include the numerous Egyptian collections and diverse secondary literature aggregators, etc., will establish Cleo as the premier digital reference for image- and text-based research on Egyptian artifacts, comparable in scale and importance to such services as the UCLA Encyclopedia of Egyptology30 and the Online Egyptological Bibliography (Oxford).31
As a final note regarding the long-term benefits of Cleo, it is critical to look beyond the Egyptian material to the broader fields of language and literature, (art) history, religion, and similar areas of humanities research and public programming, which might incorporate image- and text-based object searches. In fact, any institution utilizing image—and object-based data for research or public programming stands to benefit tremendously from the intelligent search capabilities that Cleo offers. Because the platform’s code has been and will continue to be made available as open source via GitHub, other developers can build upon Cleo’s foundations to address their own collection-specific needs. This broader humanities significance is evident already in requests from numerous users to adapt the Cleo code for other, presently unlinked museum websites, as well as requests to extend the platform to include other ancient cultures (e.g., Greek, Roman, Indonesian).32 In fact, approximately 80 percent of Cleo’s code could be adapted with relatively little modification to accommodate the written and material culture of virtually any ancient civilization.
7 Closing Thoughts: Sustainability
Financial sustainability forms a major challenge for any online platform, in digital humanities or otherwise. At the March 2019 conference on Ancient Egypt and New Technology, several scholars raised the issue of financial sustainability for platforms like Cleo, highlighting lessons learned from other online initiatives, such as Trismegistos, Giza Archives, and the UCLA Encyclopedia of Egyptology. The consensus was that those projects weathered their initial years almost entirely through grants and other external subventions but struggled subsequently with the issue of self-sustainability. By contrast, one of the very few online projects within Egyptology to become self-sustaining is the Online Egyptological Bibliography (OEB), the survival of which depends entirely upon the institutional and private annual subscriptions.
Learning from these examples, the Cleo team has opted to employ a “freemium” model, whereby some features or functionalities are available at no cost, with others available at a subscription rate. The challenges, as well as the clear benefits of such a model are now well known. In brief, any premium feature must be sufficiently enticing—and clearly defined—to encourage subscription. At the same time, the Creative Commons licenses that govern the use of museums’ metadata typically require that all shared materials remain freely available to all users. In order to navigate between these two extremes, Cleo’s approach resembles and will resemble that of the successful freemium model provided by Dropbox, whereby free accounts retain access to all features, subject to monthly data limits, while premium accounts are given access to more data per month.
Along similar lines, the Cleo solution will be to permit anyone to register a free account, granting full access to Cleo’s search functionality but with a limited number of searches per month. “Power users,” professional researchers, etc., will have the option to upgrade their free account to provide extra numbers of searches with an individual subscription. In addition, a higher, institutional subscription rate will permit unlimited searches for an unlimited number of users logging in from their institutional email. The income from this model will ultimately help to defer or eliminate the already modest cost of long-term hosting and maintenance while the ability to retain a free account will fulfill the project’s philosophical commitment to open access for the public.
e.g., the Louvre:
e.g., Staatliche Museen zu Berlin:
e.g. Archaeological Museum of Bologna:
e.g., National Museum of Denmark:
e.g., National Museum of Antiquities, the Netherlands:
Credits for all object photos illustrated at figs. 1–3 may be found on the Cleo website.
For detailed discussion of the programming workflow, see
For a similar initiatives being developed for Classical artifacts, compare Athena’s Repository, (