History and Contents of the Dutch Theatre Production Database

The Dutch Theatre Production Database is a historical database with data on theatre and dance productions in the Netherlands by professional Dutch and foreign producers and companies. The database contains data on practically all productions from 1940 up to now: over one hundred thousand productions that can be freely searched using either the Webopac (http://theatercollectie.uva.nl/search/advanced) or the website Theaterencyclopedie (https://theaterencyclopedie.nl). Since 2013 the database is managed by the Department of Special Collections of the Library of the University of Amsterdam. It is the only source that offers an overview of the complete theatre programme in the Netherlands. The origins of its development lie in the card catalogue of the library of the Toneelmuseum of the 1960s. The database itself is now nearly three decades old and has grown much larger than the first compilers could ever have imagined. Until now a history of its development has been lacking. This paper offers a reconstruction of the history of the creation and development of the data and a description of the protocols and input system. This paper clarifies which sources have fed the database and which systems and productions have been fed by the database. With this paper, the verifiability and representativeness of the data can be assessed more effectively.


Introduction
The database 'Productions' of the Theater Instituut Nederland (hereafter tin)1 is a historical database with data on theatre and dance productions in the Netherlands by professional Dutch and foreign producers and companies. The database, using software by Adlib, contains data on practically all productions from 1940 up to now2 and is updated daily. Since 2013 the database is managed by the University of Amsterdam, at the Department of Special Collections of the University Library. Over one hundred thousand productions can be searched freely using either the Webopac or the website Theaterencyclopedie ('Theatre encyclopedia'). Adlib also offers access through its api. A sample of the data is offered in a repository alongside this paper. In this file, the records of all productions from 1981 until 2017 are presented in xml format.
As of 1952, the tin registers the data of all professional theatrical performances in the Netherlands as a service to the professional field. From 1982 these data are digitally processed, and as of March 2004, this database is also available online through the so-called Webopac (Jaarverslag 2003(Jaarverslag , 2003. This Adlib database is open access and a unique source for both qualitative and quantitative research. The database is both an opening-up for the heritage and multimedia centre of the tin and a source for other websites, like Theaterencyclopedie3 and Theaterkrant. From 2007 the tin describes the database not only as an information product but also as a showpiece in itself (van Keulen, van Zadelhoff, & de Waal, 2010). Data helps researchers find trends in curricula of individual 1 Since the discontinuation of the Theater Instituut Nederland (formerly the Nthi and Toneelmuseum) in 2013, the tin collection is accommodated at the University of Amsterdam. 2 The earliest theatre production dates from 1751 and is a double programme of the Stads-

History
The history of the development of the database starts in the 1960s with the library (later multimedia centre) of the Toneelmuseum at Herengracht 168. In this new documentation centre of the theatre field press reports, information and brochures of current and past seasons were kept and made accessible. The annual issue of Theaterjaarboek,5 a compiled and indexed overview of all productions on Dutch stages, became the main focus for the collecting of data. Over the years a Theaterjaarboek form was introduced for producers to fill out; each month hundreds of letters were sent to producers asking for data and materials.
From 1981 these seasonal data were no longer added to the hand-written index cards of the library catalogue but entered into a database that was developed by David Computer Systemen: the Theater Databank.6 Performance data of the departments Internationaal Theatre Institute, Theater Klank en Beeld, and Theateramusement were combined and opened up employing a combined reference system. This meant a huge increase in efficiency for producing the Theaterjaarboek; in 1986 the information could be delivered to the printer on floppy disks (Nederlands Theater Instituut, 1987). Furthermore, the data were made accessible for users of the multimedia centre through a printout on 16 research data journal for the humanities and social sciences 5 (2020) 13-26 continuous form paper.7 In 1991 these data were converted into a specifically modified version of the library catalogue programme Adlib. From 1987 data on productions by the later merger partners Nederlands Instituut voor de Dans and Nederlands Mime Centrum were recorded as well (Nederlands Theater Instituut, 1987).
The number of productions has increased tremendously over the years (see illustration 1), with nearly two thousand performances a year up to the peak year of 2009. This trend break could be caused by the severe cuts in funding across the cultural field or by the change in data gathering around this period, or both. In the database data on all genres of performing arts productions are entered: theatre, dance, puppet shows, cabaret, music theatre, youth theatre, mime, opera, magic shows, circus, and funfairs. Conditions are that art production is made by/with the cooperation of professional producers and artists and is staged in the Netherlands.

Description of the Data
Of these productions, the following data are entered: a unique production code;8 the (original) title; working titles and umbrella titles for the production; producer(s)/company(-ies); date, theatre or location of the premiere; credits (the persons and organisations that have been artistically involved in the realisation of the production, either visibly or invisibly); functions; roles; discipline; festival information, and available materials or links to database descriptions thereof and management data.9 The names of persons come from a controlled list where pseudonyms and alternative spellings are taken into account. This list originates in the persons database from the Theatre Institute, which also functioned as its crm. The same goes more or less for the venues list. Performance dates and venues after the premiere are not entered, but a reference indicates in which season the show has been performed.
In the descriptions of the genre or the discipline of the production the qualification of the makers is followed up. The opening-up (on subject, disciplines, references and genres) is limited to a few references, and the system works with fixed controlled references, locations and names. The criteria and definitions are not static and have been re-adjusted several times in the past decades, moving with the developments in the theatre field. For example the funding structure has been decentralized, new educational institutes are 8 Invoerhandleiding producties (2012, p. 2) explains the structure of the production code: "Digits 1-4 are for the season in which the production premieres. For example, a show that premieres in 2013/2014 starts with 2013, e.g. 201301263.008. The digit following the season is always 0. Digits 5-9 are reserved for the producer code; most producer codes consist of two, three or four digits followed by a dot, e.g. 201301263.008. The dot is followed by three digits; these refer to the number of entered productions by one producer within one season, e.g. 201301263.008." 9 Production code, title, producer, premiere date, day of premiere and discipline are compulsory entries. Producer, function, day of premiere, discipline, audience and personal names are coupled to fixed domains, as well as the references. Since 2014 the different roles staff members can fulfil within a production are taken from the thesauruses of the schools. Titles and roles are optional entry fields. All records get the same production record type, in order to distinguish them from other records in the database, like those of the library and heritage catalogue of the address database (Invoerhandleiding producties, 2012).

18
research data journal for the humanities and social sciences 5 (2020) 13-26 established, and the concept of professionalism changes. At this moment the criteria for recording are: -The production is issued by a ocw-subsidised or the Fonds Podium Kunsten-supported company/producer. -Or: The majority of the performers has been professionally trained at and graduated from one of the state-recognised schools for theatre, musical, mime or dance. -Or: Performing theatre or dance is the main profession of the majority of the performers. -Or: The production will be played more than twenty times (playlist) as a publicly accessible performance in the next quarter. -Or: The production will be played at more than two large subsidised festivals and/or the Theaterfestival. -Or: The production is a final project produced by one of the state-recognised schools for theatre, musical, mime or dance.
Productions which consist mainly of singing and only a small part of the performance consists of acting or dancing (e.g. concertante performances of opera, theatre concerts, flamenco shows, sing-along concerts, etc.) or performances in which no acting takes place (like narrative performances or recitations) are not included. Community arts projects or amateur performances, however, are recorded if there is a high number of reruns, or if they are shown at leading festivals, supported by government funding or if the staff members are qualified volunteers.

Reference Field
The reference field is used for several purposes and contains uneven disparate information, which leads to semantic issues. References issue information on the period of performance, the country of origin, or the genre. Data in this field cannot be easily shared, which has consequences for the whole database as a source.
The season in which a production was staged in the Netherlands is indicated with the reference "season-/-" (there can be more if a production has been prolonged). A production that premiered in november 1960, for instance, is tagged with "seizoen 1960/1961", and if it is reprised five years later the tag "seizoen 1965/1966" is added. This is indicated for all performances. Now and then a more specific description than just the discipline is entered: e.g. modern dance, experimental theatre, youth theatre, etc. Disciplines that are recognised by the theatre field, like youth theatre or experimental theatre, are not a separate discipline in the database, but a reference. Dance, cabaret Figure 2 List of references from the import instruction. 20 research data journal for the humanities and social sciences 5 (2020) 13-26 and theatre for youths can thus be classified. In order to prevent proliferation the different disciplines and references have been standardised and limited since 1990 by the tin (see illustration 2; these instructions and definitions are added to the repository.
Hybrid performances and unlikely genres can collide with this fixed subdivision; the reference musical belongs to the discipline "form of amusement" and not to "music theatre".

Description of Entry System and Control Protocols
Until 2008 data for current productions were obtained through the aforementioned Theaterjaarboek form. In 2008 the Jaarboek's editorial set-up changed, and a paper translation of the database was no longer used. On the basis of programmes, mailings, playlists, theatre schedules and later on also websites the data were entered by a team.10 When the online availability of data increased, the first observances took place through the websites of producers instead of the paper season's brochures. The paper trail and earlier naturally underlying material would disappear for a large number of performances.
With the switch to the new standard version of Adlib (2008) entry instructions, directions and tables were further standardised and specified. The entire persons and theatres database was checked, and pseudonyms and different spellings were merged.
In 2012 the tin was abolished because of a strict austerity policy by the national government, which hit arts and culture hard. There was not enough manpower available to process the data. After a test with online input by companies was cancelled in 2013, in 2016 a stabile cooperative partner was found in theatre review website Theaterkrant. This website submitted its premiere database, after which those data were checked by an information specialist at Special Collections.

Premiere of Production Database?
In the database only members of the first casts are entered, that is to say the members that perform at the premiere. Information on replacements, understudies, swings and second casts is not up to date or entered.
Although the tin itself called it the premiere database, this moniker is not without problems. For instance: in the past some companies had premieres in The Hague, Rotterdam and Amsterdam, but only the first premiere was entered. Other producers, on the other hand, never premiere to indicate that their performance is a work in progress. In those cases productions are entered under their first known date or the first month of the season. As a result, August 1 proves to be a popular premiere date for Dutch performing arts, while in the theatre world this is a quiet summer's day.11 And even then, there is quite a large number of productions in the database without a premiere date at all, especially in the era before 1940. The database primarily offers information on the theatre productions, the artistic titles, and premiere data and venues are a characteristic of that.
With opera and ballet the repertoire is long-lived. The National Ballet, for example, stages performances that were on the repertoire as early as 1961 (and on that of predecessors before that). Although sometimes there are completely new rehearsals, not every revised version gets a premiere, while the original choreographers and directors themselves make (at times drastic) changes. Flyers and other materials of those new performances are added to the production folder of the original premiere though, and in the database the reference "season-/-" is added so that subsequent productions can be found.

Theater Collectie Selectie
With the enormous increase of theatrical output, it became impossible to collect and keep materials of all productions in the production database. The result was that in the eighties and nineties videos, photographs, playbills, costumes, scale models and programmes were entered of the companies or designers that sent in the most material, which made the collection less representative. This is why since 2003, at the end of each season, a professional advisory committee selects around one hundred representative productions. This selection is called the Theater Collectie Selectie (tcs). A committee consisting of ten to fifteen members who see many shows in their professional capacity and scholars in the field of Theatre and Performance Studies with a profound knowledge of the theatre, select a proportional representation of productions of various disciplines from the entire programme of that season. If during one 11 Until 2012, if a production premiered in the month before it was entered, it was entered in the current month with the correct premiere date added in an annotation. This had to do with the monthly mailing of the production forms (Invoerhandleiding producties, 2012).

22
research data journal for the humanities and social sciences 5 (2020) 13-26 season there are relatively many dance productions, more of those will be selected. The selected productions do not necessarily have to be 'the best' performances; they can also be productions with an important social theme or a show with a longstanding repertoire history (e.g. Gijsbrecht van Aemstel). Each year the selection is completed with productions that have won important prizes.
Producers of selected productions are asked to supply materials. For productions that are not entered into the tcs there will be no more active acquiring of materials. Once every five years the Collectie Selectie is searched in order to select important styles of which costumes, models and designs were collected. It was planned that once every ten years the total selection of 1000 productions would be studied and checked for blind spots or missed trends. However, the budget cuts and subsequent abolishing of the tin prevented this revision from taking place.
This manner of collecting was considered best practice in the visitation of sector institutes of 2011.12 Performances in the tcs can be found via the label tcs.

Range and Consistency
Of season 1960/1961 (the opening of the library of the Toneelmuseum) there is only a small number of productions missing, and productions are fully described. From the period before 1960-1961 more productions and data are missing. In 2004 a special project added another 11,000 productions from the period 1945-1983, which made the description of this period rather extensive. The research project Theater in de Tweede Wereldoorlog13 in 2003 made way for a wider search for all stage activities during the war years of 1940-1945, within and outside the collection. The criterion of professionality was abandoned for that period, to also be able to enter clandestine shows, performances in the camps, and illegal living-room performances (van der Veen, 2005). Entering data from before 1940 is done on the basis of the heritage collection -think of performance dates in programmes and on bills. The heritage collection, based on several private collections and donations, was managed out of artistic, aesthetic or historical interest. Those data are thus less representative and are incomplete for the entire spectrum. If possible, collection parts are linked to performances during digitalisation, and for instance photos and scripts of a performance are linked by the field "materialen". However, not all parts of the collection are linked or digitalised. In 2005 dates for 26,000 programmes from before 1945 were imported (Doorbouwen, 2010); the date of the document was entered as the date of the premiere. For all programmes new performances were entered, which means that especially longrunning productions can occur more than once (see illustration 3), e.g. the early runs of the classic Dutch play Op hoop van zegen. In theory all co-producers were entered, but on some professional statements not all co-producers are mentioned. With festivals in particular this can lead to duplicity. Because the different records are based on different underlying material, not all records are combined (yet) [see Figure 4].14 In 2019 Special Collections has started a project to untangle these duplicate entries, which will lead to more accurate data.
For foreign productions a distinction is made between Flemish producers and other. Flemish producers get their producer number and the reference 14 The merging of these "doubles" is planned for 2019, according to curator Hans van Keulen (private communication, October 2018).

Figure 4
Performances of the play "Elckerlyc" in two different places, with partly different actors and crew research data journal for the humanities and social sciences 5 (2020) 13-26 "Vlaanderen". The other foreigners all get the producer code 998 and their country of origin as a reference.
In the Dutch performing arts few institutes last forever, but in many cases a name change in ways is a continuation of the existing production company. Opera in the Amsterdam was produced by Stichting De Nederlandse Opera, De Nederlandse Opera Stichting, De Nederlandse Opera, and Nationale Opera en Ballet. As these names conceal a continuous production company they are used as equivalents.

Verifiability and Representativeness
The Dutch Theatre Production Database has been in existence for nearly three decades and has grown more than the first developers could ever have imagined. The database has moved with the history of the theatre field in general, with that of the sector institute tin, and with the technical possibilities.
Up until now there was no history of its development. This paper offers a reconstruction of the creation and development of the data and a description of the protocols and entry system. This paper clarifies which sources have fed the database and which systems are fed by the database. Based on this history we can distinguish five periods: -dates of shows with a premiere date before 1952 (based on retrospective entries from the heritage collection of the tin); -dates of shows with a premiere date between 1952-1960 (based on retrospective entries of the data collected for the Theater Jaarboek); -dates of shows with a premiere date between 1960-1981 (based on retrospective entry of current documentation); -dates of shows with a premiere date between 1981-2012 (based on direct entry in the database of current documentation); -dates of shows with a premiere date between 2012-present (based on direct entry in the database of digital sources).

Prospects
The prospect of long term sustainability of current and future data has significantly improved in the last year. The Ministry of Education, Culture and Science will invest in the Theatre Collection, thus putting it on a path to improved (continuous) funding, not only for preservation of the collection and its data but also for a wider variety of activities concerning research on and