Rescuing a Heritage Database: Some Lessons from London Concert Life in the Eighteenth Century

The paper outlines the genesis and subsequent transformation of the database Calendar of London Concerts 1750–1800 , now available as a dataset at https://www.doi .org/10.17026/dans-znv-3c2j. Originally developed during the 1980s, the database was used as a primary research tool in the preparation of articles and a 1993 monograph: the first comprehensive study of London’s flourishing public concert life in the later eighteenth century, which culminated in Haydn’s London visits in 1791–5. The database itself, extending to over 4000 records, was derived from an exhaustive study of London newspapers. Following the obsolescence of the relational database in which the material was initially stored, it has recently been transferred to a spreadsheet in csv format, publicly available with free open access. Issues arising out of the standardisation of concert data are explored, especially regarding the layout of complete concert programmes, and the strengths and limitations of the original design are analysed, within the context of the newly available version.

Rescuing a Heritage Database research data journal for the humanities and social sciences 5 (2020) 50-61

Introduction: The History of a Dataset
London has long been recognised for its role in pioneering the development of a public concert culture. From the first stirrings in the 1670s, London's concert life proliferated beyond recognition during the following century, to an extent unrivalled anywhere in the world. The commercial opportunities provided by an ever-expanding and increasingly wealthy metropolis were linked to an openness towards new music and a welcome towards international virtuosi that attracted top musicians from across Europe. Closely connected with the development of both music publishing and instrument-making, public concerts for a ticket-buying audience form a prime example of what cultural historians have identified as the commercialisation of leisure in eighteenth-century Britain. The increasing public prominence of concerts in the second half of the century, featuring new symphonic repertoire from the continent, placed the symphony concert alongside the Italian Opera at the heart of the fashionable calendar of the leisured classes. Since access was limited by ticket prices, subscriptions and other exclusory mechanisms, it would be a mistake to equate this efflorescence of public concerts purely with the rise of the middle classes: instead, the principal public concerts targeted the aristocracy and the wealthiest members of the bourgeoisie. Nevertheless, it can reasonably be asserted that the roots of nineteenth-century concert life, and thus of our own today, are already present in eighteenth-century London: the symphony concert series and oratorio culture, the virtuoso show and the more popular vernacular forms at the pleasure gardens. While other European cities did indeed develop their own forms of public concert culture in the eighteenth century, this was by no means universal (compare the quite different structures obtaining in Mozart's Vienna); while the volume of activity in London was absolutely unrivalled.
These broad assertions can only be made with confidence as the result of a very extensive programme of data collection, including my own complete database of London concerts during the second half of the eighteenth century. In truth, this favourable outcome only arose by a roundabout route which requires some explanation, since the comprehensive study of urban concert life was in its infancy when I began this work in the 1970s. The database began life in a paper-based format deriving from a page-by-page examination of London newspapers from 1750 to 1784, to underpin my doctoral dissertation on violinists in London during this period (McVeigh, 1980). Subsequently, I extended the data collection as far as 1800, following the same method but using microfilms then available in the British Library, with particular emphasis on the All of this data collection constituted an attempt to place the activities of composers, performers and concert promoters within broad cultural contexts. It focussed attention on the social environment in which musicians developed their professional careers, and on the critical reception that informed their choice of national repertoires, genres and programming patterns in general. While parts of this story -particularly as it relates to Handel, J. C. Bach and Haydn -were already well-known, much of the material was completely new, and no overarching investigation had yet been attempted. The compilation and analysis of the data thus formed an essential component in developing the entirely new perspective on London concert life alluded to above. Advertisements and reviews in newspapers provided the majority of the data. In part, this reflected the emphasis of the project on the public nature of London's concert life, but it also had a more practical origin. No substantial archives survive -not even in relation to the two most famous series, those of Bach and Abel (1765-81) and of Salomon and Haydn in the 1790s. Some records of concerts at the theatres and pleasure gardens are extant, and occasional handbills and printed wordbooks survive, as well as relevant bank accounts and legal documents. But in the absence of specialist music journals, it is to newspapers that scholars must mainly turn. This paper is designed not only to explore the potential of such a project, but also to probe the limitations resulting from the source material itself andwittingly or unwittingly -from the methods selected at the time. The variability of concert data in the newspapers (in extent, in presentation, in accuracy) raises major challenges for database design and interpretation, starting from the basic question as to whether the newspaper text or the event itself constitutes the level of record. A particular challenge is provided by the complexity and variety of individual concert items, which may detail composer, genre, title, performer(s), instruments/voices, and a host of other ancillary material. The different ways in which these issues have been addressed will be considered research data journal for the humanities and social sciences 5 (2020) 50-61 in the conclusion. First, however, it will be helpful to describe the working methods involved in assembling the eighteenth-century data over many years and the issues involved in structuring the original database.

Methods and Coverage
During the second half of the eighteenth century, most formal concerts in London were funnelled into the spring season lasting from January to June. They fall broadly into two types: (1) subscription series or benefit concerts offering mixed programmes of 'vocal and instrumental music' , typically alternating opera arias or English songs with symphonies, concertos or chamber music ( Figure 1); and (2) three-act oratorios by Handel and his successors during Lent -although these were often replaced later in the century by long programmes of vocal selections. During the summer months, there were nightly concerts at the pleasure gardens, while during the autumn some private music societies occasionally advertised their meetings. In practice, therefore, very little detailed information is recorded during the second half of the year. An early decision was taken to restrict the database to those concerts advertised in the daily press. One daily newspaper title was checked for the entire period: up to 1785, the General/Public Advertiser. For most of the period a second title was checked in addition, and for the later years many more titles were also consulted (see McVeigh, 1996). The majority of newspapers cited are now widely available online via the Seventeenth and Eighteenth Century Burney Newspapers Collection; while The Times (originally the Daily Universal Register) is accessible through The Times Digital Archive 1785-2013. A few newspapers are still only to be found in hard copy in the British Library, London; the Bodleian Library, Oxford; or the Beinecke Rare Book Library, Yale University. The dataset includes over 4000 records of advertised concerts within the geographical boundary of present-day Greater London. While further searches may reveal further concerts, it is very unlikely to do so in any significant numbers. It should be noted that there are inevitable ambiguities at the margins, concerning the very definition of a concert. Excluded here are all staged presentations, concerts with plays gratis (a common ruse to circumvent licensing restrictions), concerts at minor gardens, and all events where music formed a lesser part of the proceedings, from charity church-services to equestrian displays. Because of the emphasis on public performance, unadvertised music society concerts are also omitted, even though information on these is occasionally available from other archival sources, printed wordbooks, or newspaper reviews.

The Dataset
-Calendar of London Concerts 1750-1800 deposited at dans -doi:https:// www.doi.org/10.17026/dans-znv-3c2j -Temporal coverage: 1750-1800 The data was originally collected into a simple dos-based relational database, using the budget-priced programme tas-Plus launched in 1986. Data were entered first into the calendar file, and computer programmes were compiled within the software to create related files of places, names and abbreviations. A report form allowed for a range of simple or more sophisticated searches: for example, performances of Haydn string quartets at the Hanover Square Rooms between 1780 and 1790. The print-out automatically included the relevant calendar entries, together with full listings of the relevant places, names and abbreviations.
Subsequent obsolescence of the software meant that there was an imminent danger of the data becoming unreadable -and even unrecoverable. By translating it through dBase it was possible to turn the calendar file into a research data journal for the humanities and social sciences 5 (2020) 50-61 comma-delimited dataset which is currently stored and made freely available as a csv spreadsheet, readable with Excel; the complete place, name and abbreviation files are now simply transcribed in an accompanying Word document. The advantages and disadvantages of this solution will be considered below.
Each concert constitutes a single record, with the following fields: Frequently, individual concerts would receive a series of advertisements, ranging from an initial notice of the event through to a complete programme on the day, via possible changes of date, place, personnel and repertoire items. As far as possible, the latest advertisement is reflected in the record; earlier variants are also noted, together with additional information from reviews (for example, song titles). The Programme field consists of free text, so as to reflect the very variable quantity and type of information preserved. In some cases, the amount of information is minimal ('under the Direction of Mr. Abel'); while often the advertisement merely lists prominent performers, perhaps with a major attraction ('Handel's Grand Coronation Anthem'). Only rarely are full programmes itemised before the 1770s, but during the next decade, it became normal for every item at major concerts to be listed in some detail, even to the extent of identifying individual works in the case of Haydn's symphonies. The following is the complete record for the concert on 18 March 1791 to which the advertisement reproduced in Figure 1   A full listing and explanation of all the abbreviations are provided in a separate document. It will have become clear that the dataset is organized essentially as an index, manually compiled and restructured from the various sources cited; so that none of the material should be regarded as verbatim. The high level of editorial interpretation and intervention extended, in particular, to the programme information. Thus all names, genres, instruments and titles of major works are standardized in line with related authority files. Names are distinguished by editorial identification in lower case, and by the use of -for male and ~ for female musicians; thus:

Conclusion: Advantages and Disadvantages
It might be argued that, with most surviving London newspapers of the period now digitised and fully searchable, a database of this kind is redundant; and it is indeed possible that this will eventually become a reality, especially with the improving capabilities of 'fuzzy' searching. But at present, the dataset is a much more reliable and efficient tool for researchers than free-text searches of newspaper pages indexed through Optical Character Recognition. On the one hand, it overrides inconsistencies of orthography, spelling and terminology in the sources, while on the other it is immune to those variable standards of indexing and search mechanisms currently available.1 At the very least it provides a point of entry to the online digital versions. The comprehensiveness of the dataset is a major strength. One could perhaps argue that the deliberate concentration on public advertised performance is a child of a single historical moment in time. The exclusion of nonadvertised concerts has reified the public-private distinction, whereas more recent scholarship has preferred to embrace a wider spectrum of musical activity, not least because women tended to be more actively involved in the private sphere. Nevertheless, there is an integrity in the definition of public concert here, a concept certainly recognised at the time. As a result, the dataset has significant potential for statistical analysis (even if the value of this is somewhat diminished by the incomplete nature of the surviving programme data, especially the paucity of complete concert programmes before the last quarter of the century).
The consistency with which the data is entered represents another strength, offering the facility to search confidently across the entire dataset. It should be acknowledged, however, that this latter strength is also a weakness, since the high degree of editorial intervention is greatly reliant on one person's knowledge and interpretation. This, in turn, raises a more general issue.
Concert programmes provide a considerable challenge for database design (indeed digital humanities experts at the UK Arts and Humanities Data Service were of the opinion that such projects were the most demanding they had yet research data journal for the humanities and social sciences 5 (2020) 50-61 encountered). While the integrity of the 'event' is usually relatively clear, the variable structure of concert programmes sets them apart from theatrical performances: some concerts may have 20 items, others just one; a single item may have seven instruments and performers listed, or none; the same symphony or aria may be known under a variety of names in different languages, and so on. As a result, current practice varies considerably.
Two other projects relating to concert history will illustrate contrasting positions. The Register of Musical Data in London Newspapers 1660-1800, initiated by Rosamond McGuinness, is orientated towards the printed source, reproducing the full text of newspaper extracts in more-or-less diplomatic transcription, with an associated index (the history and development of this project are described in detail in (Harbor, 2013)). By contrast, the Concert Life in 19th-Century London database is organised by event, with a layered system wherein each advertisement or review is individually indexed while a separate meta-layer provides editorial synthesis and interpretation (Bashford, Cowgill, & McVeigh, 2002).
The present dataset is closer to the latter, but it relies on a still more extreme form of editorial intervention and lacks the sophistication of distinguishable layers from diverse sources. Not only is there no verbatim material from the original sources, but the data is not set out in a schematised way that relates directly to those sources. Thus neither the standardisation of names and genres nor the ordering of programme items can be readily reversed; and where concert programmes are collated from several sources, it is simply impossible to unpick the origin of any given piece of information.
Furthermore, no record was kept of the confidence with which names and works were identified. In truth, a remarkably high proportion of composers and concert performers are known to modern dictionaries, and my confidence in identifying them was usually high. Where there was real doubt, this has been expressed in the record, but it is still likely that individuals have occasionally been misidentified or conflated. For this reason, corrections have been invited and subsequently incorporated in later versions.
A more fundamental issue concerns the layout of the concert programmes themselves. As already suggested, the amount of information conveyed in this period is very variable, ranging from a few words to lengthy complete programmes, meticulously itemised. It was a pragmatic decision to incorporate all programme information within a single field, whatever the length and detail; but in retrospect, this decision appears problematic. The ordering protocols previously outlined go some way towards mitigating these reservations, but this mode of organisation does not accord with accepted modern principles. Current concert programme databases (as with the Concert Life in 19th-Century research data journal for the humanities and social sciences 5 (2020) 50-61 London example) typically isolate each item performed, ideally treating this as the level of record. Often there are separate fields for composer, work and performer, which can be more readily related to authority files. With a relational structure between the event, the single items, and the various components, it becomes possible to construct much more sophisticated analytical tools -especially once (as with nineteenth-century symphony or chamber concerts) standard concert programming designs emerged. But it would take a very substantial project, one requiring a great deal of manual intervention and expertise, to disentangle the information presented in my eighteenth-century calendar in such a way.
Relatedly, the current mode of presentation is not ideal and has suffered from the transformation into csv format. Even the relatively primitive database in which the data was originally entered allowed for multi-field searches, including strings of any length within any field. Reports could be constructed to present the data in a reasonably elegant way. The transfer to a single csv file is -when read by Excel -much less visually attractive than users now expect. Searching is also considerably less user-friendly than it might be, being restricted to the ordering of columns or to simple string searches.
Despite these limitations, there can be no doubt that the calendar remains a central resource for musicologists and for scholars working on the commercialisation of leisure, as evidenced by the number of downloads at the time of writing (3322). The dataset is easily and freely accessible, and the principles behind its compilation are clear. Indeed the advice to keep the data in a simple csv format has been welcomed both for its sustainability and because it offers the potential for users to develop their own systems or software to manipulate the data in whatever way they prefer.2 It is, however, my intention to devise a basic front end, which will not only be more user-friendly but will also offer improved search facilities. I am also consulting as to whether the system of abbreviations, though it has certain advantages, could be unpicked to make the dataset less forbidding to the casual browser.
More fundamentally, the data could assuredly be still more useful if translated into other formats (such as rdf). Trials have already been carried out as part of the InConcert project to explore the potential of linked data by engaging this database with its nineteenth-century successor. The intention of the project is to develop compatibilities that will allow for sophisticated interactions between concert databases across historical periods and countries, research data journal for the humanities and social sciences 5 (2020) 50-61 notwithstanding differences in database design. In this way, standard musicological questions -such as national dissemination or the development of canon -could be evidenced from statistically valid data analysis. Extending the inquiry across other domains of performance, culture and leisure should be still more revealing. But it is also to be expected that wholly new ways of interpreting large datasets will emerge, beyond the imagination of today's individual human researchers. It has already been a long journey from the crumbling newspapers and the paper-based notes I first made in the old North Library of the British Museum in 1975. But which of us can begin to imagine what breakthroughs are still to come?