We describe a vision of historical analysis at the world scale, through the digital assembly of historical sources into a cloud-based database, where machine-learning techniques can be used to summarize the database into a time-integrated actor-to-actor complex network. Using this time-integrated network as a template, we then apply the method of automatic narratives to discover key actors (‘who’), key events (‘what’), key periods (‘when’), key locations (‘where’), key motives (‘why’), and key actions (‘how’) that can be presented as hypotheses to world historians. We show two test cases on how this method works. To accelerate the pace of knowledge discovery and verification, we describe how historians would interact with these automatic narratives through an online, map-based knowledge aggregator that learns how scholars filter information, and eventually takes over this function to free historians from the more important tasks of verification, and stitching together coherent storylines. Ultimately, multiple coherent story-lines that are not necessary compatible with each other can be discovered through human-computer interactions by the map-based knowledge aggregator.
This introduction is both a statement of a research problem and an account of the first research results for its solution. As more historical databases come online and overlap in coverage, we need to discuss the two main issues that prevent ‘big’ results from emerging so far. Firstly, historical data are seen by computer science people as unstructured, that is, historical records cannot be easily decomposed into unambiguous fields, like in population (birth and death records) and taxation data. Secondly, machine-learning tools developed for structured data cannot be applied as they are for historical research. We propose a complex network, narrative-driven approach to mining historical databases. In such a time-integrated network obtained by overlaying records from historical databases, the nodes are actors, while the links are actions. In the case study that we present (the world as seen from Venice, 1205-1533), the actors are governments, while the actions are limited to war, trade, and treaty to keep the case study tractable. We then identify key periods, key events, and hence key actors, key locations through a time-resolved examination of the actions. This tool allows historians to deal with historical data issues (e.g., source provenance identification, event validation, trade-conflict-diplomacy relationships, etc.). On a higher level, this automatic extraction of key narratives from a historical database allows historians to formulate hypotheses on the courses of history, and also allow them to test these hypotheses in other actions or in additional data sets. Our vision is that this narrative-driven analysis of historical data can lead to the development of multiple scale agent-based models, which can be simulated on a computer to generate ensembles of counterfactual histories that would deepen our understanding of how our actual history developed the way it did. The generation of such narratives, automatically and in a scalable way, will revolutionize the practice of history as a discipline, because historical knowledge, that is the treasure of human experiences (i.e. the heritage of the world), will become what might be inherited by machine learning algorithms and used in smart cities to highlight and explain present ties and illustrate potential future scenarios and visionarios.