Low-code and ai -augmented Code for an Archaeological Database: Arkas 2.0

This data paper presents Arkas 2.0, a national research database and research infrastructure containing data on all archaeological sites and monuments in Slovenia. The new database is a hybrid cloud microservice built on low-code platforms (Caspio and ArcGIS Experience builder) and augmented by generative ai (ChatGPT-3.5). The data paper describes the Arkas 2.0 dataset and how it fits into the research context by discussing the challenges archaeologists face in setting up and curating datasets and the associated digital infrastructure. In response to these challenges, the data paper highlights the benefits of low-code platforms and ai-augmented code for archaeological research. It also describes the Arkas 2.0 development workflow, its new data structure


Introduction
Arkas 2.0 is a research database for archaeological sites and monuments of Slovenia.It comprises the data of all archaeological sites published in scientific publications.Except for urban topography, all sites up to the early Middle Ages have been systematically included.In addition to site descriptions, the database also contains metadata on archival material and bibliography data.
The history of the Arkas 2.0 database goes back to the 1950s when Slovenian archaeology started working on a map of archaeological sites (Pahič, 1962).Almost all Slovenian archaeologists were involved at that time, and the legal predecessor of the Institute of Archaeology of the Research Centre of the Slovenian Academy of Sciences and Arts (hereafter IoA) coordinated the effort.The outcome was the publication Archaeological Sites of Slovenia (Bolta, 1975).
The work continued in the form of a project Archaeological Topography of Slovenia, which yielded several published volumes (Dular, 1985;Šavel, 1991) and nearly a thousand field reports.The journal Protection of Monuments (e.g., Bratina, 2001) served as a second source of data on archaeological sites.All these records were compiled in the Register of Archaeological Sites of Slovenia, a physical archive maintained by the IoA.
In the early 1990s, the first phase of the digitisation of the Register of Archaeological Sites of Slovenia occurred and the database known as the Archaeological Cadastre of Slovenia (Slovenian: ARheološki KAtaster Slovenije) or arkas was created.In 1992, arkas was built as a remote dial-up access application on a proprietary platform Trip, developed and maintained at the University of Ljubljana (Modrijan, 1994;Tecco Hvala, 1992).
Textual data was digitised, including long texts and metadata on reports, photographs, and blueprints of archaeological sites.Standardisation of site descriptions was achieved with controlled vocabularies for site location, site type, and site dating.At that time, geographic information systems (gis) capable of performing this task were inaccessible to IoA.Consequently, the location of sites in the landscape was governed by a rather intricate ad hoc system of territorial units (based on the 1954 administrative division) and references to the grid on printed maps (Modrijan, 1994).
The 1992 arkas remote access was based on the web protocol developed at cern and released to other research institutions in 1991 (Berners-Lee, 1992).Since the web protocol and code were only made available licence-free in 1993 and the url standard was not defined until 1994 (Berners-Lee et al., 1994), arkas was probably one of the earliest archaeological databases in the world that could be accessed remotely.Any user with a dial-up modem who was given a username and password could search the database.
arkas was rebuilt in an sql environment, deployed on an sql server, and made available via the World Wide Web (www) and url protocol in 2000.The data has been made available to registered users free of charge and it remained available with thus restricted access until 2023.In 2004, updates were made, along with the addition of the Web gis application, which was not the norm in archaeology at the time.However, the gis application was only used for searching, while data entry, including naming conventions and archiving workflows, continued to be based on the 1954 territorial unit system.In 2016, a custom gis platform based on Google Maps was built.
Throughout this time, data have been constantly curated, updated, and added.The database is thus the product of three decades of thoughtful digital curation.As such, it is best described by the concept of Deep Data, which describes data that are not very big, but semantically very rich (Štular & Belak, 2022).
The 2004 platform, supported by the 2016 gis, was operational until early 2023 and included data on 8,656 archaeological sites, 19,703 bibliographic references, 384 blueprints, 978 field reports, and 37,476 analogue photographs.
In 2023, the arkas database ceased operation due to technical issues and was rebuilt into Arkas 2.0.The data was restructured and streamlined to match current workflows and take advantage of modern hardware and software.The Arkas 2.0 open-access application can be described as a hybrid cloud microservice built on low-code platforms, with some of the injected code developed with the aid of ai.
The topic of this data paper is the description of the Arkas 2.0 dataset and its placement in a research context (see Figure 1).First, the research context and its challenges are described in the "Problem" section.In the "Methods" section, the approach used to tackle these challenges is described.The dataset is described in the "Data" section, and the data paper closes with "Concluding Remarks".Although this is a data paper, it was written by non-it-professionals for non-it-professionals, and thus the "Method" section includes some definitions that might be redundant in a more traditional data paper.

Problem
The European (and global) landscape of digital archaeological data curation comprises of 'haves' and 'have-nots' .There is a lack of equity in terms of access to a persistent and adequate archive or repository of digital data (Corns et al., 2015;Wright, 2018).This dichotomy of 'haves' and 'have-nots' extends to access to the digital infrastructure needed to create, curate, and make full use of research data  (Richards et al., 2021).Without going into too much detail, it can be said that many archaeological teams produce relevant data, but struggle to obtain the funding to build and maintain bespoke databases with online access (e.g., Kreiter, 2021;Oniszczuk & Makowska, 2021;Štular, 2021).
In the specific case of arkas, there was no development of core functionality between 2004 and 2023.At least in the last ten years, the lack of development was almost entirely due to a lack of funds for dedicated personnel and/or external developers.This has led to reliability issues and an overall outdated appearance.More importantly, the team has not been able to adapt the database to the evolution of maintenance workflows and users' needs.One possible solution to such a predicament is the use of low-code database platforms that allow for internal development outside of formal it departments, rapid prototyping, and cloud-based deployment.A very recent development that makes low-code platforms even more viable is ai-augmented code.
So our answer to the 'have-not' problem was to build the Arkas 2.0 web application using low-code platforms and ai-augmented code.The core functionality of the database as a research tool was built entirely with no/ low-code and with ai-augmented code.We used the code built by human developers (provided by Caspio) only to improve the design and in one case to add an advanced feature (see Table 1).

Methods
Arkas 2.0 is a hybrid cloud microservice web application built on low-code platforms, with the injected code developed in part by human developers and in part ai-augmented.Both the backend and frontend were developed by developers outside of formal it departments and are designed to be used, maintained, and curated in the same way.Below we unpack this dense information.
Hybrid Cloud refers to a cloud computing infrastructure that combines on-premises it, public cloud and private cloud resources.This is achieved by creating a single, flexible, optimal cloud to run any workload.Arkas 2.0 is hosted in the clouds provided by Caspio and esri for their respective platforms.However, both platforms are designed to load the required data onto the client each time it is used and some of the data processing is client-side, i.e., the application uses on-premises it.
The microservices architecture refers to a style for developing applications.It allows a large application to be divided into smaller independent parts, with each part having its own area of responsibility.To serve a single user request, a microservices-based application can call many internal microservices to assemble its response.The advantage is that each service can be developed without worrying about dependencies.Arkas 2.0 consists of seven independently deployable components ("Najdišča", "gis", "Literatura", "Elaborati", and "Fototeka" as well as two applications running in the background to power interactive charts) running on two platforms (Caspio and ArcGIS Experience builder).
The low-code platform is a software development platform that allows users to build applications with a minimum of manual coding.It simplifies the application development process by providing visual modelling tools, pre-built templates, and drag-and-drop interfaces.Users can thus develop software applications with minimal (low-code) or no (no-code) coding.Two key advantages of low-code platforms are the ability to rapidly prototype and deploy applications and the fact that they allow non-it professionals to create applications, which address the shortage of skilled developers (available within the budget).Low-code platforms have been around for decades; Caspio, for example, has been available since 2000.However, with the development of web-reliant technologies, these platforms have come into their own in recent years.The 'low' in low-code is an increasingly realistic description of reality.It is therefore not surprising that the global market for low-code development technologies is estimated to grow by 20% in 2023, and that by 2026 at least 80% of users of low-code development tools will be developers outside formal it departments (Gartner, 2022).Arkas 2.0 was built on two leading low-src="https://c1h...310" title="Literatura">Sorry, but your browser does not support frames.</iframe>).
The described approach allowed us to do the vast majority of the development of the application in a small team of two non-it professionals.This sped up the implementation tremendously, as we combined workflow analysis, workflow customisation, data and database structure development, and application development into a single iterative process.This allowed us to complete the entire process from identifying viable low-code platforms to deploying the alpha application in a fortnight, the beta version in four, and the final version, including documentation (Lozić et al., 2023), in six weeks.In comparison, in parallel, IoA was carrying out a process of restructuring the database structure and building a custom application for its Zbiva database (Belak et al., 2023;Štular, 2019), but using a traditional approach based on custom built application by external it professionals.The entire process of developing Arkas 2.0 occurred during the beta testing phase of Zbiva and was about 90% shorter (adjusted for the difficulties caused by covid-19).For Zbiva, the IoA team spent approximately 150 person-hours on meeting with developers and another 100 person-hours on other related tasks; the external developers cost around 300 developer hours over three years.The entire process of developing Arkas 2.0 took about 200 person-hours and 4 hours by professional developers at Caspio.After deployment, the costs for database hosting and maintenance for Zbiva and Arkas 2.0 are similar at around €3,000/ year each.
The inevitable challenge in choosing any platform, commercial or otherwise, is platform lock-in.We have minimised the risks by choosing stable providers with a long tradition.In addition, the reduced complexity of the database structure also means less effort if the service needs to be rebuilt on a different platform.

4.
For workflow, for example, the database was designed as a national database and therefore had an elaborated user management system with separate tables for institutions, users, and user roles.In reality, however, arkas has only ever been managed by IoA staff and, in the last ten years, mostly by a single person.The user management system for Arkas 2.0 could therefore be greatly simplified.The same happened with several other fields and the metadata on physical archives, since for obvious reasons there are no new records created.
The second reason for the complex structure of the original arkas was the amount of data.Since its inception in 1992, arkas has been designed to hold data on around ten thousand archaeological sites with all contextual data.By 2023, it had almost reached this capacity and contained 107 mb of data in accdb format (tables only).In 1992, 107mb was about 10% of the capacity of the largest existing hard drive (Grochowski & Hoyt, 1996), but in 2023 this is equivalent to 0.000001% of the largest commercially available hard drive (e.g., Athow, 2023) or 0.0001% of the capacity of a high-end smartphone (e.g., Palmer, 2023).In terms of the 2023 hardware, the corresponding size of arkas would therefore be about 10 tb.For this reason, the relationship database three decades ago required an approach based on data efficiency that relied heavily on data normalisation with many lean lookup tables and one-tomany or many-to-many relationships.For example, the arkas database was structured into 72 tables with 83 relationships, of which 6 were many-to-many relationships, not counting the gis database (see Figure 2).This is an efficient and still optimal data structure, but it can only be maintained and developed by data experts.
So nowadays, Arkas 2.0 workflows are relatively simple and the amount of data is negligible.In addition, the modern digital transformation of science and society requires a constant evolution of the workflows and, consequently, a constant development of the database structure.
We therefore rebuilt the database structure from scratch to adapt it to the new circumstances.The guiding principle was not the efficiency of the data, but the low complexity of the structure.The new database consists of five tables interconnected with url links, parameter passing, and triggered actions.For example, the many-to-many relationship between sites and literature -each site can be described in many articles and each article can refer to many sites -has been replaced by a url link that passes on parameters.The bibliography displayed for each site is actually a separate data file displayed within an iframe.It can display the relevant records from the literature because the search criteria (the id of the site in question) are passed within the url link (e.g., https://arkas.caspio.com/dp/5...9dd?ID_Najdisce=010322.04 only displays articles that refer to the site id 010322.04; the bolded part of the link is the query string that passes the parameter) (see Figure 3; Caspio 2024a; Caspio 2024b).
At the same time, we have replaced several fields, i.e., we have altered the structure of the data.In particular, we have streamlined the fields that are no longer part of workflows and are considered archivable.We have merged such data from independent tables into single text fields incorporated in the relevant table.Data that are not considered important for archaeological research, such as personal data, have not been integrated into the database, but remain available in the archive of the old database.
Arkas 2.0, like any national site and monuments database, is an invaluable research tool not only for researchers interested in Slovenian data but also for regional or continent-wide analyses.Although the structure of the database predates all attempts at internationalisation, it is robust enough to be included in the ariadne portal.The ariadne portal is a central access point to the archaeological resources provided by partner institutions across Europe.Behind the portal are the ariadne registry and a range of services used to manage information about the datasets, collections, vocabularies, metadata schemas and mappings (Štular et al., 2016; https://portal.ariadne-infrastructure.eu).In other words, the database structure of Arkas 2.0 is robust enough that it can be incorporated into any relevant database using modern tools.
A detailed description of the database structure and metadata is available in a white paper, which is deposited with the archival copy of the data (Lozić et al., 2023).
The lean data structure of Arkas 2.0 is also much better suited for archiving.The streamlined data structure significantly simplifies archiving and, at the same time, increases the fair (Wilkinson et al., 2016) value of the archive.Specifically, this means that with the new database structure, only 5 csv files need to be archived, each of which can be queried directly with a variety of office or database software.This is a great advantage over the previous database because complex relational databases are notoriously unsuitable for archiving since the relationships cannot be embedded in the data tables.
Cloud-based storage is located in Ireland and is therefore subject to EU and Slovenian legislation.Archival copies of the database are regularly stored both on an on-premises archive drive and on Zenodo's cloud-based repository.
Arkas 2.0 is only available in Slovenian and there is no plan to translate it into other languages as the costs are prohibitive.However, we have tested the automatic translation function into English in Chrome and Edge browsers with satisfactory results.The translation of the functional elements (search, clear, etc.) is correct and should allow productive use.Of course, the user must be aware that elements such as site names should not be translated, which unfortunately means that using Arkas 2.0 with the automatic translation function is not effortless.

Concluding Remarks
arkas is a national archaeological research database, developed in 1992 as one of the first remote access archaeological databases.Its venerable roots made it necessary to completely rebuild the database in 2023.The result, Arkas 2.0, is a hybrid cloud microservice web application built on the low-code Caspio and ArcGIS Experience builder platforms.It was developed by non-it-professional developers (archaeologists) and augmented using generative ai.A very small amount of human coding by it professionals was only used to enhance the design.This strategy is a response to a general trend in it, where the changing technological environment, particularly the frameworks used, means that more and more effort is being put into creating completely new applications.Arkas 2.0 has been tailored to fit existing workflows while harnessing the power of cloud-based hardware and software.It was designed to allow continuous development of the workflows and database structure by a (suitably trained) archaeologist, i.e., by developers outside the formal it departments.The design principle was thus to keep the complexity of the structure as low as possible because flat(er) data structures make it easier to share data, as users need to worry less about combining separate data files.Furthermore, current tools such as R and Python make it increasingly easy to refactor such flat(er) data files into more complex database structures if required.At the same time, care was taken not to lose any information that is pertinent for the modern user.Nevertheless, one piece of information has been lost: The page range within which individual site is cited in the individual articles is no longer available.
This was a crucial requirement as the IoA does not employ data scientists or other it professionals, but three decades of experience have taught us that to fully exploit the scientific potential of a database, we need direct control over the maintenance, updating, and development of new features.
As the Arkas 2.0 dataset has not yet been utilized by scholars for citable contributions, a direct comparison with the precedential arkas dataset is untenable.However, it is feasible to conjecture potential applications.Firstly, Arkas 2.0 could serve cultural heritage professionals, allowing them to delineate sites of interest within specific parameters effortlessly.For instance,

table 1
Functionalities in Arkas 2.0, their type and how they were developed