Abstract
This article introduces a newly constructed database: the Memories 1921 Dataset. The database is a representative sample drawn from the Tafel v-bis Dataset and consists of complete inheritance taxation records for 2,321 individuals who died in 1921. To increase the amount of information about these individuals, the authors connected the Memories 1921 Dataset to the original Tafel v-bis Dataset. This added information about the name, birthplace, place of residence, age, marital status, and profession of the deceased to his or her wealth portfolio. The database is open access and available via the Social Sciences and Digital Humanities Archive (sodha).
- –Related data set “Memories 1921 Dataset” with doi www.doi.org/10.34934/DVN/LXQRRX in repository “sodha”
1. Introduction
This article introduces a new database comprising information on end-of-life wealth portfolios for a representative sample of the population of people who passed away in the Netherlands in 1921 and were subject to inheritance taxation. It serves as a continuation of the Tafel v-bis Dataset. Whereas the Tafel v-bis Dataset provided summary information on the deceased person’s net wealth, together with personal and socio-economic variables, the Memories 1921 Dataset provides a complete breakdown of the deceased’s wealth composition, including descriptions of every individual asset and liability as they occurred in the source.
This data article proceeds as follows. In section two we provide a detailed description of the dataset. Then we explain our methodology. In section four, we discuss the accuracy of the Memories 1921 Dataset compared to the Tafel v-bis Dataset. Finally, we provide some concluding remarks including avenues for further research for which the Memories 1921 Dataset might be used.
2. Description of the Dataset
- –Memories 1921 Dataset deposited at Social Sciences and Digital Humanities Archive (sodha) – doi:www.doi.org/10.34934/DVN/LXQRRX
- –Temporal coverage: 1921 (1 January-31 December)
The Memories 1921 Dataset formed the basis for the article by Gelderblom et al. (2023) which investigated the estate composition of the richest 30 percent of the Netherlands in 1921, with a particular focus on the financial institutions that people used for saving, borrowing, and lending. We chose the year 1921 because, by that time, the Netherlands had a highly developed and extensive banking sector with bank assets to gdp ratios peaking that year at about 70 percent (Gelderblom et al., 2023, p. 3).
The dataset is published as a part of the replication package which comprises the R scripts for all regressions, graphs, and tables as well as code to load the data correctly. Aside from that, the replication package contains two datasets.
The first dataset is an updated version of the Tafel v-bis Dataset first described by de Vicq and Peeters (2020), which gives summary information about all people who died in 1921 in the Netherlands and were subject to inheritance taxation. The updates include four new variables and the removal of duplicates in the Ref_reg variable. The new variables are corresponding Amsterdam Codes (acode) for the birthplace and the place of residence. Amsterdam Codes are unique identifiers per historical municipality in the Netherlands. The Ref_reg variable was initially presumed to be a unique identifier for every individual in the dataset, provided by the original source. In practice, abbreviations during registration caused duplicates across different provinces.1 These duplicates were manually verified and corrected to ensure that Ref_reg identifiers were unique and referred to the intended person.
The second dataset is the Memories 1921 Dataset itself. This contains for 2,325 individuals a full breakdown of all the assets and liabilities they had at the moment of death. In total, the database contains 77,123 entries of assets and liabilities. These assets and liabilities have been coded and grouped by asset type to allow for insight into the portfolio make-up.2 Table 1 summarizes the main categories of the extended codebook, which is available in the replication package and as Appendix B to Gelderblom et al. (2023). A full list of variables for both datasets is provided in the readme files accompanying the datasets.
Additional variables have been created, based on information extracted from the raw data entries. For all entries, we added the real value of the item (taking into account the percentage of ownership of the asset or liability). For entries concerning loans, we extracted the interest rates of the loan, the location of the counterparty, and location of the notary verifying the loan. Distances between locations were calculated using a distance matrix between municipal centres for the year 1890 (Philips, 2020). All Dutch locations were assigned an acode according to the municipality it belonged to in the year 1890 (van der Meer & Boonstra, 2011). This greatly facilitates connecting this database to, for example, the hdng database (Mourits et al., 2016).
To increase the amount of information about the owners of the end-of-life portfolios, we linked the Memories 1921 Dataset to the Tafel v-bis Dataset via the deduplicated Ref_reg variable. This added information about the name, birthplace, place of residence, age, marital status, and profession of the deceased to their wealth portfolio.
3. Data Gathering and Methodology
The underlying source for the Memories 1921 Dataset is the Memorie van Successie (death duty) which was levied on estates with a net value larger than 1,000 guilders.3 The administrative process behind this source including checks on declarations has been described in detail in de Vicq and Peeters (2020) and Appendix A of Gelderblom et al. (2023). In short, death duties list all assets and liabilities a person had at the moment of death. The listing was made by representatives of the deceased (i.e. heirs, legatees, custodians) and verified by local officials via income and wealth taxation documents, the official price list of the Amsterdam stock exchange, and the cadaster.4
Needing much more detail than available in the published, aggregate data we constructed a sample of original death duties taking into account potential regional differences in both wealth levels and financial behavior. We did this as follows. In 1921, with a total population of 6.8 million, around 77,000 people died in the Netherlands (van der Bie & Smits, 2001). Subtracting stillborns and minors leaves us with about 61,000 adults. Using the Tafel v-bis, we identified 24,263 deceased persons for whom a death duty was submitted, just over one-third of the adults who died in 1921.
Following Piketty and Saez (2006) and Piketty et al. (2014) in their research on Parisian death duty forms, we designed a stratified sample for each of the eleven Dutch provinces, including everybody in the 100th percentile of the wealth distribution, half of the deceased with wealth between the 95th and 99th percentile, down to every sixteenth person in the bottom 70 percent of taxed decedents, plus one out of ten people whose assessment fell below the 1,000 guilder threshold. We oversampled the wealthiest to make sure we captured enough observations at the top end of the wealth distribution to make meaningful comparisons across wealth groups.
Our sampling of the data resulted in a total of 2,325 death duties listing over 77,000 assets and liabilities. As Table 2 shows, the sample obtained is smaller than the one we designed because we could not find 459 death duties referred to in the summary tables. These missing death duties are fairly randomly dispersed over the different wealth classes and provinces, except for the lowest wealth Class 1. We miss 208 decedents there, probably because their estate’s value ended up below the 1,000 guilder threshold. Even so, our sample does retain 592 death duties with a net value below 1,000 guilders, a tenth of which actually had a negative net worth. We have classified this latter group of 53 people who were effectively insolvent as a separate Class 7 (Gelderblom et al., 2023).
Once we selected our sample, a team of eleven research assistants gathered the selected death duties from regional archives across the Netherlands. They photographed the necessary documents or obtained pdf/jpeg copies of already digitized death duties. These were stored to allow for checking in case of doubts or mistakes and are available upon request for other researchers to use.
The research assistants then transcribed from these photos all assets and liabilities into a very basic entry structure in Microsoft Excel. Because of the extensive listings of real estate as one asset, this information was usually summarized instead of literally transcribed. The information about the deceased, the executors of the estate or heirs, or the administrative history of the death duty was not transcribed. This is one way in which the existing database could be improved upon by future researchers. Table 3 summarizes what we noted for every asset or liability. A full list and description of every row in the datasets are provided in the readme file accompanying the datasets.
Apart from the basic entry form, there were three more sheets in the Excel file: (1) a locked one with the entire sample of that province for reference, (2) one with a logbook for comments, and (3) a tab with the division of work allocating which death duty should be entered by whom.
The entry forms were merged by the authors into one master file which was then coded for analytical purposes. The Type, Description, and Major Category variables formed the basis for coding the assets and liabilities. We first broadly categorized the assets and corrected for typos and obvious mistakes, including outliers. In a second round, we manually checked and verified all assigned codes in the database. Then we went through all the rows that were listed as dubious or unknown. We also double-checked the remaining outliers. By then, all but a few dozen errors remained, which were mostly corrected. All remaining unknown items are listed as Totalcode 999. There are 49 entries with Totalcode 999.
In the final stage, we controlled and corrected all death duties where the wealth calculated from the Memories 1921 Dataset differed more than 10 percent from the net wealth listed in the Tafel v-bis. This resulted in a correction of around 900 death duties and greatly improved the reliability of the dataset. The differences were mostly due to typos in the values or omissions of corrections and addenda to declarations. The Tafel v-bis value should reflect the net wealth as taxed, after all addenda. In case addenda were made, we almost always found them attached to the original death duty and took them into account. The few differences that remained could, at first sight, not be resolved by looking at the sources.
4. Accuracy of the Dataset
Because we were mostly interested in the net wealth of individuals and the portfolio make-up across wealth classes and regions, we tested our sample on that point against the Tafel v-bis. Figure 1 shows the net wealth listed in the Tafel v-bis (x-axis) and per the corresponding individual, the net wealth calculated from the Memories 1921 Dataset database (y-axis). The red line indicates no difference. Points above the red line indicate that the net wealth is larger in the Memories than in the Tafel v-bis, below the line means that the net worth is smaller than the amount listed in the Tafel v-bis.
Net wealth recorded in the Memories 1921 and Tafel v-bis Dataset (log-log scale)
Citation: Research Data Journal for the Humanities and Social Sciences 9, 1 (2024) ; 10.1163/24523666-bja10038
In most cases the net wealth calculated from the death duty is very close to the value listed in the Tafel v-bis; 1,459 cases (63 percent) had a nominal difference of fewer than 10 guilders, and 1,939 cases (83 percent) had a difference of less than 5 percent of their net wealth. Small differences were usually the result of rounding errors. 312 individuals have a difference of more than 10 percent between their wealth in the death duties and the one listed in the Tafel v-bis and these are all individuals with a low or zero net wealth listed in the Tafel v-bis, which rapidly results in large relative differences.
While not perfect, the dataset seems reliable in the wealth estimates it provides based on individual assets and liabilities. This means that the detailed wealth portfolios of Dutch wealth holders in 1921 are accurate. However, there are two caveats one needs to take into account. First, because the dataset is based on a stratified sample it is important to correct for the oversampling when calculating wealth shares and relative proportions of asset distributions. This can be easily done by multiplying with the sampling factor.5 Second, we are uncertain whether the dataset can currently be used to re-create the wealth distribution of the living population. Typically, estimates for the living are made using the inverted mortality method (Lindgren, 2022). However, based on the wealth and age distribution from the Tafel v-bis Dataset, we know that the age-related mortality risk was lower for the wealthy relative to the age-related mortality risk for the general population. This is why, unless wealth-corrected mortality tables are constructed, we advise against naively employing the inverted mortality method.
5. Concluding Remarks and Future Avenues
This article describes the Memories database, including how it was created and its level of accuracy compared to its summary sources. We believe this dataset will be of interest to other researchers for two reasons. First, there are few publicly available large datasets containing such detailed information.6 By making our dataset open access available we try to contribute to the movement of open science, allowing others to build on our work. The limited availability of similar datasets for other places also makes the Memories database of potential interest to international researchers. Second, the literal transcription of every individual asset and liability increases the transparency of categorizations and extracted variables, which allows other researchers to impose their own categorizations. The extreme level of detail makes the dataset versatile and therefore useful for international researchers from diverse academic backgrounds. For instance, economic historians with sociologic and demographic interests use the database to explore the correlation between an individual’s gender, profession, and accumulated wealth portfolio over a lifetime. Legal scholars can depend on this data and their underlying sources to gain a deeper understanding of notarial deeds and the system of inheritances. Consumption historians can look at the presence of types of goods in people’s homes. Finally, and perhaps most fittingly, financial historians can use this data to explore financial development or asset allocation and can, for instance, explore why a different set of individuals held different types of financial assets.
Acknowledgements
The research for this article was funded with a grant from the Dutch Research Council (nwo) (Grant Nr. 277-53-007) and it received funding within the framework of the Odysseus programme from the Research Foundation – Flanders (fwo) (Nr. G0F0421N 114516).
The authors would like to thank Jerome Bekis, Jasper Bongers, Marlon Donck, Duco Heijs, Daan Hendrikx, Stefan Gaillard, Tom Gerritsen, Paul Schilder, Tom ten Berge, Constant van der Putten, and Tirreg Verburg for gathering and transcribing the selected death duties. We also thank Matthias Van Laer De Gezelle for excellent research assistance in controlling and correcting the death duties calculated from the Memories 1921 Dataset.
References
Bureau van de Statistiek der gemeente Amsterdam. (1919). De uitgaven van 114 ambtenaars- en arbeidersgezinnen.
de Vicq, A., & Peeters, R. (2020). Introduction to the Tafel v-bis Dataset: Death duty summary information for the Netherlands, 1921. Research Data Journal for the Humanities and Social Sciences, 5, 1–19. www.doi.org/10.1163/24523666-BJA10007.
Gelderblom, O., Jonker, J., Peeters, R., & de Vicq, A. (2023). Exploring modern bank penetration: Evidence from the early 20th-century Netherlands. Economic History Review, 76, 892–916.
Lindgren, H. (2022). ‘Over-indebtedness’ – or not? Household debt accumulation and risk exposure in nineteenth century Sweden. Scandinavian Economic History Review, 70(1), 33–56. www.doi.org/10.1080/03585522.2021.1879242.
Mourits, R. J., Boonstra, O., Knippenberg, H., Hofstee, E. W., & Zijdeman, R. L. (2016). Historische Database Nederlandse Gemeenten (Version 5) [dataset]. iish Data Collection. https://hdl.handle.net/10622/RPBVK4.
Philips, R. C. M. (2020). Continuity or change? The evolution in the location of industry in the Netherlands and Belgium (1820–2010): [Utrecht University]. www.doi.org/10.33540/417.
Piketty, T., Postel-Vinay, G., & Rosenthal, J. L. (2014). Inherited vs self-made wealth: Theory & evidence from a rentier society (Paris 1872–1927). Explorations in Economic History, 51(1), 21–40.
Piketty, T., & Saez, E. (2006). The evolution of top incomes: A historical and international perspective. American Economic Review, 96(2), 200–205.
van der Bie, R. J., & Smits, J. P. (2001). Tweehonderd jaar statistiek in tijdreeksen 1800–1999. Centraal Bureau voor de Statistiek.
van der Meer, A., & Boonstra, O. (2011). Repertorium van Nederlandse gemeenten vanaf 1812 waaraan toegevoegd de Amsterdamse code [dataset]. dans Data Station Social Sciences and Humanities. www.doi.org/10.17026/DANS-XDR-CS36.
For example: Ref_Reg 1#10#9954 and Ref_Reg 4#10#9954 both became abbreviated to 10#9954 during the administrative procedure. This, however, made them no longer unique reference numbers.
The coding was the result of a collaboration between Oscar Gelderblom, Joost Jonker, Ruben Peeters and Amaury de Vicq.
This was about half the yearly income of a skilled laborer in 1919. (Bureau van de Statistiek der gemeente Amsterdam, 1919).
Securities were valued according to the official price list (prijscourant) closest to the date of death (usually the day before death). Other assets such as real estate and movables were valued through an estimate of market value at time of death, but it is unclear who provided the estimate (professionals or heirs).
For example, we sampled half of the deceased with wealth between the 95th and 99th percentile, so the wealth of this group needs to be multiplied by two to achieve the total wealth of this group.
The most notable open access datasets are the Piketty, Postel-Vinay and Rosenthal dataset for Paris and France available as a replication package: www.doi.org/10.3886/E116081V1; the Gloria L. Main sample of 18.509 New England probates between 1631 and 1776, available via eh.net: https://www.eh.net/database/new-england-1631-1776-sample-of-18509-probates; and for the Netherlands the Meertens Institute Boedelbank available via https://boedelbank.meertens.knaw.nl.