Record linkage in the Cape of Good Hope Panel

Stellenbosch Working Paper Series No. WP06/2018
 
Publication date: May 2018
 
Author(s):
[protected email address] (Department of History, Utrecht University)
[protected email address] (Department of Economic History, Lund University)
[protected email address] (Department of Economics, Stellenbosch University)
 
Abstract:

In this paper we describe the record linkage procedure to create a panel from Cape Colony census returns, or opgaafrolle, for 1787--1828, a dataset of 42,354 household-level observations. Based on a subset of manually linked records, we first evaluate statistical models and deterministic algorithms to best identify and match households over time. By using household-level characteristics in the linking process and near-annual data, we are able to create high-quality links for 84 percent of the dataset. We compare basic analyses on the linked panel dataset to the original cross-sectional data, evaluate the feasibility of the strategy when linking to supplementary sources, and discuss the scalability of our approach to the full Cape panel.

 
JEL Classification:

N01, C81

Keywords:

census, machine learning, micro-data, record linkage, panel data, South Africa

Download: PDF (1.1 MB)

BER Weekly

6 Jun 2025 SA GDP barely expands in Q1, while BCI and PMI suggest that Q2 remained weak
It was a busy week for local data releases, much of which painted a bleak picture of SA’s economy. Not only was first-quarter GDP growth dismal, but 2024 growth was also revised lower to just 0.5%. , The RMB/BER Business Confidence Index (BCI) showed sentiment remained shaky in the second quarter...

Read the full issue