A machine learning approach to domain specific dictionary generation. An economic time series framework
Stellenbosch Working Paper Series No. WP06/2021Publication date: March 2021
Author(s):
This paper aims to offer an alternative to the manually labour intensive process of constructing a domain specific lexicon or dictionary through the operationalization of subjective information processing. This paper builds on current empirical literature by (a) constructing a domain specific dictionary for various economic confidence indices, (b) introducing a novel weighting schema of text tokens that account for time dependence; and (c) operationalising subjective information processing of text data using machine learning. The results show that sentiment indices constructed from machine generated dictionaries have a better fit with multiple indicators of economic activity than @loughran2011liability's manually constructed dictionary. Analysis shows a lower RMSE for the domain specific dictionaries in a five year holdout sample period from 2012 to 2017. The results also justify the time series weighting design used to overcome the p>>n problem, commonly found when working with economic time series and text data.
JEL Classification:C32, C45, C53, C55
Keywords:Sentometrics, Machine learning, Domain-specific dictionaries
Notes:Data download: Generated Dictionaries
Download: PDF (738 KB)Login
(for staff & registered students)
Upcoming Seminars
Monday 13 October 202512:10-13:10
Prof Euan Phimister: Stellenbosch Business School
Topic: "TBA"
12:10-13:10
Dr Friedrich Kreuser: Stellenbosch University
Topic: "Allocative Efficiency, Labour Shares, and Corporate Lobbying in European Manufacturing"
12:10-13:10
Prof Masashige Hamano: Waseda University
Topic: "TBA"
BER Weekly
26 Sep 2025 Free Weekly Review | Number 37 | 26 SeptemberThis report covers the key domestic and international data releases over the past week. The more comprehensive BER Weekly Review (Enhanced Version) includes a detailed discussion on the main economic events and developments over the past week, a summary of upcoming data (the week ahead) and the BER’s forecast for key economic indicators....
Read the full issue
Upcoming Seminars
Monday 13 October 202512:10-13:10
Prof Euan Phimister: Stellenbosch Business School
Topic: "TBA"
12:10-13:10
Dr Friedrich Kreuser: Stellenbosch University
Topic: "Allocative Efficiency, Labour Shares, and Corporate Lobbying in European Manufacturing"
12:10-13:10
Prof Masashige Hamano: Waseda University
Topic: "TBA"
BER Weekly
26 Sep 2025 Free Weekly Review | Number 37 | 26 SeptemberThis report covers the key domestic and international data releases over the past week. The more comprehensive BER Weekly Review (Enhanced Version) includes a detailed discussion on the main economic events and developments over the past week, a summary of upcoming data (the week ahead) and the BER’s forecast for key economic indicators....
Read the full issue