A Framework for Investigating Micro Data Quality, with Application to South African Labour Market Household Surveys

Working Paper


Permanent link to this Item
Journal Title
Link to Journal
Journal ISSN
Volume Title

Southern Africa Labour and Development Research Unit


University of Cape Town


In this paper the Total Survey Error (TSE) paradigm is combined with detailed data quality indicators to develop a framework for investigating micro data quality. The TSE framework is widely used in the survey methodology literature to identify different components of error that arise in the survey process. Consequently, it provides a very useful typology for researchers to understand which data quality issues are relevant in applied work based on these surveys. In order to demonstrate how the framework sheds light on micro data quality, two labour market household surveys conducted by Statistics South Africa are reviewed, spanning a time-frame from 1995-2007. It is argued that efforts to improve data quality should involve a virtuous interaction between producers and consumers of micro data and should be considered an evolving process. For producers of data, the preparation and publication of detailed data quality frameworks is recommended, and two examples of these frameworks are reviewed. For consumers of data, judicious analyses of the univariate, bivariate and multivariate relationships in public-use versions of the datasets can help shed light on different components of survey error, and should be communicated back to survey organisations. Ultimately, improving data quality is about being more explicit about the limitations of data production at each stage of the process, which does not stop at initial public release. This is a joint SALDRU and DataFirst working paper.