Testing for a common latent variable in a linear regression: Or how to "fix" a bad variable by adding multiple proxies for it
Working Paper
2015-05-28
Permanent link to this Item
Authors
Journal Title
Link to Journal
Journal ISSN
Volume Title
Publisher
CSSR and SALDRU
Publisher
University of Cape Town
Department
Faculty
License
Series
Abstract
Description
We analyse models in which additional “controls” or proxies are included in a regression. This might occur intentionally if there is significant measurement error in a key regressor or if a key variable is not measured at all. We develop a test of the hypothesis that a subset of the regressors are all proxying for the same latent variable and we show how an estimate of the structural coefficient might be obtained more efficiently than is available in the current literature. We apply the procedure to the determinants of sleep among young South Africans. We show that the income variable in the time use survey is badly measured. Nevertheless the measured impact of income on sleep is significant and amounts to 35 minutes per day between children with the median income and those in the topmost income bracket. Including a variety of asset proxies increases the estimated size of the coefficient enormously. The specification tests indicate that some of the asset proxies, however, have independent effects. Access to electricity, in particular, is not simply proxying for income. Instead it seems to be capturing access to various forms of entertainment, such as television. Even when this independent effect is properly accounted for, the size of the income coefficient is still 40% to 100% larger than in the specifications without the proxies.