A variance shift model for outlier detection and estimation in linear and linear mixed models

 

Show simple item record

dc.contributor.advisor Dunne, Tim en_ZA
dc.contributor.author Gumedze, Freedom Nkhululeko en_ZA
dc.date.accessioned 2014-07-30T17:44:00Z
dc.date.available 2014-07-30T17:44:00Z
dc.date.issued 2009 en_ZA
dc.identifier.citation Gumedze, F. 2009. A variance shift model for outlier detection and estimation in linear and linear mixed models. University of Cape Town. en_ZA
dc.identifier.uri http://hdl.handle.net/11427/4380
dc.description.abstract Outliers are data observations that fall outside the usual conditional ranges of the response data.They are common in experimental research data, for example, due to transcription errors or faulty experimental equipment. Often outliers are quickly identified and addressed, that is, corrected, removed from the data, or retained for subsequent analysis. However, in many cases they are completely anomalous and it is unclear how to treat them. Case deletion techniques are established methods in detecting outliers in linear fixed effects analysis. The extension of these methods to detecting outliers in linear mixed models has not been entirely successful, in the literature. This thesis focuses on a variance shift outlier model as an approach to detecting and assessing outliers in both linear fixed effects and linear mixed effects analysis. A variance shift outlier model assumes a variance shift parameter, !i, for the ith observation, where !i is unknown and estimated from the data. Estimated values of !i indicate observations with possibly inflated variances relative to the remainder of the observations in the data set and hence outliers. When outliers lurk within anomalous elements in the data set, a variance shift outlier model offers an opportunity to include anomalies in the analysis, but down-weighted using the variance shift estimate Ë!i. This down-weighting might be considered preferable to omitting data points (as in case-deletion methods). For very large values of !i a variance shift outlier model is approximately equivalent to the case deletion approach. We commence with a detailed review of parameter estimation and inferential procedures for the linear mixed model. The review is necessary for the development of the variance shift outlier model as a method for detecting outliers in linear fixed and linear mixed models. This review is followed by a discussion of the status of current research into linear mixed model diagnostics. Different types of residuals in the linear mixed model are defined. A decomposition of the leverage matrix for the linear mixed model leads to interpretable leverage measures. ii A detailed review of a variance shift outlier model in linear fixed effects analysis is given. The purpose of this review is firstly, to gain insight into the general case (the linear mixed model) and secondly, to develop the model further in linear fixed effects analysis. A variance shift outlier model can be formulated as a linear mixed model so that the calculations required to estimate the parameters of the model are those associated with fitting a linear mixed model, and hence the model can be fitted using standard software packages. Likelihood ratio and score test statistics are developed as objective measures for the variance shift estimates. The proposed test statistics initially assume balanced longitudinal data with a Gaussian distributed response variable. The dependence of the proposed test statistics on the second derivatives of the log-likelihood function is also examined. For the single-case outlier in linear fixed effects analysis, analytical expressions for the proposed test statistics are obtained. A resampling algorithm is proposed for assessing the significance of the proposed test statistics and for handling the problem of multiple testing. A variance shift outlier model is then adapted to detect a group of outliers in a fixed effects model. Properties and performance of the likelihood ratio and score test statistics are also investigated. A variance shift outlier model for detecting single-case outliers is also extended to linear mixed effects analysis under Gaussian assumptions for the random effects and the random errors. The variance parameters are estimated using the residual maximum likelihood method. Likelihood ratio and score tests are also constructed for this extended model. Two distinct computing algorithms which constrain the variance parameter estimates to be positive, are given. Properties of the resulting variance parameter estimates from each computing algorithm are also investigated. A variance shift outlier model for detecting single-case outliers in linear mixed effects analysis is extended to detect groups of outliers or subjects having outlying profiles with random intercepts and random slopes that are inconsistent with the corresponding model elements for the remaining subjects in the data set. The issue of influence on the fixed effects under a variance shift outlier model is also discussed. en_ZA
dc.language.iso eng
dc.subject.other Statistical Sciences en_ZA
dc.title A variance shift model for outlier detection and estimation in linear and linear mixed models en_ZA
dc.type Doctoral Thesis
uct.type.publication Research en_ZA
uct.type.resource Thesis en_ZA
dc.publisher.institution University of Cape Town
dc.publisher.faculty Faculty of Science en_ZA
dc.publisher.department Department of Statistical Sciences en_ZA
dc.type.qualificationlevel Doctoral
dc.type.qualificationname PhD en_ZA
uct.type.filetype Text
uct.type.filetype Image
dc.identifier.apacitation Gumedze, F. N. (2009). <i>A variance shift model for outlier detection and estimation in linear and linear mixed models</i>. (Thesis). University of Cape Town ,Faculty of Science ,Department of Statistical Sciences. Retrieved from http://hdl.handle.net/11427/4380 en_ZA
dc.identifier.chicagocitation Gumedze, Freedom Nkhululeko. <i>"A variance shift model for outlier detection and estimation in linear and linear mixed models."</i> Thesis., University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2009. http://hdl.handle.net/11427/4380 en_ZA
dc.identifier.vancouvercitation Gumedze FN. A variance shift model for outlier detection and estimation in linear and linear mixed models. [Thesis]. University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2009 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/4380 en_ZA
dc.identifier.ris TY - Thesis / Dissertation AU - Gumedze, Freedom Nkhululeko AB - Outliers are data observations that fall outside the usual conditional ranges of the response data.They are common in experimental research data, for example, due to transcription errors or faulty experimental equipment. Often outliers are quickly identified and addressed, that is, corrected, removed from the data, or retained for subsequent analysis. However, in many cases they are completely anomalous and it is unclear how to treat them. Case deletion techniques are established methods in detecting outliers in linear fixed effects analysis. The extension of these methods to detecting outliers in linear mixed models has not been entirely successful, in the literature. This thesis focuses on a variance shift outlier model as an approach to detecting and assessing outliers in both linear fixed effects and linear mixed effects analysis. A variance shift outlier model assumes a variance shift parameter, !i, for the ith observation, where !i is unknown and estimated from the data. Estimated values of !i indicate observations with possibly inflated variances relative to the remainder of the observations in the data set and hence outliers. When outliers lurk within anomalous elements in the data set, a variance shift outlier model offers an opportunity to include anomalies in the analysis, but down-weighted using the variance shift estimate Ë!i. This down-weighting might be considered preferable to omitting data points (as in case-deletion methods). For very large values of !i a variance shift outlier model is approximately equivalent to the case deletion approach. We commence with a detailed review of parameter estimation and inferential procedures for the linear mixed model. The review is necessary for the development of the variance shift outlier model as a method for detecting outliers in linear fixed and linear mixed models. This review is followed by a discussion of the status of current research into linear mixed model diagnostics. Different types of residuals in the linear mixed model are defined. A decomposition of the leverage matrix for the linear mixed model leads to interpretable leverage measures. ii A detailed review of a variance shift outlier model in linear fixed effects analysis is given. The purpose of this review is firstly, to gain insight into the general case (the linear mixed model) and secondly, to develop the model further in linear fixed effects analysis. A variance shift outlier model can be formulated as a linear mixed model so that the calculations required to estimate the parameters of the model are those associated with fitting a linear mixed model, and hence the model can be fitted using standard software packages. Likelihood ratio and score test statistics are developed as objective measures for the variance shift estimates. The proposed test statistics initially assume balanced longitudinal data with a Gaussian distributed response variable. The dependence of the proposed test statistics on the second derivatives of the log-likelihood function is also examined. For the single-case outlier in linear fixed effects analysis, analytical expressions for the proposed test statistics are obtained. A resampling algorithm is proposed for assessing the significance of the proposed test statistics and for handling the problem of multiple testing. A variance shift outlier model is then adapted to detect a group of outliers in a fixed effects model. Properties and performance of the likelihood ratio and score test statistics are also investigated. A variance shift outlier model for detecting single-case outliers is also extended to linear mixed effects analysis under Gaussian assumptions for the random effects and the random errors. The variance parameters are estimated using the residual maximum likelihood method. Likelihood ratio and score tests are also constructed for this extended model. Two distinct computing algorithms which constrain the variance parameter estimates to be positive, are given. Properties of the resulting variance parameter estimates from each computing algorithm are also investigated. A variance shift outlier model for detecting single-case outliers in linear mixed effects analysis is extended to detect groups of outliers or subjects having outlying profiles with random intercepts and random slopes that are inconsistent with the corresponding model elements for the remaining subjects in the data set. The issue of influence on the fixed effects under a variance shift outlier model is also discussed. DA - 2009 DB - OpenUCT DP - University of Cape Town LK - https://open.uct.ac.za PB - University of Cape Town PY - 2009 T1 - A variance shift model for outlier detection and estimation in linear and linear mixed models TI - A variance shift model for outlier detection and estimation in linear and linear mixed models UR - http://hdl.handle.net/11427/4380 ER - en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record