Outliers and influence under arbitrary variance

Doctoral Thesis


Permanent link to this Item
Journal Title
Link to Journal
Journal ISSN
Volume Title

University of Cape Town

Using a geometric approach to best linear unbiased estimation in the general linear model, the additional sum of squares principle, used to generate decompositions, can be generalized allowing for an efficient treatment of augmented linear models. The notion of the admissibility of a new variable is useful in augmenting models. Best linear unbiased estimation and tests of hypotheses can be performed through transformations and reparametrizations of the general linear model. The theory of outliers and influential observations can be generalized so as to be applicable for the general univariate linear model, where three types of outlier and influence may be distinguished. The adjusted models, adjusted parameter estimates, and test statistics corresponding to each type of outlier are obtained, and data adjustments can be effected. Relationships to missing data problems are exhibited. A unified approach to outliers in the general linear model is developed. The concept of recursive residuals admits generalization. The typification of outliers and influential observations in the general linear model can be extended to normal multivariate models. When the outliers in a multivariate regression model follow a nested pattern, maximum likelihood estimation of the parameters in the model adjusted for the different types of outlier can be performed in closed form, and the corresponding likelihood ratio test statistic is obtained in closed form. For an arbitrary outlier pattern, and for the problem of outliers in the generalized multivariate regression model, three versions of the EM-algorithm corresponding to three types of outlier are used to obtain maximum likelihood estimates iteratively. A fundamental principle is the comparison of observations with a choice of distribution appropriate to the presumed type of outlier present. Applications are not necessarily restricted to multivariate normality.

