Variable selection in logistic regression, with special application to medical data

Master Thesis

1994

Permanent link to this Item
Authors
Supervisors
Journal Title
Link to Journal
Journal ISSN
Volume Title
Publisher
Publisher

University of Cape Town

License
Series
Abstract
In this thesis, the various methods of variable selection which have been proposed in the statistical, epidemiological and medical literature for prediction and estimation problems in logistic regression will be described. The procedures will be applied to medical data sets. On the basis of the literature review as well as the applications to examples, strengths and weaknesses of the approaches will be identified. The procedures will be compared on the basis of the results obtained, their appropriateness for the specific aim of the analysis, and demands they place on the analyst and researcher, intellectually and computationally. In particular, certain selection procedures using bootstrap samples, which have not been used before, will be investigated, and the partial Gauss discrepancy will be extended to the case of logistic regression. Recommendations will be made as to which approaches are the most suitable or most practical in different situations. Most statistical texts deal with issues regarding prediction, whereas the epidemiological literature focuses on estimation. It is therefore hoped that the thesis will be a useful reference for those, statistically or epidemiologically trained, who have to deal with issues regarding variable selection in logistic regression. When fitting models in general, and logistic regression models in particular, it is standard practice to determine the goodness of fit of models, and to ascertain whether outliers or influential observations are present in a data set. These aspects will not be discussed in this thesis, although they were considered when fitting the models.
Description

Bibliography: pages 121-126.

Reference:

Collections