Outcome selection in longitudinal analysis of immunological data

dc.contributor.advisorLittle, Francesca
dc.contributor.advisorNemes, Elisa
dc.contributor.authorHolcroft, Shannon
dc.date.accessioned2025-11-20T11:12:19Z
dc.date.available2025-11-20T11:12:19Z
dc.date.issued2025
dc.date.updated2025-11-20T11:06:22Z
dc.description.abstractImmunological research often compares subgroups defined by exposure variables known (or hypothesised) to influence continuous immune responses. Many immune outcomes are measured over time, often in a small number of patients. Effective outcome selection ensures that research focuses on immune outcomes with the strongest signals for subgroup differences. This dissertation explores an outcome selection technique for longitudinal immunological data, addressing current methodological limitations and proposing improvements. The approach integrates statistical modelling with dimension reduction to identify immune outcomes with the most evidence for subgroup differences. By focusing on these subsets, fewer statistical hypotheses are tested simultaneously, preserving power when stricter significance thresholds are applied to reduce type-I error inflation. The dissertation examines the suitability of different longitudinal modelling frameworks. Generalised linear mixed-effects models are better suited to the characteristics of immunological data and research than linear mixed-effects models. Two dimension reduction techniques are compared: principal component analysis (PCA) and hierarchical cluster analysis (HCA) followed by PCA. PCA identifies the largest sources of variance across all outcomes, while HCA followed by PCA identifies variance within groups of similar outcomes. These techniques influence the definition of families of tests for false discovery rate (FDR) corrections. When outcomes are selected via PCA-only dimension reduction, more tests are performed simultaneously and require correction. It was hypothesised that HCA followed by PCA would yield more significant discoveries after FDR control. However, fewer simultaneous comparisons did not reliably correspond with more statistically significant discoveries. The methodology was applied to a dataset from the South African Tuberculosis Vaccine Initiative (SATVI), focusing on 33 immune outcomes and three exposures: MVA85A priming, maternal Mycobacterium tuberculosis sensitisation (measured by a positive QuantiFERONTB Gold test), and combinations of feeding practices and cotrimoxazole treatment. The analysis shows that different dimension reduction techniques lead to different outcome selections and families of tests, emphasising the need to align analysis objectives with outcome selection techniques. This dissertation contributes to outcome selection methodology in high-dimensional, longitudinal settings, with broader applications in biomedical research.
dc.identifier.apacitationHolcroft, S. (2025). <i>Outcome selection in longitudinal analysis of immunological data</i>. (). University of Cape Town ,Faculty of Science ,Department of Statistical Sciences. Retrieved from http://hdl.handle.net/11427/42279en_ZA
dc.identifier.chicagocitationHolcroft, Shannon. <i>"Outcome selection in longitudinal analysis of immunological data."</i> ., University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2025. http://hdl.handle.net/11427/42279en_ZA
dc.identifier.citationHolcroft, S. 2025. Outcome selection in longitudinal analysis of immunological data. . University of Cape Town ,Faculty of Science ,Department of Statistical Sciences. http://hdl.handle.net/11427/42279en_ZA
dc.identifier.ris TY - Thesis / Dissertation AU - Holcroft, Shannon AB - Immunological research often compares subgroups defined by exposure variables known (or hypothesised) to influence continuous immune responses. Many immune outcomes are measured over time, often in a small number of patients. Effective outcome selection ensures that research focuses on immune outcomes with the strongest signals for subgroup differences. This dissertation explores an outcome selection technique for longitudinal immunological data, addressing current methodological limitations and proposing improvements. The approach integrates statistical modelling with dimension reduction to identify immune outcomes with the most evidence for subgroup differences. By focusing on these subsets, fewer statistical hypotheses are tested simultaneously, preserving power when stricter significance thresholds are applied to reduce type-I error inflation. The dissertation examines the suitability of different longitudinal modelling frameworks. Generalised linear mixed-effects models are better suited to the characteristics of immunological data and research than linear mixed-effects models. Two dimension reduction techniques are compared: principal component analysis (PCA) and hierarchical cluster analysis (HCA) followed by PCA. PCA identifies the largest sources of variance across all outcomes, while HCA followed by PCA identifies variance within groups of similar outcomes. These techniques influence the definition of families of tests for false discovery rate (FDR) corrections. When outcomes are selected via PCA-only dimension reduction, more tests are performed simultaneously and require correction. It was hypothesised that HCA followed by PCA would yield more significant discoveries after FDR control. However, fewer simultaneous comparisons did not reliably correspond with more statistically significant discoveries. The methodology was applied to a dataset from the South African Tuberculosis Vaccine Initiative (SATVI), focusing on 33 immune outcomes and three exposures: MVA85A priming, maternal Mycobacterium tuberculosis sensitisation (measured by a positive QuantiFERONTB Gold test), and combinations of feeding practices and cotrimoxazole treatment. The analysis shows that different dimension reduction techniques lead to different outcome selections and families of tests, emphasising the need to align analysis objectives with outcome selection techniques. This dissertation contributes to outcome selection methodology in high-dimensional, longitudinal settings, with broader applications in biomedical research. DA - 2025 DB - OpenUCT DP - University of Cape Town KW - principal component analysis KW - PCA LK - https://open.uct.ac.za PB - University of Cape Town PY - 2025 T1 - Outcome selection in longitudinal analysis of immunological data TI - Outcome selection in longitudinal analysis of immunological data UR - http://hdl.handle.net/11427/42279 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/42279
dc.identifier.vancouvercitationHolcroft S. Outcome selection in longitudinal analysis of immunological data. []. University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2025 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/42279en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Statistical Sciences
dc.publisher.facultyFaculty of Science
dc.publisher.institutionUniversity of Cape Town
dc.subjectprincipal component analysis
dc.subjectPCA
dc.titleOutcome selection in longitudinal analysis of immunological data
dc.typeThesis / Dissertation
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2025_holcroft shannon.pdf
Size:
8.99 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.72 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections