Latent class modelling of respiratory outcomes in a South African birth cohort

Thesis / Dissertation

2025

Permanent link to this Item
Authors
Journal Title
Link to Journal
Journal ISSN
Volume Title
Publisher
Publisher

University of Cape Town

License
Series
Abstract
Background & Purpose: Early life is a key period determining long-term health, particularly in low-income and middle-income countries (LMICs) where negative exposures are common. This thesis focuses on two key respiratory outcomes: childhood wheezing and lung function. It examines (1) the underlying heterogeneity of wheezing and lung function measurements through the derivation of latent phenotype classes and (2) the impact of early-life lower respiratory tract infections (LRTIs) and other predictors on wheezing and lung function. Methods: Data from the Drakenstein Child Health Study (DCHS) was used to critically evaluate various statistical approaches for identifying latent structures and modelling longitudinal trajectories, tracking 1143 children until school age. To identify wheezing phenotypes, partition around medoids (PAM) clustering tailored for longitudinal data, latent class analysis (LCA) and LCA with random effects were applied and evaluated. Lung function measurements were adjusted for height (as a proxy for body size) using a multiplicative model approach. Prior to employing latent class mixed effects models (LCMM) to identify latent classes in longitudinal lung function data, generalised additive models for location, scale, and shape (GAMLSS) and interrupted time series (ITS) approaches were used to model the impact of LRTIs and other risk factors on longitudinal trajectories. Well-established distributions were chosen within the GAMLSS framework to minimise overfitting and ensure robustness. Findings: PAM successfully identified four clinically distinct wheezing phenotypes: never, early transient, late-onset, and recurrent, outperforming LCA, which faced challenges with class separation. Random-effect LCA improved the modelling of recurrent wheezing trends but faced limitations with rare or sparse data patterns. While investigating phenotype-specific risk factors, cross-validation proved valuable to ensure the robustness and stability of variable selection and model estimates by evaluating model output across different data subsets. Turning to lung function development, by integrating a multiplicative model approach, the GAMLSS framework, and the ITS methodology, this thesis provided a comprehensive understanding of growth patterns. It captured both population-level trends and individual variability while accounting for step changes over time. Using LMS z-scores, LCMM effectively identified underlying features in longitudinal data, with two latent classes capturing the heterogeneity, a normal group and a smaller group of children who deviated from it. The multivariate LCMM approach captured shared latent processes influencing multiple dimensions of lung function, improving the interpretability of class assignments. Class separation analysis highlighted the need for careful selection of the number of latent classes, balancing interpretability and the ability to capture subtle data structures. Conclusions & implications: This thesis reveals how various factors shape respiratory health and the diverse pathways children follow in LMICs. The distinct wheezing patterns in LMICs (compared to a UK cohort) highlight the need for region-specific health policies and call for further study of long-term effects in diverse settings.
Description

Reference:

Collections