• English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  • Communities & Collections
  • Browse OpenUCT
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  1. Home
  2. Browse by Author

Browsing by Author "Mashatola, Lebohang"

Now showing 1 - 2 of 2
Results Per Page
Sort Options
  • No Thumbnail Available
    Item
    Open Access
    Exploring topological data analysis in gene expression data topology-driven biomarker discovery and clinical outcome prediction in oncology
    (2025) Nyase, Ndivhuwo; Mashatola, Lebohang; Muller, Julia; Sinkala, Musalula
    This thesis is grounded in the fundamental observation that biological data has shape and this shape matters. Beneath the high-dimensional, often noisy landscape of gene expression profiles lie hidden topological structures (connected components, loops and voids) that capture the complex relationships driving cancer development and progression. By embracing this perspective, we position Topological Data Analysis (TDA) and persistent homology at the core of a novel analytical framework designed to tackle two key challenges in cancer research: clinical outcome prediction and biomarker discovery. In this study, we employ Weighted Gene Topological Data Analysis (WGTDA) to extract topological features from gene expression data, which serve as prognostic biomarkers for cancer classification, staging, and treatment response. Moreover, by integrating these topological features with machine learning models we aim to enhance the predictive accuracy for clinical outcomes. For clinical outcome prediction, we transformed gene expression profiles into topological fingerprints using multiple co-expression measures—namely, Pearson Correlation, Distance Correlation, and Weighted Topological Overlap (wTO) computed with both Pearson and Distance-based adjacencies. These topological features were analyzed using Random Forests. In parallel, we compared the predictive performance of traditional machine learning models (SVM, Gradient Boosting Decision Trees, Random Forest, and Neural Networks) trained on raw gene expression data against models incorporating the topological fingerprints. This comparative analysis was conducted across three classification tasks: cancer type (using TCGA-SARC, TCGA-PCPG, and TCGA-ESCA datasets), cancer staging (using TCGA-HNSC for stages I–IV), and treatment response (responders vs. non-responders). For biomarker identification, the same three tasks were applied using the best performing co-expression measure to generate a global topological representation of the patient population. This provided a disease-level view, highlighting shared homological patterns to facilitate biomarker discovery. Additionally, a dedicated visualization tool has been developed to aid in interpreting these topological signatures and identifying critical biomarkers. The tool is available at https://nnyase.github.io/MSc-Thesis/ WGTDA significantly enhanced phenotype prediction tasks by overcoming common pitfalls of traditional ML models in RNA-Seq data, such as overfitting and poor handling of class imbalance. TDA-derived features improved generalizability of ML models in tasks such as cancer staging and treatment response prediction. Our findings strongly support the integration of TDA into clinical outcome prediction, demonstrating its value in capturing nuanced patterns that allow ML methods to learn more effectively. Moreover, WGTDA remarkably identified key gene signatures for cancer type, staging, and treatment response without relying on pre-existing biological assumptions, yielding biomarkers that are strongly supported by the existing literature. These results underscore the method's reliability and potential clinical utility in precision oncology.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Hypertension in African Populations: Review and Computational Insights
    (2021-04-06) Mabhida, Sihle E.; Mashatola, Lebohang; Kaur, Mandeep; Sharma, Jyoti R; Apalata, Teke; Muhamed, Babu; Benjeddou, Mongi; Johnson, Rabia
    Hypertension (HTN) is a persistent public health problem affecting approximately 1.3 billion individuals globally. Treatment-resistant hypertension (TRH) is defined as high blood pressure (BP) in a hypertensive patient that remains above goal despite use of ≥3 antihypertensive agents of different classes including a diuretic. Despite a plethora of treatment options available, only 31.0% of individuals have their HTN controlled. Interindividual genetic variability to drug response might explain this disappointing outcome because of genetic polymorphisms. Additionally, the poor knowledge of pathophysiological mechanisms underlying hypertensive disease and the long-term interaction of antihypertensive drugs with blood pressure control mechanisms further aggravates the problem. Furthermore, in Africa, there is a paucity of pharmacogenomic data on the treatment of resistant hypertension. Therefore, identification of genetic signals having the potential to predict the response of a drug for a given individual in an African population has been the subject of intensive investigation. In this review, we aim to systematically extract and discuss African evidence on the genetic variation, and pharmacogenomics towards the treatment of HTN. Furthermore, in silico methods are utilized to elucidate biological processes that will aid in identifying novel drug targets for the treatment of resistant hypertension in an African population. To provide an expanded view of genetic variants associated with the development of HTN, this study was performed using publicly available databases such as PubMed, Scopus, Web of Science, African Journal Online, PharmGKB searching for relevant papers between 1984 and 2020. A total of 2784 articles were reviewed, and only 42 studies were included following the inclusion criteria. Twenty studies reported associations with HTN and genes such as AGT (rs699), ACE (rs1799752), NOS3 (rs1799983), MTHFR (rs1801133), AGTR1 (rs5186), while twenty-two studies did not show any association within the African population. Thereafter, an in silico predictive approach was utilized to identify several genes including CLCNKB, CYPB11B2, SH2B2, STK9, and TBX5 which may act as potential drug targets because they are involved in pathways known to influence blood pressure. Next, co-expressed genes were identified as they are controlled by the same transcriptional regulatory program and may potentially be more effective as multiple drug targets in the treatment regimens for HTN. Genes belonging to the co-expressed gene cluster, ACE, AGT, AGTR1, AGTR2, and NOS3 as well as CSK and ADRG1 showed enrichment of G-protein-coupled receptor activity, the classical targets of drug discovery, which mediate cellular signaling processes. The latter is of importance, as the targeting of co-regulatory gene clusters will allow for the development of more effective HTN drug targets that could decrease the prevalence of both controlled and TRH.
UCT Libraries logo

Contact us

Jill Claassen

Manager: Scholarly Communication & Publishing

Email: openuct@uct.ac.za

+27 (0)21 650 1263

  • Open Access @ UCT

    • OpenUCT LibGuide
    • Open Access Policy
    • Open Scholarship at UCT
    • OpenUCT FAQs
  • UCT Publishing Platforms

    • UCT Open Access Journals
    • UCT Open Access Monographs
    • UCT Press Open Access Books
    • Zivahub - Open Data UCT
  • Site Usage

    • Cookie settings
    • Privacy policy
    • End User Agreement
    • Send Feedback

DSpace software copyright © 2002-2026 LYRASIS