Genomics of sickle cell disease and fetal hemoglobin in African populations
Thesis / Dissertation
2024
Permanent link to this Item
Authors
Supervisors
Journal Title
Link to Journal
Journal ISSN
Volume Title
Publisher
Publisher
University of Cape Town
Department
Faculty
License
Series
Abstract
Background More than 300,000 babies are born with sickle cell disease (SCD) each year. About 79% of these births occurs in sub-Saharan Africa where the sickle variant is known to have originated in the genetic background of the ancestors of Agriculturalist populations. Although the variant is highly lethal, the protection it confers against severe malaria in its heterozygous form has resulted in its persistence in sub-Saharan Africa where malaria is endemic. Without intervention, 50–90% of affected children in many sub-Saharan African countries die before their fifth birthday. The search for a definitive cure or effective disease-modifying therapy is therefore an imperative. Fetal hemoglobin (HbF) has long been recognized to ameliorate SCD severity whereby patients harboring natural genetic variations that lead to the persistence of high HbF levels in their blood tend to live longer with fewer complications. The HbF quantitative trait is highly heritable (89%), and is therefore the focus of enormous research for therapeutic purposes. Genetic polymorphisms influencing HbF level have been identified in three major loci; BCL11A, HBS1LMYB, and HBG2. However, these loci jointly explain less than 30% of HbF variability in African sickle cell anemia (SCA) patients as compared to ~50% in African Americans and non-anemic Europeans. Genome-wide association studies have been employed to replicate two of the three major loci (BCL11A and HBS1L-MYB) in African SCA patients, as well as to identify new loci including BACH2 involving a meta-analysis of African Americans and Tanzanians, SLC28A3, TICRR, and PIEZO2 in Nigerians, and FRMPD4 in Tanzanians. While BCL11A and HBS1L-MYB have been replicated in Cameroonian SCD patients through candidate genotyping studies, there has yet to be a genome-wide investigation. Moreover, the long survival of SCA patients in Africa despite a higher disease severity and mortality in the region hints at the enrichment of African genomes with ‘protective' polymorphisms potentially impacting the level of HbF, the strongest modifier of SCD severity. A better understanding of the genetic architecture of HbF in African SCA patients is therefore needed to foster research into potentially novel sickle cell disease therapeutic targets. Aims and Methods This thesis project aimed to: 1) present an in-depth description of the evolutionary history of the sickle cell mutation and its implication for global genetic medicine through a synthesis of publicly available data. This included updating and summarising the global georeferenced databases of the sickle gene and the HBB gene cluster haplotypes, systematically and critically evaluating the age and place of origin of the sickle cell mutation by incorporating information about malaria with which its evolution is highly intertwined, summarising bibliographic information on the genetic modifiers of SCD severity, and, importantly, examining other gene variants that co-occur with the sickle gene in sub-Saharan Africa that might hint at possible co-evolution and effect on SCD severity; 2) provide a comprehensive overview of SCD in Sub-Saharan Africa, bringing out transferable strategies and recommendations for prevention and care, through bibliographic searches; and 3) to investigate the missing heritability of HbF in SCA patients of African ancestry from Cameroon, Tanzanian, and the USA through a multiple imputation panel and genome-wide association approach, with fine-mapping and functional analysis. Briefly, the performance of different reference haplotype panels on genotype imputation accuracy for African SCA populations from Cameroon and Tanzania was assessed. Genome-wide association analyses for the two populations using all the imputation panels was performed, and then meta-analyses of the two populations, as well as with summary statistics from the USA-based cohorts. Statistical fine mapping and extensive in silico functional analyses were next performed to determine the functional relevance of significant associations, while extensive haplotype structure analysis was performed to illuminate the reason for substantial heterogeneity in association signals across different populations of the same ancestry. Finally, gene-based and gene set enrichment analyses were undertaken to identify additional significant loci and significantly enriched pathways, while heritability analysis was performed to further appreciate the observations of multiple significant signals and to better understand the genetic architecture of HbF in African SCA patients. Results Evolutionary history of sickle cell mutation In this first part, we successfully updated the global georeferenced databases of the sickle gene and the HBB gene cluster haplotypes. We showed that changes in population dynamics, contributed by migration and gene flow, might be introducing some HBB haplotypes in regions where they were previously absent, reflecting changes in regional SCD severity. Through our systematic and critical evaluation of the origin of the sickle cell mutation, we identified limitations to the models that have been used to estimate the age of the mutation, and we determined that the mutation is likely older than its currently held age of 22,000 years. Using data on the emergence of malaria, we determined that the mutation is most likely to have originated somewhere in West-Central Africa. Importantly, we showed overlapping distribution of the sickle gene and other gene variants that are under natural selection in sub-Saharan Africa. Data suggest that some of these gene variants impact SCD severity, while others have known modifying effects on the disease severity. Overview of sickle cell disease in sub-Saharan Africa We provided and overview of SCD in sub-Saharan Africa with transferable strategies for prevention and care as part of the Lancet Haematology's 2021 series on hematological care priorities in sub-Saharan Africa. We touched on aspects of SCD such as epidemiology, burden, mortality and life expectancy, hematological parameters, diagnosis, pathophysiology, biological and genetic modifiers of severity, management, environmental determinants, and psychosocial effects. We also presented challenges and proposed recommendations for SCD research in subSaharan Africa. Impact of reference haplotype panels on genotype imputation in African sickle cell anemia populations To use the genomics of HbF to search for other, perhaps more effective, targets of gene editing for better management and perhaps cure of SCD, we first assessed the impact of imputing missing genotypes into genome-wide single nucleotide polymorphism (SNP) data using different reference haplotype panels. We used six different panels including one which we created ourselves from whole genome sequencing data of fifty Cameroonians. The key observations included: i) different imputation performance for different African populations; ii) different imputation performance among different imputation panels within each African population, indicating that one variant can be imputed with vastly different accuracies across different panels, reflecting differences in haplotype structures across the panels resulting from different tagging schemes. This underscoresthe complementary use of the panels, with an expectation of different patterns of association (panel-specific significant signals). Genome-wide association analysis, statistical and functional fine mapping As expected from our assessment of imputation performance, we observed multiple panelspecific significant signals. We replicated the major known loci including BCL11A and HBS1L-MYB, and uncovered fourteen novel loci. The most significant of the novel loci, FLT1, which was observed in Cameroonians and replicated in a meta-analysis with Tanzanians, has known role in hematopoiesis. Fine-mapping and in silico functional analyses suggest an important role in HbF induction. Gene-based, gene set enrichment, and heritability analyses Gene-based analysis confirmed significant signals in the three major HbF-influencing loci, BCL11A, HBS1L-MYB, and HBG2. Gene set enrichment analysis revealed an overwhelming enrichment of hematopoiesis related pathways, as well as hemoglobin as the major enriched biological component, while blood traits were the most enriched phenotypes. Consistent with these, we estimated HbF heritability in a joint cohort of Cameroonian and Tanzanian SCA patients at 94% suggesting an enrichment of these populations with HbF-influencing loci in a way that has probably been underappreciated, and that might be better revealed through whole-genome sequencing. Conclusion and perspectives Our study presented data with overwhelmingly support for a single African origin of the sickle cell mutation, with opportunity for further research on determining the true age of the mutation. Tracking the movement of the mutation through the distribution of the HBB gene cluster haplotypes highlighted changing population dynamics that are important for public health. Calling attention to gene variants whose distribution in sub-Saharan Africa overlaps with the sickle gene is important because the co-occurrence could mean co-evolution which might suggest an impact on SCD severity. Therefore, further evolutionary genetic studies are warranted to understand the interactions of these gene variants and the sickle gene. We equally showed that progress in implementing newborn screening and comprehensive care in some sub-Saharan African countries has been encouraging. Early diagnosis and family education on management can reduce morbidity, while immunisation and hydroxyurea therapy initiated in affected children as young as 9 months old can greatly improve quality and quantity of life. Research in sub-Saharan Africa is urgently needed to establish the exact prevalence, mortality, and morbidity, environmental and genetic factors affecting clinical complications, and life expectancy of patients with sickle cell disease, to facilitate the design of future risk models and to investigate novel routes for therapeutic options with the ultimate aim of improving the clinical outcomes of patients with SCD in all parts of the world. Finally, our discovery of new genes that are associated with blood levels of HbF which is the strongest modifier of SCD severity and the target of gene editing means that our work has tremendous potential towards discovery of novel SCD therapeutic targets, but also in improving our understanding of the genetic architecture of HbF in understudied African populations.
Description
Keywords
Reference:
Esoh, K.K. 2024. Genomics of sickle cell disease and fetal hemoglobin in African populations. . University of Cape Town ,Faculty of Health Sciences ,Department of Pathology. http://hdl.handle.net/11427/40901