Browsing by Author "Chimusa, Emile"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
- ItemOpen AccessA Genome-wide Association Study of Schizophrenia in the South African Xhosa and Generalizability of Polygenic Risk Score across African populations(2021) Majara, Lerato Charlotte; Ramesar, Raj; Chimusa, EmileAfrican populations are vastly underrepresented in genetic studies despite having the most genetic variation globally and facing wide-ranging environmental exposures. Most of these studies have been conducted in populations of European (EUR) ancestry using GWAS arrays that represent the genetic variation in these populations. Thus, the prediction accuracy of polygenic risk scores (PRS) derived from EUR ancestry populations is less accurate in populations of non-European ancestry, and least accurate in African (AFR) ancestry populations. The extent to which PRS prediction accuracy varies within AFR ancestry populations has not, however, been previously investigated. This study had two aims: the first was to investigate the contribution of common variants to the risk of schizophrenia in the South African Xhosa (SAX) population through genome-wide association study (GWAS) analysis, and to determine if PRS derived from EUR and East Asian (EAS) ancestry populations from the Psychiatric Genomics Consortium (PGC) Schizophrenia Working Group were generalizable to SAX. The second aim was to assess the generalizability of PRS for non-psychiatric phenotypes that were derived from EUR ancestry individuals from the UK Biobank (UKB, n = ~350,000) in the Uganda General Population Cohort (GPC, n = 4,778) and the South African Drakenstein Child Health Study (DHCS, n = 638). To address the first aim, a GWAS was conducted in 2,086 Xhosa individuals from South Africa with and without schizophrenia (ncases = 1,038; ncontrols = 1,048) using a custom-designed Affymetrix GWAS array designed to capture variation in the Xhosa population. The schizophrenia GWAS in SAX yielded one SNP (rs35172303 ; P = 4.74e-08, OR = 0.6004, 95%CI:[0.499,0.721]) in ZFP3 that met genome-wide significance. The association of variants in ZFP3 from the schizophrenia GWAS is consistent with those from an earlier exomesequence study in SAX undertaken by colleagues, but this gene has not previously been associated with schizophrenia in large-scale schizophrenia GWAS of predominantly EUR ancestry. After characterizing the genetic architecture of schizophrenia in SAX, it was found that the heritability was enriched across functional categories involved in the regulation of gene expression. Then, the accuracy of PRS derived from PGC Schizophrenia Working Group from both EUR and EAS ancestries in predicting schizophrenia in SAX was quantified. There was low PRS prediction accuracy using PGC-derived summary statistics in SAX (PGC-EUR: max R2 = 0.0057, P = 0.008; PGC-EAS: max R2 = 0.0059, P = 0.007). These findings are consistent with previous findings that showed that PRS predication accuracy is low when discovery and target cohorts come from different ancestral backgrounds. For the second aim, PRS prediction accuracy was quantified in simulations using data from the African Genome Variation project (AGVP) to represent continental AFR diversity. Samples were categorised by geographical region into West, East and South Africa cohorts. Each cohort was divided into a discovery and target datasets. The West and East African discovery data was used to predict the simulated phenotype in the three target cohorts. Using UKB EUR ancestry individuals, PRS prediction accuracy was assessed for 34 anthropometric and blood panel traits in the Uganda GPC, and then meta-analysed UKB with PAGE (Population Architecture using Genomics and Epidemiology, comprising about 50,000 Latino/Hispanic and African-American individuals) and BBJ (Biobank Japan, n = ~162,000) to assess how the inclusion of diverse sample impacts PRS prediction accuracy. Simulations were limited by sample size but showed that PRS prediction accuracy was highest when the discovery and target cohorts were matched by African region, and for phenotypes with the sparsest genetic architecture. Using empirical data from UKB and the Uganda GPC, a low prediction accuracy was observed across all 34 quantitative traits in GPC when using GWAS data from UKB. There was differential prediction accuracy across AFR ancestry groups within UKB, i.e. the prediction accuracy was highest for the Ethiopian and admixed populations, and lowest for southern African populations. When comparing PRS prediction accuracy of East African individuals from the UKB to that of individuals from GPC, the prediction accuracy was lowest in the Ugandan GPC population, indicating that the difference in environments between the two groups may be contributing to the difference in PRS accuracy. Moreover, the cross-ancestry meta-analyses showed that the inclusion of diverse samples in large scale studies improves PRS prediction accuracy, most especially for phenotypes with population-enriched variants. It was demonstrated for the first time in this thesis that EUR ancestry-derived PRS prediction accuracy varied within continental AFR ancestry groups, and tracks with population history and the evolution of humans. The higher prediction accuracy observed in Ethiopians can be explained by their genetic proximity to Europeans as a result of the back to Africa migration, whereas the southern African populations (including SAX) are more proximal to the ancestral populations that never left the continent. It is therefore imperative to not only include more African samples in future large-scale studies, but to have samples that adequately represent the genetic and environmental diversity on the African continent.
- ItemOpen AccessGenetic diversity and population structure within Botswana: association with HIV-1 infection(2021) Thami, Prisca Kerapetse; Chimusa, Emile; Gaseitsiwe, Simani; Novitsky, Vlad; Leteane, MelvinSouthern Africa is disproportionately affected by HIV-1, with Botswana being among the most affected countries. The interindividual heterogeneity in susceptibility or resistance to HIV-1 and progression upon infection is attributable to, among other factors, host genetic variation. Characterisation of human genetic variations can contribute towards understanding the genetic aetiology of HIV-1 and foster development of novel preventive and treatment strategies against HIV-1. Despite the high burden of HIV-1 in Botswana, the population of Botswana is significantly underrepresentation in genomics studies of HIV-1. Furthermore, the bulk of previous genomics studies evaluated common human genetic variations, however, there is increasing evidence of the influence of rare variants in the outcome of diseases which may be uncovered by comprehensive complete and deep genome sequencing. This research aimed to characterise human genomic variations of Batswana in order to elucidate mutation burden, assess population structure and evaluate the role of these genomic variations in susceptibility to HIV-1 and progression through bioinformatics analyses. Whole genome sequences (WGS) of 265 HIV-1 positive and 125 were HIV-1 negative unrelated individuals from Botswana were computationally analysed. The sequences were mapped to the human reference genome GRCh38. Population joint variant calling was performed using Genome Analysis Tool Kit (GATK) and BCFTools. Variant characterisation was achieved by annotating the variants with a suite of databases in ANNOVAR. The genomic architecture of Botswana was assessed through principal component analysis and structure analysis and FST. Cumulative effects of rare variant sets on susceptibility to HIV-1 and progression (CD4+ T-cell decline) were determined with optimized Sequence Kernel Association Test (SKAT-O). Functional analysis of the prioritized variants was performed through gene-set enrichment using databases in GeneMANIA and Enrichr. Variant characterization revealed 24 damaging variants with the most damaging variants being ACTRT2 rs3795263, HOXD12 rs200302685, ABCB5 rs111647033, ATP8B4 rs77004004 and ABCC12 rs113496237. There was admixture of Khoe-San, Niger-Congo and European ancestries observed in the population of Botswana, however, there was no evidence of overall substructure among the HIV-1 positive/negative individuals of Botswana, indicating similar genetic exposure among HIV-1 samples. No variant set was significantly associated with susceptibility to HIV-1, while sets of novel rare-variants within the ANKRD39 (8.48 x 10- 8 ), LOC105378523 (7.45 x 10-7 ) and GTF3C3 (1.36 x 10-6 ) genes were significantly associated with HIV-1 progression. Functional analysis revealed that the variants affected several pathways including chemokine signalling, glycolysis, glycosylation, HIV-1 and host receptor glycoprotein biosynthesis, intracellular transport of molecules and transcription pathways. These findings highlight the significance of whole genome sequencing in pinpointing rare variants of clinical relevance. This PhD thesis unravelled novel genes and novel rare variants that are putatively linked to HIV-1 progression. The thesis contributes towards a deeper understanding of the host genetics HIV-1 and offers promise of population specific interventions against HIV-1.
- ItemOpen AccessIdentifying genetic variants and pathways associated with extreme levels of fetal hemoglobin in sickle cell disease in Tanzania(2020-06-05) Nkya, Siana; Mwita, Liberata; Mgaya, Josephine; Kumburu, Happiness; van Zwetselaar, Marco; Menzel, Stephan; Mazandu, Gaston K; Sangeda, Raphael; Chimusa, Emile; Makani, JulieBackground Sickle cell disease (SCD) is a blood disorder caused by a point mutation on the beta globin gene resulting in the synthesis of abnormal hemoglobin. Fetal hemoglobin (HbF) reduces disease severity, but the levels vary from one individual to another. Most research has focused on common genetic variants which differ across populations and hence do not fully account for HbF variation. Methods We investigated rare and common genetic variants that influence HbF levels in 14 SCD patients to elucidate variants and pathways in SCD patients with extreme HbF levels (≥7.7% for high HbF) and (≤2.5% for low HbF) in Tanzania. We performed targeted next generation sequencing (Illumina_Miseq) covering exonic and other significant fetal hemoglobin-associated loci, including BCL11A, MYB, HOXA9, HBB, HBG1, HBG2, CHD4, KLF1, MBD3, ZBTB7A and PGLYRP1. Results Results revealed a range of genetic variants, including bi-allelic and multi-allelic SNPs, frameshift insertions and deletions, some of which have functional importance. Notably, there were significantly more deletions in individuals with high HbF levels (11% vs 0.9%). We identified frameshift deletions in individuals with high HbF levels and frameshift insertions in individuals with low HbF. CHD4 and MBD3 genes, interacting in the same sub-network, were identified to have a significant number of pathogenic or non-synonymous mutations in individuals with low HbF levels, suggesting an important role of epigenetic pathways in the regulation of HbF synthesis. Conclusions This study provides new insights in selecting essential variants and identifying potential biological pathways associated with extreme HbF levels in SCD interrogating multiple genomic variants associated with HbF in SCD.
- ItemOpen AccessPharmacogenomics of sickle cell disease therapeutics: pain and drug metabolism associated gene variants and hydroxyurea-induced post-transcriptional expression of miRNAs(2020) Mnika,Khuthala; Wonkam, Ambroise; Dandara, Collet; Mazandu, Gaston; Mowla, Shaheen; Chimusa, EmileSickle cell disease (SCD) is a common blood disease caused by a single nucleotide substitution (c.20T>A, p.Glu6Val) in the beta globin gene on chromosome 11. The prevalence of the disease is high throughout large areas in sub-Saharan Africa, the Mediterranean basin, the Middle East, and India due to the level of protection that the sickle cell trait, provides against severe malaria. Approximately 300,000 infants are born per year with sickle cell anemia, which is defined as homozygosity for the sickle hemoglobin (HbS). The majority (nearly 75%) of these births occur in sub-Saharan Africa, particularly in two countries: Nigeria, and the Democratic Republic of the Congo where there are poorly resourced healthcare systems. Early diagnosis, penicillin prophylaxis, blood transfusions, hydroxyurea, and hematopoietic stem-cell transplantation can dramatically improve survival and quality of life for patients with SCD. However, our understanding of the role of genetic and clinical factors in explaining the complex phenotypic diversity of this disease is still limited. Early prediction of the severity, and patients' responses to specific therapeutics of SCD could lead to more precise treatment and management. Beyond well-known modifiers of disease severity, such as fetal hemoglobin (HbF) levels and αthalassemia, other genetic variants might influence specific sub-phenotypes. New treatments and management strategies accounting for these genetic and nongenetic factors could substantially and rapidly improve the quality of life and reduce health care costs for patients with SCD. Patients with SCD are subjected to long term administration of drugs and there is a limited data on pharmacogenomics of SCD therapeutics. Vaso-occlusive crisis (VOC) are the main clinical events of SCD and are associated with recurrent and long-term use of antalgics/opioids and HU. This project aimed to investigate the clinical and genetic predictors of painful vaso-occlusive crisis (VOC) among SCD Cameroon patients by exploring pharmacokinetic determinants of treatment responses as well as post-transcriptional signatures triggered by hydroxyurea treatment, particularly, miRNA expression. SCD patients were recruited from Yaounde Central Hospital and Laquintinie Hospital in Douala (Wonkam et al., 2018, Mnika et al., 2019 (b)), and recent migrants SCD patients from the DRC, recruited at the Haematology Clinic, Groote Schuur Hospital in Cape Town, South Africa (Mnika et al., 2019 (a) and Mnika et al., 2019 (b)). Sociodemographic and clinical data were collected by means of a structured questionnaire. Patients' medical records were reviewed to extract their clinical features over the past 3 years. Specifically, the occurrences of VOC, hematological parameters, hospital outpatient visits, hospitalisation, overt strokes, blood transfusions, and administration of hydroxyurea were recorded. Height, weight, body mass index (BMI), systolic and diastolic blood pressures (SBP and DBP) were measured. Detailed descriptions of patients and sampling methods used in the Cameroonian patients have been reported previously (Wonkam et al., 2018 Mnika et al., 2019 (a) and Mnika et al., 2019 (b)). For the purpose of comparing frequencies of variants, ethnically matched Cameroonian controls were randomly recruited from apparently healthy blood donors in Yaounde for participation in the study. All blood samples were collected for genomic characterisation and analysis. DNA was extracted from peripheral blood, following instructions on the available commercial kit [QIAamp DNA Blood Maxi Kit ® (Qiagen, United States)]. Genotyping (TaqMan and MassArray) was performed for 40 variants in 17 pain-related genes, three fetal haemoglobin (HbF)-promoting loci, two kidney dysfunction-related genes, and HBA1/HBA2 genes for 436 patients. A subset of these samples was also genotyped to analyse 32 core and 267 extended pharmacogenes using commercially available PharmacoScan® platform for characterisation of pharmacokinetic determinant of response. We also compared the pharmacogenes variants from these African groups, to data extracted from the 1000 genomes Project. Moreover, association studies were carried out on pharmacogenes variants with SCD clinical variability. Additionally, protein-protein interaction (PPI) network and enriched biological processes and pathways were investigated. For association studies, statistical models using regression frameworks to analyse 40 variants were performed in R®. For miRNA expression, total RNA was isolated using the miRNeasy kit according to protocol of the Manufacturer (QIAGEN, Hilden, Germany); and sequenced by the Genomic and RNA Profiling Core at Baylor College of Medicine, United States, using the NanoString Platform (NanoString Technologies, Inc., Seattle, WA, United States), according to manufacturer's instructions. Genes with statistically significant changes in expression were analysed using the significance analyses of microarrays (SAM) tools. Female sex, body mass index, Hb/HbF, blood transfusions, leucocytosis and consultation or hospitalisation rates significantly correlated with VOC. Three painrelated gene variants correlated with VOC (CACNA2D3-rs6777055, P = 0·025; DRD2- rs4274224, P = 0·037; KCNS1-rs734784, P= 0·01). Five pain-related gene variants correlated with hospitalization/consultation rates (COMT-rs6269, P = 0·027; FAAHrs4141964, P = 0·003; OPRM1- rs1799971, P = 0·031; ADRB2-rs1042713; P < 0·001; UGT2B7-rs7438135, P = 0·037). The 3·7 kb HBA1/HBA2 deletion correlated with increased VOC (P = 0·002). HbF-promoting loci variants correlated with decreased hospitalisation (BCL11A-rs4671393, P = 0·026; HBS1L-MYB-rs28384513, P = 0·01). APOL1 G1/G2 correlated with increased hospitalisation (P = 0·048). A commercial genotyping array platform (PharmacoScan®) with 4627 markers located in 1191 genes was used to investigate 299 pharmacogenes (32 ADME core and 267 extended pharmacogenes). Based on the PharmacoScan analyses, no statistically significant differences in allele frequencies were detected between SCD cases and controls from Cameroon. A principal component analysis (PCA) revealed that Cameroonians' data clustered with other Africans, but this population is significantly distinct from American, European and Asian populations data. Variant allele frequencies in 21/32 core pharmacogenes were significantly different between the two SCD groups (Cameroon vs. Congo). No correlation between clinical variability and variants in the core genes was detected for both populations under study. An association study of the core and extended PharmacoScan variants to VOC identified statistically significant associations between two single nucleotide polymorphisms (SNPs) to VOC after correction of multiple testing. These two SNPs mapped to 50 genes, with two SNPs located in core pharmacogenes (SLCO4A1- rs118042746, p=1.21e-07; UGT1A10, UGT1A8- rs10176426, p=1.22e-07). Functional enrichment analyses revealed that these 50 genes are involved in three biological processes and four pathways relevant to SCD pathophysiology, including xenobiotic glucuronidation (GO:0052697, p = 2.3e-03), and drug metabolism - other enzymes (p = 2.1e-02). Further analyses of the 50 genes, identified key genes in human proteinprotein networks: NTSR1, LRMDA, SMAD SMAD4 and CDH2. These four genes also interacted with three core pharmacogenes associated with VOC: UGT1A8, UGT1A10 and SLCO4A1. We found 22/798 miRNAs to be differentially expressed under HU treatment, with the majority (13/22) being functionally associated with HbF-regulatory genes, including BCL11A (miR-148b-3p, miR-32-5p, miR-340-5p, miR-29c-3p), MYB (miR-105-5p), KLF-3 (miR-106b-5), and SP1 (miR-29b-3p, miR-625-5p, miR-324-5p, miR-125a-5p, miR-99b-5p, miR-374b-5p, miR-145-5p). The present thesis started by highlighting the scarcity of studies investigating variable responses to pain in SCD patients and then proceeded to addressing this research gap. To our knowledge this is the first body of from Africa to provide evidence supporting the possible development of a genetic risk model for pain in SCD. This is also the first body of work to report an association between these two SNPs and VOC in core and extended pharmacogenes. Our data reveals that the commercial pharmacogenes arrays investigated might need additional evidence for appropriateness among Africans. Therefore, it advocates the need to invest in research exploring population-specific arrays, drug design, targeting, and efficacy, for improved clinical management of patients of African descent. Previous studies have investigated various mechanisms to understand the genomic variations affecting responses to HU, but full understanding of the variable HU-mediated HbF production among individuals affected by SCD remains elusive. The present study showed that mechanisms of HbF production in response to HU, could particularly be mediated through miRNA regulation. The data reveals some alternative perspectives and routes towards identifying new therapeutic targets and approaches for SCD. However, this study needs to be replicated in larger samples in multiple African populations.
- ItemOpen AccessWhole exome sequencing to investigate genetic variants of non-syndromic hearing impairment in a population of African ancestry(2018) Manyisa, Noluthando; Wonkam, Ambroise; Dandara, Collet; Chimusa, EmileIntroduction: Hearing impairment occurs when a child has hearing loss greater than 30dB in their better hearing ear and an adult cannot detect sound lower than 40dB in the better hearing ear. It is a common sensory disorder that affecting approximately 360 million worldwide, with an incidence of 6 in 1000 live births in developing countries such as those in Sub-Saharan Africa. 50 % of hearing impairment, in developed countries, is due to genetic factors, with 70% of genetic hearing impairment being classified as non-syndromic hearing impairment, which occurs when the hearing impairment presents with no other clinical manifestations. Hearing impairment is associated with over 150 genes, of which two connexin genes, GJB2 and GJB6, are the most prevalent genes associated with hearing impairment in European, Asian and North American of European ancestries populations. These genes have however been shown to be insignificant causes of Hearing Impairment in African populations. Aim: The aim of this study is to determine the rates for putative pathogenic variants in 172 hearing impairment associated genes, among Cameroonian patients affected by hearing impairment, and non-hearing-impaired controls. Methods: Patients and controls Patients were recruited from various schools of the Deaf and Ear, Nose and Throat (ENT) clinics in Cameroon. The patients were examined by qualified medical geneticists and ophthalmologist and detailed family history and medical history was obtained from the patients and their parents. 19 patients, who were negative for GJB2 and GJB6 mutations and presented with putative non-syndromic hearing impairment, were selected from a cohort of 582 patients for the present study. The control population consisted of 130 ethnically matched groups without any personal or familial history of hearing impairment. The controls were recruited from Yaoundé Central Hospital and Laquintinie Hospital in Cameroon. Whole exome sequencing DNA was extracted from whole blood using the salting out procedure and the Puregene Blood kit®. The DNA was subjected to spectrometry and gel electrophoresis to determine the quantity and quality of the DNA samples. The samples were then subjected to whole exome sequencing on the Illumina platform using the Nextera Rapid Capture Exome Kit at an average read depth of 30X, whereby only 18 patients were successfully sequenced. The exomes were then subjected to FastQC and SolexaQC++ for quality control measures and aligned to the hg19 reference genome using GATK and VariantMetaCaller. Bioinformatics analysis Variant annotation was performed using Annovar and the annotated variants were filtered based in rarity and pathogenicity. Tests for genetic differentiation and principle component analysis was performed on the combined patient exomes and combine control exomes. The first principle component analysis included data from African populations from the 1000 Genomes Phase 3 as well as six control samples from the Democratic Republic of Congo; and the second principle component analysis analysed on the Cameroonian patients and control population. Population structure analysis was followed by protein-protein interaction analysis using custom python and R script and pathway enrichment analysis using Enrichr combined with a second custom R script. The proportion of derived and ancestral alleles was computed by downloading the SNP ancestral alleles from Ensembl and verifying the presence of the SNPs in dbSNP database. The combined patient and control exomes were annotated using the VCFtools “fillOaa” script. The ancestral alleles were computed by dividing the number of times the alternative allele matched the ancestral allele with the number of copies of all the alternative alleles across all samples at the particular position. The ancestral alleles were categorised into six bins, based on their minor allele frequency, in the patient and control populations and this was used to contrast their proportions of derived and ancestral alleles. Furthermore, the proportion of ancestral and derived alleles in hearing impairment associated genes was computed at SNP based level for the Cameroonian population and contrasted with population from the Democratic Republic of Congo. Variants validation by Sanger sequencing Primers were designed to amplify the fragment surrounding the purported SNPs in MYO15A, MYO3A, and COL9A3 as well as for the fragments surrounding the population specific SNPs in VTN, RPL3L and DHRS4L2. Polymerase chain reaction was performed for the MYO15A, and MYO3A fragments. This was followed by purification of the PCR products and direct cycle Sanger sequencing of the PCR products. The sequencing products were then purified through ethanol precipitation and the fragments were suspended in HiDi Formamide and run on the capillary electrophoresis. The variants in MYO3A, MYO15A and COL9A3 were viewed in Integrated Genomics Viewer using the Bam files as well. Results Putative deleterious variants Single nucleotide polymorphism (SNPs) in MYO3A, MYO15A and COL9A3, were filtered out as putative causative mutations for three, four and two patients respectively. Direct Sanger Sequencing and viewing the patients BAM files did not confirm the presence of any of these putative pathogenic in the patients. Variations in USH2A, HSD17B4 and MYO1A were also filtered out but these variants were not considered disease causing, after a careful genotype to phenotypes correlations. Population genetics variants differentiations At a population level, specific variations were identified in FOXD4L2, DHRS2L6, RPL3L and VTN. Significant genetic differentiation was shown to exist between the control population and the patients’ population with regard to specific variants in VTN and RPL3L; furthermore, it was shown that these variants in VTN and RPL3L interact with other hearing impairment associated proteins with evidences that that VTN is hub protein for a hearing impairment associated pathway along with nine other genes. Conversely, this was not the case for variants described in FOXD4L2 and DHRS2L6. In known hearing Impairment genes, the proportion of ancestral alleles was lowest for the patients’ population for variations with minor allele frequencies between 0.0 and 0.1. The proportion of derived and ancestral alleles was also shown to differ between the Cameroonian and the population from the Democratic Republic of Congo, indication possible regional differences in aetiology of Hearing impairment amongst multiple African populations. Discussion Low putative pathogenic variants in known hearing impairment genes among Africans The low pick up rate for putative pathogenic variants in our patients follows a similar trend observed in the African American populations, with hearing impairment, as well as data from targeted exome sequencing from South African and Nigerian populations. This result is also in agreeance with other studies that interrogated hearing impairment in African populations utilising other means besides next generation sequencing. This result also highlights the importance of validating any results obtained from next generation sequencing through traditional approaches such as Sanger Sequencing or viewing the BAM files on IGV, specifically in African population, poorly represented in Exome databases. Bioinformatics Analysis Exhibited some Specific Variants among Cameroonian Protein-protein interactions and enrichment analysis indicated that VTN and RPL3L, and their interacting proteins, are significantly associated with osteoclast differentiation, which is associated with hearing impairment in osteogenesis imperfecta. VTN was further shown as a hub protein of a protein subnetwork, along ATPB2. The presence of a second protein acting as a hub protein may account for why aberrations in VTN have not been associated with a disease; whereby ATPB2 may ameliorate the pathogenic phenotype that ought to be observed in the presence of null mutations in VTN. Evolutionary adaptation of human hearing Data indicates the patient population carried a higher proportion of derived alleles in known hearing impairment genes, at low minor allele frequencies; possibly indicating, the interactive modifiers capacities of multiple hearing impairment genes, or alternatively, the polygenetic nature of hearing impairment in some patients. The proportion of ancestral and derived alleles was contrasted in the Cameroonian and the population from the Democratic Republic of Congo and it indicated that the variations that may result in hearing impairment in the one population may not be the same variations that result in hearing impairment in the other population Due to this, it is necessary to determine the causative variants resulting in disease in each of these populations independently. Conclusion and perspectives The results support a low pick up rate of putative variants in 172 known genes in groups of Cameroonian patients with HI, underscoring the current Targeted panel sequencing for HI may not be relevant for some African populations. The result also support the need of confirmation of variants found in WES, as well careful genotype to phenotypes correlations, particularly among African, whose sequences exome is relatively low in Exomes databases, and as a result could lead to more false positive results. Population genetic analysis has provided novel insight in the genetic architecture of HI among this group of Africans; particularly, the differential frequencies of ancestral alleles vs derived alleles in HI genes among patients vs controls underline the possibility of multigenic influence on the phenotype of Hearing Impairment that have not been well investigated, and may also signal evolutionary enrichment of some variants of HI genes in the populations as the result of natural selections, that deserve further investigation. The result supports the need of intensive familial studies in multiple African populations in order to unravel the novel genes and those variants that are relevant in clinical practice for people of African ancestry.
- ItemOpen AccessWhole genome sequencing approach to identifying genetic risk factors underlying anterior cruciate ligament injuries in a twin family study(2022) Feldmann, Daneil; September, Alison V; Collins, Malcolm; Chimusa, EmileBackground: Predisposition to ACL rupture is multifactorial, resulting from a complex interplay of intrinsic and extrinsic risk factors. Variation in the genome is now considered a key intrinsic risk factor, but the majority of currently implicated loci have been identified through case-control genetic association studies, which are limited by a candidate gene approach and insufficient statistical power. The primary aim of this thesis was to use a whole genome sequencing (WGS) approach within the context of a twin family study to identify novel or previously implicated genetic loci contributing to ACL rupture predisposition (Chapter 2). Additionally, this research aimed to explore prioritised genetic polymorphisms previously associated with ACL rupture and functioning in key biological pathways implicated through the WGS analyses, independently and as a collective, with ACL rupture predisposition in a large combined ACL rupture dataset (Chapter 3 and 4). Methods: The complete genomes of all family members in two unrelated families, each with affected twins were sequenced. Variants with potential loss of function effect were prioritised, and explored for probable biological function in the ACL rupture risk pathway. Furthermore, identity by descent analysis (IBD) was performed to identify potential disease causing mutations, on chromosomal regions shared between family members, and across families. Enriched biological pathway analyses were further explored to prioritise potential candidate genes. Two biological networks were prioritised which highlighted the angiogenesis and proteoglycan family of proteins. Specific polymorphisms within previously investigated candidate genes were further explored in case-control genetic association studies conducted in a large collective data set, including participants from three independent (Sweden, Poland and Australia) cohorts, combined with previously published South African and Polish data. The anterior cruciate ligament (ACL) rupture group included individuals diagnosed with a clinical diagnosis of an ACL rupture based on physical examination, and confirmed by either magnetic resonance imaging or arthroscopy. Only ACL ruptures resulting from a non-contact mechanism of injury were included. The control group comprised individuals of similar age to cases with no prior history of ACL injury or other ligament and tendon injuries, and participating in regular sporting activity, which was similar to cases. Participant samples were genotyped for single nucleotide polymorphisms in the VEGFA (rs699947 C/A rs1570360 G/A, rs2010963 G/C) and KDR (rs2071559 A/G, rs1870377 T/A) genes (Sweden CON: 116 ACL: 95; Poland CON: 149 ACL: 127 and Australia CON: 83 ACL: 342). Additionally, in the ACAN (rs2351491 C/T, rs1042631 T/C, rs1516797 T/G), DCN (rs516115 T/C) and BGN (rs1126499 C/T, rs1042103 G/A) genes (Sweden and Poland). Haplotype analyses were explored (VEGFA, KDR, ACAN and BGN) using the individual genotype data. In addition, inferred allele interactions were presented for VEGFA-KDR, ACAN-BGN ACAN-DCN, BGN-DCN, and VEGFA-DCN as a proxy for gene-gene interactions within the discrete angiogenesis and proteoglycan gene families, and between genes as a proxy for pathway interactions. For association studies, frequencies were calculated for the genotype, allele, inferred haplotypes and allele interactions, and the distributions compared between the control and ACL rupture participants. The statistical programs in R were used for all the analyses, and a p value < 0.05 was accepted to be significant. Results: The WGS analyses highlighted six candidate genetic loci in three genes (COL12A1, CATSPER2, and KCNJ12) with predicted loss of function effects in all affected and unaffected family members within the two studied families. Of the three genes, polymorphisms within COL12A1 were previously associated with ACL rupture predisposition, while CATSPER2 and KCNJ12 are two novel genetic loci with no known previous association with predisposition to ACL rupture. The IBD analyses identified several regions shared in each independent family, of which a segment including a long intergenic non-protein coding RNA (lincRNA) LINC01250 gene in the telomeric region of chromosome 2p25.3 was shared between affected twins in both families, and an affected brother. Furthermore, several functional partners were highlighted. Genetic association analyses of the prioritised polymorphisms in a combined cohort identified an independent association of the VEGFA rs2010963 CC genotype and C allele with increased risk (genotype p = 0.0001, FDR p = 0.001, OR 2.16, 95% CI: 1.47-3.19; allele p = 0.0006, FDR p = 0.003, OR 1.29, 95% CI: 1.11-1.49). Furthermore, the association of the VEGFA A-A-G and A-G-G inferred haplotypes (rs699947 A/C-rs1570360 G/Ars2010963 G/C) with reduced risk (p = 0.010, haplo.score: -2.58, OR: 0.85, 95% CI: 0.69-1.05; A-G-G: p = 0.036, haplo.score: -2.09, OR: 0.81, 95% CI: 0.64-1.02) of ACL rupture. Moreover, a reduced interval (rs1570360 G/A-rs2010963 G/C) revealed an association of the VEGFA -GG and -A-G inferred haplotypes with reduced risk (-G-G: p = 0.031, haplo.score: -2.15, OR: 1.00 and -A-G: p = 0.024, haplo.score: -2.25, OR: 0.98, 95% CI: 0.82-1.18) and the -G-C inferred haplotype with increased risk p = 0.012, haplo.score: 2.50, OR: 1.18, 95% CI: 0.99- 1.40). The KDR genotype and haplotype analyses illustrated that it is highly unlikely that the investigated KDR polymorphisms are associated with modulating ACL rupture risk. Inferred allele interactions noted a significant association of the VEGFA (rs699947 A/C, rs2010963 G/C) - KDR (rs2071559 A/G, rs1870377 T/A) A-G-A-A (p = 0.005, OR: 0.51, 95% CI: 0.30- 0.87) and A-G-G-A (p = 0.018, OR: 0.93, 95% CI: 0.54-1.60) combinations with reduced ACL rupture risk. Further, a significant association of the VEGFA (rs699947 C/A, rs1570360 G/A, rs20109630 G/C) - DCN (rs516115 T/C) A-G-G-T (p = 0.010, OR: 0.53, 95% CI: 0.30-0.91), A-A-G-C (p = 0.010, OR: 0.42, 95% CI: 0.21-0.81) and A-A-G-T (p = 0.046, OR: 0.77, CI: 0.49-1.2) allele combinations with reduced risk was noted for male participants in the collective cohort. No independent or haplotype associations with ACL rupture risk were noted for any of the investigated proteoglycan polymorphisms, in the collective cohort. Conclusion: Collectively, this work has expanded current knowledge on the genetic regions contributing to ACL rupture predisposition, and further highlights the polygenic nature of multifactorial phenotypes. Employing whole genome sequencing in a twin family context, together with a pathway based approach, novel and previously implicated genetic loci were identified towards the aims of the thesis. The catalogue of candidate in silico mutations and modifier genes that clustered in pathophysiological pathways important in ACL rupture, and with implications for therapeutic intervention were identified, and need to be interrogated. Of particular interest are the novel CATSPER2, KCNJ12 and LINC01250 genetic loci. Furthermore, additional evidence to support the implication of the VEGFA gene in modulating ACL rupture risk is provided, and highlighted is the potential collaboration of members within the angiogenesis and proteoglycan gene family in modulating risk. The studies in Chapter 3 and 4 suggest genetic association studies in single populations are less informative, and instead larger collective cohorts with increased statistical power should be employed. Further to that, rather than investigating single polymorphisms, larger regions of the genome should be explored to determine the potential interacting components contributing to musculoskeletal injury risk. Going forward, characterisation of the functional biological effect of implicated loci may assist in unravelling the underlying mechanisms altering tissue homeostasis, and subsequently an individual's capacity for healing and adaptive response.