• English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  • Communities & Collections
  • Browse OpenUCT
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  1. Home
  2. Browse by Author

Browsing by Author "Chimusa, Emile R"

Now showing 1 - 18 of 18
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A post-gene silencing bioinformatics protocol for plant-defence gene validation and underlying process identification: case study of the Arabidopsis thaliana NPR1
    (BioMed Central, 2017-11-23) Yocgo, Rosita E; Geza, Ephifania; Chimusa, Emile R; Mazandu, Gaston K
    Background: Advances in forward and reverse genetic techniques have enabled the discovery and identification of several plant defence genes based on quantifiable disease phenotypes in mutant populations. Existing models for testing the effect of gene inactivation or genes causing these phenotypes do not take into account eventual uncertainty of these datasets and potential noise inherent in the biological experiment used, which may mask downstream analysis and limit the use of these datasets. Moreover, elucidating biological mechanisms driving the induced disease resistance and influencing these observable disease phenotypes has never been systematically tackled, eliciting the need for an efficient model to characterize completely the gene target under consideration. Results: We developed a post-gene silencing bioinformatics (post-GSB) protocol which accounts for potential biases related to the disease phenotype datasets in assessing the contribution of the gene target to the plant defence response. The post-GSB protocol uses Gene Ontology semantic similarity and pathway dataset to generate enriched process regulatory network based on the functional degeneracy of the plant proteome to help understand the induced plant defence response. We applied this protocol to investigate the effect of the NPR1 gene silencing to changes in Arabidopsis thaliana plants following Pseudomonas syringae pathovar tomato strain DC3000 infection. Results indicated that the presence of a functionally active NPR1 reduced the plant’s susceptibility to the infection, with about 99% of variability in Pseudomonas spore growth between npr1 mutant and wild-type samples. Moreover, the post-GSB protocol has revealed the coordinate action of target-associated genes and pathways through an enriched process regulatory network, summarizing the potential target-based induced disease resistance mechanism. Conclusions: This protocol can improve the characterization of the gene target and, potentially, elucidate induced defence response by more effectively utilizing available phenotype information and plant proteome functional knowledge.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A post-gene silencing bioinformatics protocol for plant-defence gene validation and underlying process identification: case study of the Arabidopsis thaliana NPR1
    (2017) Yocgo, Rosita E; Geza, Ephifania; Chimusa, Emile R; Mazandu, Gaston K
    Advances in forward and reverse genetic techniques have enabled the discovery and identification of several plant defence genes based on quantifiable disease phenotypes in mutant populations. Existing models for testing the effect of gene inactivation or genes causing these phenotypes do not take into account eventual uncertainty of these datasets and potential noise inherent in the biological experiment used, which may mask downstream analysis and limit the use of these datasets. Moreover, elucidating biological mechanisms driving the induced disease resistance and influencing these observable disease phenotypes has never been systematically tackled, eliciting the need for an efficient model to characterize completely the gene target under consideration.
  • Loading...
    Thumbnail Image
    Item
    Restricted
    ancGWAS: a post genome-wide association study method for interaction, pathway and ancestry analysis in homogeneous and admixed populations
    (Oxford University Press, 27) Chimusa, Emile R; Mbiyavanga, Mamana; Mazandu, Gaston K; Mulder, Nicola J
    Despite numerous successful Genome-wide Association Studies (GWAS), detecting variants that have low disease risk still poses a challenge. GWAS may miss disease genes with weak genetic effects or strong epistatic effects due to the single-marker testing approach commonly used. GWAS may thus generate false negative or inconclusive results, suggesting the need for novel methods to combine effects of single nucleotide polymorphisms within a gene to increase the likelihood of fully characterizing the susceptibility gene. Results: We developed ancGWAS, an algebraic graph-based centrality measure that accounts for linkage disequilibrium in identifying significant disease sub-networks by integrating the association signal from GWAS data sets into the human protein–protein interaction (PPI) network. We validated ancGWAS using an association study result from a breast cancer data set and the simulation of interactive disease loci in the simulation of a complex admixed population, as well as pathway-based GWAS simulation. This new approach holds promise for deconvoluting the interactions between genes underlying the pathogenesis of complex diseases. Results obtained yield a novel central breast cancer sub-network of the human interactome implicated in the proteoglycan syndecan-mediated signaling events pathway which is known to play a major role in mesenchymal tumor cell proliferation, thus providing further insights into breast cancer pathogenesis.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Association of variants in APOL1, MYH9 and HMOX1 WITH micro-Albuminuria among Sickle Cell disease patients from Cameroon
    (2016) Geard, Amy; Wonkam, Ambroise; Chimusa, Emile R
    Introduction: Sickle Cell Disease (SCD) is a monogenic, multi-organ hemoglobinopathy disorder that is highly prevalent in Africa, with nearly 300 000 newborn cases per year. The underlying pathophysiological mechanism of the disease involves alteration of the normal soft and biconcave disc shape of erythrocytes, to that of a rigid crescent. These abnormal red blood cells cause vaso-occlusion and intravascular hemolysis, resulting in a variety of clinical manifestations, including acute pain crises, anemia, and damage to various organs. Kidney disease is a clinical proxy of severity, developing only in a subset of patients, and is subject to modification by genetic variations. Indeed, reports have shown significant association between proteinuria and specific genetic variants in MYH9 and APOL1, and between estimated Glomerular Filtration Rate (eGFR) and End Stage Kidney Disease (ESKD) with HMOX1 variants among adult African Americans affected by SCD. However, the association between these variants and micro-albuminuria, a primary indicator of renal dysfunction, has not been investigated, nor has any study of these variants been performed among SCD patients in Africa. Aim: The aim of this study was to investigate the association of targeted single nucleotide polymorphisms (SNPs) in APOL1, MYH9 and HMOX1, as well as a 5' promoter dinucleotide repeat in HMOX1, with micro-albuminuria among SCD patients from Cameroon; and to compare the results to that from a cohort of non-SCD Cameroonian individuals affected by ESKD.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    "Broadband" bioinformatics skills transfer with the Knowledge Transfer Programme (KTP): educational model for upliftment and sustainable development
    (Public Library of Science, 2015) Chimusa, Emile R; Mbiyavanga, Mamana; Masilela, Velaphi; Kumuthini, Judit
    A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The "omics" fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP). Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Determining ancestry proportions in complex admixture scenarios in South Africa using a novel proxy ancestry selection method
    (Public Library of Science, 2013) Chimusa, Emile R; Daya, Michelle; Möller, Marlo; Ramesar, Raj; Henn, Brenna M; van Helden, Paul D; Mulder, Nicola J; Hoal, Eileen G
    Admixed populations can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Admixture mapping has been used successfully, but is not designed to cope with populations that have more than two or three ancestral populations. The inference of admixture proportions and local ancestry and the imputation of missing genotypes in admixed populations are crucial in both understanding variation in disease and identifying novel disease loci. These inferences make use of reference populations, and accuracy depends on the choice of ancestral populations. Using an insufficient or inaccurate ancestral panel can result in erroneously inferred ancestry and affect the detection power of GWAS and meta-analysis when using imputation. Current algorithms are inadequate for multi-way admixed populations. To address these challenges we developed PROXYANC, an approach to select the best proxy ancestral populations. From the simulation of a multi-way admixed population we demonstrate the capability and accuracy of PROXYANC and illustrate the importance of the choice of ancestry in both estimating admixture proportions and imputing missing genotypes.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Dissecting the genetic bases of severe malaria resistance using genome-wide and post genomewide study approaches
    (2021) Mulisa, Delesa Damena; Chimusa, Emile R
    P. falciparum malaria remains one of the leading public health problems worldwide. The global tally of malaria in 2018 was estimated at 228 million cases and 405, 000 deaths worldwide. African countries disproportionately carry the global burden of malaria accounting for 93% and 94% of cases and deaths, respectively. Even though most infected children recover from P. falciparum malaria, a small subset (~1%) of cases progresses to severe disease and death. Over the last decade, several genome-wide association studies (GWASs) have been conducted in diverse malaria endemic populations to understand the natural host protective immunity against severe malaria that can provide clues for the development of new vaccines and therapeutics. However, beyond identifying association variants, conventional GWAS approaches can't inform the underpinning biological functions. To bridge this gap, we applied various contemporary statistical genetic analytic approaches to malaria GWAS datasets of diverse malaria endemic populations. First, we accessed malaria resistance GWAS datasets of three African populations (N=~11,000) including Kenya, Gambia and Malawi from European Genome Phenome Archive (EGA) through MalariaGEN consortium standard data accession procedures. We explored the challenges of GWAS approaches in the genetically diverse Africa populations and figured out how various advanced statistical genetic methods can be implemented to address these challenges. We investigated single nucleotide polymorphism (SNP) heritability (h2 g) of malaria resistance in the Gambian populations and determined appropriate quality (QC) thresholds to accurately estimate the h2 g in our dataset. Second, we estimated h2 g in the three populations and partitioned the h2 g into chromosomes, allele frequencies and annotations using the genetic relationship-matrix restricted maximum likelihood approaches. We further created African specific reference panel from African population datasets obtained from 1000 Genomes Project and African Genome Variation Project dataset and computed linkage disequilibrium (LD). We used LD information obtained from these reference panels to compute cell-type specific and none cell-type specific enrichments for GWAS-summary statistics meta-analyzed across the three populations. Our results showed for the first time that malaria resistance is polygenic trait with h2 g of ~20% and that the causal variants are overrepresented around protein coding regions of the genome. We further showed that the h2 g is disproportionately concentrated on three chromosomes (chr 5, 11 and 20), suggesting cost-effectiveness of targeting these chromosomes in future malaria genomic sequencing studies. Third, we systematically predicted plausible candidate genes and pathways from functional analysis of severe malaria resistance GWAS summary statistics (N = 17,000) meta-analyzed across eleven populations in malaria endemic regions in Africa, Asia and Oceania. We applied positional mapping, expression quantitative trait locus (eQTL), chromatin interaction mapping and gene-based association analyses to identify candidate severe malaria resistance genes. We performed network and pathway analyses to investigate their shared biological functions. We further applied rare variant analysis to raw GWAS datasets of three malaria endemic populations including Kenya, Malawi and Gambia and performed various population genetic structures of the identified genes in the three endemic populations and 20 world-wide ethnics. Our functional mapping analysis identified 57 genes located in the known malaria genomic loci while our gene-based GWAS analysis identified additional 125 genes across the genome. The identified genes were significantly enriched in malaria pathogenic pathways including multiple overlapping pathways in erythrocyte-related functions, blood coagulations, ion channels, adhesion molecules, membrane signaling elements and neuronal systems. Furthermore, our population genetic analysis revealed that the minor allele frequencies (MAF) of the SNPs residing in the identified genes are generally higher in the three malaria endemic populations compared to global populations. Overall, our results suggest that severe malaria resistance trait is attributed to multiple genes that are enriched in pathways linked to severe malaria pathogenesis. This highlights the possibility of harnessing new malaria therapeutics that can simultaneously target multiple malaria protective host molecular pathways. In conclusions, this project showed that malaria resistance trait is mainly a polygenic trait which is influenced by genes and pathways linked to blood stage lifecycle of P. falciparum. These findings constitute the foundations for future experimental studies that can potentially lead to translational medicine including development of new vaccines and therapeutics. However, ‘-omics' studies including those implemented in this study, are limited to single datatype analysis and lack adequate power to explain the complexity of molecular processes and usually lead to identification of correlations than causations. Thus, beyond singe locus analysis, the future direction of malaria resistance requires a paradigm shift from single-omics to multi-stage and multi-dimensional integrative multi-omics studies that combines multiple data types from the human host, the parasite, and the environment. The current biotechnological and statistical advances may eventually lead to the feasibility of systems biology studies and revolutionize malaria research.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Functional Genome Wide Association Study in Susceptibility and Resistance of Malaria
    (2021) Kabongo, Etienne Ntumba; Chimusa, Emile R
    Background: More than century, malaria is qualified as a mortal infectious disease, worldwide causing high morbidity and mortality. The World Health Organization (WHO) has shown that, Distribution of Malaria in Africa takes a major part, it's accounting for 95% (about 229 million) and 67% (about 274000) of reported cases and death respectively. One of solutions for reducing this threat is to find drugs or to develop vaccines which can resist and adapt to populations. Unfortunately, despite several efforts, malaria parasites are still developing resistance to the frontline antimalarials. Objectives: Our aim in this project is to conduct a systematic Meta-analysis and various functional analysis across three study populations in Africa ( Kenya, Malawi and Gambia ). Method and Materials: Our first analysis is directed to the Genome Wide Association Study (GWAS) of three study populations (Kenya, Malawi and Gambia) using the Emmax tool to identify the genetic variants associated with severe malaria. We then conducted GWAS based meta-analysis on the summary statistics from the three studies using Metasoft and Metal. Further, we implemented Functional GWAS (FGWAS) to re-weight the GWAS meta-analysis using functional genomic information software (fgwas-tool). Using results from fgwas-tool, we performed biological interpretation using Functional Mapping (FUMA) tool. We mapped the significant SNPs to the genes, and elucidated their functions and their associated cell types. We then performed pathway analysis and enrichment analysis of the genes using Genemania and Enrichr. Additionally, we performed a polygenic risk score for individuals in each study population using PRSice, and evaluated the level of risk exposure for each individual based on the best predictive threshold. Finally, we filtered the rare variants from each study, and performed SKAT analysis to aggregate the effect of the rare variants Results: We identified 29 significant SNPs (14 replicates and 15 novels) reweighted from FGWAS based on GWAS Meta-Analysis. The SNPs mapped to 15 genes (HBB, HBD, ATP2B4, ABO, CBLB, EYA2, HERPUDI, IQCJ, MPP7, NAVI, NUP210, SAMD5 , TCERG1L ,TMEM229B, C4orf19) at gene level. Five of these genes (HBB, HBD, ATP2B4, ABO, CBLB) had been reported by different studies to be associated with malaria. In the PRS analysis we have shown the best prediction based on the best threshold estimated of each population. We found best-fit prediction best-fit PRS for Gambia is 0.00443458 at PT = 0.00165005, for Kenya is 8.4666e-158 at PT= 1 and for Malawi is 1.5151e-55 at PT = 1 predict the risk of an infectious disease like severe malaria. However, the prediction rate is very low and may fail to distinguish the cases from the controls. Conclusion: The functional analysis based on fgwas result have shown that 5 genes (ATP2B4, ABO, HBD, HBB, CBLB) are highly associated to malaria across these 3 studies populations (Gambia, Malawi and Kenya) and 10 candidate novel genes, including high number of mutations in the gene C4orf19 which will constitute one of the future major studies. Also, we have shown the best prediction based on the best threshold estimated of each population. The results have shown that the prediction rate is very low and may fail to distinguish the cases from the controls.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Genome-wide association studies of severe P. falciparum malaria susceptibility: progress, pitfalls and prospects
    (2019-08-14) Damena, Delesa; Denis, Awany; Golassa, Lemu; Chimusa, Emile R
    Abstract Background P. falciparum malaria has been recognized as one of the prominent evolutionary selective forces of human genome that led to the emergence of multiple host protective alleles. A comprehensive understanding of the genetic bases of severe malaria susceptibility and resistance can potentially pave ways to the development of new therapeutics and vaccines. Genome-wide association studies (GWASs) have recently been implemented in malaria endemic areas and identified a number of novel association genetic variants. However, there are several open questions around heritability, epistatic interactions, genetic correlations and associated molecular pathways among others. Here, we assess the progress and pitfalls of severe malaria susceptibility GWASs and discuss the biology of the novel variants. Results We obtained all severe malaria susceptibility GWASs published thus far and accessed GWAS dataset of Gambian populations from European Phenome Genome Archive (EGA) through the MalariaGen consortium standard data access protocols. We noticed that, while some of the well-known variants including HbS and ABO blood group were replicated across endemic populations, only few novel variants were convincingly identified and their biological functions remain to be understood. We estimated SNP-heritability of severe malaria at 20.1% in Gambian populations and showed how advanced statistical genetic analytic methods can potentially be implemented in malaria susceptibility studies to provide useful functional insights. Conclusions The ultimate goal of malaria susceptibility study is to discover a novel causal biological pathway that provide protections against severe malaria; a fundamental step towards translational medicine such as development of vaccine and new therapeutics. Beyond singe locus analysis, the future direction of malaria susceptibility requires a paradigm shift from single -omics to multi-stage and multi-dimensional integrative functional studies that combines multiple data types from the human host, the parasite, the mosquitoes and the environment. The current biotechnological and statistical advances may eventually lead to the feasibility of systems biology studies and revolutionize malaria research.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A Genomic Portrait of Haplotype Diversity and Signatures of Selection in Indigenous Southern African Populations
    (Public Library of Science, 2015) Chimusa, Emile R; Meintjies, Ayton; Tchanga, Milaine; Mulder, Nicola; Seoighe, Cathal; Soodyall, Himla; Ramesar, Rajkumar
    We report a study of genome-wide, dense SNP (∼900K) and copy number polymorphism data of indigenous southern Africans. We demonstrate the genetic contribution to southern and eastern African populations, which involved admixture between indigenous San, Niger-Congo-speaking and populations of Eurasian ancestry. This finding illustrates the need to account for stratification in genome-wide association studies, and that admixture mapping would likely be a successful approach in these populations. We developed a strategy to detect the signature of selection prior to and following putative admixture events. Several genomic regions show an unusual excess of Niger-Kordofanian, and unusual deficiency of both San and Eurasian ancestry, which were considered the footprints of selection after population admixture. Several SNPs with strong allele frequency differences were observed predominantly between the admixed indigenous southern African populations, and their ancestral Eurasian populations. Interestingly, many candidate genes, which were identified within the genomic regions showing signals for selection, were associated with southern African-specific high-risk, mostly communicable diseases, such as malaria, influenza, tuberculosis, and human immunodeficiency virus/AIDs. This observation suggests a potentially important role that these genes might have played in adapting to the environment. Additionally, our analyses of haplotype structure, linkage disequilibrium, recombination, copy number variation and genome-wide admixture highlight, and support the unique position of San relative to both African and non-African populations. This study contributes to a better understanding of population ancestry and selection in south-eastern African populations; and the data and results obtained will support research into the genetic contributions to infectious as well as non-communicable diseases in the region.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Identifying Genes and Novel Variants Involved in Nonsyndromic Hearing Impairment, and Assessment of the Psychosocial Burden of Hearing Impairment in Cameroon
    (2021) Wonkam, Tingang Edmond; Chimusa, Emile R; Wonkam, Ambroise
    Background Hearing impairment (HI) is the most common sensory disability and occurs in about 1 per 1000 live births in high-income countries, with a much higher incidence of up to 6 per 1000 live births in sub-Saharan Africa (SSA). HI can be due to environmental or genetic causes, and in many cases, it is not possible to establish a definite aetiology. Hereditary HI contributes to 30% to 50% of HI cases in SSA. Hereditary HI can be syndromic or non-syndromic, depending on whether it is associated with additional abnormalities in other organs or not. Non-syndromic HI (NSHI) accounts for 70% of hereditary hearing loss, and is genetically highly heterogeneous, with approximately 170 loci and 121 genes identified to date. Studies in European and Asian populations have identified pathogenic variants in GJB2 (MIM: 121011), and GJB6 (MIM: 604418) genes as the major contributors to autosomal recessive NSHI (ARNSHI). The genetic aetiology of HI in Cameroon is unclear, as previous studies have found no contribution of GJB2 and GJB6 genes to NSHI in Cameroon. However, patients included in those studies consisted of both familial and isolated cases, therefore, underlying environmental/multifactorial causes in some cases cannot be excluded (especially for the isolated cases). Six loci for X-linked HI have been described to date, including DFNX3 (Xp21.2), where DMD is located. Variants in DMD in humans are known to be responsible for Duchenne muscular dystrophy (DMD; MIM: 310200), and Becker muscular dystrophy (BMD; MIM: 300376), an Xlinked recessive disorder. Previous studies have demonstrated that mdx mice, (an animal knockout model for DMD), have an increased threshold for hearing when compared to wildtype mice. However, the contribution of DMD to HI in humans has not been extensively studied. Besides, most of the previous studies on DMD were conducted in Caucasians, Asians, and Arabs; therefore, little is known about the features of this condition in Africans. Parents of children with HI tend to face challenges of parenting especially in terms of communication and social interaction. In Africa, parent's perceived causes of deafness vary from environmental factors to mysterious (“evil forces”) or superstitious beliefs. Also, the attitude of the society towards people with HI does not encourage their participation and involvement in the community, as they face overt discrimination. Aim and methods The aim of this project was to examine the genetic aetiologies of HI in the Cameroonian population, and undercover the challenges faced by persons with HI in Cameroon and their understanding of the causes of HI. This was addressed by 1) Establishing the current status of knowledge on HI in Africa (in terms of prevalence, aetiologies, and genetics aspects) with a particular focus on Cameroon, and assessing the contribution of connexin genes to HI in humans at a global level, through systematic literature reviews; 2) Revisiting the contribution of GJB2 and GJB6 genes to NSHI in 29 multiplex Cameroonian families with NSHI and with strong evidence of non-environmental causes, through targeted gene sequencing and specific multiplex polymerase chain reaction (PCR); 3) Using multiplex ligand-dependent probe amplification (MLPA) technique to investigate the most common variants associated with DMD in Cameroon and assess their possible implication in HI in humans; 4) Performing whole exome sequencing (WES) on 2 Cameroonian multiplex families with NSHI and who tested negative for pathogenic variants in GJB2 and GJB6, to identify the underlying causative genes; 5) Performing in-depth interviews to gain an understanding of the challenges faced by people with HI in Cameroon, their understanding of the causes of hearing impairment (HI), and how challenges could be remedied to improve the quality of life of persons with HI. Results Literature reviews Our first systematic review showed that HI is a public health issue in Cameroon, especially in the elder population where the prevalence of HI is 14.8% in people aged 50 years and more. Environmental factors, including meningitis, impacted wax, and age-related disorders are the leading aetiologies of HI in Cameroon as in many other SSA countries, contributing 52.6% to 62.2% of HI cases. Hereditary HI comprises 0.8% to 14.8% of all cases in Cameroon, and in 32.6% to 37% of HI cases, the origin remains unknown. This contrasts with findings from highincome countries where hereditary HI constitutes the main aetiology of HI, contributing to approximately 50% of cases. NSHI is the most frequent clinical entity and accounts for 86.1% to 92.5% of cases of hereditary HI in the Cameroonian population. No pathogenic variant was described in GJB6 gene, and the prevalence of pathogenic variants in GJB2 ranged from 0% to 0.5%. The prevalence of pathogenic variants in other known NSHI genes was with type 2 Waardenburg syndrome, and three cases of type 2 Usher syndrome were identified in one family. By direct gene sequencing of the coding region of GJB2, no variants were found in any of the 29 families with NSHI. Additionally, through a specific multiplex PCR, the GJB6- D3S1830 deletion which contributes to 9.7% of NSHI cases in Europeans was not identified in any of the patients with HI. Subsequently, a total of 17 males with DMD from 14 families were recruited, aged 14 ± 5.1 (8–23) years. The mean age at onset of symptoms was 4.6 ± 1.5 years, and the mean age at diagnosis was 12.1 ± 5.2 years. Proximal muscle weakness was noted in all patients and calf hypertrophy in the large majority of them (88.2%; 15/17). Flexion contractures were particularly frequent on the ankle (85.7%; 12/14). Wasting of the shoulder girdle and thigh muscles was present in 50% (6/12) and 46.2% (6/13) of patients, respectively. No patient presented with HI. The MLPA found that deletions of at least one exon in DMD occurred in 45.5% of patients (5/11), while duplications were observed in 27.3% (3/11). Both variant types were clustered between exons 45 and 50, and the proportion of de novo variant was estimated at 18.2% (2/11). Whole exome sequencing We submitted DNA samples from five members of a multiplex non-consanguineous Cameroonian family segregating prelingual and progressive ARNSHI for WES. We identified novel bi-allelic compound heterozygous pathogenic variants in CLIC5 (MIM: 607293). The variants identified, i.e. the missense [NM_016929.5:c.224T>C; p.(Leu75Pro)] and the splicing (NM_016929.5:c.63+1G>A), were validated using Sanger sequencing in all seven available family members and co-segregated with HI in the three family members with HI. The three affected individuals were compound heterozygous for both variants, and all unaffected individuals were heterozygous for one of the two variants. Both variants classify as pathogenic by the American College of Medical Genetics (ACMG) guidelines for classification of variants and are absent from the genome aggregation database (gnomAD), UK10K, Greater Middle East (GME) database, and the Single Nucleotide Polymorphism Database (dbSNP), as well in 122 healthy controls from Cameroon. We also did not identify these pathogenic variants in 118 unrelated sporadic cases of NSHI from Cameroon. A second multiplex family was also screened through the use of WES, followed by direct Sanger sequencing in additional patients and control participants. We identified a heterozygous novel missense variant [NM_001174116.2:c.918G>T; p.(Gln306His)] in DMXL2 (MIM:612186) which was transmitted in an autosomal dominant manner, and co-segregates with congenital/prelingual profound to total non-syndromic sensorineural HI in a family from Cameroon. The described family showed a variable expressivity of the HI phenotype. The p.(Gln306His) variant which substitutes a highly conserved glutamine residue is predicted deleterious by various bioinformatics tools and is absent from several genome databases including genome aggregation database (gnomAD), and trans-omics for precision medicine (TOPMed) database. This variant was neither found in 121 healthy controls without personal or family history of HI, nor 112 sporadic cases of NSHI from Cameroon. Our study identified novel variants in CLIC5 and DMXL2 in two Cameroonian families, and provided only the second report of variants in these genes worldwide; thus, strengthening the case for these two genes as candidate genes for NSHI in humans. The psychosocial burden of HI We performed in-depth interviews with 10 HI professionals (healthcare workers, and educationists), and 10 persons affected by HI (persons with HI, and caregivers). The results show that in this study population, the cause of HI is attributed to a variety of causes, including genetics, environmental factors, and a spiritual curse. There were reported cases of stigma and discrimination with persons with HI in the Cameroonian population sometimes seen as having a “mental disorder”. Our participants also highlighted the difficulty that persons with HI have in accessing the necessary education and healthcare services, and suggested the need for policymakers and researchers to develop strategies to improve the social integration of persons with HI and their access to basic social services. This includes 1) Increased awareness amongst the general population, 2) the establishment of more special schools, and 3) building and equipping facilities for proper management of HI. Conclusions Our project confirms that variants in GJB2 and GJB6 genes do not contribute significantly to NSHI in the Cameroonian population. Also, variants in DMD that were shown to be associated with an increased hearing threshold in mice, do not seem to be implicated in HI in Cameroon, neither in previous human studies (although they did not objectively assess hearing using standardized testing methods). Despite the first symptoms of DMD occurring in infancy, the diagnosis is frequently made later in adolescence, indicating an underestimation of the number of cases of DMD in Cameroon. Future screening of deletions and duplications in patients from Cameroon should focus on the distal part of the DMD gene. Subsequently, this study successfully identified the candidate genes in two Cameroonian multiplex families with NSHI through the use of WES, and thus highlights the efficacy of next-generation sequencing techniques in resolving HI cases in Cameroonians and in cases where no pathogenic variants are found in common HI-genes. Additionally, our project which confirms that CLIC5 and DMXL2 genes are associated with HI in humans advocate for the inclusion of these two genes in diagnostic gene panels for NSHI in clinical settings. Last, this study shows the difficult social interaction and access to proper management faced by persons with HI in Cameroon, and highlights the need to educate populations on the causes of HI for a better acceptance of persons with HI in the Cameroonian society.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Integration of multi-omic data and neuroimaging characteristics in studying brain related diseases
    (2020) Elsheikh, Samar Salah Mohamedahmed; Mulder, Nicola J; Crimi, Alessandro; Chimusa, Emile R
    Approaches to the identification of genetic variants associated with complex brain diseases have evolved in recent decades. This evolution was supported by advancements in medical imaging and genotyping technologies that result in rich data production in the field of imaging genetics and radiogenomics. Studies in these fields have taken different designs and directions from genomewide associations to studying the complex interplay between genetics and structural connectivity of a wide range of brain-related diseases. Nevertheless, such combinations of heterogeneous, high dimensional and inter-related data has introduced new challenges which cannot be handled with traditional statistical methods. In this thesis, we proposed analysis pipelines and methodologies to study the causal relationship between neuroimaging features, including tumour characteristics and connectomics, genetics and clinical factors in brain-related diseases. In doing so, we adopted two longitudinal study designs and modelled the association between Alzheimer's disease progression and genetic factors, utilising local and global brain connectivity networks. In addition to that, we performed a multi-stage radiogenomic analysis in glioblastoma using non-parametric statistical methods. To address some limitations in the methods, we adopted the Structural Equation Model and developed a mathematical model to examine the inter-correlation between neuroimaging and multi-omic characteristics of brain-related diseases. Our findings have successfully identified risk genes that were previously reported in the literature of Alzheimer's and glioblastoma diseases, and discovered potential risk variants which associate with disease progression. More specifically, we found some loci in the genes CDH18, ANTXR2 and IGF1, located in Chromosomes 5, 4 and 12, to have effect on the brain connectivity over time in Alzheimer's disease. We also found that the expression of APP, HFE, PLAU and BLMH have significant effects on the structural connectivity of local areas in the brain, these are the left Heschl gyrus, right anterior cingulate gyrus, left fusiform gyrus and left Heschl gyrus, respectively. These potential association patterns could be useful for early disease diagnosis, treatment and neurodegeneration prediction. More importantly, we identified gaps in the imaging genetics methodologies, we proposed a mathematical model accounting for these limitations and evaluated the model which produced promising results. Our proposed flexible model, BiGen, addresses the gaps in the existing tools by combining neuroimaging, genetics, environmental, and phenotype information to a single complex analysis, accounting for the heterogeneity, inter-correlation, and non-linearity of the variables. Moreover, BiGen adopts an important assumption which is hardly met in the literature of imaging genetics, and that is, all the four variables are assumed to be latent constructs, that means they can not be observed directly from the data, and are measured through observed indicators. This is an important assumption in both neuroimaging, behavioural and genetic studies, and it is one of the reasons why BiGen is flexible and can easily be extended to include more indicators and latent constructs in the context of brain-related diseases.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Investigating local ancestry inference models in mixed ancestry individual genomes
    (2022) Geza, Ephifania; Mazandu, Gaston K; Chimusa, Emile R; Mulder, Nicola J
    Owing to historical events including the slave trade, agricultural interests, colonialism, and political and/or economical instability, most modern humans are a mosaic of segments originating from different populations. They result from the interbreeding of two or more previously isolated populations, leading to admixture. Known admixed populations include the mixed ancestry of South Africa, Latin Americans and African Americans. Admixed individuals play important roles in understanding population history, disease aetiology, and personal genomics. Accordingly, efforts have been made to understand the genetic composition of such individuals, yielding several models that infer the ancestry of every chromosomal segment in admixed individuals (local ancestry). However, new research questions emerged concerning model statistical and biological parameters, as well as the performance of these models across admixed datasets. This elicited the need for examining existing local ancestry inference models in order to identify and tackle critical issues of these models, which is the main goal of this thesis. We achieve this in four steps, constituting the main contributions of this PhD project: (1) Qualitative assessment of existing models through a systematic review; (2) Building a unified framework integrating existing models for inferring and assessing local ancestry estimates; (3) Quantitative assessment of existing methods within the same framework; and (4) Proposing a model extension to account for natural selection and the origin of modern humans to improve the accuracy of local ancestry estimates. Firstly, we assess models using published results on different datasets and performance measures, to orient modellers and software developers on the future trends in local ancestry inference. Secondly, to address the challenges identified in (1) including model complexity reflected in the distinct inputs each model requires and outputs formats, we design a unified framework, referred to as FRANC, to manipulate tool-specific inputs, deconvolve ancestry and standardise outputs, to ease the inference process and pave the way for model assessment. Thirdly, using FRANC, we assess the performance of eight state-of-the-art models on simulated admixed population datasets involving three and five ancestral populations. LAMP-LD and LOTER performed better than the other six tested models on admixed populations involving five ancestral populations while RFMIX, WINPOP, ELAI and LAMP-LD were comparable in admixed datasets involving three populations. Performance was evaluated based on performance measures borrowed from the machine learning confusion matrix. Finally, we noted that it may be more practical to extend existing models to incorporate more realistic biological assumptions. Hence, we propose a nonparametric hidden Markov model, that adjusts an existing model mSPECTRUM to account for natural selection and state-persistence when deconvolving local ancestry, which should improve the accuracy of estimates. Similarly to mSPECTRUM, this acknowledges the two common hypotheses on the origin of modern humans, making it comparable to mSPECTRUM which has been shown to be competitive with HAPMIX, a benchmark for two-way admixtures. Therefore, these four are a good contribution to admixture analysis of populations.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Large–scale data–driven network analysis of human–plasmodium falciparum interactome: extracting essential targets and processes for malaria drug discovery
    (2020) Agamah, Francis Edem; Chimusa, Emile R; Mazandu, Gaston
    Background: Plasmodium falciparum malaria is an infectious disease considered to have great impact on public health due to its associated high mortality rates especially in sub Saharan Africa. Falciparum drugresistant strains, notably, to chloroquine and sulfadoxine-pyrimethamine in Africa is traced mainly to Southeast Asia where artemisinin resistance rate is increasing. Although careful surveillance to monitor the emergence and spread of artemisinin-resistant parasite strains in Africa is on-going, research into new drugs, particularly, for African populations, is critical since there is no replaceable drug for artemisinin combination therapies (ACTs) yet. Objective: The overall objective of this study is to identify potential protein targets through host–pathogen protein–protein functional interaction network analysis to understand the underlying mechanisms of drug failure and identify those essential targets that can play their role in predicting potential drug candidates specific to the African populations through a protein-based approach of both host and Plasmodium falciparum genomic analysis. Methods: We leveraged malaria-specific genome wide association study summary statistics data obtained from Gambia, Kenya and Malawi populations, Plasmodium falciparum selective pressure variants and functional datasets (protein sequences, interologs, host-pathogen intra-organism and host-pathogen inter-organism protein-protein interactions (PPIs)) from various sources (STRING, Reactome, HPID, Uniprot, IntAct and literature) to construct overlapping functional network for both host and pathogen. Developed algorithms and a large-scale data-driven computational framework were used in this study to analyze the datasets and the constructed networks to identify densely connected subnetworks or hubs essential for network stability and integrity. The host-pathogen network was analyzed to elucidate the influence of parasite candidate key proteins within the network and predict possible resistant pathways due to host-pathogen candidate key protein interactions. We performed biological and pathway enrichment analysis on critical proteins identified to elucidate their functions. In order to leverage disease-target-drug relationships to identify potential repurposable already approved drug candidates that could be used to treat malaria, pharmaceutical datasets from drug bank were explored using semantic similarity approach based of target–associated biological processes Results: About 600,000 significant SNPs (p-value< 0.05) from the summary statistics data were mapped to their associated genes, and we identified 79 human-associated malaria genes. The assembled parasite network comprised of 8 clusters containing 799 functional interactions between 155 reviewed proteins of which 5 clusters contained 43 key proteins (selective variants) and 2 clusters contained 2 candidate key proteins(key proteins characterized by high centrality measure), C6KTB7 and C6KTD2. The human network comprised of 32 clusters containing 4,133,136 interactions between 20,329 unique reviewed proteins of which 7 clusters contained 760 key proteins and 2 clusters contained 6 significant human malaria-associated candidate key proteins or genes P22301 (IL10), P05362 (ICAM1), P01375 (TNF), P30480 (HLA-B), P16284 (PECAM1), O00206 (TLR4). The generated host-pathogen network comprised of 31,512 functional interactions between 8,023 host and pathogen proteins. We also explored the association of pfk13 gene within the host-pathogen. We observed that pfk13 cluster with host kelch–like proteins and other regulatory genes but no direct association with our identified host candidate key malaria targets. We implemented semantic similarity based approach complemented by Kappa and Jaccard statistical measure to identify 115 malaria–similar diseases and 26 potential repurposable drug hits that can be 3 appropriated experimentally for malaria treatment. Conclusion: In this study, we reviewed existing antimalarial drugs and resistance–associated variants contributing to the diminished sensitivity of antimalarials, especially chloroquine, sulfadoxine-pyrimethamine and artemisinin combination therapy within the African population. We also described various computational techniques implemented in predicting drug targets and leads in drug research. In our data analysis, we showed that possible mechanisms of resistance to artemisinin in Africa may arise from the combinatorial effects of many resistant genes to chloroquine and sulfadoxine–pyrimethamine. We investigated the role of pfk13 within the host–pathogen network. We predicted key targets that have been proposed to be essential for malaria drug and vaccine development through structural and functional analysis of host and pathogen function networks. Based on our analysis, we propose these targets as essential co-targets for combinatorial malaria drug discovery.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Leveraging genotypes imputation and polygenic risk scores in malaria susceptibility
    (2020) Kimathi, Peter Opiyo; Chimusa, Emile R
    Background Over the past few years, Genome Wide Association Studies (GWAS) have identified thousands of genetic variants that are associated with a wide range of complex traits, and have provided valuable insights as far as their genetic architectures are concerned. In malaria studies too, GWAS has been successful and a number of genetic variants have been identified. Despite the success, the complete aetiology of malaria, and many complex traits in general, remains poorly understood. A key concern is that the missing heritability remains too large, with some of the variants identified in some populations failing to replicate using independent study populations. Indeed comparable sources have revealed that the statistical power of association studies can be improved either via genotypes imputation approaches or by treating the whole genome of an individual as a risk predictor using Polygenic Risk Scores (PRS). However, imputation remains at modest in Africa populations with few (or no) studies (study) have evaluated the potential of imputation tools in African populations. On the other hand, although the utility of PRS has been shown in other studies, it has neither been assessed in African population nor applied in an infectious disease, like malaria. Methodology We evaluated the performance of five popular genotypes imputation methods (IMPUTE4, minimac 4, IMPUTE2, minimac3 and BEAGLE4) using case control datasets that mimics African populations, European populations and the admixed populations simulated with FractalSIM. We assessed imputation performance based on internal imputation quality metrics and the genotypes concordance. We applied the best imputation tool based on the assessment results to impute raw genotypes data of severe malaria case control studies from MalariaGEN of three African populations: Kenya, The Gambia and Malawi. Similarly, we obtained summary statistics of the same datasets, and imputed the summary statistics with ImpG. We performed an association on the imputed raw genotypes, and compared the association results with that of ImpG based imputation. Additionally, we performed meta-analysis with METASOFT, and compared the meta-analysis result of ImpG based imputation and that from imputed raw genotypes associations. Finally, we assessed five PRS methods (PRSice, LDpred(p+t), PRSoS, PLINK and PRScS) in predicting genetic risk in African population, and applied the best PRS method to predict the genetic risk of severe malaria. Results IMPUTE2 recorded the best performance based on imputation accuracy and concordance for the African (accuracy=80.21% and concordance=99.2%) and the admixed samples (accuracy=69.46% and concordance=90.92%) for variants with MAF>0.05. Other tools recorded similar accuracy and concordance although BEAGLE 4 recorded the lowest concordance and accuracy across all the African and admixed datasets. For the real genotypes data, no SNP attained the genome wide significant threshold of 5.0 × 10−8 for Malawi and the Gambia datasets. However, for the Kenyan dataset, 9 SNPs on chromosome 11 were significantly associated with severe malaria. 3 of these SNPS were located on the HBG2 genes and the remaining 6 had not been reviewed. No SNP attained the genome wide threshold for the ImpG imputed summary statistics for all the populations. For IMPUTE2 based meta-analysis, only one SNP rs12295158 located on the HBB region was significant across all the meta-analysis model (with P-value of 2.88 × 10−12 for fixed (FE), 2.88 × 10−12 random (RE) and 9.64 × 10−12 binary effect (BE) respectively). On the other hand ImpG based meta-analysis, two SNPs were signicant across all the meta-analysis model (rs183731078 located on RFX3 with P-values of 8.40 × 10−9 , 8.40 × 10−9 , 4.47 × 10−8 for FE, RE and BE respectively, and rs8096513 located on DLGAP1 1.43 × 10−9 , 1.43 × 10−9 , 1.01 × 10−8 with P-value for FE, RE and BE respectively). Pathway enrichment and analysis of these genes revealed that both of these genes are associated with malaria. Finally, for the PRS, PRSoS recorded the best performance based on Nargalkerke's R 2 (0.01736) and area under curve (AUC) (0.511). Other PRS methods recorded slightly similar results with PLINK recording the least. The odds of having severe malaria was estimated as 2.869, and a unit change of PRS scores was associated with -5.143 change in odds of having severe malaria with P-value of 0.0193 at α = 0.05. However, the scores could only explain 1.28% of the phenotypic variance. Conclusion Our results provide foundation for future studies in genetics, especially in African population, where the best performing imputation tool remains a mystery. Moreover, our results have demonstrated the potential of application of PRS in infectious diseases.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Leveraging the microbiome in host genome wide association studies
    (2021) Awany, Denis; Chimusa, Emile R; Dandara, Collet
    Genome-wide association study (GWAS) has emerged as an effective method for detecting genetic polymorphisms associated with expressed phenotypes. Over the past decade, GWAS of human traits and diseases has revolutionized the field of complex disease genetics, identifying hundreds of genetic variants associated with several different phenotypes, ranging from metabolic diseases to cardiovascular and neuropsychiatric conditions. These associations have provided fundamental insights into the genetic architecture of disease susceptibility and led to initial forays into clinical applications, particularly in creation of genetic risk scores for improved disease risk prediction and identification of new drug targets for novel drug development. Despite this gratifying success, however, for almost all complex traits, the identified genetic loci explain only a small proportion, generally less than half, of the estimated heritability. A number of alternative explanations have been offered for this, including undetected genetic effects, unaccounted-for environmental factors, and gene–gene and gene–environment interaction effects. Although there is no consensus on these explanations, it is universally acknowledged that a substantial proportion of the trait heritability is attributable to existence of a large number of undetected genetic variants distributed across the entire allele frequency spectrum, each of which has very small to modest effect on the phenotype, and non host-DNA factors that contribute to phenotypic variation. In parallel to host GWAS, the advent of next-generation sequencing technologies (NGS) that enable culture-independent profiling of microbial communities has led to the rediscovery of the microbiome - the collective genome of the microorganisms that inhabit the body - and the emergence of microbiome-wide association studies. These studies have linked the gut microbiome to a variety of human conditions, ranging from neurological conditions, such as Parkinson's disease and autism, to metabolic diseases, such as obesity, diabetes, and cardiovascular disease. Given the critical importance of the microbiome in host phenotype, it is clear that in order to more comprehensively understand the basis of host phenotypic status, both the host's genotype and microbiome information have to be examined. This thesis explores the dissection of microbial taxa and host genetic polymorphisms associated with human complex traits and diseases, and the interaction of human host genetic polymorphisms with the microbiome. Then, a Bayesian statistical framework, based on the Dirichlet process random effects model, is proposed for identifying microbial species associated with host phenotype. The proposed method uses a weighted combination of phylogenetic and radial basis function kernels to model microbial taxa effects, and a non-parametrically defined latent variable to model latent heterogeneity among samples. Philosophically, the non-parametric specification amounts to the addition of an infinite amount of prior information about all fine details of the parameters being modelled; thus represents an attractive strategy. The utility of the method is demonstrated through simulation experiments and application to real microbiome datasets for schizophrenia, HIV/AIDS, and atherosclerosis diseases, where it is shown that the method is not only robust but also has high statistical power for association inference, resulting in a framework that can contribute to our understanding of the link between the microbiome and human diseases. Understanding the human genetic predisposing factors in concert with this link will make human GWAS fulfil its translational potential, from patient stratifcation and disease risk prediction to identification of new biology and drug discovery.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Leveraging Whole Genome Sequences to Compare Mutational Mechanism and Identify Medically Relevant Variation in African versus Non-African Descend Populations
    (2020) Alosaimi, Shatha Mobarak; Chimusa, Emile R
    Whole-Genome Sequencing (WGS) is ushering a new era in healthcare and research in identifying genetic variation in all populations. However, the African populations are still under-represented. Since African populations are being the most genetically diverse with high heterogeneity rate, we need to benchmark the Whole Genome Sequence (WGS) analysis pipeline to ensure reliable mutation detection. Therefore, it is essential to ensure that all steps of WGS downstream analysis are accurate, mainly the variant calling (VC). Current VC tools may produce falsepositive/negative results; such result may produce misleading conclusions in prioritisation of mutation, clinical relevancy and actionability of genes. With such many VC tools, two questions have arisen. Firstly, which tool has a high rate of sensitivity and precision in low either high coverage African sequences, given they have high genetic diversity and heterogeneity? Secondly, does the improvement of the VC result will advance the accuracy of detecting mutation and incidental finding (actionable genes) in African populations? In this project, a total of 100 DNA sequence samples was simulated (of which every 50 samples mimicked the genetics background of African and European, respectively) at different coverage (high and low). In particular, the sensitivity to discover polymorphisms was done by nine different VC tools. These tools were assessed in term of false positive/negative call rate given the simulated golden variants. Combining our result on sensitivity and positive predictive value (PPV). Lofreq performs best in African population data (sens=0.85, PPV=0.983, F-score=0.91) on high/low coverage data; as a result, we chose Lofreq to perform variant calling, and Gene-based annotation is performed to conduct in-sillico predication of mutation on publicly available data (the African Genome Variation and 1000 Genome Project). In doing so, we have leveraged WGS to examine and validate four of burden diseases in the African content, such as communicable diseases: HIV/AIDS, Malaria, Tuberculosis (TB), and Non-communicable diseases: such as Sickle cell disease, these diseases have uniquely shaped ethnic-specific and continental genomics variation and therefore provides unprecedented opportunities to map disease genes across the African continent. Moreover, the current actionable gene recommended by The American College of Medical Genetics and Genomics (ACMG) in the African population and update on additional African-specific actionable genes. Our result suggests African and African diaspora ethnic groups, particularly Bantu and Khoesan ethnics have gene diversity, high proportion of derived allele at low minor allele frequency (0.0 − 01) and the highest proportion of pathogenic variants within HIV, TB, Malaria, Sickle-Cell disease, while non-African ethnic groups including Latin America, Afro-Asiatic European related ethnic groups have high proportion of pathogenic variants within current actionable gene list. Overall, given the observed highest genetic diversity found in African ethnics and African diaspora related ethnics at these four Africa burden diseases and current actionable gene associated, our results support (1) the use of personalised medicine as beneficial to both African continent and worldwide; (2) a recommendation for African-specific actionable list of genes to further improve African and diaspora healthcare.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A panel of ancestry informative markers for the complex five-way admixed South African Coloured population
    (Public Library of Science, 2013) Daya, Michelle; Merwe, Lize van der; Galal, Ushma; Möller, Marlo; Salie, Muneeb; Chimusa, Emile R; Galanter, Joshua M; Helden, Paul D van; Henn, Brenna M; Gignoux, Chris R
    Admixture is a well known confounder in genetic association studies. If genome-wide data is not available, as would be the case for candidate gene studies, ancestry informative markers (AIMs) are required in order to adjust for admixture. The predominant population group in the Western Cape, South Africa, is the admixed group known as the South African Coloured (SAC). A small set of AIMs that is optimized to distinguish between the five source populations of this population (African San, African non-San, European, South Asian, and East Asian) will enable researchers to cost-effectively reduce false-positive findings resulting from ignoring admixture in genetic association studies of the population. Using genome-wide data to find SNPs with large allele frequency differences between the source populations of the SAC, as quantified by Rosenberg et. al's -statistic, we developed a panel of AIMs by experimenting with various selection strategies. Subsets of different sizes were evaluated by measuring the correlation between ancestry proportions estimated by each AIM subset with ancestry proportions estimated using genome-wide data. We show that a panel of 96 AIMs can be used to assess ancestry proportions and to adjust for the confounding effect of the complex five-way admixture that occurred in the South African Coloured population.
UCT Libraries logo

Contact us

Jill Claassen

Manager: Scholarly Communication & Publishing

Email: openuct@uct.ac.za

+27 (0)21 650 1263

  • Open Access @ UCT

    • OpenUCT LibGuide
    • Open Access Policy
    • Open Scholarship at UCT
    • OpenUCT FAQs
  • UCT Publishing Platforms

    • UCT Open Access Journals
    • UCT Open Access Monographs
    • UCT Press Open Access Books
    • Zivahub - Open Data UCT
  • Site Usage

    • Cookie settings
    • Privacy policy
    • End User Agreement
    • Send Feedback

DSpace software copyright © 2002-2026 LYRASIS