Browsing by Author "Mulder, Nicola"
Now showing 1 - 20 of 40
Results Per Page
Sort Options
- ItemOpen AccessA pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae(2023) Iranzadeh, Arash; Mulder, NicolaStreptococcus pneumoniae (pneumococcus) is one of the leading causes of mortality in Africa. It asymptomatically colonizes the human nasopharynx. The invasive pneumococcal disease occurs when isolates spread to normally sterile sites such as lungs, blood, and the central nervous system. Colonization, though, does not necessarily lead to infection. Some isolates remain in the upper respiratory tract only, without causing any pathogenic symptoms. This thesis hypothesized that invasive and non-invasive isolates differ genetically. We tested this hypothesis by applying a pan-genome approach using whole-genome sequencing short reads of 1477 samples from Malawi, including those obtained from the nasopharynx of carriers (825 samples) and from the blood and cerebrospinal fluid of patients (652 samples). In-silico serotyping identified 56 serotypes in the cohort and statistical analysis showed that despite the vaccination, the prevalence of serotypes 1 and 12F increased amongst patients. Genomes were assembled, and a reference pan-genome for all strains was built. Short reads were aligned to the core genome, and core variants were called. The population structure was determined based on the distribution of variants in the pan-genome. Finally, genes with a significant presence in the invasive isolates were identified. Functional enrichment analysis of potential virulence genes was carried out to address how specific genes may contribute to the pathogenesis. The findings highlighted the features of the pneumococcus pan-genome in Malawi. The core- and accessory-genome were characterized based on the functional analysis of genes. The core components included: Ribosomal subunits. Subunits of F-type ATP synthase. Enzymes that catalyze the attachment of amino acids to tRNA molecules, DNA replication, DNA repair, and homologous recombination. 10.13% of the core and soft-core genes were uncharacterized. In the accessory genome, the study detected the presence of genes from Regions of Diversity (RDs), including Subunits of V-type ATPases and Sodium/solute symporter from RD8a. Enzymes from RD3 catalyzing the capsule synthesis. Subunits of PsrP secY2A2 pathogenicity island from RD10. Genes from RD6 and RD7 involved in transposing mobile genetic elements. Genes from RD2 RD8b, and RD12 participating in communication and competition. Genes from RD4 that assemble pilins into pili and anchor pili to the cell wall. 53.58% of accessory genes were uncharacterized. Most serotypes showed a similar prevalence in carriage and disease groups. However, the significant abundance of serotypes 1, 5, and 12F among patients compared to the carriage group suggested they are highly invasive with a short colonization period. These serotypes exhibited a remarkable genetic distinction from others. Their divergence included the absence and presence of several genes in their genome structure. The lack of genes from a genomic island known as RD8a was the most pronounced difference between serotypes 1, 5, and 12F compared to significantly prevalent serotypes in the nasopharynx. Genes in RD8a are involved in binding to epithelial cells and doing aerobics respiration to synthesize ATP through oxidative phosphorylation. The absence of RD8a from serotypes 1, 5, and 12F may be associated with their short duration in the nasopharynx where they need to bind to epithelial cells and access free oxygen molecules required for aerobic respiration. Given this, the amount of ATP is likely to decline in serotypes 1, 5, and 12F, causing them to harbour more phosphotransferase systems to transport carbohydrates since these transporters use phosphoenolpyruvate as the energy source instead of ATP. In conclusion, serotypes 1, 5, and 12F, the most prevalent and invasive pneumococcal strains in Malawi, showed a considerable genetic distinction from other strains that may be associated with their short colonization period and quickness to infect the blood and cerebrospinal fluid.
- ItemOpen AccessAccumulation of splice variants and transcripts in response to PI3K inhibition in T cells(Public Library of Science, 2013) Riedel, Alice; Mofolo, Boitumelo; Avota, Elita; Schneider-Schaulies, Sibylle; Meintjes, Ayton; Mulder, Nicola; Kneitz, SusanneBACKGROUND: Measles virus (MV) causes T cell suppression by interference with phosphatidylinositol-3-kinase (PI3K) activation. We previously found that this interference affected the activity of splice regulatory proteins and a T cell inhibitory protein isoform was produced from an alternatively spliced pre-mRNA. Hypothesis Differentially regulated and alternatively splice variant transcripts accumulating in response to PI3K abrogation in T cells potentially encode proteins involved in T cell silencing. METHODS: To test this hypothesis at the cellular level, we performed a Human Exon 1.0 ST Array on RNAs isolated from T cells stimulated only or stimulated after PI3K inhibition. We developed a simple algorithm based on a splicing index to detect genes that undergo alternative splicing (AS) or are differentially regulated (RG) upon T cell suppression. RESULTS: Applying our algorithm to the data, 9% of the genes were assigned as AS, while only 3% were attributed to RG. Though there are overlaps, AS and RG genes differed with regard to functional regulation, and were found to be enriched in different functional groups. AS genes targeted extracellular matrix (ECM)-receptor interaction and focal adhesion pathways, while RG genes were mainly enriched in cytokine-receptor interaction and Jak-STAT. When combined, AS/RG dependent alterations targeted pathways essential for T cell receptor signaling, cytoskeletal dynamics and cell cycle entry. CONCLUSIONS: PI3K abrogation interferes with key T cell activation processes through both differential expression and alternative splicing, which together actively contribute to T cell suppression.
- ItemOpen AccessAfrican Genomic Medicine Portal: A Web Portal for Biomedical Applications(2022-02-11) Othman, Houcemeddine; Zass, Lyndon; da Rocha, Jorge E B; Radouani, Fouzia; Samtal, Chaimae; Benamri, Ichrak; Kumuthini, Judit; Fakim, Yasmina J; Hamdi, Yosr; Mezzi, Nessrine; Boujemaa, Maroua; Okeke, Chiamaka Jessica; Tendwa, Maureen B; Sanak, Kholoud; Chaouch, Melek; Panji, Sumir; Kefi, Rym; Sallam, Reem M; Ghoorah, Anisah W; Romdhane, Lilia; Kiran, Anmol; Meintjes, Ayton P; Maturure, Perceval; Jmel, Haifa; Ksouri, Ayoub; Azzouzi, Maryame; Farahat, Mohammed A; Ahmed, Samah; Sibira, Rania; Turkson, Michael E E; Ssekagiri, Alfred; Parker, Ziyaad; Fadlelmola, Faisal M; Ghedira, Kais; Mulder, Nicola; Kamal Kassim, SamarGenomics data are currently being produced at unprecedented rates, resulting in increased knowledge discovery and submission to public data repositories. Despite these advances, genomic information on African-ancestry populations remains significantly low compared with European- and Asian-ancestry populations. This information is typically segmented across several different biomedical data repositories, which often lack sufficient fine-grained structure and annotation to account for the diversity of African populations, leading to many challenges related to the retrieval, representation and findability of such information. To overcome these challenges, we developed the African Genomic Medicine Portal (AGMP), a database that contains metadata on genomic medicine studies conducted on African-ancestry populations. The metadata is curated from two public databases related to genomic medicine, PharmGKB and DisGeNET. The metadata retrieved from these source databases were limited to genomic variants that were associated with disease aetiology or treatment in the context of African-ancestry populations. Over 2000 variants relevant to populations of African ancestry were retrieved. Subsequently, domain experts curated and annotated additional information associated with the studies that reported the variants, including geographical origin, ethnolinguistic group, level of association significance and other relevant study information, such as study design and sample size, where available. The AGMP functions as a dedicated resource through which to access African-specific information on genomics as applied to health research, through querying variants, genes, diseases and drugs. The portal and its corresponding technical documentation, implementation code and content are publicly available.
- ItemOpen AccessAn African Genome Variation Database and its applications in human diversity and health(2021) Todt, Davis; Mulder, NicolaAfrican genomes exhibit the highest levels of sequence and haplotype diversity of all extant human populations. A combination of historical as well as geographical factors have contributed toward the high level of genetic diversity in Ancestral populations in Africa. Additionally, a series of concomitant migration events out of Africa, with founder populations harbouring only a subset of this genetic variation, have contributed to the relatively lower genetic diversity observed in non-Africans. Population genetic studies have refined our understanding of human evolutionary history and clinical genomic studies have resulted in improved patient outcomes. However, despite the increased throughput and decreased cost afforded from next-generation sequencing (NGS) and despite the relatively higher genetic variation in Africans, relatively little of the genomic data currently available is representative of diverse African populations. This may result in adverse outcomes in the context of minority populations with little representation in clinical databases. Given the under-representation of African genetic variation and the importance of highlighting and further characterizing it, the objectives of this project were to design, develop and deploy a proof of concept database and web application for the storage, analysis and visualization of African genetic variant data – the African Genome Variation Database (AGVD). The AGVD was developed according to software industry design standards. The project also explored available genomic tools and databases in order to leverage existing software solutions where suitable. Additionally, relevant data sets were identified for use during testing and validation of the pilot phase of the project. To this end, the open access 1000 Genomes Project phase 3 dataset was selected and the genotypes for several chromosomes were loaded into the AGVD. The AGVD leverages the scalable, performant, and open source genomics engine OpenCGA for data storage and analysis. A custom front-end web application was developed by applying a novel approach to render and serve static Vue JS assets from the Python Flask microframework. The web application supports rich data search and filtering operations of loaded variants and allows end-users to visualize annotations of genomic loci and allele change, variant type, associated gene and transcript consequences, clinical significance, and allele frequency information for all annotated cohorts in a highly interactive manner. A bespoke REST API also supports future analytical functionality. The AGVD has demonstrated proof of concept in the secure and scalable storage and visualization of African genomic data, providing a viable solution for H3ABioNet to further extend in future iterations of the project and a valuable resource for researchers to explore African genetic variation.
- ItemOpen AccessAnalysis of within-host evolution of Plasmodium Falciparum during treatment(2018) Okendo, Javan Ochieng; Mulder, Nicola; Andagalu, BenAntimalarial drugs impose strong selective pressure on Plasmodium falciparum parasite genomes and leave signatures of selection. The evolutionary basis of drug resistant malaria in endemic and epidemic settings continues to remain an ongoing scientific priority whose solution carries a significant effect on treatment outcomes. To understand the evolutionary changes in P. falciparum during treatment with ACTs, we used various approaches to test the neutral models of evolution using P. falciparum genomic data which were collected from Kombewa and Maseno in Kisumu, Kenya between 2013 and 2015. The Synonymous/Non-synonymous (dN/dS) ratio was used to predict the effect of selection on protein coding loci of the Pfk13 gene. A logistic regression model was used to test the association between IC50s and the SNPs. mCSM and SDM were used to detect the effects of mutations on the Pfk13 gene while the PRIMO web server was used to locate the SNPs on the Kelch13 propeller domain. Modeller V9.1 was used to predict the structure of the Kelch 13 propeller domain and the Posview webserver used to predict ACT/kelch 13 interactions. Population differentiation was done using Microsatellite analyzer to calculate FST and customized R scripts with the relevant population genetics packages were used in the analysis. For samples collected in 2013, Tajima’s D genomic summary statistic was 4.53194, Fu & Li D* 2.13380, and Fu &Li F* 3.62142. However, in 2015 Tajima’s D was -2.42910, Fu and Li’s D* -5.2712, and Fu and Li’s F* -5.0045. The dN/dS in 2013 was 1.0299, while in 2015 dN/dS was 2.6884. Kenyan P. falciparum SNPs occur on the intra or inter blade domains on the PfK13 propeller domain. The FST analysis showed minimal population differentiation of the parasites during treatment. There was no significant association between SNPs and IC50 values but SNPs at codon D547E showed association with Artesunate and D559E with AQ and MQ IC50 respectively. Even though there is an exponential increase in the number of non-synonymous point mutations in the Pfk13 gene, the Kenyan P. falciparum strains remain sensitive to ACT drugs. Further research needs to be done by deep sequencing this location of chromosome 13 as it will provide more power for finding novel SNPs for further validation.
- ItemOpen AccessApplying, Evaluating and Refining Bioinformatics Core Competencies (An Update from the Curriculum Task Force of ISCB's Education Committee)(Public Library of Science, 2016) Welch, Lonnie; Brooksbank, Cath; Schwartz, Russell; Morgan, Sarah L; Gaeta, Bruno; Kilpatrick, Alastair M; Mietchen, Daniel; Moore, Benjamin L; Mulder, Nicola; Pauley, Mark; Pearson, William; Radivojac, Predrag; Rosenberg, Naomi; Rosenwald, Anne; Rustici, Gabriella; Warnow, Tandy
- ItemOpen AccessA bioinformatic study on the feasibility of a cross-species proteomics analyses of mycobacteria(2013) Rajaonarifara, Elinambinina; Blackburn, Jonathan; Mulder, NicolaIncludes abstract. Includes bibliographical references.
- ItemOpen AccessCharacterisation of the metabolome of Mycobacterium tuberculosis to identify new pathways and pathway holes(2014) Wolfenden, Kristen Marie; Mulder, NicolaDue to high incidence rates and the development of new drug-resistant or multidrug-resistant strains of TB, the development of new medicines and treatments for tuberculosis is a necessity. In order to develop these drugs, Mycobacterium tuberculosis (Mtb) needs to be studied more completely; this study performs a characterisation of the metabolome of Mtb and comparison across the phylogenetic profile to identify notable pathways.
- ItemOpen AccessCreating and analysing an African pan-genome(2022) Bourn, Jessica Jean; Mulder, NicolaThe human reference genome is currently a core resource for understanding the role of genetics in human health, disease, and variation, and has been invaluable in the development of clinical and computational tools for these purposes. However, the limited number of individual genomes used to create the reference has resulted in an underrepresentation of the extensive genetic diversity present in different human populations. Since an important use of the reference genome is to identify genetic variants that may be implicated in disease, this lack of diversity could limit the scientific utility of the reference for ethnic groups that are poorly represented in it. As a result, adaptations to the reference genome structure have been proposed. One such proposal has been the use of multiple reference genomes, each of which represent different human populations. A logical and highly practical method of achieving this is through the use of a pan-genome, which is a curated collection of all the DNA sequences that are found within a population under study. Despite the fact that African populations exhibit the greatest genetic diversity and variation in the world, the many and sometimes ancient ethnolinguistic groups from Africa are among those least represented within the reference genome. Consequently, this study aimed to explore the feasibility of creating and analysing an African pangenome, and to begin developing tools to achieve this. Several distinct African regional ancestral groups – namely east African Nilo-Saharan, east African Afro-Asiatic, far west Niger-Congo, central west Niger-Congo, Bantu-speaking Niger-Congo, central African rainforest hunter-gatherer, and the Khoe and San – have previously been identified, and this study included and analysed samples from each group in order to assemble a more inclusive and representative pan-genome. A software pipeline developed by Duan et al. (2019), termed the HUman Pan-genome ANalysis (HUPAN) pipeline, was used here to assemble the African pan-genome. As the HUPAN pipeline was originally designed to analyse only single populations, the inclusion of multiple populations required modifications and improvements, which were implemented following the testing and analysis of the pipeline using a smaller dataset of whole genome sequences. Subsequently, a final dataset of 168 African high- and medium-coverage whole genome sequences representing the seven separate regional ancestral groups was submitted to the adapted HUPAN pipeline. For each group, nucleotide sequences that were absent from the human reference genome were assembled and extracted, which resulted in the identification of 43.37 Mbp of non-redundant non-reference genomic sequence and 31 novel predicted protein-coding genes from African individuals. Alignment to other pan-genome sequences, whole genomes from different human populations, and the complete telomere-to-telomere human genome validated a large portion of the sequences as nonreference and confirmed that the dataset contained sequences specific to African populations. However, the gene presence-absence variation analysis of the pan-genome within all 168 samples revealed patterns of gene presence and absence that were strongly correlated to the sample dataset of origin, rather than to the ancestral group of origin. This hindered the identification of genuine genetic variation specific to the groups analysed. Further, it appears that previous pan-genomic research has not investigated the degree to which the genetic variation identified is dataset-specific or truly population-specific. Consequently, the failure to acknowledge and account for the effects of spurious inter-dataset variation in previous pan-genomic research indicates that those analyses may be incomplete or ambiguous. This, therefore, calls into question the methods currently used for pangenomic research, and highlights that robust, standardised methods for human pan-genome research must be agreed on to ensure that comprehensive population-specific pan-genomes are produced in the future. Despite this inherent weakness of pan-genomic research, this study successfully enabled the creation and analysis of a comprehensive and inclusive African pan-genome. Unique sets of non-reference sequences specific to African regional ancestral groups were identified and obtained, enabling the assembly of a non-redundant set of pan-African non-reference sequences. Furthermore, certain complex but previously unconsidered aspects of pan-genome research were identified and explored, and these observations may play a role in the advancement of pan-genome research in future.
- ItemOpen AccessDaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures(BioMed Central Ltd, 2013) Mazandu, Gaston; Mulder, NicolaBACKGROUND: The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. RESULTS: We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. CONCLUSIONS: The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.
- ItemOpen AccessDAS Writeback: A Collaborative Annotation System(BioMed Central Ltd, 2011) Salazar, Gustavo; Jimenez, Rafael; Garcia, Alexander; Hermjakob, Henning; Mulder, Nicola; Blake, EdwinBACKGROUND: Centralised resources such as GenBank and UniProt are perfect examples of the major international efforts that have been made to integrate and share biological information. However, additional data that adds value to these resources needs a simple and rapid route to public access. The Distributed Annotation System (DAS) provides an adequate environment to integrate genomic and proteomic information from multiple sources, making this information accessible to the community. DAS offers a way to distribute and access information but it does not provide domain experts with the mechanisms to participate in the curation process of the available biological entities and their annotations. RESULTS: We designed and developed a Collaborative Annotation System for proteins called DAS Writeback. DAS writeback is a protocol extension of DAS to provide the functionalities of adding, editing and deleting annotations. We implemented this new specification as extensions of both a DAS server and a DAS client. The architecture was designed with the involvement of the DAS community and it was improved after performing usability experiments emulating a real annotation task. CONCLUSIONS: We demonstrate that DAS Writeback is effective, usable and will provide the appropriate environment for the creation and evolution of community protein annotation.
- ItemOpen AccessData integration for the analysis of uncharacterized proteins in Mycobacterium tuberculosis(2010) Mazandu, Gaston Kuzamunu; Mulder, NicolaMycobacterium tuberculosis is a bacterial pathogen that causes tuberculosis, a leading cause of human death worldwide from infectious diseases, especially in Africa. Despite enormous advances achieved in recent years in controlling the disease, tuberculosis remains a public health challenge. The contribution of existing drugs is of immense value, but the deadly synergy of the disease with Human Immunodeficiency Virus (HIV) or Acquired Immunodeficiency Syndrome (AIDS) and the emergence of drug resistant strains are threatening to compromise gains in tuberculosis control. In fact, the development of active tuberculosis is the outcome of the delicate balance between bacterial virulence and host resistance, which constitute two distinct and independent components. Significant progress has been made in understanding the evolution of the bacterial pathogen and its interaction with the host. The end point of these efforts is the identification of virulence factors and drug targets within the bacterium in order to develop new drugs and vaccines for the eradication of the disease.
- ItemOpen AccessDeveloping reproducible bioinformatics analysis workflows for heterogeneous computing environments to support African genomics(BioMed Central, 2018-11-29) Baichoo, Shakuntala; Souilmi, Yassine; Panji, Sumir; Botha, Gerrit; Meintjes, Ayton; Hazelhurst, Scott; Bendou, Hocine; Beste, Eugene d; Mpangase, Phelelani T; Souiai, Oussema; Alghali, Mustafa; Yi, Long; O’Connor, Brian D; Crusoe, Michael; Armstrong, Don; Aron, Shaun; Joubert, Fourie; Ahmed, Azza E; Mbiyavanga, Mamana; Heusden, Peter v; Magosi, Lerato E; Zermeno, Jennie; Mainzer, Liudmila S; Fadlelmola, Faisal M; Jongeneel, C. V; Mulder, NicolaAbstract Background The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an African-led research consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous African computing environments. Processing and analysis of genomic data is an example of a big data application requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and secondary input data through several computationally-intensive processing steps using different software packages, where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and easy-to-use workflows is particularly challenging. Results H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for use by members of the H3Africa consortium and the international research community. Conclusion The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use. All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network.
- ItemOpen AccessDevelopment of computational methods for custom protein arrays analysis : a case study on a 100-protein ("CT100") cancer/testis antigen array(2010) Safari Serufuri, Jean-Michel; Blackburn, Jonathan; Mulder, Nicola; Kumuthini, JuditCustom antigen arrays offer a platform to assay the serological response of cancer patients to at set of selected cancer testis antigens in order to infer a diagnosis value or to assess the patient responses to particular treatments. However, the acquisition of the array data is subject to bias and noise. Therefore, array data processing and analysis is required to clear the data from bias, reduce noise and learn from the data. This study aims to address the issues of normalization and sample qualitative clustering for custom protein arrays.
- ItemOpen AccessDisruption of maternal gut microbiota during gestation alters offspring microbiota and immunity(BioMed Central, 2018-07-07) Nyangahu, Donald D; Lennard, Katie S; Brown, Bryan P; Darby, Matthew G; Wendoh, Jerome M; Havyarimana, Enock; Smith, Peter; Butcher, James; Stintzi, Alain; Mulder, Nicola; Horsnell, William; Jaspan, Heather BBackground: Early life microbiota is an important determinant of immune and metabolic development and may have lasting consequences. The maternal gut microbiota during pregnancy or breastfeeding is important for defining infant gut microbiota. We hypothesized that maternal gut microbiota during pregnancy and breastfeeding is a critical determinant of infant immunity. To test this, pregnant BALB/c dams were fed vancomycin for 5 days prior to delivery (gestation; Mg), 14 days postpartum during nursing (Mn), or during gestation and nursing (Mgn), or no vancomycin (Mc). We analyzed adaptive immunity and gut microbiota in dams and pups at various times after delivery. Results In addition to direct alterations to maternal gut microbial composition, pup gut microbiota displayed lower α-diversity and distinct community clusters according to timing of maternal vancomycin. Vancomycin was undetectable in maternal and offspring sera, therefore the observed changes in the microbiota of stomach contents (as a proxy for breastmilk) and pup gut signify an indirect mechanism through which maternal intestinal microbiota influences extra-intestinal and neonatal commensal colonization. These effects on microbiota influenced both maternal and offspring immunity. Maternal immunity was altered, as demonstrated by significantly higher levels of both total IgG and IgM in Mgn and Mn breastmilk when compared to Mc. In pups, lymphocyte numbers in the spleens of Pg and Pn were significantly increased compared to Pc. This increase in cellularity was in part attributable to elevated numbers of both CD4+ T cells and B cells, most notable Follicular B cells. Conclusion Our results indicate that perturbations to maternal gut microbiota dictate neonatal adaptive immunity.
- ItemOpen AccessGenetic characteristics of Plasmodium vivax from Northern Mali(2018) Djimde, Moussa; Mulder, Nicola; Djimde, Abdoulaye; Dara, AntoineIntroduction: The surprising presence of P. vivax in West Africa and their ability to infect a Duffy negative population is one more threat to public health. In order to contribute to malaria elimination efforts, there is a need to investigate the origin and characteristics of P. vivax population isolates in Northern Mali. Next Generation Sequence Analysis (NGSA) can help us understand parasite genetic characteristics although low parasite density is a challenge for whole genome sequencing (WGS). In the present work, we investigated if selective whole genome amplification (sWGA) can enrich P. vivax DNA extracted from Rapid Diagnostic Tests (RDTs) for Whole Genome Sequencing. We also investigated the origin and the susceptibility to antimalarial drugs of the strains isolated in Northern Mali. Methods: Parasite DNA was extracted from 267 RDTs using the QIAamp DNA mini kit, then nested PCR and 7 samples were positive for P. vivax. After sWGA, the whole genomes were sequenced using the Illumina platform. Next Generation Sequences Analysis was done followed by population differentiation analyses. Twenty-two additional P. vivax whole genomes from other parts of the World were downloaded from the European Nucleotide Archive for further Neighbour Joining analysis. Results: The sequences extracted from RDTs showed high contamination with human DNA (80%). From the parasite DNA, in total 69529 SNPs were found in the seven P. vivax strains of Northern Mali. The most significant p-values per SNP were carried by the chromosomes 2, 3, 4, 5, 12, 13 and 14. With regard to variant effects, the Transition/Transversion ratio was 1.1. The density of variants with a high effect was 1.62%. There was no mutation associated with antimalarial drugs resistance on pvcrt-o or pvmdr-1 genes. Pairwise differentiation suggests a high degree of relatedness between P. vivax strains isolated in Northern Mali. The NeighboursJoining analysis shows clearly that strains from Mali cluster together and are genetically distinct from those from Mauritania, which shares a border with Mali. The strains isolated in Northern Mali are genetically closer to those from Madagascar, India and Latina America. Conclusion: We did not identify mutations associated to the resistance to antimalarial drugs in pvcrt-o and pvmdr-1 genes. This study confirms that P. vivax strains genetically distinct from those of Mauritania are circulating in Mali. Finally, we conclude that sWGA is a feasible approach for P. vivax DNA enrichment for WGS despite the high proportion of human contamination.
- ItemOpen AccessGenetic dating and pattern of admixture in modern human evolution(2017) Defo, Joel; Mulder, Nicola; Rugamika, Emile ChimusaGenetic variation is shaped by admixture between populations in an evolutionary process. The mixture dynamic between groups of populations results in a mosaic of chromosomal segments inherited from multiple ancestral populations. The distribution of ancestral chromosomal segments and the recombination breakpoints in an admixed genome provide information about the time of admixture. Studying populations with particular ancestries has become a major interest in population genetics because of medical and evolutionary impacts of the patterns of single nucleotide polymorphisms. It provides a better understanding of the impact of population migrations and helps us uncover interactions between several populations. Most of the research on admixed population dating has focused on a single interaction between two populations using various approaches. Some have extended this to mixing of three populations based on assumptions and approaches which differ from one tool to another. However, the inference of distinct ancestral proportions along the genome of an admixed individual and plausible dates of admixture, still remain a challenge in the case of multi-way admixed populations. This dissertation consists of three research initiatives. First, provide a succinct review of current methods for dating the admixture events. We accomplish this by providing a comprehensive review and comparison of current methods pertinent to date admixture event. Second, we assess various admixture dating tools which estimate the time of admixture between two parental populations. We do so by performing various simulations assuming a particular number of generations and use these to evaluate the tools. Third, we apply the top three assessed methods to some admixed populations from the 1000 Genomes project. Despite MALDER shows improvement and produces reasonable date estimates over other current methods, the results from both simulation and real data suggest that dating ancient admixture events accounting for the effect of other admixtures remains a challenge. Our results suggest the need for developing a new approach to date ancient and complex admixture events in multi-way admixed populations.
- ItemOpen AccessGenGraph: a python module for the simple generation and manipulation of genome graphs(2019-10-25) Ambler, Jon M; Mulaudzi, Shandukani; Mulder, NicolaAbstract Background As sequencing technology improves, the concept of a single reference genome is becoming increasingly restricting. In the case of Mycobacterium tuberculosis, one must often choose between using a genome that is closely related to the isolate, or one that is annotated in detail. One promising solution to this problem is through the graph based representation of collections of genomes as a single genome graph. Though there are currently a handful of tools that can create genome graphs and have demonstrated the advantages of this new paradigm, there still exists a need for flexible tools that can be used by researchers to overcome challenges in genomics studies. Results We present GenGraph, a Python toolkit and accompanying modules that use existing multiple sequence alignment tools to create genome graphs. Python is one of the most popular coding languages for the biological sciences, and by providing these tools, GenGraph makes it easier to experiment and develop new tools that utilise genome graphs. The conceptual model used is highly intuitive, and as much as possible the graph structure represents the biological relationship between the genomes. This design means that users will quickly be able to start creating genome graphs and using them in their own projects. We outline the methods used in the generation of the graphs, and give some examples of how the created graphs may be used. GenGraph utilises existing file formats and methods in the generation of these graphs, allowing graphs to be visualised and imported with widely used applications, including Cytoscape, R, and Java Script. Conclusions GenGraph provides a set of tools for generating graph based representations of sets of sequences with a simple conceptual model, written in the widely used coding language Python, and publicly available on Github.
- ItemOpen AccessA Genomic Portrait of Haplotype Diversity and Signatures of Selection in Indigenous Southern African Populations(Public Library of Science, 2015) Chimusa, Emile R; Meintjies, Ayton; Tchanga, Milaine; Mulder, Nicola; Seoighe, Cathal; Soodyall, Himla; Ramesar, RajkumarWe report a study of genome-wide, dense SNP (∼900K) and copy number polymorphism data of indigenous southern Africans. We demonstrate the genetic contribution to southern and eastern African populations, which involved admixture between indigenous San, Niger-Congo-speaking and populations of Eurasian ancestry. This finding illustrates the need to account for stratification in genome-wide association studies, and that admixture mapping would likely be a successful approach in these populations. We developed a strategy to detect the signature of selection prior to and following putative admixture events. Several genomic regions show an unusual excess of Niger-Kordofanian, and unusual deficiency of both San and Eurasian ancestry, which were considered the footprints of selection after population admixture. Several SNPs with strong allele frequency differences were observed predominantly between the admixed indigenous southern African populations, and their ancestral Eurasian populations. Interestingly, many candidate genes, which were identified within the genomic regions showing signals for selection, were associated with southern African-specific high-risk, mostly communicable diseases, such as malaria, influenza, tuberculosis, and human immunodeficiency virus/AIDs. This observation suggests a potentially important role that these genes might have played in adapting to the environment. Additionally, our analyses of haplotype structure, linkage disequilibrium, recombination, copy number variation and genome-wide admixture highlight, and support the unique position of San relative to both African and non-African populations. This study contributes to a better understanding of population ancestry and selection in south-eastern African populations; and the data and results obtained will support research into the genetic contributions to infectious as well as non-communicable diseases in the region.
- ItemOpen AccessIdentification of the virulence gene of Mycobacterium tuberculosis(2007) Rabiu, Halimah Adenike; Mulder, NicolaThe major thrust of this project is to identify and characterize potential virulence genes from M. tuberculosis. To this end, we have compiled and integrated information from various public databases to catalogue 246573 microbial genes from 84 organisms, including pathogens and non pathogenic microbes. We determined the phylogenetic distributions by grouping the proteins into families based on sequence similarity with the aid of BLASTP and the NCBI BLASTClust program.