Browsing by Author "Martin, Darrin"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- ItemOpen AccessAnalysis of the impact on phylogenetic inference of non-reversible nucleotide substitution models(2023) Sianga, Rita; Martin, DarrinMost phylogenetic trees are inferred using time-reversible evolutionary models that assume that the relative rates of substitution for any given pair of nucleotides are the same regardless of the direction of the substitutions. However, there is no reason to assume that the underlying biochemical mutational processes that cause substitutions are similarly symmetrical. Here, we evaluate the effect on phylogenetic inference in empirical viral and simulated data of incorporating non-reversibility into models of nucleotide substitution processes. I consider two non-reversible nucleotide substitution models: (1) a 6-rate nonreversible model (NREV6) that is applicable to analyzing mutational processes in double-stranded genomes in that complementary substitutions occur at identical rates; and (2) a 12-rate non-reversible model (NREV12) that is applicable to analyzing mutational processes in single-stranded (ss) genomes in that all substitution types are free to occur at different rates. Using likelihood ratio and Akaike Information Criterion-based model tests, we show that, surprisingly, NREV12 provided a significantly better fit than the General Time Reversible (GTR) and NREV6 models to 21/31 dsRNA and 20/30 dsDNA datasets. As expected, however, NREV12 provided a significantly better fit to 24/33 ssDNA and 40/47 ssRNA datasets. I tested how non-reversibility impacts the accuracy with which phylogenetic trees are inferred. As simulated degrees of non-reversibility (DNR) increased, the tree topology inferences using both NREV12 and GTR became more accurate, whereas inferred tree branch lengths became less accurate. I conclude that while non-reversible models should be helpful in the analysis of mutational processes in most virus species, there is no pressing need to use these models for routine phylogenetic inference. Finally, I introduce a web application, RpNRM, that roots phylogenetic trees using a non-reversible nucleotide substitution model. The phylogenetic tree is rooted on every branch and the likelihoods of each rooting are determined and compared with the highest likelihood tree being identified as that with the most plausible rooting. The rooting accuracy of RpNRM was compared to that of the outgroup rooting method, the midpoint rooting method and another non-reversible model-based rooting method implemented in the program IQTREE. I find that although the RpNRM and IQTREE reversible model-based methods are not as accurate on their own as outgroup or midpoint rooting methods, they nevertheless provide an independent means of verifying the root locations that are inferred by these other methods.
- ItemOpen AccessSimulating recombinant sequence date to evaluate and improve computational methods of multiple sequence alignment and recombinant identification(2023) Swanepoel, Phillip; Martin, DarrinMotivation. Recombination is a central evolutionary process that substantially changes the structure of genomes and shapes their evolutionary trajectory. Recombination detection is thus an important computational step in understanding the evolutionary history of nucleotide sequences, and the accurate identification of recombinant sequences is particularly important in the context of downstream phylogenetics-based sequence analyses. Evaluating recombination detection methods requires the simulation of sequence data, and the training of statistical learning models requires large, realistic datasets. The goal of this study was thus to (1) simulate large, realistic sequence datasets that have evolved in the presence of frequent recombination, and (2) to use these datasets to improve one of the computational steps used in the analysis of recombination by the computer program, recombination detection program 5 (RDP5), specifically: the identification of the recombinant from a recombinant/parent/parent triplet. Results. To improve the accuracy with which RDP5 identifies recombinant sequences, we simulated the evolution of recombining sequences to produce large datasets that could then be used to train a number of machine learning models to accurately differentiate recombinants from their parental sequences. The artificial intelligence systems created using these models showed a substantial improvement in recombinant identification accuracy over the method currently implemented in RDP5 - with an increase in accuracy of up to 26 percentage points. Availability and implementation. Our simulation software is a forked version of SANTA-SIM developed in Java. All source code is released and is available at: https://github.com/phillipswanepoel/santa-sim/tree/Recomb_and_align.
- ItemOpen AccessTesting for host adaptive evolution using the maize streak virus model(2022) Oyeniran, Kehinde Adewole; Martin, DarrinMaize streak virus (MSV; Genus: Mastrevirus; Family: Geminiviridae) causes maize streak disease (MSD); a major biotic threat to maize farming especially in sub-Saharan Africa, and it neighbouring Indian and Atlantic Ocean Islands, where its insect vectors in the genus Cicadulina thrive. Of the eleven known MSV strains (called A through K), only MSV-A is economically significant as it is the only one that causes severe disease in maize. MSV is a single stranded DNA (ssDNA) virus which, like RNA viruses, has high mutation and recombination rates. Given that these processes can sometimes promote viral diversity and result in the rapid evolution of new, fitter MSV variants, continuous genomic surveillance of MSV is therefore important. Based on analyses of full genome sequences, MSV-A has been classified into five subtypes (-A1, -A2, -A3, -A4, and -A6) and more than 20 recombinant lineages. Here, I showed using laboratory-based experiments that maize infecting mastreviruses such as Maize streak Reunion virus (MSRV) and MSV-C which have been found maize plants displaying severe streak symptoms do not in fact cause severe streak symptoms in maize when used to infect maize on their own. Although a mixed infection involving MSRV and MSV-B resulted in slight changes in symptom phenotypes it is unlikely that MSRV and MSV-C are responsible for emerging maize diseases. I carried out model-based phylogenetic and phylogeographic analyses of MSV-A movement dynamics in and out of Madagascar, Ethiopia and Rwanda using newly determined MSV-A genome sequences (Madagascar: n = 56; Ethiopia: n = 84) together with other sequences from GenBank. I showed that most movements of MSV-A into Madagascar have been from East Africa between the early 1990s and 2000s. My inferences show that MSV-A1 variants currently found in Ethiopia likely arrived there from Uganda or Kenya between 1985 and 1988. Similarly the MSV-A1 variants found in Rwanda likely also moved there from Ethiopia, Kenya or Uganda between 2007 and 2011. The time periods over which inferred movements of MSV-A1 into Madagascar, Rwanda and Ethiopia occurred all correspond with the period during which trade between these and other East African nations was being liberalized. Although these temporally-scaled phylogeographic analyses indicated that human activities are likely responsible for some of the long-range movements of MSV-A1 variants (such as movements from East Africa to Madagascar), leafhopper-mediated dissemination of these variants also likely played a major role in long and short distance movements of these variants within both Madagascar and between East African countries. Over 90 years of evolution that yielded MSV-A-ZW-MatA_1994 in the MSV-A1 lineage, produced symptoms that have varied in a less concerted ways, or largely remained unchanged. Major harms (intensity of chlorosis, leaf deformation and stunting) have decreased while the amount of colonized cells (chlorotic areas) that determine onward transmission have increased. These data suggest MSV-A has evolved to optimize the number of cells it infects for effective onward transmission, while reducing excessive harm to its hosts. Altogether, these results suggest (1) synergism potentially plays a role in some instances of severe streak disease and (2) the movement of MSV-A1 within the East African region and Madagascar emphasizes the importance of this MSV-A subtype as a major ongoing threat to maize production within these regions; and (3) over the last 90 years, the MSV-A1 subtype has evolved to produce greater chlorotic areas on the leaves of infected maize plants while at the same time either not increasing or reducing the degrees of chloroplast destruction, stunting and deformation caused by infections: characteristics that may have enhanced the transmissibility of this variant and therefore played an important role in the present rise to dominance of this subtype throughout East Africa and Madagascar.