Browsing by Subject "Reproducibility"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- ItemOpen AccessDeveloping reproducible bioinformatics analysis workflows for heterogeneous computing environments to support African genomics(BioMed Central, 2018-11-29) Baichoo, Shakuntala; Souilmi, Yassine; Panji, Sumir; Botha, Gerrit; Meintjes, Ayton; Hazelhurst, Scott; Bendou, Hocine; Beste, Eugene d; Mpangase, Phelelani T; Souiai, Oussema; Alghali, Mustafa; Yi, Long; O’Connor, Brian D; Crusoe, Michael; Armstrong, Don; Aron, Shaun; Joubert, Fourie; Ahmed, Azza E; Mbiyavanga, Mamana; Heusden, Peter v; Magosi, Lerato E; Zermeno, Jennie; Mainzer, Liudmila S; Fadlelmola, Faisal M; Jongeneel, C. V; Mulder, NicolaAbstract Background The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an African-led research consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous African computing environments. Processing and analysis of genomic data is an example of a big data application requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and secondary input data through several computationally-intensive processing steps using different software packages, where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and easy-to-use workflows is particularly challenging. Results H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for use by members of the H3Africa consortium and the international research community. Conclusion The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use. All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network.
- ItemOpen AccessOptimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens(2020-05-12) Claassen-Weitz, Shantelle; Gardner-Lubbe, Sugnet; Mwaikono, Kilaza S; du Toit, Elloise; Zar, Heather J; Nicol, Mark PCareful consideration of experimental artefacts is required in order to successfully apply high-throughput 16S ribosomal ribonucleic acid (rRNA) gene sequencing technology. Here we introduce experimental design, quality control and “denoising” approaches for sequencing low biomass specimens. Results We found that bacterial biomass is a key driver of 16S rRNA gene sequencing profiles generated from bacterial mock communities and that the use of different deoxyribonucleic acid (DNA) extraction methods [DSP Virus/Pathogen Mini Kit® (Kit-QS) and ZymoBIOMICS DNA Miniprep Kit (Kit-ZB)] and storage buffers [PrimeStore® Molecular Transport medium (Primestore) and Skim-milk, Tryptone, Glucose and Glycerol (STGG)] further influence these profiles. Kit-QS better represented hard-to-lyse bacteria from bacterial mock communities compared to Kit-ZB. Primestore storage buffer yielded lower levels of background operational taxonomic units (OTUs) from low biomass bacterial mock community controls compared to STGG. In addition to bacterial mock community controls, we used technical repeats (nasopharyngeal and induced sputum processed in duplicate, triplicate or quadruplicate) to further evaluate the effect of specimen biomass and participant age at specimen collection on resultant sequencing profiles. We observed a positive correlation (r = 0.16) between specimen biomass and participant age at specimen collection: low biomass technical repeats (represented by < 500 16S rRNA gene copies/μl) were primarily collected at < 14 days of age. We found that low biomass technical repeats also produced higher alpha diversities (r = − 0.28); 16S rRNA gene profiles similar to no template controls (Primestore); and reduced sequencing reproducibility. Finally, we show that the use of statistical tools for in silico contaminant identification, as implemented through the decontam package in R, provides better representations of indigenous bacteria following decontamination. Conclusions We provide insight into experimental design, quality control steps and “denoising” approaches for 16S rRNA gene high-throughput sequencing of low biomass specimens. We highlight the need for careful assessment of DNA extraction methods and storage buffers; sequence quality and reproducibility; and in silico identification of contaminant profiles in order to avoid spurious results.
- ItemOpen AccessWithin-subject variability of interferon-g assay results for tuberculosis and boosting effect of tuberculin skin testing: a systematic review(Public Library of Science, 2009) Van Zyl-Smit, Richard N; Zwerling, Alice; Dheda, Keertan; Pai, MadhukarBACKGROUND: Variability in interferon-gamma release assays (IGRAs) results for tuberculosis has implications for interpretation of results close to the cut-point, and for defining thresholds for test conversion and reversion. However, little is known about the within-subject variability (reproducibility) of IGRAs. Several national guidelines recommend a two-step testing procedure (tuberculin skin test [TST] followed by IGRA) for the diagnosis of LTBI. However, the effect of a preceding TST on subsequent IGRA results has been reported in studies with apparently conflicting results. Methodology/FINDINGS: We conducted a systematic review to synthesize evidence on within-subject variability of IGRA results and the potential boosting effect of TST. We searched several databases and reviewed citations of previous reviews on IGRAs. We included studies using commercial IGRAs, in addition to non-commercial versions of the ELISPOT assay. Four studies, fulfilling our predefined criteria, examined within-subject variability and 13 studies evaluated TST effects on subsequent IGRA responses. Meta-analysis was not considered appropriate because of heterogeneity in study methods, assays, and populations. Although based on limited data, within-subject variability was present in all studies but the magnitude varied (16-80%) across studies. A TST induced "boosting" of IGRA responses was demonstrated in several studies and although more pronounced in IGRA-positive (i.e. sensitized) individuals, also occurred in a smaller but not insignificant proportion of IGRA-negative subjects. The TST appeared to affect IGRA responses only after 3 days and may apparently persist for several months, but evidence for this is weak. Conclusions/Significance Although reproducibility data are scarce, significant within person IGRA variability has been reported. If confirmed in more studies, this has implications for the interpretation of results close to the cut-point and for definition of conversions and reversions. Although the effect of TST on IGRA results is likely to be inconsequential in IGRA-positive subjects, in IGRA-negative subjects, the interpretation of results may be confounded by a preceding TST if administered more than 3 days prior to an IGRA.