A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae

dc.contributor.advisorMulder, Nicola
dc.contributor.authorIranzadeh, Arash
dc.date.accessioned2023-09-11T07:02:40Z
dc.date.available2023-09-11T07:02:40Z
dc.date.issued2023
dc.date.updated2023-09-11T06:59:08Z
dc.description.abstractStreptococcus pneumoniae (pneumococcus) is one of the leading causes of mortality in Africa. It asymptomatically colonizes the human nasopharynx. The invasive pneumococcal disease occurs when isolates spread to normally sterile sites such as lungs, blood, and the central nervous system. Colonization, though, does not necessarily lead to infection. Some isolates remain in the upper respiratory tract only, without causing any pathogenic symptoms. This thesis hypothesized that invasive and non-invasive isolates differ genetically. We tested this hypothesis by applying a pan-genome approach using whole-genome sequencing short reads of 1477 samples from Malawi, including those obtained from the nasopharynx of carriers (825 samples) and from the blood and cerebrospinal fluid of patients (652 samples). In-silico serotyping identified 56 serotypes in the cohort and statistical analysis showed that despite the vaccination, the prevalence of serotypes 1 and 12F increased amongst patients. Genomes were assembled, and a reference pan-genome for all strains was built. Short reads were aligned to the core genome, and core variants were called. The population structure was determined based on the distribution of variants in the pan-genome. Finally, genes with a significant presence in the invasive isolates were identified. Functional enrichment analysis of potential virulence genes was carried out to address how specific genes may contribute to the pathogenesis. The findings highlighted the features of the pneumococcus pan-genome in Malawi. The core- and accessory-genome were characterized based on the functional analysis of genes. The core components included: Ribosomal subunits. Subunits of F-type ATP synthase. Enzymes that catalyze the attachment of amino acids to tRNA molecules, DNA replication, DNA repair, and homologous recombination. 10.13% of the core and soft-core genes were uncharacterized. In the accessory genome, the study detected the presence of genes from Regions of Diversity (RDs), including Subunits of V-type ATPases and Sodium/solute symporter from RD8a. Enzymes from RD3 catalyzing the capsule synthesis. Subunits of PsrP secY2A2 pathogenicity island from RD10. Genes from RD6 and RD7 involved in transposing mobile genetic elements. Genes from RD2 RD8b, and RD12 participating in communication and competition. Genes from RD4 that assemble pilins into pili and anchor pili to the cell wall. 53.58% of accessory genes were uncharacterized. Most serotypes showed a similar prevalence in carriage and disease groups. However, the significant abundance of serotypes 1, 5, and 12F among patients compared to the carriage group suggested they are highly invasive with a short colonization period. These serotypes exhibited a remarkable genetic distinction from others. Their divergence included the absence and presence of several genes in their genome structure. The lack of genes from a genomic island known as RD8a was the most pronounced difference between serotypes 1, 5, and 12F compared to significantly prevalent serotypes in the nasopharynx. Genes in RD8a are involved in binding to epithelial cells and doing aerobics respiration to synthesize ATP through oxidative phosphorylation. The absence of RD8a from serotypes 1, 5, and 12F may be associated with their short duration in the nasopharynx where they need to bind to epithelial cells and access free oxygen molecules required for aerobic respiration. Given this, the amount of ATP is likely to decline in serotypes 1, 5, and 12F, causing them to harbour more phosphotransferase systems to transport carbohydrates since these transporters use phosphoenolpyruvate as the energy source instead of ATP. In conclusion, serotypes 1, 5, and 12F, the most prevalent and invasive pneumococcal strains in Malawi, showed a considerable genetic distinction from other strains that may be associated with their short colonization period and quickness to infect the blood and cerebrospinal fluid.
dc.identifier.apacitationIranzadeh, A. (2023). <i>A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae</i>. (). ,Faculty of Health Sciences ,Computational Biology Division. Retrieved from http://hdl.handle.net/11427/38498en_ZA
dc.identifier.chicagocitationIranzadeh, Arash. <i>"A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae."</i> ., ,Faculty of Health Sciences ,Computational Biology Division, 2023. http://hdl.handle.net/11427/38498en_ZA
dc.identifier.citationIranzadeh, A. 2023. A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae. . ,Faculty of Health Sciences ,Computational Biology Division. http://hdl.handle.net/11427/38498en_ZA
dc.identifier.ris TY - Doctoral Thesis AU - Iranzadeh, Arash AB - Streptococcus pneumoniae (pneumococcus) is one of the leading causes of mortality in Africa. It asymptomatically colonizes the human nasopharynx. The invasive pneumococcal disease occurs when isolates spread to normally sterile sites such as lungs, blood, and the central nervous system. Colonization, though, does not necessarily lead to infection. Some isolates remain in the upper respiratory tract only, without causing any pathogenic symptoms. This thesis hypothesized that invasive and non-invasive isolates differ genetically. We tested this hypothesis by applying a pan-genome approach using whole-genome sequencing short reads of 1477 samples from Malawi, including those obtained from the nasopharynx of carriers (825 samples) and from the blood and cerebrospinal fluid of patients (652 samples). In-silico serotyping identified 56 serotypes in the cohort and statistical analysis showed that despite the vaccination, the prevalence of serotypes 1 and 12F increased amongst patients. Genomes were assembled, and a reference pan-genome for all strains was built. Short reads were aligned to the core genome, and core variants were called. The population structure was determined based on the distribution of variants in the pan-genome. Finally, genes with a significant presence in the invasive isolates were identified. Functional enrichment analysis of potential virulence genes was carried out to address how specific genes may contribute to the pathogenesis. The findings highlighted the features of the pneumococcus pan-genome in Malawi. The core- and accessory-genome were characterized based on the functional analysis of genes. The core components included: Ribosomal subunits. Subunits of F-type ATP synthase. Enzymes that catalyze the attachment of amino acids to tRNA molecules, DNA replication, DNA repair, and homologous recombination. 10.13% of the core and soft-core genes were uncharacterized. In the accessory genome, the study detected the presence of genes from Regions of Diversity (RDs), including Subunits of V-type ATPases and Sodium/solute symporter from RD8a. Enzymes from RD3 catalyzing the capsule synthesis. Subunits of PsrP secY2A2 pathogenicity island from RD10. Genes from RD6 and RD7 involved in transposing mobile genetic elements. Genes from RD2 RD8b, and RD12 participating in communication and competition. Genes from RD4 that assemble pilins into pili and anchor pili to the cell wall. 53.58% of accessory genes were uncharacterized. Most serotypes showed a similar prevalence in carriage and disease groups. However, the significant abundance of serotypes 1, 5, and 12F among patients compared to the carriage group suggested they are highly invasive with a short colonization period. These serotypes exhibited a remarkable genetic distinction from others. Their divergence included the absence and presence of several genes in their genome structure. The lack of genes from a genomic island known as RD8a was the most pronounced difference between serotypes 1, 5, and 12F compared to significantly prevalent serotypes in the nasopharynx. Genes in RD8a are involved in binding to epithelial cells and doing aerobics respiration to synthesize ATP through oxidative phosphorylation. The absence of RD8a from serotypes 1, 5, and 12F may be associated with their short duration in the nasopharynx where they need to bind to epithelial cells and access free oxygen molecules required for aerobic respiration. Given this, the amount of ATP is likely to decline in serotypes 1, 5, and 12F, causing them to harbour more phosphotransferase systems to transport carbohydrates since these transporters use phosphoenolpyruvate as the energy source instead of ATP. In conclusion, serotypes 1, 5, and 12F, the most prevalent and invasive pneumococcal strains in Malawi, showed a considerable genetic distinction from other strains that may be associated with their short colonization period and quickness to infect the blood and cerebrospinal fluid. DA - 2023_ DB - OpenUCT DP - University of Cape Town KW - Streptococcus Pneumoniae LK - https://open.uct.ac.za PY - 2023 T1 - A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae TI - A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae UR - http://hdl.handle.net/11427/38498 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/38498
dc.identifier.vancouvercitationIranzadeh A. A pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae. []. ,Faculty of Health Sciences ,Computational Biology Division, 2023 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/38498en_ZA
dc.language.rfc3066eng
dc.publisher.departmentComputational Biology Division
dc.publisher.facultyFaculty of Health Sciences
dc.subjectStreptococcus Pneumoniae
dc.titleA pan-genome wide association study to identify genes associated with invasive Streptococcus pneumoniae
dc.typeDoctoral Thesis
dc.type.qualificationlevelDoctoral
dc.type.qualificationlevelPhD
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_hsf_2023_iranzadeh arash.pdf
Size:
12.43 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections