The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset

dc.contributor.advisorWilliamson, Carolyn
dc.contributor.advisorMartin Darren
dc.contributor.authorMphahlele, Ruth
dc.date.accessioned2024-05-27T08:47:40Z
dc.date.available2024-05-27T08:47:40Z
dc.date.issued2023
dc.date.updated2024-05-22T08:34:57Z
dc.description.abstractHuman Immunodeficiency Virus (HIV) rapidly escapes cytotoxic T-cell lymphocyte (CTL) immune responses exerted by the host. Mutation patterns and HLA associated footprints linked to viral escape have been identified, making it possible to use viral sequence data, combined with the host HLA allele information, to predict escape. Next-Generation Sequencing (NGS) approaches enable the generation of large sequence datasets, and the detection of viral populations present at very low frequencies in an infected individual at any given time. These datasets allow for the study of changes in viral populations within a host over time and provide a means to understand the kinetics and pathway(s) of escape. While tools exist that allow the prediction of escape in sequence data with small sequence numbers per sampling timepoint, these tools often have limitations in analysing large NGS data sets. In this project, we developed a workflow for identifying the kinetics of CTL escape in longitudinal HIV-1 next-generation datasets of gag sequences generated using an Illumina Miseq platform over the duration of drug-naïve infection. This acquired data set was generated from 15 women over a period of one to seven years and comprised of 4583 short read gag sequences (544 bp). We identified tools for identifying CTL escape in deep sequencing datasets and used pre-defined criteria to screen these tools. The outputs were validated using a test dataset from a previous study that identified escape. We selected the Epitope Matcher tool as having the most potential to identify CTL epitopes and escape mutations. To further support evidence of escape and identify additional putative escape mutations, we identified sites with high Shannon entropy (>=0.25) and sites evolving under positive selection using HyphyFUBAR. The sites were verified using the HLA association and CTL epitope variants and escape mutations lists, or data generated by Epitope Matcher. Using the Epitope Matcher tool, we identified seven HLA-B restricted gag epitopes in six individuals of which putative escape was identified in seven epitopes, commonly occurring in the late chronic phase of infection. The most common epitope in the population was YL9 (found in 60% of the participants) (Gag HXB2 coordinates 296 to 304) restricted by HLA B*15:03, B*15:10 and B*42:01. Toggling of amino acids within epitopes as a result of potential fitness cost associated with a specific change, was observed in five of seven epitopes. We further identified 35 high Shannon entropy sites, where nine of these sites were found within epitopes identified by Epitope Matcher. Additionally, nine of the high Shannon entropy sites were evolving under positive selection. With supporting evidence, we can predict that the mutation T310S (found in the AW11 epitope, restricted by allele B*58:01), is likely to be associated with escape. This study is important in that it provides a pipeline that will enable semiautomated analysis of NGS data. Using this approach, we have provided a better understanding of the kinetics and frequency of CTL escape over the course of HIV infection. Additionally, we have identified frequently targeted sites across the Gag p24 region and across individuals. This study is relevant to inform CTL-based vaccine prevention and treatment strategies.
dc.identifier.apacitationMphahlele, R. (2023). <i>The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset</i>. (). ,Faculty of Health Sciences ,Computational Biology Division. Retrieved from http://hdl.handle.net/11427/39727en_ZA
dc.identifier.chicagocitationMphahlele, Ruth. <i>"The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset."</i> ., ,Faculty of Health Sciences ,Computational Biology Division, 2023. http://hdl.handle.net/11427/39727en_ZA
dc.identifier.citationMphahlele, R. 2023. The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset. . ,Faculty of Health Sciences ,Computational Biology Division. http://hdl.handle.net/11427/39727en_ZA
dc.identifier.ris TY - Thesis / Dissertation AU - Mphahlele, Ruth AB - Human Immunodeficiency Virus (HIV) rapidly escapes cytotoxic T-cell lymphocyte (CTL) immune responses exerted by the host. Mutation patterns and HLA associated footprints linked to viral escape have been identified, making it possible to use viral sequence data, combined with the host HLA allele information, to predict escape. Next-Generation Sequencing (NGS) approaches enable the generation of large sequence datasets, and the detection of viral populations present at very low frequencies in an infected individual at any given time. These datasets allow for the study of changes in viral populations within a host over time and provide a means to understand the kinetics and pathway(s) of escape. While tools exist that allow the prediction of escape in sequence data with small sequence numbers per sampling timepoint, these tools often have limitations in analysing large NGS data sets. In this project, we developed a workflow for identifying the kinetics of CTL escape in longitudinal HIV-1 next-generation datasets of gag sequences generated using an Illumina Miseq platform over the duration of drug-naïve infection. This acquired data set was generated from 15 women over a period of one to seven years and comprised of 4583 short read gag sequences (544 bp). We identified tools for identifying CTL escape in deep sequencing datasets and used pre-defined criteria to screen these tools. The outputs were validated using a test dataset from a previous study that identified escape. We selected the Epitope Matcher tool as having the most potential to identify CTL epitopes and escape mutations. To further support evidence of escape and identify additional putative escape mutations, we identified sites with high Shannon entropy (>=0.25) and sites evolving under positive selection using HyphyFUBAR. The sites were verified using the HLA association and CTL epitope variants and escape mutations lists, or data generated by Epitope Matcher. Using the Epitope Matcher tool, we identified seven HLA-B restricted gag epitopes in six individuals of which putative escape was identified in seven epitopes, commonly occurring in the late chronic phase of infection. The most common epitope in the population was YL9 (found in 60% of the participants) (Gag HXB2 coordinates 296 to 304) restricted by HLA B*15:03, B*15:10 and B*42:01. Toggling of amino acids within epitopes as a result of potential fitness cost associated with a specific change, was observed in five of seven epitopes. We further identified 35 high Shannon entropy sites, where nine of these sites were found within epitopes identified by Epitope Matcher. Additionally, nine of the high Shannon entropy sites were evolving under positive selection. With supporting evidence, we can predict that the mutation T310S (found in the AW11 epitope, restricted by allele B*58:01), is likely to be associated with escape. This study is important in that it provides a pipeline that will enable semiautomated analysis of NGS data. Using this approach, we have provided a better understanding of the kinetics and frequency of CTL escape over the course of HIV infection. Additionally, we have identified frequently targeted sites across the Gag p24 region and across individuals. This study is relevant to inform CTL-based vaccine prevention and treatment strategies. DA - 2023 DB - OpenUCT DP - University of Cape Town KW - Bioinformatics LK - https://open.uct.ac.za PY - 2023 T1 - The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset TI - The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset UR - http://hdl.handle.net/11427/39727 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/39727
dc.identifier.vancouvercitationMphahlele R. The identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset. []. ,Faculty of Health Sciences ,Computational Biology Division, 2023 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/39727en_ZA
dc.language.rfc3066eng
dc.publisher.departmentComputational Biology Division
dc.publisher.facultyFaculty of Health Sciences
dc.subjectBioinformatics
dc.titleThe identification of cytotoxic T lymphocyte (CTL) escape in a large, longitudinal subtype C HIV-1 sequence dataset
dc.typeThesis / Dissertation
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_hsf_2023_mphahlele ruth.pdf
Size:
4.98 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.72 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections