Browsing by Author "Kuttel, Michelle"
Now showing 1 - 14 of 14
Results Per Page
Sort Options
- ItemOpen AccessAccelerated cooperative co-evolution on multi-core architectures(2018) Moyo, Edmore; Kuttel, Michelle; Nitschke, Geoff StuartThe Cooperative Co-Evolution (CC) model has been used in Evolutionary Computation (EC) to optimize the training of artificial neural networks (ANNs). This architecture has proven to be a useful extension to domains such as Neuro-Evolution (NE), which is the training of ANNs using concepts of natural evolution. However, there is a need for real-time systems and the ability to solve more complex tasks which has prompted a further need to optimize these CC methods. CC methods consist of a number of phases, however the evaluation phase is still the most compute intensive phase, for some complex tasks taking as long as weeks to complete. This study uses NE as a test case study and we design a parallel CC processing framework and implement the optimized serial and parallel versions using the Go programming language. Go is a multi-core programming language with first-class constructs, channels and goroutines, that make it well suited to parallel programming. Our study focuses on Enforced Subpopulations (ESP) for single-agent systems and Multi-Agent ESP for multi-agent systems. We evaluate the parallel versions in the benchmark tasks; double pole balancing and prey-capture, for single and multi-agent systems respectively, in tasks of increasing complexity. We observe a maximum speed-up of 20x for the parallel Multi-Agent ESP implementation over our single core optimized version in the prey-capture task and a maximum speedup of 16x for ESP in the harder version of double pole balancing task. We also observe linear speed-ups for the difficult versions of the tasks for a certain range of cores, indicating that the Go implementations are efficient and that the parallel speed-ups are better for more complex tasks. We find that in complex tasks, the Cooperative Co-Evolution Neuro-Evolution (CCNE) methods are amenable to multi-core acceleration, which provides a basis for the study of even more complex CC methods in a wider range of domains.
- ItemOpen AccessAddition of flexible linkers to GPU-accelerated coarse-grained simulations of protein-protein docking(2019) Pinska, Adrianna; Kuttel, Michelle; Gain, James; Best, RobertMultiprotein complexes are responsible for many vital cellular functions, and understanding their formation has many applications in medical research. Computer simulation has become a valuable tool in the study of biochemical processes, but simulation of large molecular structures such as proteins on a useful scale is computationally expensive. A compromise must be made between the level of detail at which a simulation can be performed, the size of the structures which can be modelled and the time scale of the simulation. Techniques which can be used to reduce the cost of such simulations include the use of coarse-grained models and parallelisation of the code. Parallelisation has recently been made more accessible by the advent of Graphics Processing Units (GPUs), a consumer technology which has become an affordable alternative to more specialised parallel hardware. We extend an existing implementation of a Monte Carlo protein-protein docking simulation using the Kim and Hummer coarse-grained protein model [1] on a heterogeneous GPU-CPU architecture [2]. This implementation has achieved a significant speed-up over previous serial implementations as a result of the efficient parallelisation of its expensive non-bonded potential energy calculation on the GPU. Our contribution is the addition of the optional capability for modelling flexible linkers between rigid domains of a single protein. We implement additional Monte Carlo mutations to allow for movement of residues within linkers, and for movement of domains connected by a linker with respect to each other. We also add potential terms for pseudo-bonds, pseudo-angles and pseudo-torsions between residues to the potential calculation, and include additional residue pairs in the non-bonded potential sum. Our flexible linker code has been tested, validated and benchmarked. We find that the implementation is correct, and that the addition of the linkers does not significantly impact the performance of the simulation. This modification may be used to enable fast simulation of the interaction between component proteins in a multiprotein complex, in configurations which are constrained to preserve particular linkages between the proteins. We demonstrate this utility with a series of simulations of diubiquitin chains, comparing the structure of chains formed through all known linkages between two ubiquitin monomers. We find reasonable agreement between our simulated structures and experimental data on the characteristics of diubiquitin chains in solution.
- ItemOpen AccessAlgorithms for efficiently and effectively matching agents in microsimulations of sexually transmitted infections(2018) Geffen, Nathan; Kuttel, MichelleMathematical models of the HIV epidemic have been used to estimate incidence, prevalence and life-expectancy, as well the benefits and costs of public health interventions, such as the provision of antiretroviral treatment. Models of sexually transmitted infection epidemics attempt to account for varying levels of risk across a population based on diverse / or heterogeneous / sexual behaviour. Microsimulations are a type of model that can account for fine-grained heterogeneous sexual behaviour. This requires pairing individuals, or agents, into sexual partnerships whose distribution matches that of the population being studied, to the extent this is known. But pair-matching is computationally expensive. There is a need for computer algorithms that pair-match quickly. In this work we describe the role of modelling in responses to the South African HIV epidemic. We also chronicle a three-decade debate, greatly influenced since 2008 by a mathematical model, on the optimal time for people with HIV to start antiretroviral treatment. We then present and analyse several pair-matching algorithms, and compare them in a microsimulation of a fictitious STI. We find that there are algorithms, such as Cluster Shuffle Pair-Matching, that offer a good compromise between speed and approximating the distribution of sexual relationships of the study-population. An interesting further finding is that infection incidence decreases as population increases, all other things being equal. Whether this is an artefact of our methodology or a natural world phenomenon is unclear and a topic for further research.
- ItemOpen AccessComparative evaluation of two carbohydrate force fields for modelling polysaccharide conformation(2022) Lazar, Ryan; Kuttel, Michelle; Ravenscroft, Neil; Akher, FaridehModern carbohydrate simulation models have reached a level of maturity whereby their accuracy is often assumed. However, concerning differences have been reported when comparing the conformational predictions of rhamnose-rich polysaccharides between GLYCAM06 and other widely used carbohydrate force fields. This thesis investigates the scope and origin of these differences. We compare Molecular Dynamics simulations of strategically selected saccharide chains, with both the GLYCAM06 and CHARMM36 carbohydrate force fields. We find significant differences in the conformational predictions of the two force fields. More specifically, collapsed, globular conformations occur in the GLYCAM06 simulations, but are absent in the equivalent CHARMM36 results. The collapsing phenomenon is brought about by a gradual folding process, facilitated by instabilities in the GLYCAM06 a-L-Rha(1®X)-a-L-Rha glycosidic linkage that are stabilised by strong intramolecular interactions. The reduced consideration for repulsive Coulombic forces in GLYCAM06, originating from a collective lack of partial aliphatic hydrogen charges, is likely the principle factor behind these differences. This work suggests critical areas for refinement in GLYCAM06 that will be required for the force field to accurately model rhamnose-rich polysaccharides. The insights gained in this work have the potential to assist in the development of more accurate force fields for modelling carbohydrates.
- ItemOpen AccessComputational analysis of Escherichia coli O25 and O25b carbohydrate antigens using the CHARMM36 and GLYCAM06 force fields(2020) Fourie, Alexander Rees; Kuttel, MichelleThe emergence of ST131 extra-intestinal pathogenic Escherichia coli that are resistant to multiple antibiotics is a growing international health concern. Infections are common, treatment options for antibiotic resistant bacteria are limited and there is no vaccine available. Polysaccharides serve key functions in immune response to bacterial infection. The Opolysaccharides present on the cell surface of gram negative bacteria are antigenic and are associated with specific bacterial serogroups. These are, therefore, a potentially effective target for vaccines. Most ST131 E. coli isolates express the O25b antigen and monoclonal antibodies that are specific to it have been isolated. The chemical structure of O25b has been characterized and differentiated from that of the previously known O25 (or O25a) variety. Relatively little is known about the conformations of O25a and O25b and how they differ, however. As conformation is a factor in antigen-antibody binding, differences between the conformations of these two antigens may be relevant to further research into carbohydrate targeted vaccines and diagnosis techniques for ST131:O25b bacteria. The conformations of polysaccharides are typically dynamic in solution and are difficult to determine empirically. Molecular dynamics simulation provides a means of estimating polysaccharide conformation but the results are critically dependent on the quality of the selected force field. Carbohydrate force fields have matured over the past few decades and CHARMM36 and GLYCAM06 are used extensively for the analysis of bacterial polysaccharides. Studies that compare results from these two widely used force fields are, however, still quite rare. Here we use molecular dynamics simulations of unacetylated, 3 RU oligosaccharide extensions to compare the CHARMM36 and GLYCAM06 force fields and to present an initial analysis of the conformations of the O25a and O25b E. coli antigens. We then apply CHARMM36 molecular dynamics simulation to analogous O- and N- acetylated oligosaccharide extensions to gauge the effect of these groups on the conformations of the two antigens and to compare O25a and O25b. Despite some differences, our CHARMM36 and GLYCAM06 simulations are largely in agreement regarding the conformation of O25a trimers without O- or N-acetylation. Both force fields predict extended, linear antigen conformations. Differences between the two force fields are noted in our analogous study of O25b however: GLYCAM06 favors a collapsed, globular oligosaccharide over a more extended molecule favored by CHARMM36; CHARMM36 and GLYCAM06 predict different preferred dihedral values for a conformationally important, main-chain ɑ-L-Rhap-(1->3)- β-D-Glcp bond; GLYCAM06 favors an anti-Ψ, anti-ω orientation of a side-chain β-D-Glc-(1->6)-ɑD-Glc bond over an anti-Ψ, syn-ω orientation favored by CHARMM36. These findings are in agreement with other studies that indicate the collapse of some oligosaccharides into metastable globular conformations during simulations with GLYCAM06. Our CHARMM36 simulations of O- and N-acetylated, 3 RU oligosaccharide extensions of O25a and O25b indicate large differences between the conformations of the two antigens: First, the O25b trimer favors either a compressed or extended helical conformation in solution whereas the O25a trimer favors a single, extended conformation. Second, O25a and O25b exhibit notably different dihedral values for conformationally important glycosidic bonds that correspond with the reported structural differences between the two antigens. Third, O- and N-acetylation is found to facilitate rotation about a key ɑ-D-Glcp-(1->3)-ɑ-L-Rhap2Ac bond in O25b that, in turn, facilitates the formation of compressed, helical O25b conformations. These compressed conformations are stabilized by intramolecular hydrogen bonds that involve O- and N-acetyl groups. Finally, N-acetyl groups appear to be shielded on the inside of the compressed O25b helix whereas O-acetyl groups appear to be exposed on the outside of the molecule. We postulate that these large conformational differences provide a rationale for the clinically noted differences in cross reactivity of monoclonal antibodies for the O25a and O25b antigens.
- ItemOpen AccessDesign of a prototype mobile application interface for efficient accessing of electronic laboratory results by health clinicians(2018) Chigudu, Kumbirai; Kuttel, Michellein order for clinicians to make informed medical decisions and prescribe the correct medication within a limited specified time. Since no further informed action can be taken on the patient until the laboratory report reaches the clinician, the delivery of the report to the clinician becomes a critical path in the value chain of the laboratory testing process. The National Health Laboratory Service (NHLS) currently delivers lab results in three ways: via a physical paper report, and electronically through a web application. The third alternative is for short and high-priority test results, like human immunodeficiency virus (HIV) and tuberculosis (TB), that are delivered via short message service (SMS) printers in remote rural clinics. However, despite its inefficiencies, the paper report remains the most commonly used method. As turnaround times for basic and critical laboratory tests remain a great challenge for NHLS to meet the specified targets; there is need to shift method of final delivery from paper to a paperless secured electronic result delivery system. Accordingly, the recently-implemented centralised TrakCare Lab laboratory information system (LIS) makes provision for delivery of electronic results via a web application, ‘TrakCarewebview’. However, the uptake of TrakCarewebview has been very low due to the cumbersomeness of the application; this web application takes users through nine steps to obtain the results and is not designed for mobile devices. In addition, its access in remote rural health care facilities is a great challenge because of lack of supportive infrastructure. There is therefore an obvious gap and considerable potential in diagnostic result delivery system that calls for an immediate action to design and development of a less complex, cost effective and usable mobile application, for electronic delivery of laboratory results. After obtaining research ethics clearance approval from the University’s Faculty of Science Research Ethics Committee a research was sanctioned. A survey of public sector clinicians across South Africa indicated that 98% have access to the internet through smartphones, and 93% of the clinicians indicated that they would use their mobile devices to access electronic laboratory results. A significant number of clinicians believe that the use of a mobile application in health facilities will improve patient care. This belief, therefore, set a strong basis for designing and developing a mobile application for laboratory results. The study aims to design and develop a mobile application prototype that can demonstrate the capability of delivering electronic laboratory test results to clinicians on their smart devices, via a usable mobile application. The design of the mobile application prototype was driven by user-centred design (UCD) principles in order to develop an effective design. Core and critical to the process is the design step which establishes the user requirements specifications that meet the user expectations. The study substantiated the importance of the design aspect as the initial critical step in obtaining a good final product. The prototype was developed through an iterative process alternating prototype development and evaluation. The development iterations consisted of a single paper prototyping iteration followed by further two iterations using an interactive Justinmind prototyping tool. Respective to the development iterations, cognitive walk-through and heuristic principles were used to evaluate the usability of the initial prototype. The final prototype was then evaluated using the system usability scale (SUS) survey quantitative tool, which determines the effectiveness and perceived usability of the application. The application scored an average SUS score of 77, which is significantly above the average acceptable SUS score of 68. The standard SUS measurement deems 80 to be an excellent score. Yet a score below 68 is considered below average. The evaluation was conducted by the potential user group which was involved in the initial design process. The ability of the interactive prototyping tool (Justinmind) to mimic the actual final product offered end users a feel of the actual product thus giving the outcome of the evaluation a strong basis to develop the actual product.
- ItemOpen AccessDesigning a mobile application interface to support mid-career professionals in creating better financial futures(2020) Pentz, Audrey; Kuttel, MichelleSouth Africans borrow more and save less than other nations (Discovery Bank, 2018). One reason is a lack of financial knowledge. If a mobile application could guide individuals to modify their financial habits slightly by spending less and saving more, they could dramatically improve their financial future. When designing visualisation systems such as a mobile application interface, users' qualitative design feedback and quantitative usability evaluation are both important and complementary. The benefit of usability feedback in software development is undisputed. The importance of qualitative design feedback from users however, seems to be controversial in Science. Gathering users' qualitative design feedback, ahead of usability evaluation, can have a substantial impact on downstream development costs. The researcher used design as a tool for thinking (imagining new possibilities) and communicating (sharing ideas). The purpose was to clarify ways in which a mobile application interface could support users in making better financial decisions and creating better financial futures for themselves and consequently for society. A user centred design (UCD) approach was followed, emphasising design before development, with a strong focus on user involvement in all three phases, namely requirements gathering, design and evaluation. A primary client archetype for mid-career professionals was developed, split into two personas, Alan and Zoe, based on personality and self-rated motivational attributes which were used in an unconventional way to inspire two parallel, diverse designs. In early design stages, before an idea is well formed, producing multiple contrasting designs in parallel and qualitative design feedback from users is beneficial to establishing utility (solving the right problem), tapping into users' domain knowledge, improving the quality of the design and reducing fixation on one idea. Once the concept has been socialised and evolved sufficiently with users' input, converging on one final design and testing usability (solving the problem in the right way) become more important. This research offers two refinements of the UCD process guidelines for the benefit of researchers and practitioners.
- ItemOpen AccessDesigning an event display for the Transition Radiation Detector in ALICE(2021) Perumal, Sameshan; Dietel, Thomas; Kuttel, MichelleWe document here a successful design study for an event display focused on the Transition Radiation Detector (TRD) within A Large Ion Collider Experiment (ALICE) at the European Organisation for Nuclear Research (CERN). Reviews of the fields of particle physics and visualisation are presented to motivate formally designing this display for two different audiences. We formulate a methodology, based on successful design studies in similar fields, that involves experimental physicists in the design process as domain experts. An iterative approach incorporating in-person interviews is used to define a series of visual components applying best practices from literature. Interactive event display prototypes are evaluated with potential users, and refined using elicited feedback. The primary artefact is a portable, functional, effective, validated event display – a series of case studies evaluate its use by both scientists and the general public. We further document use cases for, and hindrances preventing, the adoption of event displays, and propose novel data visualisations of experimental particle physics data. We also define a flexible intermediate JSON data format suitable for web-based displays, and a generic task to convert historical data to this format. This collection of artefacts can guide the design of future event displays. Our work makes the case for a greater use of high quality data visualisation in particle physics, across a broad spectrum of possible users, and provides a framework for the ongoing development of web-based event displays of TRD data.
- ItemOpen AccessEffective visualisation of callgraphs for optimisation of parallel programs: a design study(2019) Mabakane, Mabule Samuel; Kuttel, MichelleParallel programs are increasingly used to perform scientific calculations on supercomputers. Optimising parallel applications to scale well, and ensuring maximum parallelisation, is a challenging task. The performance of parallel programs is affected by a range of factors, such as limited network bandwidth, parallel algorithms, memory latency and the speed of the processors. The term “performance bottlenecks” refers to obstacles that cause slow execution of the parallel programs. Visualisation tools are used to identify performance bottlenecks of parallel applications in an attempt to optimize the execution of the programs and fully utilise the available computational resources. TAU (Tuning and Analysis Utilities) callgraph visualisation is one such tool commonly used to analyse the performance of parallel programs. The callgraph visualisation shows the relationship between different parts (for example, routines, subroutines, modules and functions) of the parallel program executed during the run. TAU’s callgraph tool has limitations: it does not have the ability to effectively display large performance data (metrics) generated during the execution of the parallel program, and the relationship between different parts of the program executed during the run can be hard to see. The aim of this work is to design an effective callgraph visualisation that enables users to efficiently identify performance bottlenecks incurred during the execution of a parallel program. This design study employs a user-centred iterative methodology to develop a new callgraph visualisation, involving expert users in the three developmental stages of the system: these design stages develop prototypes of increasing fidelity, from a paper prototype to high fidelity interactive prototypes in the final design. The paper-based prototype of a new callgraph visualisation was evaluated by a single expert from the University of Oregon’s Performance Research Lab, which developed the original callgraph visualisation tool. This expert is a computer scientist who holds doctoral degree in computer and information science from University of Oregon and is the head of the University of Oregon’s Performance Research Lab. The interactive prototype (first high fidelity design) was evaluated against the original TAU callgraph system by a team of expert users, comprising doctoral graduates and undergraduate computer scientists from the University of Tennessee, United States of America (USA). The final complete prototype (second high fidelity design) of the callgraph visualisation was developed with the D3.js JavaScript library and evaluated by users (doctoral graduates and undergraduate computer science students) from the University of Tennessee, USA. Most of these users have between 3 and 20 years of experience in High Performance Computing (HPC). On the other hand, an expert has more than 20 years of experience in development of visualisation tools used to analyse the performance of parallel programs. The expert and users were chosen to test new callgraphs against original callgraphs because they have experience in analysing, debugging, parallelising, optimising and developing parallel programs. After evaluations, the final visualisation design of the callgraphs was found to be effective, interactive, informative and easy-to-use. It is anticipated that the final design of the callgraph visualisation will help parallel computing users to effectively identify performance bottlenecks within parallel programs, and enable full utilisation of computational resources within a supercomputer.
- ItemOpen AccessForce field comparison through computational analysis of capsular polysaccharides of Streptococcus pneumoniae serotypes 19A and F(2014) Gordon, Marc Brian; Kuttel, MichelleModern Molecular Dynamics force fields, such as the CHARMM36 and GLYCAM06carbohydrate force fields, are parametrised to reproduce behaviours for specific molecules under specific conditions in order to be able to predict the behaviour of similar molecular systems, where there is often no experimental data. Coupled with the sheer number available, this makes choosing the appropriate force field a formidable task. For this reason it is important that modern force fields be regularly compared. Streptococcus pneumoniae is a cause of invasive pneumococcal disease (IPD) such as pneumonia and meningitis in children under five. While there are over 90 pneumococcal serotypes only a handful of these are responsible for disease. Immunisation with the conjugate vaccine PCV7, has markedly decreased invasive pneumoccocal disease. Following PCV7 immunisation, incidences of non-vaccine serotypes, especially serotype19A, have increased.
- ItemOpen AccessGPU acceleration of the frequency domain acceleration search for binary pulsars(2021) Laidler, Christopher; Kuttel, MichelleGraphics processing units (GPUs) have been used to accelerate computation in a broad range of fields; this work presents a GPU-accelerated search for pulsars. Pulsars are highly magnetised neutron stars with extremely stable rotational periods. These periods can be accurately measured, which makes them exceptionally powerful reference tools in the field of astrophysics. Pulsars have very weak emissions, making them difficult to find. Most pulsars are found in large-scale surveys, which generate a large amount of data, and require extensive data processing. This work describes a GPU-based solution, with implications for real-time processing of pulsar search data. Pulsar astronomy uses radio telescope observations with high spectral and temporal resolution, which produce very large data sets and require intensive Digital Signal Processing. Large-scale pulsar surveys using next-generation radio telescopes such as the Square Kilometre Array (SKA), will have to be performed in real time as the volumes of raw data produced will be too large to be stored for an extended period. These computational requirements are compounded when searching for binary pulsars as their orbital motion makes them difficult to detect using classic periodicity searches. However, these rare pulsars are of great interest to physicists, as they allow us to test general relativity. Acceleration searches are the most common technique for detecting signals from binary pulsars that may be missed by standard search techniques. One of these, the frequency domain acceleration search (FDAS), mitigates the effect of orbital acceleration by correlating a matched template with the spectrum of a signal. This method has been shown to be more efficient than the alternative time domain acceleration search (TDAS)s. Even so, it is extremely computationally intensive to perform on a large scale. The existing implementation, Accelsearch, is run on a central processing unit (CPU), which limits its performance. We address this problem by creating a GPU port of the FDAS. An analysis of the fundamental calculations on which the FDAS is based informs the design of a fully asynchronous pipeline that exploits multiple levels of parallelism. This entails developing a novel technique for calculating Fresnel integrals, which increases the speed and numerical accuracy of the calculations, in both single- and double-precision. Furthermore, we develop a new estimate which improves the numerical accuracy of filter coefficients for accelerations close to zero. The GPU-accelerated pipeline achieves speeds 30 to 70 times faster than the existing serial CPU implementation. Our results clearly show that GPU acceleration is effective at reducing the cost of processing the FDAS component, to the point at which the SKA1-mid survey data could be searched in real time using 340 to 675 desktop GPUs from the Pascal generation.
- ItemOpen AccessIntroduction to Python Programming, Part 1(2012-04) Kuttel, MichelleA first half of a first course on how to program in Python. This set of six files comprises the slides from first 6 weeks of our 12 week first year course on Python Programming. It is an introduction aimed at students who have not programmed before. The code examples referred to in the slides are included as a zipped archive.
- ItemOpen AccessMolecular modeling of bacterial polysaccharide antigens to inform future vaccine development(2021) Hlozek, Jason; Ravenscroft, Neil; Kuttel, MichellePolysaccharide conjugate vaccines have been pivotal in reducing the prevalence and severity of bacterial infectious diseases worldwide, preventing countless deaths. The effectiveness of a vaccine can be extended if the selected vaccine strains in a multivalent vaccine cross-protect against non-vaccine strains. Detailed knowledge of antigen structure and conformation is required for vaccine components to be rationally selected. However, experimental methods may not be able to ascertain the conformations of polysaccharide chains. To address this, molecular dynamics simulations can provide key theoretical insights on molecular conformation to rationalize cross-protection data and inform vaccine development. In this work, we use molecular dynamics to investigate the conformations of glycan antigens of Neisseria meningitidis and Shigella flexneri bacteria - causative agents of meningitis and diarrheal disease. For N. meningitidis, our modeling indicates that serogroup A is unlikely to cross-protect against serogroup X infection, justifying the inclusion of serogroup X in future multivalent meningococcal vaccines. We also find that a chemically-stable carba-analogue of serogroup A has significant conformational differences to the native serogroup A chain, which does not support its use as a suitable serogroup A vaccine replacement. Our simulations of S. flexneri glycan antigens (serogroups Y, 2, 3, and 5) identify heuristics for the effects of substitution on backbone conformation and supports a proposed vaccine containing serotypes 2a (with O-acetylation) and 3a that will provide broad crossprotection. These findings can guide the rational selection of vaccine components to result in next-generation vaccines with greater cost-effectiveness and improved disease coverage.
- ItemOpen AccessNMR characterisation of group b streptococcus capsular polysaccharide repeating units(2022) Keresztesi, Maximillian Ludwig; Ravenscroft, Neil; Kuttel, MichelleGroup B Streptococcus (Streptococcus agalactiae) is a Gram-positive β-haemolytic bacterium and the leading cause of neonatal mortality by sepsis, pneumonia and meningitis. To date, ten serotypes of Group B Streptococcus (GBS) have been recognised (Ia, Ib, II - IX), each identified and differentiated by their sialic acid-containing capsular polysaccharide. Capsular polysaccharides are the virulence factor for bacterial pathogens and the target for vaccine development, with multivalent polysaccharide-protein conjugate vaccines licenced against bacteria such as Neisseria meningitidis and Streptococcus pneumoniae. Nuclear magnetic resonance (NMR) spectroscopy has been established as an extremely useful and robust method for tracking the manufacturing process of carbohydrate vaccines from polysaccharide antigen through to conjugate vaccines. The 1D proton profiles of most of the GBS antigens have been published, however, the identity spectra were recorded at 298 K, resulting in broad peaks and overlap of the large water signal with diagnostic GBS signals in the anomeric region. This study attempts to aid the development of GBS glycoconjugate vaccines by fully characterising the repeating units of the six most common GBS serotypes (Ia, Ib, II - V) by NMR recorded at a higher temperature of 343 K to serve as a database of reference GBS NMR spectra and chemical shift assignments. Full NMR characterisation of the repeating unit of each serotype was achieved by use of an array of 1D and 2D NMR experiments including proton, carbon, proton-proton scalar and dipolar correlation experiments and proton-carbon heteronuclear single-quantum and multiple bond correlation experiments. The assignments of all six serotypes largely agree with NMR data published for these serotypes. The exception to this was GBS V, where data presented in this study shows that the assignments of the anomeric peaks of GlcNAc and the backbone β-Glucose are reversed relative to their assignments in the current literature. The 1D and 2D NMR spectra presented in this study can be used for identity, integrity and purity testing of polysaccharide batches. They allow identification of each serotype by its diagnostic anomeric peaks, can confirm the structural integrity of the polysaccharide both before and after conjugation and can detect the presence of impurities such as residuals. Ultimately, they represent a powerful reference resource for use in the development, preparation and control testing of future GBS glycoconjugate vaccines.