• English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  • Communities & Collections
  • Browse OpenUCT
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  1. Home
  2. Browse by Author

Browsing by Author "Winberg, Simon"

Now showing 1 - 20 of 45
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A GPU based X-Engine for the MeerKAT Radio Telescope
    (University of Cape Town, 2020) Callanan, Gareth Mitchell; Winberg, Simon
    The correlator is a key component of the digital backend of a modern radio telescope array. The 64 antenna MeerKAT telescope has an FX architecture correlator consisting of 64 F-Engines and 256 X-Engines. These F- and X-Engines are all hosted on 128 custom designed FPGA processing boards. This custom board is known as a SKARAB. One SKARAB X-Engine board hosts four logical X-Engines. This SKARAB ingests data at 27.2 Gbps over a 40 GbE connection. It correlates this data in real time. GPU technology has improved significantly since SKARAB was designed. GPUs are now becoming viable alternatives to FPGAs in high performance streaming applications. The objective of this dissertation is to investigate how to build a GPU drop-in replacement X-Engine for MeerKAT and to compare this implementation to a SKARAB X-Engine. This includes the construction and analysis of a prototype GPU X-Engine. The 40 GbE ingest, GPU correlation algorithm and the software pipeline framework that links these two together were identified as the three main sub-systems to focus on in this dissertation. A number of different tools implementing these sub-systems were examined with the most suitable ones being chosen for the prototype. A prototype dual socket system was built that could process the equivalent of two SKARABs worth of X-Engine data. This prototype has two 40 GbE Mellanox NICS running the SPEAD2 library and a single Nvidia GeForce 1080Ti GPU running the xGPU library. A custom pipeline framework built on top of the Intel Threaded Building Blocks (TBB) library was designed to facilitate the ow of data between these sub-systems. The prototype system was compared to two SKARABs. For an equivalent amount of processing, the GPU X-Engine cost R143 000 while the two SKARABs cost R490 000. The power consumption of the GPU X-Engine was more than twice that of the SKARABs (400W compared 180W), while only requiring half as much rack space. GPUs as X-Engines were found to be more suitable than FPGAs when cost and density are the main priorities. When power consumption is the priority, then FPGAs should be used. When running eight logical X-Engines, 85% of the prototype's CPU cores were used while only 75% of the GPU's compute capacity was utilised. The main bottleneck on the GPU X-Engine was on the CPU side of the server. This report suggests that the next iteration of the system should offload some CPU side processing to the GPU and double the number of 40 GbE ports. This could potentially double the system throughput. When considering methods to improve this system, an FPGA/GPU hybrid X-Engine concept was developed that would combine the power saving advantage of FPGAs and the low cost to compute ratio of GPUs.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Accelerator-based look-up table for coarse-grained molecular dynamics computations
    (2018) Gangopadhyay, Ananya; Naidoo, Kevin J.; Winberg, Simon
    Molecular Dynamics (MD) is a simulation technique widely used by computational chemists and biologists to simulate and observe the physical properties of a system of particles or molecules. The method provides invaluable three-dimensional structural and transport property data for macromolecules that can be used in applications such as the study of protein folding and drug design. The most time-consuming and inefficient routines in MD packages, particularly for large systems, are the ones involving the computation of intermolecular energy and forces for each molecule. Many fully atomistic systems such as CHARMM and NAMD have been refined over the years to improve their efficiency. But, simulating complex long-time events such as protein folding remains out reach for atomistic simulations. The consensus view amongst computational chemists and biologists is that the development of a coarse-grained (CG) MD package will make the long timescales required for protein folding simulations possible. The shortcoming of this method remains an inability to produce accurate dynamics and results that are comparable with atomistic simulations. It is the objective of this dissertation to develop a coarse-grained method that is computationally faster than atomistic simulations, while being dynamically accurate enough to produce structural and transport property data comparable to results from the latter. Firstly, the accuracy of the Gay-Berne potential in modelling liquid benzene in comparison to fully atomistic simulations was investigated. Following this, the speed of a course-grained condensed phase benzene simulation employing a Gay-Berne potential was compared with that of a fully atomistic simulation. While coarse-graining algorithmically reduces the total number of particles in consideration, the execution time and efficiency scales poorly for large systems. Both fully-atomistic and coarse-grained developers have accelerated packages using high-performance parallel computing platforms such as multi-core CPU clusters, Field Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs). GPUs have especially gained popularity in recent years due to their massively parallel architecture on a single chip, making them a cheaper alternative to a CPU cluster. Their relatively shorter development time also gives them an advantage over FPGAs. NAMD is perhaps the most popular MD package that employs efficient use of a single GPU or a multi-GPU cluster to conduct simulations. The Scientific Computing Research Unit’s in-house generalised CG code, the Free Energy Force Induced (FEFI) coarse-grained MD package, was accelerated using a GPU to investigate the achievable speed-up in comparison to the CPU algorithm. To achieve this, a parallel version of the sequential force routine, i.e. the computation of the energy, force and torque per molecule, was developed and implemented on a GPU. The GPU-accelerated FEFI package was then used to simulate benzene, which is almost exclusively governed by van der Waal’s forces (i.e. dispersion effects), using the parameters for the Gay-Berne potential from a study by Golubkov and Ren in their work “Generalized coarse-grained model based on point multipole and Gay-Berne potentials”. The coarse-grained condensed phase structural properties, such as the radial and orientational distribution functions, proved to be inaccurate. Further, the transport properties such as diffusion were significantly more unsatisfactory compared to a CHARMM simulation. From this, a conclusion was reached that the Gay-Berne potential was not able to model the subtle effects of dispersion as observed in liquid benzene. In place of the analytic Gay-Berne potential, a more accurate approach would be to use a multidimensional free energy-based potential. Using the Free Energy from Adaptive Reaction Coordinate Forces (FEARCF) method, a four-dimensional Free Energy Volume (FEV) for two interacting benzene molecules was computed for liquid benzene. The focal point of this dissertation was to use this FEV as the coarse-grained interaction potential in FEFI to conduct CG simulations of condensed phase liquid benzene. The FEV can act as a numerical potential or Look-Up Table (LUT) from which the interaction energy and four partial derivatives required to compute the forces and torques can be obtained via numerical methods at each step of the CG MD simulation. A significant component of this dissertation was the development and implementation of four-dimensional LUT routines to use the FEV for accurate condensed phase coarse-grained simulations. To compute the energy and partial derivatives between the grid points of the surface, an interpolation algorithm was required. A four-dimensional cubic B-spline interpolation was developed because of the method’s superior accuracy and resistance to oscillations compared with other polynomial interpolation methods. When The algorithm’s introduction into the FEFI CG MD package for CPUs exhausted the single-core CPU architecture with its large number of interpolations for each MD step. It was therefore impractical for the high throughput interpolation required for MD simulations. The 4D cubic B-spline algorithm and the LUT routine were then developed and implemented on a GPU. Following evaluation, the LUT was integrated into the FEFI MD simulation package. A FEFI CG simulation of liquid benzene was run using the 4D FEV for a benzene molecular pair as the numerical potential. The structural and transport properties outperformed the analytical Gay-Berne CG potential, more closely approximating the atomistic predicted properties. The work done in this dissertation demonstrates the feasibility of a coarse-grained simulation using a free energy volume as a numerical potential to accurately simulate dispersion effects, a key feature needed for protein folding.
  • No Thumbnail Available
    Item
    Open Access
    Action Tracer with Android API
    (2024) Chiramba, Humphrey; Winberg, Simon
    This dissertation aims to present the design and testing of a low-cost motion capture device with feedback capabilities. In addition to this, an API to communicate with this system is developed that allows devices to connect to the motion capture system and execute commands on it. This system, Action Tracer, would be useful in cases where a user may need to train or learn a repetitive motion found in an activity or sport. Such a task needs to be done under supervision and may need to be done away from a coach or therapist. In either situation, a system which can accommodate both cases while providing insights that are not visible to the eye is needed. In addition to this, the system would need to be con gurable for the various sports that it may be used with, as well as be operated from an Android device. Most of the work that will be presented in this report will relate to the development of marker based motion capture systems, with a particular interest in the use of Inertial Measurement Units (IMUs). In addition to this, the subject of Biofeedback is be covered, as well as how it relates to motion capture and how it can be applied in this eld.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Analysis and Development of an Online Knowledge Management Support System for a Community of Practice
    (2019) Mafereka, Moeketsi; Winberg, Simon
    The purpose of this study was to investigate how particular business practices, focusing on those occurring in multi-site non-governmental organization (NGOs), could be enhanced by use of a knowledge management system (KMS). The main objective of this KMS is to enhance business processes and save costs for a multi-site NGO through streamlining the organizational practices of knowledge creation, storage, sharing and application. The methodology uses a multiple perspective approach, which covers exploration of the problem space and solution space. Under exploration of problem space, interviews with employees of the NGO are done to identify core problem that the organization faced. Still under exploration of problem space, organization’s knowledge management maturity was assessed through an online questionnaire. The methodology then moved on to exploration of problem space. During the exploration of problem space, the requirements gathering and definition process was done through a combination of interviews with company employees and by completing a systematic literature review of best practices. The requirements were used to design system architecture and use-case models. The prototype for a Community of Practice (COP) support website was developed and investigated in test cases. The tests showed that the prototype system was able to facilitate asynchronous communication through the creation and management of events, creation and management of collaboration groups, creation of discussion topics and creation of basic pages. Furthermore, security capabilities were tested in terms of login functionality. Lastly page load times were tested for eight different scenarios. The system performance was found to be satisfactory because the scenarios covering crucial system requirements aspects had a response time of below 11 seconds. An exception was the landing page, which after login took 26 seconds to load. It is believed that creation of a platform that enables, and records, user interaction, easy of online discussions, managing groups, topics and events, are all major contributors to a successful knowledge management approach.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Architectural Level Computational Hardware Abstraction: A New Programming Language for FPGA Projects
    (2022) Taylor, John-Philip; Winberg, Simon
    Recent years have seen vast improvements to the capability of programmable processing platforms, especially field programmable gate arrays, or FPGAs. Modern software languages have been developed, adding features such as duck-typing, dynamic interpretation, built-in high level data structures, etc. Yet, FPGA development is still mostly using traditional hardware description languages such as VHDL and Verilog, and the industry is resorting to third party tools and scripting-based automation in order to increase developer efficiency. This dissertation presents ALCHA: a new object-oriented language aimed at low-level FPGA development. Main language objectives include increasing the architectural abstraction capabilities, introducing structured programming to FPGA development, automating fixed-point related design, integrating design constraints and increasing the generalisation capability. In short, the ALCHA language is designed to allow the user to increase abstraction and reduce maintenance effort. After ensuring that the language grammar is parsable, the resulting language design is evaluated by means of a radar-based case study. Language complexity measurement is based on the number of lines of code, and language power is based on the cost of maintenance. ALCHA is shown to support code that is about half as complex and twice as powerful as traditional HDL-based design, based on these metrics. In future, ALCHA could evolve into a hardware description language in its own right, allowing developers to leverage the strengths of FPGAs.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Automated troubleshooting for RTWP in 3G/4G RAN nodes
    (2018) Mohammed, Hisham; Winberg, Simon
    Nowadays, Mobile Network Operators are confronted with many challenges to operate and maintain their network. Subscribers expect stable and perpetual services. Repeated interruptions of services will result in the dissatisfaction of users and may lead to losing the end user. One of the major issues facing a Radio Access Network (RAN) mobile operator is coping with the uplink interference in their RAN, such as the Receive Total Wideband Power (RTWP) in the Universal Mobile Telecommunications System (UMTS) band. A frequently occurring issue in such networks is the RTWP alarm. This alarm is reported in the Network Operation Centre (NOC) and contribute to poor quality in the network . Such an alarm may occur daily, thus impacting the network’s Key Point Indicator (KPI). The mobile network operator always tries to resolve the issue of RTWP quickly by means of several processes and strategies to diagnose and troubleshoot this issue, all within a target ‘Service Level Agreement’ (SLA). There are many different causes that can lead to an RTWP alarm in a mobile 3G RAN. In addition, each of these cases has different diagnoses and troubleshooting methods. The main idea of this project is to design a Graphical User Interface (GUI) tool to help the Front Office (FO) or Back Office (BO) engineer in mobile network operator to check and troubleshoot the RTWP issue in the network in a timely manner. The tool is designed to check the configuration of the radio, based on the Huawei NodeB 3900 and statistical performance counters, and to provide the correct decision for the engineer to improve the efficiency and minimize the time taken to troubleshoot the RTWP alarm in the network. It is very useful to design such a tool for interacting with the Huawei NodeB 3900. The GUI tool is thus basically designed to support the engineers in Oman Telecommunication Company’s NOC while dealing with the RTWP alarm in the Huawei NodeB 3900. The major finding of this study is the design of the GUI tool to minimize the time taken to resolve the RTWP issue in the Huawei NodeB 3900 both in a single site and in multiple sites, to conduct consistency checks for the software parameters, and finally to identify the root cause of the RTWP alarm. The GUI tool shows an operation log, which can be used by the administrator for maintenance records, and it also contains a help guide that gives the user more information about the functionality of each button.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Automatic generation of a floor plan from a 3D scanned model: Making the Analogue World Digital
    (2018) Wilson, Bradlee Kenneth; Winberg, Simon; O'Hagan, Daniel
    The processing of three-dimensional (3D) room models is an area of research undertaken by many academics and hobbyists due to multiple uses derived from the information obtained - such as the generation of a floor plan; an example of bridging the real and digital world. A floor plan is required when an existing room, floor, or building requires alteration. By having the floor plan in the digital domain it allows the user to alter the room via simulation and render the environment in a life-like manner to determine if the alterations will suffice. This is done using Computer Aided Design Software (CAD). Designing a new room or building would be done using CAD software. However, not all building's digital files are readily available or exist - making the creation of a floor plan necessary. The floor plan can created up by a person on pen and paper, or with using software tools and sensors. Commercial systems exist for this task but there are no automated, open-source systems that can do the same. Current research tends to focus on the processing algorithms and not the sensors or methods for capturing the environment. This dissertation deals with testing and evaluating off-the-shelf (OTS) sensors and the processing of 3D modelled rooms captured with one of these sensors. The tests performed on the OTS sensors determine the overall accuracy of the sensors for 3D room modelling. The rationale for designing and conducting these tests is to provide the community with suggested practical tests to assist in selecting an OTS sensor for 3D room modelling. The 3D room models are captured using an opensource application and are imported into custom software. The 3D models undergo pre-processing algorithms producing 2D results, which were further processed to determine the walls of rooms. The dimension information about these features are used to create a 2D floor plan. 3D modelled environments are inherently noisy, requiring efficient pre-processing to remove the noise without hampering processing performance of the 3D model. One of the largest contributors to noise and accuracy is the sensor. Selecting the appropriate sensor can mitigate the need for complex pre-processing algorithms and will improve overall processing time. The project was able to extract dimension information within an acceptable error. The tests that were designed and used for sensor testing were able to determine which sensor was the better choice for 3D room modelling. The optimal sensor was found to be Microsoft's Kinect1 . Tests were performed in which the Microsoft Kinect was required to map a room. The results show that dimensional information about the given scene could be successfully extracted with an average error of 4.60 %.
  • No Thumbnail Available
    Item
    Open Access
    Automation of ultrasonic-based product tracking & traceability in supply chains
    (2025) Mkombwe, Anorld; Winberg, Simon
    In the sweeping tide of digital evolution, the Internet of Things (IoT) is emerging as a significant catalyst, spearheading a colossal upsurge in deploying various sensors as the Industrial 4.0 buzzword continues dominating all platforms where industry captains converge. Modern-day Supply Chain Management (SCM), product tracking, and traceability are paramount for ensuring efficiency, quality control, and regulatory compliance. At the heart of IoT are wireless sensors and various other sensors forming the ecosystem of technologies that interact. This dissertation explores optimizing ultrasonic-based systems as wireless sensors for tracking and traceability in SCM and logistics. While widely used, traditional radio-frequency identification (RFID) tags and barcode systems encounter limitations in environments with interference or when tracking through dense materials. Ultrasonic technology, with its ability to penetrate various media and provide high- resolution data, presents a promising alternative for environments where other tracking technologies underperform. It can thus be used with near-field communication (NFC) technology and real-time GPS tracking, traditionally reserved for tracking goods in transit. The research investigates the unique attributes of ultrasonic signals for product identification, focusing on frequency modulation, signal processing techniques, and integration with existing digital frameworks. The study utilizes simulation models and field testing to examine the reliability, accuracy, and cost-effectiveness of ultrasonic-based tracking across various supply chain stages. Innovative algorithms and device design modifications address key challenges, such as signal attenuation, environmental noise, and energy consumption, enhancing signal clarity and data retrieval efficiency. This project primarily focused on the costs and benefits of using a simple ultrasonic sensor for product tracing and tracking. A cheap ultrasonic sensor was used and connected to an Arduino device and an affordable buzzer and LED for audible and visual alerts to anyone nearby. These simple, standard, low-cost devices were programmed with open-source code and libraries downloadable in Arduino IDE to achieve comparable results that previously cost an arm and a leg with current technologies such as Blockchain (BCT), global positioning system (GPS), or RFID. The sensor was programmed to sense objects passing through at a certain distance and then increment a count that displayed real- time results locally and on a centralized cloud platform. This enabled the results to be monitored and queried in any part of the world where there is internet connectivity. These methods in SCM have been quite expensive to set up and maintain, thus prompting the need for an IoT-based system with low-cost input but reliable performance to achieve the purpose. This project also aims to provide a solution for automatically tracking and tracing goods without human involvement before goods are packaged for transportation, where GPS tracking is ineffective. Results demonstrate that automated ultrasonic tracking can improve product traceability, particularly in complex industrial environments where traditional methods struggle. By incorporating ultrasonic technology, supply chains benefit from enhanced visibility, which supports real-time inventory management, reduces errors, and increases responsiveness to disruptions. This dissertation concludes with recommendations for implementing ultrasonic systems in conjunction with existing technologies such as Blockchain technology, RFID scanning and tagging systems, and other IoT based infrastructure, offering a heterogeneous approach that maximizes the strengths of each technology to create a robust, scalable solution for modern supply chains.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A binaural sound sources localisation application for smart phones
    (2015) Mugagga, Pius Kavuma Basajjabaka; Winberg, Simon
    The ability to estimate positions of sound sources is one that gives animals a 360° awareness of their acoustic environment. This helps compliment the visual scene which is restricted to 180° in humans. Unfortunately, deaf people are left out on this ability. Smart phones are rapidly becoming a common tool amongst mobile users in developed and emerging markets. Their processing ability has more than doubled since their introduction to mass consumer markets by Apple in 2007. Top-end smart phones such as the Samsung Galaxy Series; 3, 4, and 5 models, have two microphones with which one can acquire stereo recordings. The purpose of this research project was to establish a feasible Sound source localization algorithm for current top-end smart phones, and to recommend hardware improvements for future smart phones, to pave way for the use of smart phones as advanced auditory sensory devices capable of acting as avatars for intelligent remote systems to learn about different acoustic scenes with help of human users. The GCC-PHAT algorithm was chosen as the underlying core DOA algorithm due to its suitability for pair-wise localization as highlighted in literature. A stochastic power accumulation algorithm was designed and implemented to improve estimation outcomes by GCC-PHAT. This algorithm was based on inspiration from W-disjoint orthogonality assumption in literature and was extended to perform sound source counting and time domain source separation. The system yielded satisfactory azimuth estimates of sound source directions in real time with pin-point DOA estimation accuracy rates of 64%, and 90.67% accuracy rate when a tolerance of ± 1 correlation sample is considered. An effort to resolve front back ambiguity using phone orientation data from the MEMs sensors yielded un-satisfactory results prompting a recommendation that an extra microphone would be needed to achieve 360° localization in a more user friendly way. The dissertation concludes with plans for further work on the topic and provision of a further refined API and optimised libraries to facilitate development of customised solutions using this system.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Comparative study of tool-flows for rapid prototyping of software-defined radio digital signal processing
    (2019) Setetemela, Khobatha; Winberg, Simon
    This dissertation is a comparative study of tool-flows for rapid prototyping of SDR DSP operations on programmable hardware platforms. The study is divided into two parts, focusing on high-level tool-flows for implementing SDR DSP operations on FPGA and GPU platforms respectively. In this dissertation, the term ‘tool-flow’ refers to a tool or a chain of tools that facilitate the mapping of an application description specified in a programming language into one or more programmable hardware platforms. High-level tool-flows use different techniques, such as high-level synthesis to allow the designer to specify the application from a high level of abstraction and achieve improved productivity without significant degradation in the design’s performance. SDR is an emerging communications technology that is driven by - among other factors – increasing demands for high-speed, interoperable and versatile communications systems. The key idea in SDR is the need to implement as many as possible of the radio functions that were traditionally defined in fixed hardware, in software on programmable hardware processors instead. The most commonly used processors are based on complex parallel computing architectures in order to support the high-speed processing demands of SDR applications, and they include FPGAs, GPUs and multicore general-purpose processors (GPPs) and DSPs. The architectural complexity of these processors results in a corresponding increase in programming methodologies which however impedes their wider adoption in suitable applications domains, including SDR DSP. In an effort to address this, a plethora of different high-level tool-flows have been developed. Several comparative studies of these tool-flows have been done to help – among other benefits – designers in choosing high-level tools to use. However, there are few studies that focus on SDR DSP operations, and most existing comparative studies are not based on well-defined comparison criteria. The approach implemented in this dissertation is to use a system engineering design process, firstly, to define the qualitative comparison criteria in the form of a specification for an ideal high-level SDR DSP tool-flow and, secondly, to implement a FIR filter case study in each of the tool-flows to enable a quantitative comparison in terms of programming effort and performance. The study considers Migen- and MyHDL-based open-source tool-flows for FPGA targets, and CUDA and Open Computing Language (OpenCL) for GPU targets. The ideal high-level SDR DSP tool-flow specification was defined and used to conduct a comparative study of the tools across three main design categories, which included high-level modelling, verification and implementation. For tool-flows targeting GPU platforms, the FIR case study was implemented using each of the tools; it was compiled, executed on a GPU server consisting of 2 GTX Titan-X GPUs and an Intel Core i7 GPP, and lastly profiled. The tools were moreover compared in terms of programming effort, memory transfers cost and overall operation time. With regard to tool-flows with FPGA targets, the FIR case study was developed by using each tool, and then implemented on a Xilinx 7 FPGA and compared in terms of programming effort, logic utilization and timing performance.
  • No Thumbnail Available
    Item
    Open Access
    Convolutional Neural Networks for Robust Fynbos Leaf Classification: Enabling Trustworthy Machine Learning in Botanical Science
    (2025) Govender, Jarushen; Winberg, Simon
    The Fynbos Leaf Optical Recognition Application (or FLORA) is a novel machine-learning tool created for the purposes of aiding conservation efforts of the Cape Floral Region, and the species of plants known as Fynbos in particular. Known for their distinctive evolutionary features, the species maintains a revered position in the ecological heritage of South Africa. This version of FLORA intends to make use of a Convolutional Neural Network trained on a dataset of collected leaf images, to correctly classify species of Fynbos using natural images as training data. The thesis intends to combat many of the pitfalls of using CNN technology such as working with small datasets and provides a novel approach for dealing with image quality issues and over-fitting that arise from working with limited data. This project also intends to be scalable, and to be able to grow and become more generalised as more training data are added. The collected data involved manual sample collection using photography equipment and consists of 1,196 field images spread across 35 different species of plants. A part of thesis involved the creation of a novel Image Quality Assessment tool to remove low quality images that negatively influenced the predictive capability of the model. The model evaluation process makes use of SHapley Additive exPlanations (SHAP), a tool for visualising model predictions, to contribute to the explian-ability of the model and to develop trust and confidence in machine-learning algorithms, with the ultimate aim of providing a tool to merge the fields of ecology, botany and electrical engineering. Multiple models were trained and evaluated and the selected model for the project obtained a classification accuracy of 76% on the validation data, and an F1-score of 74%. This was an extremely positive result as the training data consisted of exclusively natural images and no feature engineering was performed. The model was then tuned to specific hyper-parameter values which yielded a small performance boost, and then tested on its ability to generalise. Five new classes were added to the training set and the model performance remained consistent, demonstrating robust generalisation. This project contributes knowledge to the growing field of image recognition, and provides a clear framework for model explain-ability which should benefit future research endeavors.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    The design and development of a pulsed radar block for the Rhino platform
    (2012) Raw, Bruce; Winberg, Simon; Inggs, Michael
    The Reconfigurable Hardware Interface for computiNg and radiO (Rhino) Platform is an FPGA based computing platform designed at the University of Cape Town to provide an FPGA resource that is both affordable and easy to learn and use in research and skills development in the areas of Software Designed Radio, Radio Astronomy and Cognitive Radio. A fremework comprising reusable radar processing modules (referred to in this text as "Radar Blocks") has been implemented on the Rhino and allows users to control simple pulse radar. The pulse radar application is implemented on the FPGA using the radar blocks framework which allows each block to be configured from the ARM processor to adapt settings during experiments. This project developed blocks for the communications bus, Gigabit Ethernet and simple pulse radar.
  • No Thumbnail Available
    Item
    Open Access
    Design and Implementation of a Risc-V Based LoRa Module
    (2023) Njoroge, Mark; Winberg, Simon
    The proliferation of the Internet of Things(IoT) in both scale and complexity, alongside advances in optimised edge and fog system architectures, is driving an increasing need for low power end nodes with greater computational capabilities. These distributed higher capacity nodes allow IoT infrastructures to minimise the power cost of data movement and increase real time response through increased edge data analytics. This dissertation presents the design of a prototype softcore RISC-V based LoRa end node Printed Circuit Board (PCB) design. By combining the reconfigurability and optimisation potential of a FPGA and RISC-V based architecture with a LoRa interface, the design contributes a novel option for use in solutions to the above. The design utilises the open source python framework LiteX to generate an open, low cost and flexible System on a Chip (SoC) that contains the necessary core and peripherals to facilitate integration with a LoRa transceiver. The SoC is implemented on an ultra low power FPGA (Lattice iCE40UP5k), providing access to both reconfigurable logic and a CPU for data analytics, and standard interfaces for 3rd party sensors, such UART, I2C and SPI. The whole design is integrated on a custom PCB in a USB dongle form factor. The resulting prototype can therefore be used as a peripheral for existing systems that may require additional compute power and IoT connectivity. The performance of the prototype is evaluated in various applicable outdoor and indoor scenarios and is observed to have comparative results with industry standard modules.
  • No Thumbnail Available
    Item
    Open Access
    Design, Implementation and Assessment of BPSK and QPSK PAPR over OFDM signals using LimeSDR
    (2023) Kagande, Tinashe; Winberg, Simon
    Modern day communications required efficient and affordable wireless communication techniques. Traditional wireless communication systems involved a lot of hardware for each component within the system. Software Defined Radios (SDR) became popular and were a major area of research as they provided a highly adaptive software solution to the traditional analog hardware solutions that were commonly implemented for communication systems. The benefits of using SDRs outweighed those of traditional analog hardware by a significant margin, hence so much research and development was pursued in this area. SDRs offered upgradability and adaptation often without needing to change the hardware, giving them the edge over traditional hardware approaches that needed replacement for upgrades. With Orthogonal Frequency Division Multiplexing (OFDM) being one of the most popular modulation techniques for Next Generation Networks (NGN), it was important to understand how best it could be delivered using low-cost SDRs. The biggest challenge of OFDM multiplexing was high PAPR, which could necessitate expensive circuitry for ADC/DAC components of an SDR solution. OFDM signals were popularly modulated either by Binary Phase Shift Keying (BPSK) or Quadrature Phase Shift Keying (QPSK), which then brought in the question “What was their contribution to Peak-to-Average Power Ratio (PAPR) in an OFDM system?”. Consequently, this project compared BPSK and QPSK in terms of PAPR in OFDM signals. A personal computer was used to host GnuRadio-based prototypes on which an OFDM system was developed using OFDM blocks inbuilt in the software. LimeSDR hardware was used to sample radio waves. A LimeSDR block was implemented in GnuRadio to interconnect with the LimeSDR module. An OFDM transceiver was designed in GnuRadio, and the code developed for this project was also open source. GnuRadio was selected specifically for its open-source flexibility, that allowed adaptability and the prospect to experiment with code, which was expected to be of benefit for future work. For this project, pre-selected data stored on the host PC, was transmitted from the OFDM transmitter through the LimeSDR antennas and received by the Lime SDR antennas, then demodulated and saved in a different folder on the host PC. Once this was achieved, user interface facilities were added to facilitate use and testing. Results from the testing demonstrated the compatibility of LimeSDR and GnuRadio and showed significant differences between a BPSK modulated signal and QPSK modulated signal in terms of PAPR. This project aimed to provide contributions to the radio and wireless communication field as well as being supportive towards other ongoing projects taking place in the UCT Electrical Engineering Department that connected to pertinent considerations for 5G and IoT wireless remote sensing solutions.
  • No Thumbnail Available
    Item
    Open Access
    Designing a dynamic spectrum sharing algorithm between DCS and LTE in the 1800 MHz band: a case study of a mobile telecommunication operator in Zimbabwe
    (2025) Magwa, Luckmore; Winberg, Simon
    In recent years, there has been remarkable growth in the wireless devices and networks market, leading to the proliferation of numerous wireless services and applications. Consequently, regulatory agencies around the world have allocated licensed spectrum chunks to different wireless services to meet the increasing demand. Despite technological advancements such as Multiple Input Multiple Output (MIMO) communications, heterogeneous networks, and cooperative communications, spectrum scarcity continues to pose a challenge for regulatory agencies worldwide. Facing this challenge, Dynamic Spectrum Sharing (DSS) has emerged as a promising remedy. As a facet of frequency spectrum management, DSS aims to bolster spectrum utilization efficiency and elevate the end-user experience by introducing greater flexibility in the usage of spectrum resources. This Dissertation has substantiated DSS as a viable solution to the challenge of spectrum scarcity. It assessed the effectiveness and suitability of dynamic spectrum sharing within a conventional mobile network setting, concentrating on intra-operator scenarios encompassing Digital Cellular System (DCS), also recognized as GSM1800, and LTE radio technologies. Monte Carlo-style system-level simulations were conducted using Atoll, utilizing raster traffic maps provided by the Mobile Network Operator (MNO). These simulations served two main purposes: firstly, to benchmark the simulator's performance with actual network performance data collected from the MNO, and secondly, to validate the impact of DSS on the network by contrasting it with the current fixed spectrum sharing method employed by the MNO in urban and suburban settings, thus offering a realistic analysis. Key Performance Indicators (KPIs) on LTE, such as Downlink Throughput, Physical Resource Block (PRB) Utilization, and Evolved Radio Access Bearer (ERAB) Establishment, were evaluated, along with consideration of the impact on 2G metrics like voice call drops, total carried traffic, and receiver (Rx) signal quality. The simulations revealed a substantial surge in LTE throughput, averaging 62% across both clusters, resulting in an overall increase in LTE traffic by 34%, thanks to DSS implementation. Remarkably, this enhancement in LTE performance was achieved while ensuring minimal adverse effects on DCS performance. Notably, DSS's impact on the DCS network was more pronounced in urban areas, leading to a 7% reduction in voice traffic attributed to heightened interference in shared spectrum zones, leading to increased SINR. As a result, there was a 6% drop in DL quality samples (DL Rq), resulting in a 0.16% increase in voice call drops post-DSS activation. In suburban regions, both DSS and Fixed Spectrum Allocation (FSA) exhibited nearly identical DCS performance, with negligible impact, as indicated by a slight 1.7% decline in received signal quality compared to the urban cluster. To further optimize DCS performance within the DSS framework, future strategies suggest reducing the transmit power of resource elements in shared spectrum zones to mitigate interference with DCS channels.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Designing and developing a robust automated log file analysis framework for debugging complex system failure
    (2022) Van Balla, Tyrone Jade; Winberg, Simon
    As engineering and computer systems become larger and more complex, additional challenges around the development, management and maintenance of these systems materialize. While these systems afford greater flexibility and capability, debugging failures that occur during the operation of these systems has become more challenging. One such system is the MeerKAT Radio Telescope's Correlator Beamformer (CBF), the signal processing powerhouse of the radio telescope. The majority of software and hardware systems generate log files detailing system operation during runtime. These log files have long been the go-to source of information for engineers when debugging system failures. As these systems become increasingly complex, the log files generated have exploded in both volume and complexity as log messages are recorded for all interacting parts of a system. Manually using log files for debugging system failures is no longer feasible. Recent studies have explored data-driven, automated log file analysis techniques that aim to address this challenge and have focused on two major aspects: log parsing, in which unstructured, free-form text log files are transformed into a structured dataset by extracting a set of event templates that describe the various log messages; and log file analysis, in which data-driven techniques are applied to this structured dataset to model the system behaviour and identify failures. Previous work is yet to address the combination of these two aspects to realize an end-to-end framework for performing automated log file analysis. The objective of this dissertation is to design and develop a robust, end-to-end Automated Log File Analysis Framework capable of analysing log files generated by the MeerKAT CBF to assist in system debugging. The Data Miner, Inference Engine and the complete framework are the major subsystems developed in this dissertation. State-of-the-art, data-driven approaches to log parsing were considered and the best performing approaches were incorporated into the Data Miner. The Inference Engine implements an LSTM-based multi-class classifier that models the system behaviour and uses this to perform anomaly detection to identify failures from log files. The complete framework links these two components together in a software pipeline capable of ingesting unstructured log files and outputting assistive system debugging information. The performance and operation of the framework and its subcomponents is evaluated for correctness on a publicly available, labelled dataset consisting of log files from the Hadoop Distributed File System (HDFS). Given the absence of a labelled dataset, the applicability and usefulness of the framework in the context of the MeerKAT CBF is subjectively evaluated through a case study. The framework is able to correctly model system behaviour from log files, but anomaly detection performance is greatly impacted by the nature and quality of the log files available for tuning and training the framework. When analysing log files, the framework is able to identify anomalous events quickly, even when large log files are considered. While the design of the framework primarily considered the MeerKAT CBF, a robust and generalisable end-to-end framework for automated log file analysis was ultimately developed.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Development and testing of the RHINO host streamed data acquisition framework
    (2017) Boleme, Mpati; Winberg, Simon; Mohapi, Lerato
    This project focuses on developing a supporting framework for integrating the Reconfigurable Hardware INterface for computing and radiO (RHINO) with a Personal Computer (PC) host in order to facilitate the development of Software Defined Radio (SDR) applications built using a hybrid RHINO/multicore PC system. The supporting framework that is the focus of this dissertation is designed around two main parts: a) resources for integrating the GNU Radio framework with the RHINO platform to allow data streams to be sent from RHINO to be processed by GNU Radio, and b) a concise and highly efficient C code module with accompanying Application Program Interface (API) that will receive streamed data from RHINO and provide data marshalling facilities to gather and dispatch blocks of data for further processing using C/C++ routines. The methodology followed in this research project involves investigating real-time streaming techniques using User Datagram Protocol (UDP) packets, furthermore, investigating how GNU Radio high-level SDR development framework can be integrated into the real-time data acquisition systems such as in the case of this project with RHINO. The literature for real-time processing requirements for the streamer framework was reviewed. The guidelines to implement a high performance, low latency and maximum throughput for streaming will consequently be presented and the proposed design motivated. The results achieved demonstrate an efficient data streaming system. The objectives of implementing RHINO data acquisition system through integration with standard C/C++ code and GNU Radio were satisfactorily met. The system was tested with real-time Radio Frequency (RF) demodulation. The system captures a pair of an In-phase/Quadrature signal (I/Q) sample at a time, which is one packet. The results show that data can be streamed from the RHINO board to GNU Radio over GbE with a minimum capturing latency of 10.2μs for 2 0 packet size and an average data capturing throughput of 0.54 Mega Bytes per second (MBps). The capturing latency, in this case, is the time taken from the time of the request to receiving the data. The FM receiver case study successfully demonstrated results of a demodulated FM signal of a 94.5 Mega Hetz (MHz) radio station. Further recommendations include making use of the 10GbE port on RHINO for data streaming purposes. 10GbE port on RHINO can be used together with GNU Radio to improve the speed of the RHINO streamer.
  • No Thumbnail Available
    Item
    Open Access
    Digitized Radio broadcast seeker (DRBS)
    (2025) Sithole, Simbisai Mfunani; Winberg, Simon
    The use of Software Defined Radio (SDR) has greatly increased the flexibility and programmability of radio systems due to the implementation of radio functions in software. The transfer of traditionally hardware-based signal processing to the software domain enables web SDR receivers to be hosted on the internet as data streaming services. Hosting SDR receivers online allows many users to tune in and listen to broadcast transmissions simultaneously. While significant advancements have been made in SDR hardware technology and making data easily accessible to users, developer productivity has been lacking. The market lacks a simple software development kit that would enable researchers and developers to experiment and create innovative software applications using existing data and components. The aim of this project is to develop an application framework that provides a collection of pre-built modules, components, code libraries, tools and Application Programmable Interfaces (API) that will allow developers to quickly create innovative applications using broadly available Software Defined Radio (SDR) data by leveraging pre- developed infrastructure. The Digitized Radio Broadcast Seeker (DRBS) application as an implementation of the application framework provides a platform for web streaming that aggregates web SDRs and helps developers search for broadcast transmissions of interest as well as build solutions around the streamed transmissions. The research project demonstrated that DRBS could be used to monitor Morse code signals to detect emergencies such as distress calls, weather alerts and other critical broadcasts. After conducting latency and average server response time performance tests, it was concluded that despite the additional infrastructure layer, the DRBS application did not add significant overhead to the signal processing path and could be considered for additional use cases such as identifying FM radio stations and analyzing spectrum usage across different geographical regions.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A domain specific language for facilitating automatic parallelization and placement of SDR patterns into heterogeneous computing architectures
    (2017) Mohapi, Lerato Jerfree; Winberg, Simon; Inggs, Michael R
    This thesis presents a domain-specific language (DSL) for software defined radio (SDR) which is referred to as OptiSDR. The main objective of OptiSDR is to facilitate the development and deployment of SDR applications into heterogeneous computing architectures (HCAs). As HCAs are becoming mainstream in SDR applications such as radar, radio astronomy, and telecommunications, parallel programming and optimization processes are also becoming cumbersome, complex, and time-consuming for SDR experts. Therefore, the OptiSDR DSL and its compiler framework were developed to alleviate these parallelization and optimization processes together with developing execution models for DSP and dataflow models of computation suitable for SDR-specific computations. The OptiSDR target HCAs are composed of graphics processing units (GPUs), multi-core central processing units (MCPUs), and field programmable gate arrays (FPGAs). The methodology used to implement the OptiSDR DSL involved an extensive review process of existing SDR tools and the extent to which they address the complexities associated with parallel programming and optimizing SDR applications for execution in HCAs. From this review process, it was discovered that, while HCAs are used to accelerate many SDR computations, there is a shortage of intuitive parallel programming frameworks that efficiently utilize the HCAs' computing resources for achieving adequate performance for SDR applications. There were, however, some very good general-purpose parallel programming frameworks identied in the literature review, including Python based tools such as NumbaPro and Copperhead, as well as the prevailing Delite embedded DSL compiler framework for heterogeneous targets. The Delite embedded DSL compiler framework motivated and powered the OptiSDR compiler development in that, it provides four main compiler development capabilities that are desired in OptiSDR: 1) Generic data parallel executable patterns; 2) Execution semantics for heterogeneous MCPU-GPU run-time; 3) Abstract syntax creation using intermediate representations (IR) nodes; and 4) Extensibility for defining new syntax for other domains. The OptiSDR DSL design processes using this Delite framework involved designing the new structured parallel patterns for DSP algorithms (e.g. FIR, FFT, convolution, correlation, etc.), dataflow models of computation (MoC), parallel loop optimizations (tiling and space splitting), and optimal memory access patterns. Advanced task and data parallel patterns were applied in the OptiSDR dataflow MoCs, which are especially suitable for SDR computations where FPGA-based realtime data acquisition systems feed data into multi-GPUs for implementation of parallel DSP algorithms. Furthermore, the research methodology involved an evaluation process that was used to determine the OptiSDR language's expressive power, efficiency, performance, accuracy, and ease of use in SDR applications, such as radar pulse compression and radio frequency sweeping algorithms. The results include measurements of performance and accuracy, productivity versus performance, and real-time processing speeds and accuracy. The performance of some of the regularly used modules, such as FFT-based Hilbert and cross-correlation was found to be very high, with computations speeds ranging from 70.0 GFLOPS to 72.6 GFLOPS, and speedups of up to 80× compared to sequential C/C++ programs and 50× for Matlab's parallel loops. Accuracy was favourable in most cases favourable. For instance, OptiSDR Octave-like DSP instantiations were found to be accurate, with L2 norm forward-errors ranging from 10⁻¹³ to 10⁻¹⁶for smaller and bigger SDR programs respectively. It can therefore be concluded from the analysis in this thesis that the objectives, which include alleviating the complexities in parallel programming and optimizing SDR applications for execution in HCAs, were met. Moreover, the following hypothesis was validated, namely: "It is possible to design a DSL to facilitate the development of SDR applications and their deployment on HCAs without significant degradation of software performance, and with possible improvement in the automatically emitted low-level source code quality.". It was validated by; 1) Defining the OptiSDR attributes such as parallel DSP patterns and dataflow MoCs; 2) Providing parameterizable SDR modules with automatic parallelization and optimization for performance and accuracy; and 3) Presenting a set of intuitive validation constructs for accuracy testing using root-mean square error, and functional verification of DSP using two-dimensional graphics plotting for radar and real-time spectral analysis plots.
  • No Thumbnail Available
    Item
    Metadata only
    EEE4084F Digital Systems
    (2013) Winberg, Simon
    The objective of this course is for students to develop an understanding of the concepts involved in the design and development of high performance and special-purpose digital computing systems. The course involves lectures in a standard lecture venue. Projects and pracs are done using computers and other hardware in a laboratory. Presentation slides and the assignments are available on the publicly accessible website for this course. Correspondence and assistance with assignments are provided by the lecturer, tutors and students via a Google Group. Some recorded lectures and tutorials are available on the website for the course as open access resources to assist in students' learning and completion of the pracs
  • «
  • 1 (current)
  • 2
  • 3
  • »
UCT Libraries logo

Contact us

Jill Claassen

Manager: Scholarly Communication & Publishing

Email: openuct@uct.ac.za

+27 (0)21 650 1263

  • Open Access @ UCT

    • OpenUCT LibGuide
    • Open Access Policy
    • Open Scholarship at UCT
    • OpenUCT FAQs
  • UCT Publishing Platforms

    • UCT Open Access Journals
    • UCT Open Access Monographs
    • UCT Press Open Access Books
    • Zivahub - Open Data UCT
  • Site Usage

    • Cookie settings
    • Privacy policy
    • End User Agreement
    • Send Feedback

DSpace software copyright © 2002-2026 LYRASIS