• English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  • Communities & Collections
  • Browse OpenUCT
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  1. Home
  2. Browse by Author

Browsing by Author "Nicolls, Frederick"

Now showing 1 - 20 of 21
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Active object recognition for 2D and 3D applications
    (2015) Govender, Natasha; Nicolls, Frederick
    Active object recognition provides a mechanism for selecting informative viewpoints to complete recognition tasks as quickly and accurately as possible. One can manipulate the position of the camera or the object of interest to obtain more useful information. This approach can improve the computational efficiency of the recognition task by only processing viewpoints selected based on the amount of relevant information they contain. Active object recognition methods are based around how to select the next best viewpoint and the integration of the extracted information. Most active recognition methods do not use local interest points which have been shown to work well in other recognition tasks and are tested on images containing a single object with no occlusions or clutter. In this thesis we investigate using local interest points (SIFT) in probabilistic and non-probabilistic settings for active single and multiple object and viewpoint/pose recognition. Test images used contain objects that are occluded and occur in significant clutter. Visually similar objects are also included in our dataset. Initially we introduce a non-probabilistic 3D active object recognition system which consists of a mechanism for selecting the next best viewpoint and an integration strategy to provide feedback to the system. A novel approach to weighting the uniqueness of features extracted is presented, using a vocabulary tree data structure. This process is then used to determine the next best viewpoint by selecting the one with the highest number of unique features. A Bayesian framework uses the modified statistics from the vocabulary structure to update the system's confidence in the identity of the object. New test images are only captured when the belief hypothesis is below a predefined threshold. This vocabulary tree method is tested against randomly selecting the next viewpoint and a state-of-the-art active object recognition method by Kootstra et al.. Our approach outperforms both methods by correctly recognizing more objects with less computational expense. This vocabulary tree method is extended for use in a probabilistic setting to improve the object recognition accuracy. We introduce Bayesian approaches for object recognition and object and pose recognition. Three likelihood models are introduced which incorporate various parameters and levels of complexity. The occlusion model, which includes geometric information and variables that cater for the background distribution and occlusion, correctly recognizes all objects on our challenging database. This probabilistic approach is further extended for recognizing multiple objects and poses in a test images. We show through experiments that this model can recognize multiple objects which occur in close proximity to distractor objects. Our viewpoint selection strategy is also extended to the multiple object application and performs well when compared to randomly selecting the next viewpoint, the activation model and mutual information. We also study the impact of using active vision for shape recognition. Fourier descriptors are used as input to our shape recognition system with mutual information as the active vision component. We build multinomial and Gaussian distributions using this information, which correctly recognizes a sequence of objects. We demonstrate the effectiveness of active vision in object recognition systems. We show that even in different recognition applications using different low level inputs, incorporating active vision improves the overall accuracy and decreases the computational expense of object recognition systems.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Classification of Fallow and Perennial Fields in High-Resolution Multispectral Aerial Images
    (2021) Akhoury, Sharat Saurabh; Nicolls, Frederick
    Increased cultivation of perennial fields hardens the water demand by the agricultural sector during drought events. It is therefore important to detect and track these fields to better plan for drought mitigation and response strategies. Remote sensing offers an effective means by which this can be accomplished. An interesting and challenging problem is presented in some cases of remotely sensed perennial fields which are readily confused with ill-maintained, abandoned and weed-infested fallow fields. The spectral response of such cases are highly correlated, hence conventional remote sensing indicators fail to discriminate between these two terrains. The work undertaken in this research attempts to address this problem by applying machine learning-based solutions for providing accurate and scalable valuations of perennial and fallow fields using high resolution multi-spectral remote sensing data. The distinctive uniform grid-like representation of perennial acreage motivated the use of a texture-based classification approach. Two different texture classification methods are developed, namely a pixel-based image analysis texture segmentation framework (TSF) approach and a statistical vocabulary learning-based approach referred to as the VarmaZisserman classifier (VZC). In the first approach, the texture classification problem is reformulated as a texture segmentation problem in which each pixel in the image is individually labelled by training a classifier on the texture feature space. In the second approach, a set of images is used to generate a texton dictionary from which exemplar texture probabilistic models are learnt. Three transform-based techniques are applied for computing texture features. Experimental results validate that a texture-based machine learning approach is able to successfully discriminate between fallow and perennial land cover with an error rate ranging between 6.6% (TSF) and 16.8% (VZC). The pixel-based image analysis approach is found to be more conducive for classifying homogenous land cover types that have high interclass spectral reflectance overlap. A comprehensive multi-classifier experiment indicates that ensemble-based classifiers (such as random forests and AdaBoost) and instance-based classifiers (such as k-nearest neighbours) are better suited at identifying agricultural land covers with correlated spectral responses. These classifiers yield precision and recall scores ≥ 90% with error rates less than 10%. It is shown that a deterministic sampling technique such as striding can greatly reduce the learning rate as well as model size without compromising the classification accuracy, precision and recall.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Constraints and invariance in target detection
    (2000) Nicolls, Frederick
    The concept of invariance in hypothesis testing is discussed for purposes of target detection. Invariant tests are proposed and analysed in two contexts. The first involves the use of cyclic permutation invariance as a solution to detecting targets with unknown location in noise. An invariance condition is used to eliminate the target location parameter, and a uniformly most powerful test developed for the reduced data. The test is compared with conventional solutions, and shown to be more powerfu. The difference however is slight, justifying the simpler formulations. This conclusion continues to hold even when additional unknown noise parameters are introduced.
  • No Thumbnail Available
    Item
    Open Access
    Development of scalable hybrid switching-based software-defined networking using reinforcement learning
    (2024) Blose, Max; Akinyemi, Lateef Adesola; Nicolls, Frederick
    As the global internet traffic continues to grow exponentially, there is a growing need for cutting-edge switching technologies to manage this growth. One of the most recent innovations is Software Defined Networking (SDN), which refers to the disintegration of the infrastructure layer and the logically centralized control layer. SDN is a cutting-edge networking approach that provides network agility, programming flexibility, and enhanced network performance over traditional switching networks. Even though SDN has some great benefits, there is a need to address and manage scalability challenges to guarantee optimal, scalable, and rapid data traffic switching within Service Provider network infrastructure, including Data Centres environments. These scalability issues are inherent to SDN's logically centralized control layer. Whenever a packet belonging to a new flow has to be transported, OpenFlow switch has to interact with the logically centralized SDN controller through southbound OpenFlow Application Programming Interface between OpenFlow switch and the logically centralized SDN controller. This results to an increase in communication overhead between the two instances. The control layer overhead traffic can impede scalability due to the controller's limited processing memory. There is therefore a strong incentive to enhance the scalability of SDN operations. To address the SDN scalability issues identified by creating a scalable hybrid switching solution using machine learning algorithms. We propose an SDN OpenFlow model switch which collaborate with the traditional switch to represent a scalable framework of Hybrid Routing with Reinforcement Learning (sHRRL). We implement a reinforcement algorithm to randomly explore new routes and discover the most optimal path through the Q-learning algorithm. This primitive and model-free form of reinforcement learning utilizes the Markov Decision Process and the Bellman's equation to reiteratively update Q-values in Q-table for every transition in the network environment state, until Q-function has converged to the best Q-Values. The greedy strategy is employed to guide the reinforcement learning agent in selecting the most suitable Q-values from the Q-Table. To ensure that the machine learning algorithm is able to discover a sufficient amount of possible routes and has a sufficient understanding of the network environment, sufficient training and evaluation episodes should be conducted. The proposed hybrid switching methodology was benchmarked against the standard SDN OpenFlow switch in terms of network performance metrics, including average throughput and packet exchange transmission rates, CPU load, and delay, to compare the two switching approaches. When statistically comparing the test results, it was observed that the number of packets exchanged by the hybrid switch was greater by more than sixty percent compared to the Open Flow switch which saturated first. The average throughput results demonstrate that the hybrid switching routing scheme achieves high throughput results. The first type of switch to reach saturation is the Open Flow switch, as it does not explore all available paths. Consequently, the hybrid switch is more efficient than the Open Flow Switch when it comes to CPU load. The average CPU load for the Open Flow switch is fifteen percent (15%) higher than for the hybrid Switch. Our analysis of the simulation data suggests that the Q-learning-based reinforcement learning framework, sHRRL, enhances the performance of the hybrid switch when compared to the Open Flow switches. We are therefore of the opinion that the hybrid switching model proposed utilizing machine learning algorithms can address the scalability issues in the design of SDN controller networks, particularly in data centre environments where high switching speeds are of paramount importance.
  • No Thumbnail Available
    Item
    Open Access
    Development of scalable hybrid switching-based software-defined networking using reinforcement learning
    (2024) Blose, Max; Akinyemi, Lateef Adesola; Nicolls, Frederick
    As the global internet traffic continues to grow exponentially, there is a growing need for cutting-edge switching technologies to manage this growth. One of the most recent innovations is Software Defined Networking (SDN), which refers to the disintegration of the infrastructure layer and the logically centralized control layer. SDN is a cutting-edge networking approach that provides network agility, programming flexibility, and enhanced network performance over traditional switching networks. Even though SDN has some great benefits, there is a need to address and manage scalability challenges to guarantee optimal, scalable, and rapid data traffic switching within Service Provider network infrastructure, including Data Centres environments. These scalability issues are inherent to SDN's logically centralized control layer. Whenever a packet belonging to a new flow has to be transported, OpenFlow switch has to interact with the logically centralized SDN controller through southbound OpenFlow Application Programming Interface between OpenFlow switch and the logically centralized SDN controller. This results to an increase in communication overhead between the two instances. The control layer overhead traffic can impede scalability due to the controller's limited processing memory. There is therefore a strong incentive to enhance the scalability of SDN operations. To address the SDN scalability issues identified by creating a scalable hybrid switching solution using machine learning algorithms. We propose an SDN OpenFlow model switch which collaborate with the traditional switch to represent a scalable framework of Hybrid Routing with Reinforcement Learning (sHRRL). We implement a reinforcement algorithm to randomly explore new routes and discover the most optimal path through the Q-learning algorithm. This primitive and model-free form of reinforcement learning utilizes the Markov Decision Process and the Bellman's equation to reiteratively update Q-values in Q-table for every transition in the network environment state, until Q-function has converged to the best Q-Values. The greedy strategy is employed to guide the reinforcement learning agent in selecting the most suitable Q-values from the Q-Table. To ensure that the machine learning algorithm is able to discover a sufficient amount of possible routes and has a sufficient understanding of the network environment, sufficient training and evaluation episodes should be conducted. The proposed hybrid switching methodology was benchmarked against the standard SDN OpenFlow switch in terms of network performance metrics, including average throughput and packet exchange transmission rates, CPU load, and delay, to compare the two switching approaches. When statistically comparing the test results, it was observed that the number of packets exchanged by the hybrid switch was greater by more than sixty percent compared to the Open Flow switch which saturated first. The average throughput results demonstrate that the hybrid switching routing scheme achieves high throughput results. The first type of switch to reach saturation is the Open Flow switch, as it does not explore all available paths. Consequently, the hybrid switch is more efficient than the Open Flow Switch when it comes to CPU load. The average CPU load for the Open Flow switch is fifteen percent (15%) higher than for the hybrid Switch. Our analysis of the simulation data suggests that the Q-learning-based reinforcement learning framework, sHRRL, enhances the performance of the hybrid switch when compared to the Open Flow switches. We are therefore of the opinion that the hybrid switching model proposed utilizing machine learning algorithms can address the scalability issues in the design of SDN controller networks, particularly in data centre environments where high switching speeds are of paramount importance.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Discriminative training of hidden Markov Models for gesture recognition
    (2018) Combrink, Jan Hendrik; Nicolls, Frederick
    As homes and workplaces become increasingly automated, an efficient, inclusive and language-independent human-computer interaction mechanism will become more necessary. Isolated gesture recognition can be used to this end. Gesture recognition is a problem of modelling temporal data. Non-temporal models can be used for gesture recognition, but require that the signals be adapted to the models. For example, the requirement of fixed-length inputs for support-vector machine classification. Hidden Markov models are probabilistic graphical models that were designed to operate on time-series data, and are sequence length invariant. However, in traditional hidden Markov modelling, models are trained via the maximum likelihood criterion and cannot perform as well as a discriminative classifier. This study employs minimum classification error training to produce a discriminative HMM classifier. The classifier is then applied to an isolated gesture recognition problem, using skeletal features. The Montalbano gesture dataset is used to evaluate the system on the skeletal modality alone. This positions the problem as one of fine-grained dynamic gesture recognition, as the hand pose information contained in other modalities are ignored. The method achieves a highest accuracy of 87.3%, comparable to other results reported on the Montalbano dataset using discriminative non-temporal methods. The research will show that discriminative hidden Markov models can be used successfully as a solution to the problem of isolated gesture recognition
  • No Thumbnail Available
    Item
    Open Access
    Enhancing cross-dataset performance in distracted driver detection using body part activity recognition.
    (2024-05) Zandamela, Frank; Nicolls, Frederick; Stoltz, Gene
    Detecting distracted drivers is a crucial task, and the literature proposes various deep learning-based methods. Among these methods, convolutional neural networks dominate because they can extract and learn image features automatically. However, even though existing methods have reported remarkable results, the cross-dataset performance of these methods remains unknown. A problem arises because cross-dataset performance often indicates a model's generalisation ability. Without knowing the model's cross-dataset performance, deployment in the real world could result in catastrophic events. This thesis investigates the generalisation ability of deep learning-based distracted driver detection methods. In addition, a robust distracted driver detection approach is proposed. The proposed approach is based on recognising distinctive activities of human body parts involved when a driver is operating a vehicle. Representative state-of-the-art deep learning-based methods have been trained exclusively on three widely used image datasets and evaluated across the test sets of these datasets. Experimental results reveal that current deep learning-based methods for detecting distracted drivers do not generalise well on unknown datasets, particularly for convolutional neural network (CNN) models that use the entire image for prediction. In addition, the experiments indicated that although current distracted driver detection datasets are relatively large, they lack diversity. The proposed approach was implemented using a state-of-the-art object detection algorithm called Yolov7. The cross-dataset performance of the implemented approach was evaluated on three benchmark datasets and a custom dataset. Experimental results demonstrate that the proposed approach improves cross-dataset performance. A cross-dataset accuracy improvement of 7.8% was observed. Most importantly, the overall balanced (F1-score) performance was improved by a factor of 2.68. The experimental results also revealed that although the proposed approach demonstrates commendable performance on a custom test set, all algorithms encountered challenges when dealing with the custom test set, mainly due to lower image quality and difficult lighting conditions. The thesis presents two main contributions. Firstly, it evaluates the performance of current deep learning-based distracted driver detection algorithms across different datasets. Secondly, it proposes a robust algorithm for detecting distracted drivers by identifying key human body parts involved in operating a vehicle.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Estimating phytoplankton size classes from their inherent optical properties
    (2019) Berliner, David Stephen; Nicolls, Frederick
    Phytoplankton plays a massive role in the regulation of greenhouse gases, with different functional types affecting the carbon cycle differently. The most practical way of synoptically mapping the ocean’s phytoplankton communities is through remote sensing with the aid of ocean-optics algorithms. This thesis is a study of the relationships between the Inherent Optical Properties (IOPs) of the ocean and the physical constituents within it, with a special focus on deriving phytoplankton size classes. Three separate models were developed, each focusing on a different relationship between absorption and phytoplankton size classes, before being combined into a final ensemble model. It was shown that all of the developed models performed better than the baseline model, which only estimates the mean values per size class, and that the results of the final ensemble model is comparable to, and performs better than, most other published models on the NOMAD dataset.
  • No Thumbnail Available
    Item
    Open Access
    Hartbeesthoek Radio Astronomy Observatory (HartRAO) Pioneering Advantage: Exploring Scientific Instrumentation Co-Location Benefits in Ensuring African Research Facility Sustainability
    (2023) Madlanga, Simphiwe; Nicolls, Frederick
    This study focuses on the prospect of development of radio astronomy facilities with instrument co-location capacity in the 8 Square Kilometre Array (SKA) African Very Long Baseline Interferometry Network (AVN) partner countries. The premise of the research is that the Hartebeesthoek Radio Astronomy Observatory (HartRAO) had laid down a blue-print and is well-suited as a proof of concept on what may be possible for the other countries. The technology gap between HartRAO, South Africa and the planned sites in the other countries is far better than between those countries and any others beyond the borders of the African continent. The fact that the countries of interest are all in sub-Saharan Africa fosters a level of mutual understanding where developmental challenges are concerned, but some of unique challenges are not to be underestimated. The research is qualitative and has been framed through the perspective of subjectivism and interpretivism, in order to do justice to the regional contexts and unique realities and circumstances in the development narrative. Research and data collection is pursued through the heterodox avenue of archival research; the collection and use of official organizational reports, archived material and even digital resources. Afri lagging, and no nation has been able to achieve much alone. The research shows how a large scientific project to introduce research capacity and infrastructure can be leveraged to respond to and incorporate some of the SDGs; namely SDG4 (quality education), SDG9 (Industry, innovation and infrastructure) and SDG 17 (Partnerships for the goals). Given the fast-approaching 2030 deadline for the S
  • No Thumbnail Available
    Item
    Open Access
    In-Flight control simulation of a proposed, future microsatellite for the African Resource Management Constellation (ARMC)
    (University of Cape Town, 2024) Maongera, Brendon; Nicolls, Frederick
    The African Resource Management Constellation (ARMC) is a group of satellites that provide vegetation monitoring over the African continent. It is operated by the following four countries: Kenya, South Africa, Nigeria and Algeria. The constellation allows the four partner countries to learn to control and build their own satellite systems. The University of Cape Town has been sponsored by a European consortium of academic and industry partners and has received a satellite testbench. The satellite testbench is a fully functional digital twin of the “Flying Laptop” satellite. The simulation testbench is commanded via the commercial mission control software and includes a detailed simulation of the satellite and all subsystems. The University of Cape Town testbench can be the realistic nucleus of an ARMC mission. Starting from this setup a satellite model with improved remote sensing technologies can be defined for South Africa as well as for the ARMC with vegetation monitoring. The constellation should operate at altitudes where there is high atmospheric drag, which will reduce the lifetime use of the satellites. An electric propulsion system can be used to restore the satellite to the desired altitude when commanded. The current study aimed to perform a flight simulation of one of the constellation satellites that demonstrated vegetation monitoring over the African continent and modelled the Gecko Imager for the payload and an electric propulsion system on the testbench. Each model was simulated in the Simulation Third Generation (SimTG), the flight and mission control software were enhanced and a simulation of the model with the satellite was performed. The research only focused on simulating a South African developed camera product for the payload and an electric propulsion system. The propulsion system was not designed but rather extracted from a previous student's paper. The software was enhanced for both models. The simulated Gecko and electric propulsion system models were developed on the SimTG. Each model went through unit level testing to prove overall functionality. Each model was integrated with a satellite subsystem and that integration was tested. Other subsystem models were edited to accommodate the new models and the flight software was enhanced for the new models. The mission control system was updated to create telecommands and telemetry packets for the models. The simulation of the models and the integration of the models to the satellite subsystems was successful. The Gecko Imager was able to capture images and the propulsion system was not able to improve the orbit of a satellite. The realistic flight simulation of the Gecko Imager was successful. Images of South Africa and Kenya were captured during the simulation. The orbit raise manoeuvre was not successful due to the thrust acceleration not overcoming atmospheric drag. The simulation of the Gecko camera and the electric propulsion system into the SimTG was successful. All objectives were completed, and the enhancement of the flight software was successful and the creation of packets for commanding and telemetry on the mission control system was successful.
  • No Thumbnail Available
    Item
    Open Access
    Kinematic Modeling and Dynamic Aspects of an Accelerating Quadruped
    (2024) Van Der Leek, Casey; Nicolls, Frederick
    Bio-inspired robotics engineers look to the natural world for clues to aspects of motion dynamics and morphologies that may be incorporated in the design of these robots. The mimicking and transfer of these aspects of a live subject to a modern day robot is limited by the technologies available such as computational resources, materials engineering, mathematical modeling constraints and efficient systems engineering. With this in mind, a reasonable strategy is to reproduce the functionality of a subject with current technology. A monocular camera and deep learning algorithm allow non-invasive image pose extraction of an accelerating cheetah subject, which is represented as a mechanism of rigid links interconnected by joints, and this information forms the data basis of subsequent operations. In addition, a non-linear least squares optimiser is formulated and coded specifically for the quadruped robot that produces estimates of the relative link angles, a base link length and trajectory of a reference point so that a three dimensional configuration evolution of the system is rendered. A secondary consideration is the deployment of inverse kinematics to determine the end effector trajectory of the front leg, both in the real spatial frames and phase space domains, as well as the angular rates required for these target manifolds. The parameterised inverse kinematics models were also able to generate smooth task space trajectories to within acceptable tolerances of the target position and for a single, full gait the corresponding joint space trajectories were deemed to be sufficiently closed.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Limited angle tomography
    (2004) De Villiers, Mattieu; De Jager, Gerhard; Nicolls, Frederick
    This thesis investigates the limited angle tomography problem where axial reconstructions are produced from few measured projection views covering a 100° angular range. Conventional full angle tomography requires at least a 180° range of projection views of the patient at a fine angular spacing. Inference techniques presented in the literature, such as Bayesian methods, perform inadequately on the information-starved problem of interest.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Methods for multi-spectral image fusion: identifying stable and repeatable information across the visible and infrared spectra
    (2016) Retief, Francois Jacques; Nicolls, Frederick
    Fusion of images captured from different viewpoints is a well-known challenge in computer vision with many established approaches and applications; however, if the observations are captured by sensors also separated by wavelength, this challenge is compounded significantly. This dissertation presents an investigation into the fusion of visible and thermal image information from two front-facing sensors mounted side-by-side. The primary focus of this work is the development of methods that enable us to map and overlay multi-spectral information; the goal is to establish a combined image in which each pixel contains both colour and thermal information. Pixel-level fusion of these distinct modalities is approached using computational stereo methods; the focus is on the viewpoint alignment and correspondence search/matching stages of processing. Frequency domain analysis is performed using a method called phase congruency. An extensive investigation of this method is carried out with two major objectives: to identify predictable relationships between the elements extracted from each modality, and to establish a stable representation of the common information captured by both sensors. Phase congruency is shown to be a stable edge detector and repeatable spatial similarity measure for multi-spectral information; this result forms the basis for the methods developed in the subsequent chapters of this work. The feasibility of automatic alignment with sparse feature-correspondence methods is investigated. It is found that conventional methods fail to match inter-spectrum correspondences, motivating the development of an edge orientation histogram (EOH) descriptor which incorporates elements of the phase congruency process. A cost function, which incorporates the outputs of the phase congruency process and the mutual information similarity measure, is developed for computational stereo correspondence matching. An evaluation of the proposed cost function shows it to be an effective similarity measure for multi-spectral information.
  • No Thumbnail Available
    Item
    Open Access
    Partitioned particle filtering for target tracking in video sequences
    (2004) Louw, Markus Smuts; de Jager, G; Nicolls, Frederick
    [page 9-12,17,18 are missing] A partitioned particle filtering algorithm is developed to track moving targets exhibiting complex interaction in a static environment, in a video sequence. The filter is augmented with an additional scan phase, which is a deterministic sequence which has been formulated in terms of the recursive Bayesian paradigm, and yields superior results. One partition is allocated to each target object, and a joint hypothesis is made for simultaneous location of all targets in world coordinates. The observation likelihood is calculated on a per-pixel basis, using sixteen-centered Gaussian Mixture Models trained on the available colour information for each target. Assumptions about the behaviour of each pixel allow for the improvement under certain circumstances of the basic pixel classification by smoothing, using Hidden Markov Models, again on a per-pixel basis. The tracking algorithm produces very good results, both on a complex sequence using highly identifiable targets, as well as on a simpler sequence with natural targets. In each of the scenes, all of the targets were correctly tracked for a very high percentage of the frames in which they were present, and each target loss was followed by a successful reacquisition. Two hundred basic particles were used per partition, with an additional one hundred augmented particles per partition, for the scan phase. The algorithm does not run in real-time, although with optimization this is a possibility.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Programmable Aperture Photography: An investigation into applications and methods
    (2018) Chiranjan, Ashill; Nicolls, Frederick; Kanjee, Ritesh
    The fields of digital image processing (DIP) and computational photography are ever growing with new focuses on coded aperture imaging and its real-world applications. Research has shown that coded apertures are far superior to traditional circular apertures for various tasks. A variety of coded aperture patterns have been proposed and developed over the years for use in various applications such as defocus deblurring, depth estimation and light field acquisition. Traditional coded aperture masks are constructed from static materials such as cardboard and cannot be altered once their shapes have been defined. These masks are then physically inserted into the aperture plane of a camera-lens system which makes swapping between different patterned masks difficult. This is undesirable as optimal aperture patterns differ depending on application, scene content or imaging conditions and thus would need to be changed quickly and frequently. This dissertation proposes the design and development of a programmable aperture photography camera. The camera makes use of a liquid crystal display (LCD) as a programmable aperture. This allows one to change the aperture shape at a relatively high frame rate. All the benefits and drawbacks of the camera are evaluated. Firstly the task of performing deblurring and depth estimation is tested using existing and optimised aperture patterns on the LCD. A light field is then captured and used to synthesise virtual photographs and perform stereo vision. Thereafter, exposure correction is performed on a scene based on various degrees of illumination. The aperture pattern optimised online based on scene content outperformed generic coded apertures for defocus deblurring. The programmable aperture also performed well for depth estimation using an optimised pattern and existing coded apertures. Using the captured light field, refocused photographs were constructed and stereo vision performed to accurately calculate depth. Finally, the aperture could adjust to the different levels of illumination in the room to provide the correct exposure for image capture. Thus the camera provided all the advantages of traditional coded aperture imaging systems but without the disadvantage of having a static aperture in the aperture plane.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Real-time measurement of biaxial tensions using digital image correlation methods
    (2022) Kyd, Haemish; Govender, Reuben Ashley; Nicolls, Frederick
    The mechanical properties of biological materials need to be measured for various applications. A means of inducing biaxial tensions in samples like these is with an inflation or bulge test. Normally the material under test would be measured with displacement gauges, however, under these conditions, where the specimen is soft and further, where the measurement cycle cannot be reliably paused, a contactless real-time measurement system is necessary to obtain reliable deformation data. Digital Image Correlation (DIC) is one such method. Pioneered in the 1980s the field has developed from basic 2D displacement measurements to very sophisticated full field 3D displacement measurement systems. The question becomes can the current state of the field, as well as the advances in modern technology, be leveraged to create a useable 3D DIC measurement system that is: • Useable in a real-time context. • Portable enough to be able to run these experiments wherever the experiment apparatus is located. • Cost effective enough to reduce the barrier to entry that the current commercial options present. To this end off-the-shelf components were acquired to form the technology base of the system. The open-source DICe framework, which enabled the necessary level of access to the underlying code base, was implemented on an NVIDIA Jetson Nano single board computer. Synchronised, stereo image acquisition was implemented via an Arducam 12 MP camera system. A stepper motor controlled linear drive was used to experimentally investigate accuracy and speed of the DIC system, for both rigid body motion and deforming targets. A thorough review of the concepts involved in DIC is undertaken followed by a detailed description of the design and build of the system. Ultimately a set of experiments are executed that show that, within a set of important constraints, it is indeed possible to run 3D DIC in real-time with off the shelf, cost effective components.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Towards A Single Electron Current On Superfluid Helium
    (2021) Funk, Oliver; Blumenthal, Mark; Nicolls, Frederick
    The aim of this dissertation was to investigate the application of a system of electrons floating above the surface of superfluid helium to the field of single electron transport. Previous work done by Dr Forrest Bradbury at Princeton University (now a collaborator in the group) demonstrated the highly efficient and precise control of packets of electrons floating on the surface of superfluid helium, localised to channels defined in a silicon substrate. Using similar devices and methodologies, the work done in this dissertation investigates whether this modality of electron transport can be effectively applied to deliver a current of single electrons. Single electron devices have numerous applications in the field of metrology and quantum information processing. They allow for measurements to be made of fundamental quantities, such as the charge of an electron, and further demonstrate various quantum mechanical properties of nature. Presented in this dissertation is the work completed to date, which includes: the design and fabrication of the nanoscale device used to conduct the electrons on superfluid helium experiments, the required electronics needed to control the device and the data acquisition system needed to read various signals off the device. The fabrication was done at Oak Ridge National Labs in the USA. Additionally, a hermetically sealed superfluid cell designed in collaboration with Dr Jay Amrit from Universit´e Paris-Sud, France used to house the device is presented, as well the probe needed to insert this cell into the dilution fridge. The theory behind the functionality of the device and the way in which it would work is developed. A simulation of working of the device is presented, as well as the expected measurement quantities. The outlook for continued work in this exciting and very novel physical system is also presented.
  • No Thumbnail Available
    Item
    Open Access
    Viewpoint estimation in medical imaging
    (2024) Hounkanrin, Mahouclo Anicet; Nicolls, Frederick; Amayo Paul
    In medical imaging, the appearance of a certain body part on a radiograph depends not only on the position but also on the orientation of the X-ray imaging system with respect to the patient. Given a 2D image of a 3D scene, the problem of viewpoint estimation aims to determine the position and the orientation of the imaging sensor that resulted in that view. We investigate methods to solve the viewpoint estimation problem for medical images, notably the determination of orientation parameters. Machine learning models, particularly convolutional neural networks (CNNs), are developed to predict a human subject's orientation in a radiograph. Since deep learning models require data for training, we first generate a dataset of digitally reconstructed radiographs (DRRs) from a set of computed tomography (CT) scans using Fourier volume rendering (FVR). The dataset of DRRs is then used to train CNN models for viewpoint regression and classification. A label-softening strategy is used to improve the performance of the classification models. Meanwhile, a geometric structure-aware cost function is used to account for the geometric continuity of the viewpoint space. Several 3D rotation methods such as Euler angle, axis-angle, and quaternions are investigated for viewpoint representation. The results demonstrate that viewpoint estimation in medical imaging can be effectively solved using CNN-based classification and regression models. The geometric structure-aware cost function proves to be essential to the success of classification models for viewpoint estimation. The regression-based models, on the order hand, appear to be sensitive to the type of parametrization used to represent the viewpoints. In particular, the unit quaternion representation of 3D rotations proves to be more effective than other representations for viewpoint regression with CNN models. Moreover, we extend the proposed method to perform viewpoint estimation for natural images. The performance on the PASCAL3D+ dataset indicates that the application of the methods presented is not restricted to medical imaging.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Visual localisation of electricity pylons for power line inspection
    (2023) Ali, Emmanuel Yahi; Nicolls, Frederick
    Inspection of power infrastructure is a regular maintenance event. To date the inspection process has mostly been done manually, but there is growing interest in automating the process. The automation of the inspection process will require an accurate means for the localisation of the power infrastructure components. In this research, we studied the visual localisation of a pylon. The pylon is the most prominent component of the power infrastructure and can provide a context for the inspection of the other components. Point-based descriptors tend to perform poorly on texture less objects such as pylons, therefore we explored the localisation using convolutional neural networks and geometric constraints. The crossings of the pylon, or vertices, are salient points on the pylon. These vertices aid with recognition and pose estimation of the pylon. We were successfully able to use a convolutional neural network for the detection of the vertices. A model-based technique, geometric hashing, was used to establish the correspondence between the stored pylon model and the scene object. We showed the effectiveness of the method as a voting technique to determine the pose estimation from a single image. In a localisation framework, the method serves as the initialization of the tracking process. We were able to incorporate an extended Kalman filter for subsequent incremental tracking of the camera relative to the pylon. Also, we demonstrated an alternative tracking using heatmap details from the vertex detection. We successfully demonstrated the proposed algorithms and evaluated their effectiveness using a model pylon we built in the laboratory. Furthermore, we revalidated the results on a real-world outdoor electricity pylon. Our experiments illustrate that model-based techniques can be deployed as part of the navigation aspect of a robot.
  • No Thumbnail Available
    Item
    Open Access
    Volumetric Medical Classification using Deep Learning: A comparative study on classifying Alzheimer's disease using Convolutional Neural Networks
    (2023) Masson, Richard; Nicolls, Frederick; Son Jarryd
    This work sets about designing and implementing a number of deep-learning models capable of identifying Alzheimer's disease from MRI brain scans. A common problem with detecting the disease is the difficulty in doing so before outward mental symptoms have begun to show. Therefore, the models attempt to classify both mild and severe cases. The experimental process proves that a problem involving volumetric medical images benefits from the usage of 3D model architecture over traditional 2D architecture. In doing so, however, it is revealed that the 2D models do ultimately perform only slightly below the 3D model. Thus, the 2D approaches hold merit for potential usage, should a 2D planar approach be desired. The paper presents a total of three models. The first is a 3D CNN model, which performs the best in all regards, with a mean accuracy of 81.3%. It is treated as the optimal means of detecting Alzheimer's. The second is a 2D CNN model which uses separate 2D convolution layers to independently train and combine 2D slices across the depth axis. This approach produces a model that only slightly under-performs compared to the 3D model (80% accuracy). The third and final model is a novel design in which a set of models are each trained on a single unique 2D slice of the volume, across a carefully chosen range of slices deemed to contain the most favourable feature data. The model set is then used in unison to make predictions which are then aggregated using a weighted ensemble-voter to produce a final prediction score. This final design scored between the prior two models (80.6%), and establishes itself as a promising model capable of operating on a fraction of the data. Analysis of the models' activation gradients was conducted to confirm that 2D models are able to train well on isolated 2D slices, but struggle to process the space between these slices. Additionally, the work examines and rates the effectiveness of several optional variables in the overall CNN model design, specifically in the context of training on brain scans. A variety of pixel rescaling functions were found to have a noticeable positive impact on overall model performance. Regularization, as well as augmentation in the form of rotation / elastic deformation, also yielded similar improvements on such models, and are thus universally recommended as considerations for any works attempting to solve a similar classification problem. With all this in mind, a final conclusion is made that machine learning and deep learning are promising tools in the medical field for assessing and diagnosing using raw brain scans. For additional reference, the code repository for generating and processing the models is available for viewing. An alternate branch, containing the code used to produce the gradient activation maps, has also been included.
  • «
  • 1 (current)
  • 2
  • »
UCT Libraries logo

Contact us

Jill Claassen

Manager: Scholarly Communication & Publishing

Email: openuct@uct.ac.za

+27 (0)21 650 1263

  • Open Access @ UCT

    • OpenUCT LibGuide
    • Open Access Policy
    • Open Scholarship at UCT
    • OpenUCT FAQs
  • UCT Publishing Platforms

    • UCT Open Access Journals
    • UCT Open Access Monographs
    • UCT Press Open Access Books
    • Zivahub - Open Data UCT
  • Site Usage

    • Cookie settings
    • Privacy policy
    • End User Agreement
    • Send Feedback

DSpace software copyright © 2002-2025 LYRASIS