Browsing by Author "Nicolls, Fred"
Now showing 1 - 20 of 31
Results Per Page
Sort Options
- ItemOpen Access3D model reconstruction using photoconsistency(2007) Joubert, Kirk Michael; Nicolls, Fred; De Jager, GerhardModel reconstruction using photoconsistency refers to a method that creates a photohull, an approximate computer model, using multiple calibrated camera views of an object. The term photoconsistency refers to the concept that is used to calculate the photohull from the camera views. A computer model surface is considered photoconsistent if the appearance of that surface agrees with the appearance of the surface of the real world object from all camera viewpoints. This thesis presents the work done in implementing some concepts and approaches described in the literature.
- ItemOpen AccessAircraft state estimation using cameras and passive radar(2018) De Charmoy, Benjamin; Nicolls, FredMultiple target tracking (MTT) is a fundamental task in many application domains. It is a difficult problem to solve in general, so applications make use of domain specific and problem-specific knowledge to approach the problem by solving subtasks separately. This work puts forward a MTT framework (MTTF) which is based on the Bayesian recursive estimator (BRE). The MTTF extends a particle filter (PF) to handle the multiple targets and adds a probabilistic graphical model (PGM) data association stage to compute the mapping from detections to trackers. The MTTF was applied to the problem of passively monitoring airspace. Two applications were built: a passive radar MTT module and a comprehensive visual object tracking (VOT) system. Both applications require a solution to the MTT problem, for which the MTTF was utilized. The VOT system performed well on real data recorded at the University of Cape Town (UCT) as part of this investigation. The system was able to detect and track aircraft flying within the region of interest (ROI). The VOT system consisted of a single camera, an image processing module, the MTTF module and an evaluation module. The world coordinate frame target localization was within ±3.2 km and these results are presented on Google Earth. The image plane target localization has an average reprojection error of ±17.3 pixels. The VOT system achieved an average area under the curve value of 0.77 for all receiver operating characteristic curves. These performance figures are typical over the ±1 hr of video recordings taken from the UCT site. The passive radar application was tested on simulated data. The MTTF module was designed to connect to an existing passive radar system developed by Peralex Electronics Pty Ltd. The MTTF module estimated the number of targets in the scene and localized them within a 2D local world Cartesian coordinate system. The investigations encompass numerous areas of research as well as practical aspects of software engineering and systems design.
- ItemOpen AccessAutomated 3D reconstruction of Lodox Statscan images for forensic application(2011) Bolton, Frank; Nicolls, FredThe main objectives of this project are to perform tomographic reconstruction with manually scanned projection data from a Lodox Statscan full body digital radiography system, and to produce tools to allow automated generation of information required to perform the tomographic reconstruction.
- ItemOpen AccessAutomatic 2D-to-3D conversion of single low depth-of-field images(2017) Reddy, Serendra; Nicolls, FredThis research presents a novel approach to the automatic rendering of 3D stereoscopic disparity image pairs from single 2D low depth-of-field (LDOF) images. Initially a depth map is produced through the assignment of depth to every delineated object and region in the image. Subsequently the left and right disparity images are produced through depth imagebased rendering (DIBR). The objects and regions in the image are initially assigned to one of six proposed groups or labels. Labelling is performed in two stages. The first involves the delineation of the dominant object-of-interest (OOI). The second involves the global object and region grouping of the non-OOI regions. The matting of the OOI is also performed in two stages. Initially the in focus foreground or region-of-interest (ROI) is separated from the out of focus background. This is achieved through the correlation of edge, gradient and higher-order statistics (HOS) saliencies. Refinement of the ROI is performed using k-means segmentation and CIEDE2000 colour-difference matching. Subsequently the OOI is extracted from within the ROI through analysis of the dominant gradients and edge saliencies together with k-means segmentation. Depth is assigned to each of the six labels by correlating Gestalt-based principles with vanishing point estimation, gradient plane approximation and depth from defocus (DfD). To minimise some of the dis-occlusions that are generated through the 3D warping sub-process within the DIBR process the depth map is pre-smoothed using an asymmetric bilateral filter. Hole-filling of the remaining dis-occlusions is performed through nearest-neighbour horizontal interpolation, which incorporates depth as well as direction of warp. To minimising the effects of the lateral striations, specific directional Gaussian and circular averaging smoothing is applied independently to each view, with additional average filtering applied to the border transitions. Each stage of the proposed model is benchmarked against data from several significant publications. Novel contributions are made in the sub-speciality fields of ROI estimation, OOI matting, LDOF image classification, Gestalt-based region categorisation, vanishing point detection, relative depth assignment and hole-filling or inpainting. An important contribution is made towards the overall knowledge base of automatic 2D-to-3D conversion techniques, through the collation of existing information, expansion of existing methods and development of newer concepts.
- ItemOpen AccessAutomatic detection and segmentation of brain lesions from 3D MR and CT images(2014) Mokhomo, Molise; Nicolls, Fred; De Jager, Gerhard; Muller, NThe detection and segmentation of brain pathologies in medical images is a vital step which helps radiologists to diagnose a variety of brain abnormalities and set up a suitable treatment. A number of institutes such as iThemba LABS still rely on a manual identification of abnormalities. A manual identification is labour intensive and tedious due to the large amount of medical data to be processed and the presence of small lesions. This thesis discusses the possible methods that can be used to address the problem of brain abnormality segmentation in MR and CT images. The methods are general enough to segment different types of abnormalities. The first method is based on the symmetry of the brain while the second method is based on a brain atlas. The symmetry-based method assumes that healthy brain tissues are symmetrical in nature while abnormal tissues are asymmetric with respect to the symmetry plane dividing the brain into similar hemispheres. The three major steps involved in this approach are the symmetry detection, tilt correction and asymmetry quantification. The method used to determine the brain symmetry automatically is discussed and its accuracy has been validated against the ground-truth using mean angular error (MAE) and distance error (DE). Two asymmetric quantification methods are studied and validated on real and simulated patient’s T1- and T2-weighted MR images with low and highgrade gliomas using true positive volume fraction (TPVF), false positive volume fraction (FPVF) and false negative volume fraction (FNVF). The atlas-based method is also presented and relies on the assumption that abnormal brain tissues appear with intensity values different from those of the surrounding healthy tissues. To detect and segment brain lesions the test image is aligned onto the atlas space and voxel by voxel analysis is performed between the atlas and the registered image. This methods is also evaluated on the simulated T1-weighted patient dataset with simulated low and high grade gliomas. The atlas, containing prior knowledge of normal brain tissues, is built from a set of healthy subjects.
- ItemOpen AccessAutomation of region specific scanning for real time medical systems(2012) Wong, Denis Kow Son; Nicolls, FredX-rays have played a vital role in both the medical and security sectors. However, there is a limit to the amount of radiation a body can receive before it becomes a health risk. Modern low dose x-ray devices operate using a c-arm which moves across the entire human body. This research reduces the radiation applied to the human body by isolating the region that needs exposure. The medical scanner that this work is based on is still under development and therefore a prototype of the scanner is developed for running simulations. A camera is attached onto the prototype and used to point out the regions that are required to be scanned. This is both faster and more accurate than the traditional method of manually specifying the areas.
- ItemOpen AccessCalibration, recognition, and shape from silhouettes of stones(2007) Forbes, Keith; Nicolls, Fred; De Jager, GerhardMulti-view shape-from-silhouette systems are increasingly used for analysing stones. This thesis presents methods to estimate stone shape and to recognise individual stones from their silhouettes. Calibration of two image capture setups is investigated. First, a setup consisting of two mirrors and a camera is introduced. Pose and camera internal parameters are inferred from silhouettes alone. Second. the configuration and calibration of a high throughput multi-camera setup is covered. Multiple silhouette sets of a stone are merged into a single set by inferring relative poses between sets. This is achieved by adjusting pose parameters to maximise geometrical consistency specified by the epipolar tangency constraint. Shape properties (such as volume, flatness, and eiongation) are inferred more accurately from the merged silhouette sets than from the original silhouette sets. Merging is used to recognise individual stones from pairs of silhouette sets captured on different occasions. Merged sets with sufficient geometrical consistency are classified as matches (produced by the same stone), whereas inconsistent sets are classified as mismatches. Batch matching is determining the one-to-one correspondence between two unordered batches of silhouette sets of the same batch of stones. A probabilistic framework is used to combine recognition by merging (which is slow, but accurate) with the efficiency of computing shape distribution-based dissimilarity values. Two unordered batches of 1200 six-view silhouette sets of uncut gemstones are correctly matched in approximately 68 seconds (using a 3.2 GHz Pentium 4 machine]. An experiment that compares silhouette-based shape estimates with mechanical sieving demonstrates an application using the developed methods. A batch of 494 garnets is sieved 15 times. After each sieving, silhouette sets are captured for sub-batches in each bin. Batch matching is used to determine the IS sieve bins per stone. Better estimates of repeatability, and better understanding of the variability of the sieving process is obtained than if only histograms (the natural output of sieving) were considered. Silhouette-based sieve emulation is found to be more repeatable than mechanical sieving.
- ItemOpen AccessThe characterisation and automatic classification of transmission line faults(2014) Minnaar, Ulrich; Gaunt, C T; Nicolls, FredA country's ability to sustain and grow its industrial and commercial activities is highly dependent on a reliable electricity supply. Electrical faults on transmission lines are a cause of both interruptions to supply and voltage dips. These are the most common events impacting electricity users and also have the largest financial impact on them. This research focuses on understanding the causes of transmission line faults and developing methods to automatically identify these causes. Records of faults occurring on the South African power transmission system over a 16-year period have been collected and analysed to find statistical relationships between local climate, key design parameters of the overhead lines and the main causes of power system faults. The results characterize the performance of the South African transmission system on a probabilistic basis and illustrate differences in fault cause statistics for the summer and winter rainfall areas of South Africa and for different times of the year and day. This analysis lays a foundation for reliability analysis and fault pattern recognition taking environmental features such as local geography, climate and power system parameters into account. A key aspect of using pattern recognition techniques is selecting appropriate classifying features. Transmission line fault waveforms are characterised by instantaneous symmetrical component analysis to describe the transient and steady state fault conditions. The waveform and environmental features are used to develop single nearest neighbour classifiers to identify the underlying cause of transmission line faults. A classification accuracy of 86% is achieved using a single nearest neighbour classifier. This classification performance is found to be superior to that of decision tree, artificial neural network and naïve Bayes classifiers. The results achieved demonstrate that transmission line faults can be automatically classified according to cause.
- ItemOpen AccessCoded aperture and coded exposure photography : an investigation into applications and methods(2011) Wilson, Martin; Nicolls, FredThis dissertation presents an introduction to the field of computational photography, and provides a survey of recent research. Specific attention is given to coded aperture and coded exposure theory and methods, as these form the basis for the experiments performed.
- ItemOpen AccessDesigning hypothesis tests for digital image matching(2000) Cox, Gregory Sean; Wohlberg, Brendt; Nicolls, Fred; De Jager, GerhardImage matching in its simplest form is a two class decision problem. Based on the evidence in two sensed images, a matching procedure must decide whether they represent two views of the same scene, or views of two different scens. Previous solutions to this problem were either based on an intuitive notion of image similarity, or were modelled on solutions to the superficially similar problem of target detection in images. This research, in contrast, uses a decision theoretic formulation of the problem, with the image pair as unit of observation and probability of error in the match/mismatch decision as performance criterion. A stochastic model is proposed for the image pair, and the optimal test of match and mismatch hypotheses for samples of this random process is derived. The test is written conveniently in terms of a statistic of the two images and a scalar decision threshold. The analytical advantages of a solution derived from first principles are illustrated with the derivation of hypothesis conditional probability distributions, optimal decision thresholds, and expessions for the probability of error in the decision.
- ItemOpen AccessDigital video moving object segmentation using tensor voting: A non-causal, accurate approach(2009) Guest, Ian; Nicolls, FredMotion based video segmentation is important in many video processing applications such as MPEG4. This thesis presents an exhaustive, non-causal method to estimate boundaries between moving objects in a video clip. It make use of tensor voting principles. The tensor voting is adapted to allow image structure to manifest in the tangential plane of the saliency map. The technique allows direct estimation of motion vectors from second-order tensor analysis. The tensors make maximal and direct use of the available information by encoding it into the dimensionality of the tensor. The tensor voting methodology introduces a non-symmetrical voting kernel to allow a measure of voting skewness to be inferred. Skewness is found in the third-order tensor in the direction of the tangential first eigenvector. This new concept is introduced as the Tensor Skewness Map or TS map. The TS map gives further information about whether an object is occluding or disoccluding another object. The information can be used to infer the layering order of the moving objects in the video clip. Matched filtering and detection are applied to reduce the TS map into occluding and disoccluding detections. The technique is computationally exhaustive, but may find use in off-line video object segmentation processes. The use of commercial-off-the-shelf Graphic Processor Units is demonstrated to scale well to the tensor voting framework, providing the computational speed improvement required to make the framework realisable on a larger scale and to handle tensor dimensionalities higher than before.
- ItemOpen AccessEvaluation of optimal control-based deformable registration model(2014) Matjelo, Naleli Jubert; Braae, Martin; Nicolls, FredThe deformable image registration is central to many challenges in medical imaging applications. The basic idea of the deformable image registration problem is to find an approximation of a reasonable deformation which transforms one image to match another based on a chosen similarity measure. A reasonable deformation can be thought of as one that is physically realizable. A number of models, guaranteeing reasonable deformations, have been proposed and implemented with success under various similarity measures. One such model is based on the grid deformation method (GDM) and is the method of interest in this thesis. This work focuses on the evaluation of an optimal control-based model for solving the deformable image registration problem which is formulated using GDM. This model is compared with other four well-known variational-based deformable image registration models: elastic, fluid, diffusion and curvature models. Using similarity and deformation quality measures as performance indices, the non-dominated sorting genetic algorithm (NSGA-II) is applied to approximate the Pareto fronts for each model to facilitate proper evaluation. The Pareto fronts are also visualized using level diagrams analysis.
- ItemOpen AccessGesture recognition with application to human-robot interaction(2015) Mangera, Ra'eesah; Nicolls, Fred; Senekal, FGestures are a natural form of communication, often transcending language barriers. Recently, much research has been focused on achieving natural human-machine interaction using gestures. This dissertation presents the design of a gestural interface that can be used to control a robot. The system consists of two modes: far-mode and near-mode. In far-mode interaction, upper-body gestures are used to control the motion of a robot. Near-mode interaction uses static hand poses to control a graphical user interface. For upper-body gesture recognition, features are extracted from skeletal data. The extracted features consist of joint angles and relative joint positions and are extracted for each frame of the gesture sequence. A novel key-frame selection algorithm is used to align the gesture sequences temporally. A neural network and hidden Markov model are then used to classify the gestures. The framework was tested on three different datasets, the CMU Military dataset of 3 users, 15 gestures and 10 repetitions per gesture, the VisApp2013 dataset with 28 users, 8 gestures and 1 repetition/gesture and a recorded dataset of 15 users, 10 gestures and 3 repetitions per gesture. The system is shown to achieve a recognition rate of 100% across the three different datasets, using the key-frame selection and a neural network for gesture identification. Static hand-gesture recognition is achieved by first retrieving the 24-DOF hand model. The hand is segmented from the image using both depth and colour information. A novel calibration method is then used to automatically obtain the anthropometric measurements of the user’s hand. The k-curvature algorithm, depth-based and parallel border-based methods are used to detect fingertips in the image. An average detection accuracy of 88% is achieved. A neural network and k-means classifier are then used to classify the static hand gestures. The framework was tested on a dataset of 15 users, 12 gestures and 3 repetitions per gesture. A correct classification rate of 75% is achieved using the neural network. It is shown that the proposed system is robust to changes in skin colour and user hand size.
- ItemOpen AccessImage and video segmentation using graph cuts(2010) Kulkarni, Mayuresh; Nicolls, FredIncludes abstract. Includes bibliographical references (leaves 67-71).
- ItemOpen AccessImage registration and its application to computer vision : mosaicing and independant motion detection(2005) Nkanza, Ntana; Nicolls, Fred; De Jager, GerhardImage registration enables the geometric alignment of two images and is widely used in various applications in the fields of remote sensing, medical imaging and computer vision. This thesis explores each of the stages, and looks at two applications of image registration. The applications investigated are mosaicing and independent motion detection. Mosaicing is the aligning of several images into a single composition that represents part of a 3D scene. This is useful for many different applications, including virtual reality environments and movie special effects.
- ItemOpen AccessInvestigation into the use of the Microsoft Kinect and the Hough transform for mobile robotics(2014) O'Regan, Katherine; Verrinder, Robyn; Nicolls, FredThe Microsoft Kinect sensor is a low cost RGB-D sensor. In this dissertation, its calibration is fully investigated and then these parameters are compared to the parameters given by Microsoft and OpenNI. The parameters found were found to be different to those given by Microsoft and OpenNI therefore, every Kinect should be fully calibrated. The transformation from the raw data to a point cloud is also investigated. Then, the Hough transform is presented in its 2-dimensional form. The Hough transform is a line extraction algorithm which uses a voting system. It is then compared to the Split-and-Merge algorithm using laser range _nder data. The Hough transform is found to compare well to the Split-and-Merge in 2 dimensions. Finally, the Hough transform is extended into 3-dimensions for use with the Kinect sensor. It was found that pre-processing of the Kinect data was necessary to reduce the number of points input into the Hough transform. Three edge detectors are used - the LoG, Canny and Sobel edge detectors. These were compared, and the Sobel detector was found to be the best. The _nal process was then used in multiple ways - _rst to determine its speed. Its accuracy was then investigated. It was found that the planes extracted were very inaccurate, and therefore not suitable for obstacle avoidance in mobile robotics. The suitability of the process for SLAM was also investigated. It was found to be unsuitable, as planar environments did not have distinct features which could be tracked, whilst the complex environment was not planar, and therefore the Hough transform would not work.
- ItemOpen AccessLocating facial features with active shape models(2007) Milborrow, Stephen; Nicolls, FredThis dissertation focuses on the problem of locating features in frontal views of upright human faces. The dissertation starts with the Active Shape Model of Cootes et al. [19] and extends it with the following techniques: 1. Selectively using two-instead of one-dimensional landmark profiles. 2. Stacking two Active Shape Models in series. 3. Extending the set of landmarks. 4. Trimming covariance matrices by setting most entries to zero. 5. Using other modifications such as adding noise to the training set. The resulting feature locater is shown to compare favorably with previously published methods.
- ItemOpen AccessProceedings of the Fifteenth Annual Symposium of the Pattern Recognition Association of South Africa(2004) Nicolls, FredThis paper describes an attempt to reconstruct a 3-D object from a set of 35 images captured using a scanning electron microscope. Point matching over overlapping triples of views is used to obtain an initial reconstruction, which is refined using bundle adjustment with the added knowledge that the sequence is closed. Intrinsic camera parameters are estimated via autocalibration under an affine assumption. Good results for the final metric reconstruction are obtained
- ItemOpen AccessReducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods(2021) Razzak, Muhammed T; Nicolls, FredThis dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended.
- ItemOpen AccessSemi-Supervised Transfer Learning for medical images as an alternative to ImageNet Transfer Learning(2022) Nkwentsha, Xolisani; Nicolls, FredOne of the main disadvantages of supervised transfer learning is that it necessarily requires a large amount of expensive manually labelled training data. Consequently, even in medical imaging, transfer learning from natural image datasets (such as ImageNet) has become the norm. However, this approach has been shown to be ineffective due to the significant differences between medical images and natural images. Developing a large-scale medical imaging dataset for transfer learning would be too expensive, therefore the possibility of using large amounts of unlabelled data for feature learning is very attractive. In this work, we propose a semi-supervised transfer learning method for training deep learning models for medical imaging. The main idea behind the proposed method is to leverage unlabelled medical image datasets to improve accuracy for the target task by transferring feature maps learned from an unsupervised task to the supervised target task. We leverage unlabelled data by transferring weights/kernels and representations learned by an autoencoder (specifically the encoder part) during a reconstruction task to a classification task. We show the applicability of features learned by the autoencoder from the collection of unlabelled x-ray images to a pneumonia classification problem. Our proposed method improves the baseline performance by 4.167% in accuracy and the precision, recall and F1 score by 4%. We also demonstrate that increasing the size of the unlabelled dataset used to train the autoencoder improves the performance on the target task. This increase in the size of the dataset resulted in an overall 5.288% accuracy increase from the baseline. We also compare our method with ImageNet models on the target dataset. For the standard ImageNet architectures, we evaluate ResNet50 and Inception-v3, which have both been used extensively in medical deep learning applications. Our proposed method outperforms both standard ImageNet models on the target task. These results demonstrate that learning features from unlabelled medical images for transfer learning for medical imaging tasks is more effective than transfer learning from natural images, at least for the problem of pneumonia detection.