• English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  • Communities & Collections
  • Browse OpenUCT
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  1. Home
  2. Browse by Author

Browsing by Author "Marais, Patrick"

Now showing 1 - 20 of 34
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Open Access
    3D Scan Campaign Classification with Representative Training Scan Selection
    (2019) Pocock, Christopher; Marais, Patrick
    Point cloud classification has been shown to effectively classify points in 3D scans, and can accelerate manual tasks like the removal of unwanted points from cultural heritage scans. However, a classifier’s performance depends on which classifier and feature set is used, and choosing these is difficult since previous approaches may not generalise to new domains. Furthermore, when choosing training scans for campaign-based classification, it is important to identify a descriptive set of scans that represent the rest of the campaign. However, this task is increasingly onerous for large and diverse campaigns, and randomly selecting scans does not guarantee a descriptive training set. To address these challenges, a framework including three classifiers (Random Forest (RF), Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP)) and various point features and feature selection methods was developed. The framework also includes a proposed automatic representative scan selection method, which uses segmentation and clustering to identify balanced, similar or distinct training scans. The framework was evaluated on four labelled datasets, including two cultural heritage campaigns, to compare the speed and accuracy of the implemented classifiers and feature sets, and to determine if the proposed selection method identifies scans that yield a more accurate classifier than random selection. It was found that the RF, paired with a complete multi-scale feature set including covariance, geometric and height-based features, consistently achieved the highest overall accuracy on the four datasets. However, the other classifiers and reduced sets of selected features achieved similar accuracy and, in some cases, greatly reduced training and prediction times. It was also found that the proposed training scan selection method can, on particularly diverse campaigns, yield a more accurate classifier than random selection. However, for homogeneous campaigns where variations to the training set have limited impact, the method is less applicable. Furthermore, it is dependent on segmentation and clustering output, which require campaign-specific parameter tuning and may be imprecise.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A comparative study of recurrent neural networks and statistical techniques for forecasting the stock prices of JSE-listed securities
    (2022) Galant, Rushin; Marais, Patrick
    As machine learning has developed, the attention of stock price forecasters has slowly shifted from traditional statistical forecasting techniques towards machine learning techniques. This study investigated whether machine learning techniques, in particular, recurrent neural networks, do indeed provide greater forecasting accuracy than traditional statistical techniques on the Johannesburg Securities' Exchanges' top forty stocks. The Johannesburg Securities Exchange represents the largest and most developed stock exchange in Africa, though limited research has been performed on the application of machine learning in forecasting stock prices on this exchange. Simple recurrent neural networks, Gated Recurrent Units and Long-Short Term Memory Units were thoroughly evaluated with a Convolutional Neural Network and a random forest were used as machine learning benchmarks. Historical data was collected for the period 2 January 2019 to 29 May 2020, with the 2019 calendar year being used as the training dataset. Both a train once and a Walkforward configuration were used. The number of input observations utilised were varied from four to fifteen observations whilst making forecasts from one up to ten timesteps into the future. The Mean Percentage Error was utilised to measure forecasting accuracy. Different configurations of the Neural Network models were assessed, including considering whether bidirectionality improved forecasting accuracy. The neural networks were run using two different datasets, the historical stock prices on its own and the historical stock prices with the market index (the JSE All Share Index) to determine whether including the market index improves forecasting accuracy. The study found that bidirectional neural networks provided more accurate forecasts than neural networks that did not incorporate bidirectionality. In particular, the Bidirectional Long Short-Term Memory provided the greatest forecasting accuracy for one step forecast whilst the Bidirectional GRU was more accurate two to eight time steps into the future with the Bidirectional LSTM model being more accurate for nine and ten time steps into the future. However, the classical statistical model, the theta method, significantly outperformed all machine learning models. This is likely the result of the unforeseen impact of the covid-19 pandemic on financial markets that would not have been factored into the training sets of the machine learning algorithms. . . .
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Accelerating point cloud cleaning
    (2017) Mulder, Rickert; Marais, Patrick
    Capturing the geometry of a large heritage site via laser scanning can produce thousands of high resolution range scans. These must be cleaned to remove unwanted artefacts. We identified three areas that can be improved upon in order to accelerate the cleaning process. Firstly the speed at which the a user can navigate to an area of interest has a direct impact on task duration. Secondly, design constraints in generalised point cloud editing software result in inefficient abstraction of layers that may extend a task duration due to memory pressure. Finally, existing semi-automated segmentation tools have difficulty targeting the diverse set of segmentation targets in heritage scans. We present a point cloud cleaning framework that attempts to improve each of these areas. First, we present a novel layering technique aimed at segmentation, rather than generic point cloud editing. This technique represents 'layers' of related points in a way that greatly reduces memory consumption and provides efficient set operations between layers. These set operations (union, difference, intersection) allow the creation of new layers which aid in the segmentation task. Next, we introduce roll-corrected 3D camera navigation that allows a user to look around freely while reducing disorientation. A user study shows that this camera mode significantly reduces a user's navigation time (29.8% to 57.8%) between locations in a large point cloud thus reducing the overhead between point selection operations. Finally, we show how Random Forests can be trained interactively, per scan, to assist users in a point cloud cleaning task. We use a set of features selected for their discriminative power on a set of challenging heritage scans. Interactivity is achieved by down-sampling training data on the fly. A simple map data structure allows us to propagate labels in the down-sampled data back to the input point set. We show that training and classification on down-sampled point clouds can be performed in under 10 seconds with little effect on accuracy. A user study shows that a user's total segmentation time decreases between 8.9% and 20.4% when our Random Forest classifier is used. Although this initial study did not indicate a significant difference in overall task performance when compared to manual segmentation, performance improvement is likely with multi-resolution features or the use of colour range images, which are now commonplace.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Active shape model segmentation of Brain structures in MR images of subjects with fetal alcohol spectrum disorder
    (2010) Eicher, Anton; Marais, Patrick; Meintjes, Ernesta
    Fetal Alcohol Spectrum Disorder (FASD) is the most common form of preventable mental retardation worldwide. This condition affects children whose mothers excessively consume alcohol whilst pregnant. FASD can be identified by physical and mental defects, such as stunted growth, facial deformities, cognitive impairment, and behavioural abnormalities. Magnetic Resonance Imaging provides a non-invasive means to study the neural correlates of FASD. One such approach aims to detect brain abnormalities through an assessment of volume and shape of sub-cortical structures on high-resolution MR images.
  • No Thumbnail Available
    Item
    Open Access
    An analysis of extracting cross sections from heritage site point clouds using meshfree point-based techniques
    (2025) Aaron, Jerome; Marais, Patrick
    In the digital age cultural heritage has evolved to incorporate 3D virtual models of heritage buildings and sites. It is common to use laser scanning to acquire the 3D point cloud data for these models, with this data usually being processed and “meshed” to form a surface mesh model, however, this process is computationally costly and generally creates new data (beyond the original, correct, surface samples). For heritage conservation purposes, accurate and original data is preferred. An important issue when using the points directly, rather than the 3D surface model, is the suitability of a direct point representation for rendering and image processing. One important task is the production of floorplan, elevation, and cross-section views from a 3D digital heritage model of a site. In this work, the viability of a Point Based Technique (PBT) approach, for cross-section recovery from heritage point cloud models, is evaluated. The solution developed involves slicing the 3D point cloud, applying binary morphology operations on this 2D point cloud slice to close gaps in the cross-section profile, and filling holes in the cross-section profile. Post processing operations add minor but necessary improvements (such as filling gaps on the ground level due to slicing through vegetation and removing noise). Finally, to produce the desired image, the cross-section is overlaid onto the rendering of the scene (removing points in front of the cross-section slicing plane). The results are assessed by registering the point-based cross-section against the meshbased equivalent. The Intersection over Union (IoU) metric, a measure of similarity between two images, is calculated on the registered images. The tabulated IoU and IoU loss (dissimilarity between the registered cross-sections) is also depicted graphically. The conclusion drawn on analysing the results, is that the fidelity of the point-based cross-section is a close approximation to the mesh-based cross-section (even when considering possible errors from manual registration of the cross-sections), thus validating the point based approach. The average IoU “score” across the 24 crosssections recovered from 4 point cloud, cultural heritage models is 94.6%. An argument may be made for the point-based cross-section being an improvement over the meshbased cross-section, where the latter connects points incorrectly in the mesh reconstruction of the cross-section. A positive relationship is found between the point density and the fidelity of the pointbased cross-section. An examination of the data and results reveals that a higher point density relates to a higher fidelity of the cross-section. The cross-section profile in a higher density point cloud is more likely to have no gaps in the profile, after binary morphology Closing (distance between points are smaller at higher point densities). Note that high point density is typical of current scanning technology (LIDAR, Photogrammetry) but the output is typically downsampled for practical use (e.g. uploading to repositories with file size limitations, to make the data publicly available), relying instead on mesh reconstruction to model a surface. For the task of recovering cross-sections from 3D point clouds of cultural heritage sites we conclude that point- based techniques offer a similar accuracy to mesh-based techniques - provided that the model possesses a sufficiently high point density. The advantage of using points directly is the avoidance of (i) additional effort and computational cost of reconstructing a mesh, and (ii) the possible creation of undesirable “new” points (in addition to the original points) in the mesh reconstruction process.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    An automatic marker for vector graphics drawing tasks
    (2016) Bunn, Tristan; Marais, Patrick
    In recent years, the SVG file format has grown increasingly popular, largely due to its widespread adoption as the standard image format for vector graphics on the World Wide Web. However, vector graphics predate the modern Web, having served an important role in graphic and computer-aided design for decades prior to SVG's adoption as a web standard. Vector graphics are just as - if not more - relevant than ever today. As a result, training in vector graphics software, particularly in graphic and other creative design fields, forms an important part of the skills development necessary to enter the industry. This study explored the feasibility of a web application that can automatically mark/assess drawing tasks completed in popular vector graphics editors such as Adobe Illustrator, CorelDRAW, and Inkscape. This prototype has been developed using a collection of front-end and back-end web technologies, requiring that users need only a standards-compliant, modern web browser to submit tasks for assessment. Testing was carried out to assess how the application handled SVG markup produced by different users and vector graphics drawing software; and whether the assessment/scoring of submitted tasks was inline with that of a human marker. While some refinement is required, the application assessed six different tasks, submitted eleven times over by as many individuals, and for the greater part was successful in reporting scores in line with that of the researcher. As a prototype, serving as a proof of concept, the project proved the automatic marker a feasible concept. Exactly how marks should be assigned, for which criteria, and how much instruction should be provided are aspects for further study; along with support for curved path segments, and automatic task generation.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    The design and evaluation of a Java image analysis tool for componentizing lines from digitial architectural floor plans
    (2006) Ochwo, Jeniffer; Marais, Patrick
    This research set out to determine the feasibility of using Java to create a tool that could perform paper-to-electronic format conversion by vectorizing the lines in a raster image of an architectural floor plan. The tool aimed to apply a method that was previously used in another field (mechanical engineering) to Architectural floor plans. The method used had to overcome the problems associated with raster drawings that include noise and image disortions in addition to being able to identify lines, the line thickness and the junctions along the lines. The method used was the Global Line Vectorization Algorithm.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Distributed texture-based terrain synthesis
    (2011) Tasse, Flora Ponjou; Gain, James; Marais, Patrick
    Terrain synthesis is an important field of Computer Graphics that deals with the generation of 3D landscape models for use in virtual environments. The field has evolved to a stage where large and even infinite landscapes can be generated in realtime. However, user control of the generation process is still minimal, as well as the creation of virtual landscapes that mimic real terrain. This thesis investigates the use of texture synthesis techniques on real landscapes to improve realism and the use of sketch-based interfaces to enable intuitive user control.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Enhancing colour-coded poll sheets using computer vision as a viable Audience Response System (ARS) in Africa
    (2018) Muchaneta, Irikidzai Zorodzai; Gain, James; Marais, Patrick
    Audience Response Systems (ARS) give a facilitator accurate feedback on a question posed to the listeners. The most common form of ARS are clickers; Clickers are handheld response gadgets that act as a medium of communication between the students and facilitator. Clickers are prohibitively expensive creating a need to innovate low-cost alternatives with high accuracy. This study builds on earlier research by Gain (2013) which aims to show that computer vision and coloured poll sheets can be an alternative to clicker based ARS. This thesis examines a proposal to create an alternative to clickers applicable to the African context, where the main deterrent is cost. This thesis studies the computer vision structures of feature detection, extraction and recognition. In this research project, an experimental study was conducted using various lecture theatres with students ranging from 50 - 150. Python and OpenCV tools were used to analyze the photographs and document the performance as well as observing the different conditions in which to acquire results. The research had an average detection rate of 75% this points to a promising alternative audience response system as measured by time, cost and error rate. Further work on the capture of the poll sheet would significantly increase this result. With regards to cost, the computer vision coloured poll sheet alternative is significantly cheaper than clickers.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Fast galactic structure finding using graphics processing units
    (2014) Wood, Daniel; Marais, Patrick; Faltenbacher, Andreas
    Cosmological simulations are used by astronomers to investigate large scale structure formation and galaxy evolution. Structure finding, that is, the discovery of gravitationally-bound objects such as dark matter halos, is a crucial step in many such simulations. During recent years, advancing computational capacity has lead to halo-finders needing to manage increasingly larger simulations. As a result, many multi-core solutions have arisen in an attempt to process these simulations more efficiently. However, a many-core approach to the problem using graphics processing units (GPUs) appears largely unexplored. Since these simulations are inherently n-body problems, they contain a high degree of parallelism, which makes them very well suited to a GPU architecture. Therefore, it makes sense to determine the potential for further research in halo-finding algorithms on a GPU.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Fast presenter tracking for 4K lecture videos using computationally inexpensive algorithms
    (2023) Fitzhenry, Charles; Marais, Patrick; Marquard, Stephen
    Lecture recording has become an essential tool for educational institutions to enhance the student learning experience and offer online courses for remote learning programs. Highresolution 4K cameras have gained popularity in these systems due to their affordability and clarity of written content on boards/screens. Unfortunately, at 4K resolution, a typical 45- minute lecture video easily exceeds 2GB. Many video files of this size place a financial burden on institutions and students, especially in developing countries where financial resources are limited. Institutions require costly high-end equipment to capture, store and distribute this ever-increasing collection of videos. Students require a fast internet connection with a large data quota for off-campus viewing, which can be too expensive for many, especially if they use mobile data. This project designs and implements a low-cost presenter and writing detection front-end that can integrate with an external Virtual Cinematographer (VC). Gesture detection was also explored; however, the frame differencing approach used for presenter detection was not sufficiently robust for gesture detection. Our front-end is carefully designed to run on commodity computers without requiring expensive Graphics Processing Units (GPU) or servers. An external VC can use our contextual information to segment a smaller cropping window from the 4K frame, only containing the presenter and relevant boards, drastically reducing the file size of the resultant videos while preserving writing clarity. The software developed as part of this project will be available as open source. Our results show that the front-end module is fit for purpose and sufficiently robust across several challenging lecture venue types. On average, a 2-minute video clip is processed by the front-end in under 60 seconds (or approximately half of the input video duration). The majority (89%) of this time is used for reading and decoding frames from storage. Additionally, our low-cost presenter detection achieves an overall F1-Score of 0.76, while our writing detection achieves an overall F1-Score of 0.55. We also demonstrate a mean reduction of 81.3% in file size from the original 4K video to a cropped 720p video when using our front-end in a full pipeline with an external VC.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Field D* pathfinding in weighted simplicial complexes
    (2013) Perkins, Simon; Marais, Patrick; Gain, James
    The development of algorithms to efficiently determine an optimal path through a complex environment is a continuing area of research within Computer Science. When such environments can be represented as a graph, established graph search algorithms, such as Dijkstra’s shortest path and A*, can be used. However, many environments are constructed from a set of regions that do not conform to a discrete graph. The Weighted Region Problem was proposed to address the problem of finding the shortest path through a set of such regions, weighted with values representing the cost of traversing the region. Robust solutions to this problem are computationally expensive since finding shortest paths across a region requires expensive minimisation. Sampling approaches construct graphs by introducing extra points on region edges and connecting them with edges criss-crossing the region. Dijkstra or A* are then applied to compute shortest paths. The connectivity of these graphs is high and such techniques are thus not particularly well suited to environments where the weights and representation frequently change. The Field D* algorithm, by contrast, computes the shortest path across a grid of weighted square cells and has replanning capabilites that cater for environmental changes. However, representing an environment as a weighted grid (an image) is not space-efficient since high resolution is required to produce accurate paths through areas containing features sensitive to noise. In this work, we extend Field D* to weighted simplicial complexes – specifically – triangulations in 2D and tetrahedral meshes in 3D.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    GPU-based acceleration of radio interferometry point source visibility simulations in the MeqTrees framework
    (2013) Baxter, Richard Jonathan; Marais, Patrick; Kuttel, Michelle Mary
    Modern radio interferometer arrays are powerful tools for obtaining high resolution images of low frequency electromagnetic radiation signals in deep space. While single dish radio telescopes convert the electromagnetic radiation directly into an image of the sky (or sky intensity map), interferometers convert the interference patterns between dishes in the array into samples of the Fourier plane (UV-data or visibilities). A subsequent Fourier transform of the visibilities yields the image of the sky. Conversely, a sky intensity map comprising a collection of point sources can be subjected to an inverse Fourier transform to simulate the corresponding Point Source Visibilities (PSV). Such simulated visibilities are important for testing models of external factors that affect the accuracy of observed data, such as radio frequency interference and interaction with the ionosphere. MeqTrees is a widely used radio interferometry calibration and simulation software package that contains a Point Source Visibility module. Unfortunately, calculation of visibilities is computationally intensive: it requires application of the same Fourier equation to many point sources across multiple frequency bands and time slots. There is great potential for this module to be accelerated by the highly parallel Single-Instruction-Multiple-Data (SIMD) architectures in modern commodity Graphics Processing Units (GPU). With many traditional high performance computing techniques requiring high entry and maintenance costs, GPUs have proven to be a cost effective and high performance parallelisation tool for SIMD problems such as PSV simulations. This thesis presents a GPU/CUDA implementation of the Point Source Visibility calculation within the existing MeqTrees framework. For a large number of sources, this implementation achieves an 18x speed-up over the existing CPU module. With modications to the MeqTrees memory management system to reduce overheads by incorporating GPU memory operations, speed-ups of 25x are theoretically achievable. Ignoring all serial overheads, and considering only the parallelisable sections of code, speed-ups reach up to 120x.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A GPU-Based level of detail system for the real-time simulation and rendering of large-scale granular terrain
    (2014) Leach, Craig; Marais, Patrick
    Real-time computer games and simulations often contain large virtual outdoor environments. Terrain forms an important part of these environments. This terrain may consist of various granular materials, such as sand, rubble and rocks. Previous approaches to rendering such terrains rely on simple textured geometry, with little to no support for dynamic interactions. Recently, particle-based granular terrain simulations have emerged as an alternative method for simulating and rendering granular terrain. These systems simulate granular materials by using particles to represent the individual granules, and exhibit realistic, physically correct interactions with dynamic objects. However, they are extremely computationally expensive, and thus may only feasibly be used to simulate small areas of terrain. In order to overcome this limitation, this thesis builds upon a previously created particle-based granular terrain simulation, by integrating it with a heightfield-based terrain system. In this way, we create a level of detail system for simulating large-scale granular terrain. The particle-based terrain system is used to represent areas of terrain around dynamic objects, whereas the height field-based terrain is used elsewhere. This allows large-scale granular terrain to be simulated in real-time, with physically correct dynamic interactions. This is made possible by a novel system, which allows for terrain to be converted from one representation to the other in real-time, while maintaining changes made to the particle-based system in the heightfield-based system. We show that the system is capable of simulating and rendering multiple particle- based simulations across a large-scale terrain, whilst maintaining real-time performance. In one scenario, 10 high-fidelity simulations were run at the same time, whilst maintaining 30 frames per second. However, the number of particles used, and thus the number of particle-based simulations which may be used, is limited by the computational resources of the GPU. Additionally, the particle sizes don't allow for sand to be realistically simulated, as was our original goal. However, other granular materials may still be simulated.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    High fidelity compression of irregularly sampled height fields
    (2007) Marais, Patrick; Gain, James
    This paper presents a method to compress irregularly sampled height-fields based on a multi-resolution framework. Unlike many other height-field compression techniques, no resampling is required so the original height-field data is recovered (less quantization error). The method decomposes the compression task into two complementary phases: an in-plane compression scheme for (x, y) coordinate positions, and a separate multi-resolution z compression step. This decoupling allows subsequent improvements in either phase to be seamlessly integrated and also allows for independent control of bit-rates in the decoupled dimensions, should this be desired. Results are presented for a number of height-field sample sets quantized to 12 bits for each of x and y, and 10 bits for z. Total lossless encoded data sizes range from 11 to 24 bits per point, with z bit-rates lying in the range 2.9 to 8.1 bits per z coordinate. Lossy z bit-rates (we do not lossily encode x and y) lie in the range 0.7 to 5.9 bits per z coordinate, with a worst-case root-mean-squared (RMS) error of less than 1.7% of the z range. Even with aggressive lossy encoding, at least 40% of the point samples are perfectly reconstructed.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    High-level control of agent-based crowds by means of general constraints
    (2009) Jacka, David; Gain, James; Marais, Patrick
    The use of computer-generated crowds in visual effects has grown tremendously since the warring armies of virtual ores and elves were seen in The Lord of the Rings. These crowds are generated by agent-based simulations, where each agent has the ability to reason and act for itself. This autonomy is effective at automatically producing realistic, complex group behaviour but leads to problems in controlling the crowds. Due to interaction between crowd members, the link between the behaviour of the individual and that of the whole crowd is not obvious. The control of a crowd’s behaviour is, therefore time consuming and frustrating, as manually editing the behaviour of individuals is often the only control approach available. This problem of control has not been widely addressed in crowd simulation research.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A highly accessible application for detection and classification of maize foliar diseases from leaf images
    (2017) Khethisa, Joang Adolf; Marais, Patrick
    Crop diseases are a major impediment to food security in the developing world. The development of cheap and accurate crop diagnosis software would thus be of great benefit to the farming community. A number of previous studies, utilizing computer vision and machine-learning algorithms, have successfully developed applications that can diagnose crop diseases. However, these studies have primarily focussed either on developing large scale remote sensing applications more suited for large scale farming or on developing desktop/laptop applications and a few others on developing high end smartphone applications. Unfortunately, the attendant hardware requirements and expenses make them inaccessible to the majority of the subsistence farmers, especially those in sub-Saharan Africa where both smartphones and personal computers ownership is minimal. The primary objective of our research was to establish the feasibility of utilizing computer vision and machine learning techniques to develop a crop diseases diagnosis application that is not only accessible through personal computers and smartphones but is also accessible through any internet enabled feature phone. Leveraging methods established in previous papers, we successfully developed a prototype crop diseases diagnosis application capable of diagnosing two maize foliar diseases, Common Rust and Grey Leaf Spot. This application is accessible through personal computers and high end smartphones as well as through any internet enabled feature phones. The solution is a responsive web based application constructed using open source libraries whose diagnosing engine utilizes an SVM classifier that can be trained using either SIFT or SURF features. The solution was evaluated to establish classification accuracy, page load times when accessed from different networks and its cross-browser support. The system achieved 73.3% overall accuracy rate when tested using images identical to images end users would upload. Page load times were considerably long on GPRS and 2G network tests. However, they were comparable to average page load times users would experience when accessing google search engine pages from similar networks. Cross-browser support tests indicated that the system is fully compatible with all popular mobile and desktop browsers. Based on the evaluation results, we concluded that it is feasible to develop a crop diseases diagnosis application that in addition to being accessible through personal computers and smartphones can also be accessed through any internet enabled feature phones.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    Identification and reconstruction of bullets from multiple x-rays
    (2004) Perkins, Simon; Marais, Patrick
    The 3D shape and position of objects inside the human body are commonly detected using Computed Tomography (CT) scanning. CT is an expensive diagnostic option in economically disadvantaged areas and the radiation dose experienced by the patient is significant. In this dissertation, we present a technique for reconstructing the 3D shape and position of bullets from multiple X-rays. This technique makes use of ubiquitous X-ray equipment and a small number of X-rays to reduce the radiation dose. Our work relies on Image Segmentation and Volume Reconstruction techniques.
  • No Thumbnail Available
    Item
    Open Access
    Investigating the virtual directing strategies of a virtual cinematographer in an automatic lecture video post-processing system
    (2023) Khatieb, Mohamed Tanweer; Marais, Patrick ; Marquard, Stephen; Marquard, Stephen
    As recording technology improves and becomes more affordable, many learning institutions are using lecture recording to make lessons more persistent and accessible. Statically mounted 4K cameras are now cheaper than PTZ cameras which makes them a desirable alternative for lecture recordings. Unfortunately, 4K resolution videos are very large, posing a problem for storage and streaming - the file size for a 45 - 60 minute lecture video in 4K can exceed 2GB. Many students cannot afford the bandwidth required to stream such large files. Furthermore, since static 4K cameras do not move, they require a wide-angle view of the venue in order to capture as much of the front of the venue as possible. This view is much too zoomed out for viewers to see the details, such as writing on the boards and the presenter's facial expressions, captured by the 4K resolution. This dissertation investigates an approach to post-processing these 4K lecture videos to reduce the file size and emphasise lecture details such as lecture motion and board/screen usage. This is done using scene tracking data (generated via a third-party front-end) which a Virtual Cinematographer (VC) uses to make decisions on about which areas to crop from each 4K frame in the original video. The VC then positions and sizes the cropping windows in such a way that the resultant, cropped video resembles one recorded by a human camera operator. This is accomplished using cinematographic heuristics to inform its decision-making. The VC uses scene analysis algorithms to determine how the environment changes as time progresses in the video. By dividing the video into “chunks” (equivalent to “scenes” in traditional cinematography) based on context, the VC is able to maintain stable shots with consistent framing to avoid jittery and disorienting footage. These contextual chunks are determined by comparing the trajectory of the presenter with the manner in which the features on the board regions change over time. After the chunks are established, the VC creates transitions between them while avoiding any changes to the framing inside each chunk. The final output is a JSON file containing the cropping coordinates for each frame in the video for a third-party video cropping application to use when producing the final video. We performed a user evaluation of the VC to measure user satisfaction with the resulting output videos and how successful it was at following its heuristics. The VC succeeded in following the major heuristics such that viewers were satisfied with the output based on the framing of the presenter and the content on the boards, transition stability and smoothness of motion, and transition frequency with the VC only changing shots when necessary.
  • Loading...
    Thumbnail Image
    Item
    Open Access
    A linear framework for character skinning
    (2007) Merry, Bruce; Marais, Patrick; Gain, James
    Character animation is the process of modelling and rendering a mobile character in a virtual world. It has numerous applications both off-line, such as virtual actors in films, and real-time, such as in games and other virtual environments. There are a number of algorithms for determining the appearance of an animated character, with different trade-offs between quality, ease of control, and computational cost. We introduce a new method, animation space, which provides a good balance between the ease-of-use of very simple schemes and the quality of more complex schemes, together with excellent performance. It can also be integrated into a range of existing computer graphics algorithms. Animation space is described by a simple and elegant linear equation. Apart from making it fast and easy to implement, linearity facilitates mathematical analysis. We derive two metrics on the space of vertices (the "'animation space"), which indicate the mean and maximum distances between two points on an animated character. Ve demonstrate the value of these metrics by applying them to the problems of parametrisation, level-of-detail (LOD) and frustum culling. These ltletrics provide information about the entire range of poses of an animated character, so they are able to produce better results than considering only a single pose of the character, as is commonly done.
  • «
  • 1 (current)
  • 2
  • »
UCT Libraries logo

Contact us

Jill Claassen

Manager: Scholarly Communication & Publishing

Email: openuct@uct.ac.za

+27 (0)21 650 1263

  • Open Access @ UCT

    • OpenUCT LibGuide
    • Open Access Policy
    • Open Scholarship at UCT
    • OpenUCT FAQs
  • UCT Publishing Platforms

    • UCT Open Access Journals
    • UCT Open Access Monographs
    • UCT Press Open Access Books
    • Zivahub - Open Data UCT
  • Site Usage

    • Cookie settings
    • Privacy policy
    • End User Agreement
    • Send Feedback

DSpace software copyright © 2002-2026 LYRASIS