Browsing by Author "Berman, Sonia"
Now showing 1 - 20 of 23
Results Per Page
Sort Options
- ItemOpen AccessA mobile application promoting good contact lens practices(2022) Naidoo, Terushka; Berman, SoniaContact lens complications pose an ongoing problem for both optometrists and contact lens wearers. Most of these complications are due to noncompliance to good care practices. Education is the first step to ensuring compliance. If good habits are created on commencement of wear, patients are more likely to continue with these habits or practices. The key however, is maintenance and building on this education, as we cannot expect patients to remember all the information given to them initially. Telemedicine is rapidly becoming a wide reaching and convenient way to provide services and support to patients. The aim of this study was to create a mobile application to provide contact lens wearers with knowledge and assistance to empower them to take good care of their eyes and lenses. A mobile application was built for the study with three main features: a lens change reminder, an information feature, and a diagnosis facility to aid contact lens wearers when they encounter any problems. A PDF version of the application was also created with the latter two features; a secondary aim was to compare its success with that of the mobile application. After receiving ethical clearance for the study, lens wearers who signed the Informed Consent form, were surveyed about their symptoms, knowledge and habits in relation to contact lenses and their eyes. After being divided into two groups, they were either given the mobile application or the PDF document to use. They were subsequently given a second survey to determine if there were any changes to symptoms, habits and knowledge. They were also questioned about the value and effectiveness of the application and the PDF. Although, the results of habit changes were inconclusive, there was a decrease in symptoms after using both the app and the PDF. Both were well received and the majority of participants reported that they would recommended them to others. The mobile application was used more frequently than the PDF, led to a slightly better improvement in knowledge, and scored slightly better in its user evaluation, compared to the PDF.
- ItemOpen AccessAn architecture for secure searchable cloud storage(2012) Koletka, Robert; Hutchison, Andrew; Berman, SoniaCloud Computing is a relatively new and appealing concept; however, users may not fully trust Cloud Providers with their data and can be reluctant to store their files on Cloud Storage Services. The problem is that Cloud Providers allow users to store their information on the provider's infrastructure with compliance to their terms and conditions, however all security is handled by the provider and generally the details of how this is done are not disclosed. This thesis describes a solution that allows users to securely store data all a public cloud, while also providing a mechanism to allow for searchability through their encrypted data. Users are able to submit encrypted keyword queries and, through a symmetric searchable encryption scheme, the system retrieves a list of files with such keywords contained within the cloud storage medium.
- ItemOpen AccessAutomated feature synthesis on big data using cloud computing resources(University of Cape Town, 2020) Saker, Vanessa; Berman, SoniaThe data analytics process has many time-consuming steps. Combining data that sits in a relational database warehouse into a single relation while aggregating important information in a meaningful way and preserving relationships across relations, is complex and time-consuming. This step is exceptionally important as many machine learning algorithms require a single file format as an input (e.g. supervised and unsupervised learning, feature representation and feature learning, etc.). An analyst is required to manually combine relations while generating new, more impactful information points from data during the feature synthesis phase of the feature engineering process that precedes machine learning. Furthermore, the entire process is complicated by Big Data factors such as processing power and distributed data storage. There is an open-source package, Featuretools, that uses an innovative algorithm called Deep Feature Synthesis to accelerate the feature engineering step. However, when working with Big Data, there are two major limitations. The first is the curse of modularity - Featuretools stores data in-memory to process it and thus, if data is large, it requires a processing unit with a large memory. Secondly, the package is dependent on data stored in a Pandas DataFrame. This makes the use of Featuretools with Big Data tools such as Apache Spark, a challenge. This dissertation aims to examine the viability and effectiveness of using Featuretools for feature synthesis with Big Data on the cloud computing platform, AWS. Exploring the impact of generated features is a critical first step in solving any data analytics problem. If this can be automated in a distributed Big Data environment with a reasonable investment of time and funds, data analytics exercises will benefit considerably. In this dissertation, a framework for automated feature synthesis with Big Data is proposed and an experiment conducted to examine its viability. Using this framework, an infrastructure was built to support the process of feature synthesis on AWS that made use of S3 storage buckets, Elastic Cloud Computing services, and an Elastic MapReduce cluster. A dataset of 95 million customers, 34 thousand fraud cases and 5.5 million transactions across three different relations was then loaded into the distributed relational database on the platform. The infrastructure was used to show how the dataset could be prepared to represent a business problem, and Featuretools used to generate a single feature matrix suitable for inclusion in a machine learning pipeline. The results show that the approach was viable. The feature matrix produced 75 features from 12 input variables and was time efficient with a total end-to-end run time of 3.5 hours and a cost of approximately R 814 (approximately $52). The framework can be applied to a different set of data and allows the analysts to experiment on a small section of the data until a final feature set is decided. They are able to easily scale the feature matrix to the full dataset. This ability to automate feature synthesis, iterate and scale up, will save time in the analytics process while providing a richer feature set for better machine learning results.
- ItemOpen AccessCommunication tools for distance learning students(2021) Cossa, Adele; Berman, SoniaIn distance learning, ICT tools are used to bridge the instructional gap caused by physical distance between the lecturer and the student. Therefore, more effective communication tools can help to enhance the success of a distance learning curriculum. Communication barriers such as disconnectedness, conceptual confusion and lack of social pressure to perform, can negatively affect the success of distance learning. Careful design and implementation of contextually appropriate communication tools is vital in a distance learning curriculum. The University of Cape Town (UCT) Conversion Masters in Information Technology (MIT) originally used a tool called Vula for communication between staff and students, as well as student-to-student communication. Vula is UCT's implementation of the Sakai learning management system. Between 2016 and 2018, a major shift was observed in the adoption and use of communication tools within the programme. There was a noticeable decrease in dialogue between students and lecturers on Vula, and an increase in student-to-student communication using WhatsApp. In 2018, the Slack communication tool wasintroduced to the MIT degree with the objective of increasing communication and collaboration between students and lecturers. This study investigates the adoption and use of the three communication tools (Vula, WhatsApp and Slack) within the context of the University of Cape Town MIT programme. The research aims to provide an understanding of communication needs and practice that can inform the design of distance learning programmes and enable them to harness the potential of social communication tool features. The study describes the nature of communication within the UCT MIT degree. The research also explores the functional features of the tools and how they are used, and the frequency of interaction on the various communication platforms within the MIT programme. This is complemented by a survey of current MIT students and their perceptions. The research analysed 2605 communication messages in Vula (UCT's name for the Sakai learning management system), Slack and WhatsApp communication tools over the three-year transition period 2016-2018. Feedback from a student survey, in which 11 respondents completed a questionnaire after an interview, is also presented. Based on questionnaire responses from MIT students, Vula is viewed as the best tool for administrative matters, WhatsApp is preferred for sharing information and checking on peers, and Slack is perceived as best for communication with all types of participants - students, lecturers and tutors. Most respondents rated WhatsApp as accessible, convenient and providing a good experience, while far fewer did so for Vula and Slack. WhatsApp was also seen to be the tool students used to reinforce or follow up on communications posted on the other tools.
- ItemOpen AccessDecision tree classifiers for incident call data sets(2017) Igboamalu, Frank Nonso; Berman, SoniaInformation technology (IT) has become one of the key technologies for economic and social development in any organization. Therefore the management of Information technology incidents, and particularly in the area of resolving the problem very fast, is of concern to Information technology managers. Delays can result when incorrect subjects are assigned to Information technology incident calls: because the person sent to remedy the problem has the wrong expertise or has not brought with them the software or hardware they need to help that user. In the case study used for this work, there are no management checks in place to verify the assigning of incident description subjects. This research aims to develop a method that will tackle the problem of wrongly assigned subjects for incident descriptions. In particular, this study explores the Information technology incident calls database of an oil and gas company as a case study. The approach was to explore the Information technology incident descriptions and their assigned subjects; thereafter the correctly-assigned records were used for training decision tree classification algorithms using Waikato Environment for Knowledge Analysis (WEKA) software. Finally, the records incorrectly assigned a subject by human operators were used for testing. The J48 algorithm gave the best performance and accuracy, and was able to correctly assign subjects to 81% of the records wrongly classified by human operators.
- ItemOpen AccessEpistemology as the basis for a corporate memory model(2001) Azbel, Ilan; Berman, SoniaEpistemology is the philosophical study of knowledge; it attempts to understand which beliefs a person can claim to know. This thesis discusses how theories found in epistemology can be used to understand what knowledge an organisation possesses, and shows how epistemic theories can be used as a basis for a corporate memory model. Work in artificial intelligence knowledge representations and corporate memory knowledge representations are compared with ideas found in epistemology. A corporate memory model based on epistemology is proposed, and it is shown how this model can be used as a basis for development of corporate memory systems.
- ItemOpen AccessExpert system adjudication of hospital data in HIV disease management(2012) Joseph, Asma; Berman, SoniaHIV Disease Management Programs (DMP's) are comprehensive programs that are designed to manage the HIV infected patient's treatment in an integrated manner.
- ItemOpen AccessA framework for building spatiotemporal applications in Java(2001) Voges, Erik; Berman, SoniaA relatively new area of study that first received attention during the past decade explores how spatiotemporal data can be efficiently used in applications and stored in databases. Spatial data includes the locations or positions, and possibly also the size and orientation of physical objects. Temporal data is time-stamped, i.e. every piece of temporal data is associated with at least one point or interval in time. Spatiotemporal data is both spatial and temporal. It would denote, for example, where a particular object was at a given time, or when an object had a certain size. There is a need for many spatial applications to have temporal functionality added. Often such an approach is fraught with problems, since the existing designs were specifically tailored for spatial applications, and changing those designs invariably leads to poorer performance or some other compromises. Because it is not easy to change an existing spatial or temporal system to a spatiotemporal system, it makes sense for developers to start building new systems that are optimised for handling spatiotemporal data. Building an application from the ground up is a daunting task, but there are powerful technologies in existence that can expedite the process. One such technology is found in persistent programming languages, which relieves the application builder of the task of storing and retrieving data (e.g. on a database or file). Persistence technology was created with the aim of making complex applications such as Geographic Information Systems (GIS), Computer Aided Design (CAD) systems and Computer Aided Software Engineering (CASE) software easier to create and maintain. When using a persistent language, the same data types and operations are used for both transient and database data - there is no need for code to translate between store and memory formats or to transfer objects between memory and disk. In contrast, programmers using conventional databases need to be able to program in two languages: the database language (like SQL) and a programming language (like C++ or Java). To date there has been little work done to actually apply persistence technology to the GIS domain [MIA96, SVZ98]. One problem with a persistent programming approach in this context is that one has to spend a great deal of time to work out how spatiotemporal data is going to be dealt with in the application, as noted in trying to build a PJama (Persistent Java) system for the local Sea Fisheries Research Institute in our department [SVZ98]. How will spatiotemporal data be represented, what methods are necessary for manipulating the data, and where would spatiotemporal data and methods be positioned in the overall structure of the program? This thesis aims at developing a framework that can be used to make persistent spatiotemporal systems a viable alternative to conventional GIS applications. The approach adopted involves finding a suitable data model, implementing this as a PJama (Persistent Java) class library, extending this with structures to improve performance, and evaluating the result.
- ItemOpen AccessGarbage collection of the PLAVA object store(2002) Schulz, Michael F; Berman, SoniaThis research investigates the implementation of suitable garbage collection schemes for a persistent store. The store that formed the basis for these experiments was the one developed for a Java Virtual Machine known as PLaVa, designed by Stephan Tjasink at the University of Cape Town [Tja99]. Two garbage collection schemes were implemented: semispace copying and a partitioned collection scheme due to Maheshwari and Liskov [M21h97, ML97]. Semispace copying was implemented to gain background knowledge of how the PLaVa store operates and how a simple garbage collection scheme could be developed for it. This work is then extended by implementing an incremental garbage collection scheme. Both implementations required modifications to the PLaVa store, in order to support semispace and partitioning algorithms. The partitioned collection scheme in particular required almost a complete re-implementation of the store. To evaluate the implemented garbage collection schemes, a synthetic application was developed, which allowed the fine tuning of specific parameters. This facilitated the evaluation of specific features and mechanisms used in both schemes, so that bottlenecks within the implementation could be determined. The evaluation results show that both garbage collection schemes are suitable for small to medium stores with reasonable amounts of idle time. Semispace copying overhead is linearly proportional to the amount of live data that exists during copying ( with store size having far less of an impact), but requires stores to be twice as large. Incremental garbage collection using the partitioned store method performs well under most store configurations, but becomes an ever increasing bottleneck as the number of inter-partition references increases. It is thus sensitive to the object placement scheme being used as well as the partition selection policy.
- ItemOpen AccessJoining and aggregating datasets using CouchDB(2018) Smith, Zach; Berman, SoniaData mining typically requires implementing operations that involve cross-cutting entity boundaries and are awkward to implement in document-oriented databases. CouchDB, for example, models entities as documents, with highly isolated entity boundaries, and on which joins cannot be directly performed. This project shows how join and aggregation can be achieved across entity boundaries in such systems, as encountered for example in the pre-processing and exploration stages of educational data mining. A software stack is presented as a means by which this can be achieved; first, datasets are processed via ETL operations, then MapReduce is used to create indices of ordered and aggregated data. Finally, a Couchdb list function is used to iterate through these indices and perform joins, and to compute aggregated values on joined datasets such as variance and correlations. In terms of the case study, it is shown that the proposed approach to implementing cross-document joins and aggregation is effective and scalable. In addition, it was discovered that for the 2014 - 2016 UCT cohorts, NBT scores correlate better with final grades for the CSC1015F course than do Grade 12 results for English, Science and Mathematics.
- ItemOpen AccessP-Pascal : a data-oriented persistent programming language(1991) Berman, Sonia; MacGregor, KenPersistence is measured by the length of time an object is retained and is usable in a system. Persistent languages extend general purpose languages by providing the full range of persistence for data of any type. Moreover, data which remains on disk after program termination, is manipulated in the same way as transient data. As these languages are based on general purpose programming languages, they tend to be program-centred rather than data-centred. This thesis investigates the inclusion of data-oriented features in a persistent programming language. P-Pascal, a Persistent Pascal, has been designed and implemented to develop techniques for data clustering, metadata maintenance, security enforcement and bulk data management. It introduces type completeness to Pascal and in particular shows how a type-complete set constructor can be provided. This type is shown to be a practical and versatile mechanism for handling bulk data collections in a persistent environment. Relational algebra operators are provided and the automatic optimisation of set expressions is performed by the compiler and the runtime system. The P-Pascal Abstract Machine incorporates two complementary data placement strategies, automatic updating of type information, and metadata query facilities. The protection of data types, primary (named) objects and their individual components is supported. The challenges and opportunities presented by the persistent store organisation are discussed, and techniques for efficiently exploiting these properties are proposed. We also describe the effects on a data-oriented system of treating persistent and transient data alike, so that they cannot be distinguished statically. We conclude that object clustering, metadata maintenance and security enforcement can and should be incorporated in persistent programming languages. The provision of a built-in, type-complete bulk data constructor and its non-procedural operators is demonstrated. We argue that this approach is preferable to engineering such objects on top of a language, because of greater ease of use and considerable opportunity for automatic optimisation. The existence of such a type does not preclude programmers from constructing their own bulk objects using other types - this is but one advantage of a persistent language over a database system.
- ItemOpen AccessPeer-to-peer systems for simple and flexible information sharing(2009) Pather, Suhendran; Berman, SoniaPeer to peer computing (P2P) is an architecture that enables applications to access shared resources, with peers having similar capabilities and responsibilities. The ubiquity of P2P computing and its increasing adoption for a decentralized data sharing mechanism have fueled my research interests. P2P networks are useful for sharing content files containing audio, video, and data. This research aims to address the problem of simple and flexible access to data from a variety of data sources across peers with different operating systems, databases and hardware. The proposed architecture makes use of SQL queries, web services, heterogeneous database servers and XML data transformation for the peer to peer data sharing prototype. SQL queries and web services provide a data sharing mechanism that allows both simple and flexible data access.
- ItemOpen AccessPredicting grade progression within the Limpopo Education System(2018) Ramphele, Frans; Berman, SoniaOne way to improve education in South Africa is to ensure that additional support and resourcing are provided to schools and learners that are most in need of help. To this end, education officials need to understand the factors affecting learning and the schools most in need of appropriate interventions. Several theories, models and methods have been developed to attempt to address the challenges faced in the education sector. Educational Data Mining (EDM) is one which has gained prominence in addressing these challenges. EDM is a field of data mining using mathematical and machine learning models to improve learners’ performance, education administration, and policy formulation. This study explored the literature and related methodologies used within the EDM context and constructed a solution to improve learner support and planning in the Limpopo primary and secondary schools education system. The data utilized included socio-economic environment, demographic information as well as learner’s performance sourced from the Education Management Information Systems database of the Limpopo Department of Education (LDoE). Feature selection methods; Information Gain, Correlation and Asymmetrical Uncertainty were combined to determine factors that affect learning. Three machine learning classifiers, AdaboostM1 (Decision Stump), HoeffdingTree and NaïveBayes, were used to predict learners’ grade progression. These were compared using several evaluation metrics and HoeffdingTree outperformed AdaboostM1 (Decision Stump) and NaïveBayes. When the final HoeffdingTree model was applied to the test datasets, the performance was exceptionally good. It is hoped that the implementation of this model will assist the LDoE in its role of supporting learning and planning of resource allocation.
- ItemOpen AccessPredicting household poverty with machine learning methods: the case of Malawi(2022) Chinyama, Francis; Berman, SoniaPoverty alleviation continues to be paramount for developing countries. This necessitates the need for poverty tracking tools to monitor progress towards this goal and effect timely interventions. One major way poverty has been tracked in Malawi is by carrying out integrated household surveys every five years to quantify poverty at local and national levels. However, such surveys have been documented as very expensive, tedious, and sparsely administered by many low- and middle-income countries. Therefore, this study looked at whether machinelearning models can be used on existing survey data to predict poor and non-poor households, and whether these models can predict poverty using a smaller number of features than those collected in integrated household surveys. This was achieved by comparing the performance of three off-the-shelf, open-source machinelearning classification algorithms namely Logistic Regression, Extra Gradient Boosting Machine and Light Gradient Boosting Machine, in correctly predicting poor and non-poor households from Malawi survey data. These supervised learning algorithms were trained using 10-fold cross-validation. The experiments were carried out on the full panel of features which represent all the questions asked in a household survey, as well as on smaller feature subsets. The Filter method and SHapley Additive exPlanations method were used to rank the importance of the features, and smaller data subsets were selected based on these rankings. The highest prediction accuracy achieved for the full panel data set of 486 features was 87%. When the Filter method rankings were used, the models' prediction accuracy dropped to 63% for the top 50 features subset. However, using the SHAP method rankings, the maximum prediction accuracy level was maintained and only dropped slightly to 86% with the top 50 feature subset; to 84% with the top 20 features; and 73% for the top 10 features. Area under the Curve, Receiver Operating Characteristic curve, recall, precision, F1 score, Matthews Correlation Coefficient and Cohen's Kappa scores confirmed the classification models' reliability. The study, therefore, established that poverty can be predicted by open-source machine learning algorithms using a substantially reduced number of features with accuracy comparable to using the full feature set. A policy recommendation is to employ only the top explanatory features in surveys. This will enable shorter, lower-cost surveys that can be administered more frequently. The aim is to assist policymakers and aid organisations to make more timely interventions with better targeting of the poorest.
- ItemOpen AccessSchema matching in a peer-to-peer database system(2006) Rouse, Colin; Berman, SoniaPeer-to-peer or P2P systems are applications that allow a network of peers to share resources in a scalable and efficient manner. My research is concerned with the use of P2P systems for sharing databases. To allow data mediation between peers' databases, schema mappings need to exist, which are mappings between semantically equivalent attributes in different peers' schemas. Mappings can either be defined manually or found semi-automatically using a technique called schema matching. However, schema matching has not been used much in dynamic environments, such as P2P networks. Therefore, this thesis investigates how to enable effective semi-automated schema matching within a P2P network.
- ItemOpen AccessThe semantic database model as a basis for an automated database design tool(1983) Berman, Sonia; MacGregor, KenThe automatic database design system is a design aid for network database creation. It obtains a requirements specification from a user and generates a prototype database. This database is compatible with the Data Definition Language of DMS 1100, the database system on the Univac 1108 at the University of Cape Town. The user interface has been constructed in such a way that a computer-naive user can submit a description of his organisation to the system. Thus it constitutes a powerful database design tool, which should greatly alleviate the designer's tasks of communicating with users, and of creating an initial database definition. The requirements are formulated using the semantic database model, and semantic information in this model is incorporated into the database as integrity constraints. A relation scheme is also generated from the specification. As a result of this research, insight has been gained into the advantages and shortcomings of the semantic database model, and some principles for 'good' data models and database design methodologies have emerged.
- ItemOpen AccessSemi-automatic matching of semi-structured data updates(2014) Forshaw,Gareth William; Berman, SoniaData matching, also referred to as data linkage or field matching, is a technique used to combine multiple data sources into one data set. Data matching is used for data integration in a number of sectors and industries; from politics and health care to scientific applications. The motivation for this study was the observation of the day-to-day struggles of a large non-governmental organisation (NGO) in managing their membership database. With a membership base of close to 2.4 million, the challenges they face with regard to the capturing and processing of the semi-structured membership updates are monumental. Updates arrive from the field in a multitude of formats, often incomplete and unstructured, and expert knowledge is geographically localised. These issues are compounded by an extremely complex organisational hierarchy and a general lack of data validation processes. An online system was proposed for pre-processing input and then matching it against the membership database. Termed the Data Pre-Processing and Matching System (DPPMS), it allows for single or bulk updates. Based on the success of the DPPMS with the NGO’s membership database, it was subsequently used for pre-processing and data matching of semi-structured patient and financial customer data. Using the semi-automated DPPMS rather than a clerical data matching system, true positive matches increased by 21% while false negative matches decreased by 20%. The Recall, Precision and F-Measure values all improved and the risk of false positives diminished. The DPPMS was unable to match approximately 8% of provided records; this was largely due to human error during initial data capture. While the DPPMS greatly diminished the reliance on experts, their role remained pivotal during the final stage of the process.
- ItemOpen AccessSSDE : structured software development environment(1990) Norman, Michael John; Berman, SoniaSoftware engineers have identified many problem areas regarding the development of software. There is a need for improving system and program quality at design level, ensuring that design costs remain within the budget, and increasing the productivity of designers. Structured Software Development Environment (SSDE) provides the system designer with an interactive menu-driven environment, and a framework within which he can conveniently express and manipulate his proposed solution. This representation is in terms of both a conceptual model and a detailed software logic definition. Thus SSDE provides tools for both high-level (or logical) and low-level (or physical) design. It allows a user to follow his own preferred methodology rather than restricting him to one specific strategy. SSDE builds and maintains databases that record all design decisions. It provides the system designer with a mechanism whereby systems can easily be modified and new systems can evolve from similar existing systems. There are several auxiliary facilities as productivity aids. SSDE generates PASCAL code for low-level design constructs, ·full documentation of both the high- and low-level designs for inclusion in the project file, as well as a skeleton manual. The system was evaluated by a number of independent users. This exercise clearly demonstrated its success as an aid in expressing, understanding, manipulating and solving software development problems.
- ItemOpen AccessUbiquitous intelligence for smart cities: a public safety approach(2017) Isafiade, Omowunmi Elizabeth; Berman, Sonia; Bagula, Bigomokero AntoineCitizen-centered safety enhancement is an integral component of public safety and a top priority for decision makers in a smart city development. However, public safety agencies are constantly faced with the challenge of deterring crime. While most smart city initiatives have placed emphasis on the use of modern technology for fighting crime, this may not be sufficient to achieve a sustainable safe and smart city in a resource constrained environment, such as in Africa. In particular, crime series which is a set of crimes considered to have been committed by the same offender is currently less explored in developing nations and has great potential in helping to fight against crime and promoting safety in smart cities. This research focuses on detecting the situation of crime through data mining approaches that can be used to promote citizens' safety, and assist security agencies in knowledge-driven decision support, such as crime series identification. While much research has been conducted on crime hotspots, not enough has been done in the area of identifying crime series. This thesis presents a novel crime clustering model, CriClust, for crime series pattern (CSP) detection and mapping to derive useful knowledge from a crime dataset, drawing on sound scientific and mathematical principles, as well as assumptions from theories of environmental criminology. The analysis is augmented using a dual-threshold model, and pattern prevalence information is encoded in similarity graphs. Clusters are identified by finding highly-connected subgraphs using adaptive graph size and Monte-Carlo heuristics in the Karger-Stein mincut algorithm. We introduce two new interest measures: (i) Proportion Difference Evaluation (PDE), which reveals the propagation effect of a series and dominant series; and (ii) Pattern Space Enumeration (PSE), which reveals underlying strong correlations and defining features for a series. Our findings on experimental quasi-real data set, generated based on expert knowledge recommendation, reveal that identifying CSP and statistically interpretable patterns could contribute significantly to strengthening public safety service delivery in a smart city development. Evaluation was conducted to investigate: (i) the reliability of the model in identifying all inherent series in a crime dataset; (ii) the scalability of the model with varying crime records volume; and (iii) unique features of the model compared to competing baseline algorithms and related research. It was found that Monte Carlo technique and adaptive graph size mechanism for crime similarity clustering yield substantial improvement. The study also found that proportion estimation (PDE) and PSE of series clusters can provide valuable insight into crime deterrence strategies. Furthermore, visual enhancement of clusters using graphical approaches to organising information and presenting a unified viable view promotes a prompt identification of important areas demanding attention. Our model particularly attempts to preserve desirable and robust statistical properties. This research presents considerable empirical evidence that the proposed crime cluster (CriClust) model is promising and can assist in deriving useful crime pattern knowledge, contributing knowledge services for public safety authorities and intelligence gathering organisations in developing nations, thereby promoting a sustainable "safe and smart" city.
- ItemOpen AccessUser perception of gaming element effectiveness in a corporate learning application(2017) Arnold, Henry; Berman, SoniaThis Conversion Masters in Information Technology thesis gathered users' perceptions about eight gaming elements to determine their effectiveness on aspects of playability, enjoyment and intrinsic motivation needed in a gamified corporate learning application. The study focused on user opinions about a Progress Bar, Individual Leaderboard, Departmental Leaderboard, Timer, In-Game Currency, Badges, Storyline/Theme and Avatar. A gamification application containing these gaming elements was designed and developed to make the evaluation. The application entailed users learning four Information Technology Infrastructure Library (ITIL) processes needed to manage an information technology department in a telecommunications company. The application design process considered the business goals, rules, target behaviours, time limits, rewards, feedback, levels, storytelling, interest, aesthetics, replay or do-overs, user types, activity cycles, fun mechanisms and development tools needed to create a coherent, addictive, engaging and fun user experience. Player types were determined using the Brainhex online survey. Federoff's Game Playability Heuristics model was used to measure the users' perceptions about the playability of the application. Sweetser and Wyeth's Gameflow model was used to measure perceptions about the gaming elements' contribution toward creating an enjoyable experience. Malone and Lepper's Taxonomy of Intrinsic Motivation for Learning was used to measure the gaming elements' ability to promote an intrinsically motivating learning environment. Masterminds, Achievers, Conquerors and Seekers were the most prominent player types found in the Brainhex online survey for which the gamification application design then catered. The staff in the department play-tested the application to evaluate the gaming elements. Overall the Storyline/Theme, suited to Seekers and Masterminds, ranked as the most effective gaming element in this study. The users perceived artwork as an essential component of a gamified learning application. The Individual Leaderboard, suited to Conquerors, ranked very closely as the second most effective gaming element. The Storyline/Theme and Individual Leaderboard both performed the strongest against the criteria measuring the playability. The Storyline/Theme was by far the strongest from a gameflow perspective and the Individual Leaderboard from a motivation perspective. The Avatars ranked the worst across all the measurement criteria. Based on quiz results, 86 percent of the staff in the department had learned the material from the gamified training prototype developed in this work. The findings from this study will therefore serve as input for developing a full-scale gamification learning application.