Browsing by Subject "Big Data"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- ItemOpen AccessAutomated feature synthesis on big data using cloud computing resources(University of Cape Town, 2020) Saker, Vanessa; Berman, SoniaThe data analytics process has many time-consuming steps. Combining data that sits in a relational database warehouse into a single relation while aggregating important information in a meaningful way and preserving relationships across relations, is complex and time-consuming. This step is exceptionally important as many machine learning algorithms require a single file format as an input (e.g. supervised and unsupervised learning, feature representation and feature learning, etc.). An analyst is required to manually combine relations while generating new, more impactful information points from data during the feature synthesis phase of the feature engineering process that precedes machine learning. Furthermore, the entire process is complicated by Big Data factors such as processing power and distributed data storage. There is an open-source package, Featuretools, that uses an innovative algorithm called Deep Feature Synthesis to accelerate the feature engineering step. However, when working with Big Data, there are two major limitations. The first is the curse of modularity - Featuretools stores data in-memory to process it and thus, if data is large, it requires a processing unit with a large memory. Secondly, the package is dependent on data stored in a Pandas DataFrame. This makes the use of Featuretools with Big Data tools such as Apache Spark, a challenge. This dissertation aims to examine the viability and effectiveness of using Featuretools for feature synthesis with Big Data on the cloud computing platform, AWS. Exploring the impact of generated features is a critical first step in solving any data analytics problem. If this can be automated in a distributed Big Data environment with a reasonable investment of time and funds, data analytics exercises will benefit considerably. In this dissertation, a framework for automated feature synthesis with Big Data is proposed and an experiment conducted to examine its viability. Using this framework, an infrastructure was built to support the process of feature synthesis on AWS that made use of S3 storage buckets, Elastic Cloud Computing services, and an Elastic MapReduce cluster. A dataset of 95 million customers, 34 thousand fraud cases and 5.5 million transactions across three different relations was then loaded into the distributed relational database on the platform. The infrastructure was used to show how the dataset could be prepared to represent a business problem, and Featuretools used to generate a single feature matrix suitable for inclusion in a machine learning pipeline. The results show that the approach was viable. The feature matrix produced 75 features from 12 input variables and was time efficient with a total end-to-end run time of 3.5 hours and a cost of approximately R 814 (approximately $52). The framework can be applied to a different set of data and allows the analysts to experiment on a small section of the data until a final feature set is decided. They are able to easily scale the feature matrix to the full dataset. This ability to automate feature synthesis, iterate and scale up, will save time in the analytics process while providing a richer feature set for better machine learning results.
- ItemOpen AccessExplaining the Big Data adoption decision in Small and Medium Sized Enterprises: Cape Town case studies(2022) Matross, Lonwabo; Seymour, LisaProblem Statement: Small and Medium-Sized Enterprises (SMEs) play an integral role in the economy of developed and developing countries. SMEs are constantly searching for innovative technologies that will not only reduce their overhead costs but also improve product development, customer relations and profitability. Literature has revealed that some SMEs around the world have incorporated a fairly new technology called Big Data to achieve higher levels of operational efficiency. Therefore, it is interesting to observe the reasons why some organizations in developing countries such as South Africa are not adopting this technology as compared to other developed countries. A large portion of the available literature revealed that there isa general lack of in-depth information and understanding of Big Data amongst SMEs in developing countries such as South Africa. The main objective of this study is to explain the factors that SMEs consider during the Big Data decision process. Purpose of the study: This research study aimed to identify the factors that South African SMEs consider as important in their decision-making process when it comes to the adoption of BigData. The researcher used the conceptual framework proposed by Frambach and Schillewaert to derive an updated and adapted conceptual framework that explained the factors that SMEs consider when adopting Big Data. Research methodology: SMEs located in the Western Province of South Africa were chosen as the case studies. The interpretive research philosophy formed the basis of this research. Additionally, the nature of the phenomenon being investigated deemed it appropriate that the qualitative research method and research design be applied to this thesis. Due to constraints such as limited time and financial resources this was a cross-sectional study. The research strategy in this study was multiple in-depth case studies. The qualitative approach was deemed appropriate for this study. The researcher used two methods to collect data, namely, the primary research method and the secondary research method. The primary research method enabled the researcher to obtain rich data that could assist in answering the primary research questions, whilst the secondary research method included documents which supplemented the primary data collected. Data was analyzed using the NVivo software provided by the University of Cape Town. Key Findings: The findings suggest that the process that influences the decision to adopt Big Data by SMEs follows a three-step approach namely: 1.) Awareness, 2.) Consideration, 3.) Intention. This indicates that for Big Data to be adopted by SMEs there must be organizational readiness to go through the process. This study identified the main intention for SMEs to adopt Big Data is to ensure operational stability. Improved operational efficiency was identified as the supporting sub-theme. This study has raised awareness about the process that SMEs, academic researchers, IT practitioners and government need to place emphasis on to improve the adoption of Big Data by SMEs. Furthermore, this study has raised awareness about the opportunities and challenges that SMEs, academic researchers, IT practitioners and government need to place emphasis on to improve the adoption of Big Data by SMEs. Value of the study: The study adds value in both academia and the business industry as it provides more insight into the factors that SMEs consider in the Big Data adoption decision.
- ItemOpen AccessMobile phone technology as an aid to contemporary transport questions in walkability, in the context of developing countries(2019) Chege, Wilberforce Wanjau; Zuidgeest, MarcusThe emerging global middle class, which is expected to double by 2050 desires more walkable, liveable neighbourhoods, and as distances between work and other amenities increases, cities are becoming less monocentric and becoming more polycentric. African cities could be described as walking cities, based on the number of people that walk to their destinations as opposed to other means of mobility but are often not walkable. Walking is by far the most popular form of transportation in Africa’s rapidly urbanising cities, although it is not often by choice rather a necessity. Facilitating this primary mode, while curbing the growth of less sustainable mobility uses requires special attention for the safety and convenience of walking in view of a Global South context. In this regard, to further promote walking as a sustainable mobility option, there is a need to assess the current state of its supporting infrastructure and begin giving it higher priority, focus and emphasis. Mobile phones have emerged as a useful alternative tool to collect this data and audit the state of walkability in cities. They eliminate the inaccuracies and inefficiencies of human memories because smartphone sensors such as GPS provides information with accuracies within 5m, providing superior accuracy and precision compared to other traditional methods. The data is also spatial in nature, allowing for a range of possible applications and use cases. Traditional inventory approaches in walkability often only revealed the perceived walkability and accessibility for only a subset of journeys. Crowdsourcing the perceived walkability and accessibility of points of interest in African cities could address this, albeit aspects such as ease-of-use and road safety should also be considered. A tool that crowdsources individual pedestrian experiences; availability and state of pedestrian infrastructure and amenities, using state-of-the-art smartphone technology, would over time also result in complete surveys of the walking environment provided such a tool is popular and safe. This research will illustrate how mobile phone applications currently in the market can be improved to offer more functionality that factors in multiple sensory modalities for enhanced visual appeal, ease of use, and aesthetics. The overarching aim of this research is, therefore, to develop the framework for and test a pilot-version mobile phone-based data collection tool that incorporates emerging technologies in collecting data on walkability. This research project will assess the effectiveness of the mobile application and test the technical capabilities of the system to experience how it operates within an existing infrastructure. It will continue to investigate the use of mobile phone technology in the collection of user perceptions of walkability, and the limitations of current transportation-based mobile applications, with the aim of developing an application that is an improvement to current offerings in the market. The prototype application will be tested and later piloted in different locations around the globe. Past studies are primarily focused on the development of transport-based mobile phone applications with basic features and limited functionality. Although limited progress has been made in integrating emerging advanced technologies such as Augmented Reality (AR), Machine Learning (ML), Big Data analytics, amongst others into mobile phone applications; what is missing from these past examples is a comprehensive and structured application in the transportation sphere. In turn, the full research will offer a broader understanding of the iii information gathered from these smart devices, and how that large volume of varied data can be better and more quickly interpreted to discover trends, patterns, and aid in decision making and planning. This research project attempts to fill this gap and also bring new insights, thus promote the research field of transportation data collection audits, with particular emphasis on walkability audits. In this regard, this research seeks to provide insights into how such a tool could be applied in assessing and promoting walkability as a sustainable and equitable mobility option. In order to get policy-makers, analysts, and practitioners in urban transport planning and provision in cities to pay closer attention to making better, more walkable places, appealing to them from an efficiency and business perspective is vital. This crowdsourced data is of great interest to industry practitioners, local governments and research communities as Big Data, and to urban communities and civil society as an input in their advocacy activities. The general findings from the results of this research show clear evidence that transport-based mobile phone applications currently available in the market are increasingly getting outdated and are not keeping up with new and emerging technologies and innovations. It is also evident from the results that mobile smartphones have revolutionised the collection of transport-related information hence the need for new initiatives to help take advantage of this emerging opportunity. The implications of these findings are that more attention needs to be paid to this niche going forward. This research project recommends that more studies, particularly on what technologies and functionalities can realistically be incorporated into mobile phone applications in the near future be done as well as on improving the hardware specifications of mobile phone devices to facilitate and support these emerging technologies whilst keeping the cost of mobile devices as low as possible.
- ItemOpen AccessThe democratisation of decision-makers in data-driven decision-making in a Big Data environment: The case of a financial services organisation in South Africa(University of Cape Town, 2020) Hassa, Ishmael; Tanner, Maureen; Brown, IrwinBig Data refers to large unstructured datasets from multiple dissimilar sources. Using Big Data Analytics (BDA), insights can be gained that cannot be obtained by other means, allowing better decision-making. Big Data is disruptive, and because it is vast and complex, it is difficult to manage from technological, regulatory, and social perspectives. Big Data can provide decision-makers (knowledge workers) with bottom-up access to information for decision-making, thus providing potential benefits due to the democratisation of decision-makers in data-driven decision-making (DDD). The workforce is enabled to make better decisions, thereby improving participation and productivity. Enterprises that enable DDD are more successful than firms that are solely dependent on management's perception and intuition. Understanding the links between key concepts (Big Data, democratisation, and DDD) and decision-makers are important, because the use of Big Data is growing, the workforce is continually evolving, and effective decision-making based on Big Data insights is critical to a firm's competitiveness. This research investigates the influence of Big Data on the democratisation of decision-makers in data-driven decision-making. A Grounded Theory Method (GTM) was adopted due to the scarcity of literature around the interrelationships between the key concepts. An empirical study was undertaken, based on a case study of a large and leading financial services organisation in South Africa. The case study participants were diverse and represented three different departments. GTM facilitates emergence of novel theory that is grounded in empirical data. Theoretical elaboration of new concepts with existing literature permits the comparison of the emergent or substantive theory for similarities, differences, and uniqueness. By applying the GTM principles of constant comparison, theoretical sampling and emergence, decision-makers (people, knowledge workers) became the focal point of study rather than organisational decision-making processes or decision support systems. The concentrate of the thesis is therefore on the democratisation of decision-makers in a Big Data environment. The findings suggest that the influence of Big Data on the democratisation of the decisionmaker in relation to DDD is dependent on the completeness and quality of the Information Systems (IS) artefact. The IS artefact results from, and is comprised of, information that is extracted from Big Data through Big Data Analytics (BDA) and decision-making indicators (DMI). DMI are contributions of valuable decision-making parameters by actors that include Big Data, People, The Organisation, and Organisational Structures. DMI is an aspect of knowledge management as it contains both the story behind the decision and the knowledge that was used to decide. The IS artefact is intended to provide a better and more complete picture of the decision-making landscape, which adds to the confidence of decision-makers and promotes participation in DDD which, in turn, exemplifies democratisation of the decisionmaker. Therefore, the main theoretical contribution is that the democratisation of the decisionmaker in DDD is based on the completeness of the IS artefact, which is assessed within the democratisation inflection point (DIP). The DIP is the point at which the decision-maker evaluates the IS artefact. When the IS artefact is complete, meaning that all the parameters that are pertinent to a decision for specific information is available, then democratisation of the decision-maker is realised. When the IS artefact is incomplete, meaning that all the parameters that are pertinent to a decision for specific information is unavailable, then democratisation of the decision-maker breaks down. The research contributes new knowledge in the form of a substantive theory, grounded in empirical findings, to the academic field of IS. The IS artefact constitutes a contribution to practice: it highlights the importance of interrelationships and contributions of DMI by actors within an organisation, based on information extracted through BDA, that promote decisionmaker confidence and participation in DDD. DMI, within the IS artefact, are critical to decision-making, the lack of which has implications for the democratisation of the decisionmaker in DDD. The study has uncovered the need to further investigate the extent of each actor's contribution (agency) to DMI, the implications of generational characteristics on adoption and use of Big Data and an in-depth understanding of the relationships between individual differences, Big Data and decision-making. Research is also recommended to better explain democratisation as it relates to data-driven decision-making processes.