Designing and developing a robust automated log file analysis framework for debugging complex system failure

dc.contributor.advisorWinberg, Simon
dc.contributor.authorVan Balla, Tyrone Jade
dc.date.accessioned2022-06-29T11:19:09Z
dc.date.available2022-06-29T11:19:09Z
dc.date.issued2022
dc.date.updated2022-06-29T11:18:40Z
dc.description.abstractAs engineering and computer systems become larger and more complex, additional challenges around the development, management and maintenance of these systems materialize. While these systems afford greater flexibility and capability, debugging failures that occur during the operation of these systems has become more challenging. One such system is the MeerKAT Radio Telescope's Correlator Beamformer (CBF), the signal processing powerhouse of the radio telescope. The majority of software and hardware systems generate log files detailing system operation during runtime. These log files have long been the go-to source of information for engineers when debugging system failures. As these systems become increasingly complex, the log files generated have exploded in both volume and complexity as log messages are recorded for all interacting parts of a system. Manually using log files for debugging system failures is no longer feasible. Recent studies have explored data-driven, automated log file analysis techniques that aim to address this challenge and have focused on two major aspects: log parsing, in which unstructured, free-form text log files are transformed into a structured dataset by extracting a set of event templates that describe the various log messages; and log file analysis, in which data-driven techniques are applied to this structured dataset to model the system behaviour and identify failures. Previous work is yet to address the combination of these two aspects to realize an end-to-end framework for performing automated log file analysis. The objective of this dissertation is to design and develop a robust, end-to-end Automated Log File Analysis Framework capable of analysing log files generated by the MeerKAT CBF to assist in system debugging. The Data Miner, Inference Engine and the complete framework are the major subsystems developed in this dissertation. State-of-the-art, data-driven approaches to log parsing were considered and the best performing approaches were incorporated into the Data Miner. The Inference Engine implements an LSTM-based multi-class classifier that models the system behaviour and uses this to perform anomaly detection to identify failures from log files. The complete framework links these two components together in a software pipeline capable of ingesting unstructured log files and outputting assistive system debugging information. The performance and operation of the framework and its subcomponents is evaluated for correctness on a publicly available, labelled dataset consisting of log files from the Hadoop Distributed File System (HDFS). Given the absence of a labelled dataset, the applicability and usefulness of the framework in the context of the MeerKAT CBF is subjectively evaluated through a case study. The framework is able to correctly model system behaviour from log files, but anomaly detection performance is greatly impacted by the nature and quality of the log files available for tuning and training the framework. When analysing log files, the framework is able to identify anomalous events quickly, even when large log files are considered. While the design of the framework primarily considered the MeerKAT CBF, a robust and generalisable end-to-end framework for automated log file analysis was ultimately developed.
dc.identifier.apacitationVan Balla, T. J. (2022). <i>Designing and developing a robust automated log file analysis framework for debugging complex system failure</i>. (). ,Faculty of Engineering and the Built Environment ,Department of Electrical Engineering. Retrieved from http://hdl.handle.net/11427/36572en_ZA
dc.identifier.chicagocitationVan Balla, Tyrone Jade. <i>"Designing and developing a robust automated log file analysis framework for debugging complex system failure."</i> ., ,Faculty of Engineering and the Built Environment ,Department of Electrical Engineering, 2022. http://hdl.handle.net/11427/36572en_ZA
dc.identifier.citationVan Balla, T.J. 2022. Designing and developing a robust automated log file analysis framework for debugging complex system failure. . ,Faculty of Engineering and the Built Environment ,Department of Electrical Engineering. http://hdl.handle.net/11427/36572en_ZA
dc.identifier.ris TY - Master Thesis AU - Van Balla, Tyrone Jade AB - As engineering and computer systems become larger and more complex, additional challenges around the development, management and maintenance of these systems materialize. While these systems afford greater flexibility and capability, debugging failures that occur during the operation of these systems has become more challenging. One such system is the MeerKAT Radio Telescope's Correlator Beamformer (CBF), the signal processing powerhouse of the radio telescope. The majority of software and hardware systems generate log files detailing system operation during runtime. These log files have long been the go-to source of information for engineers when debugging system failures. As these systems become increasingly complex, the log files generated have exploded in both volume and complexity as log messages are recorded for all interacting parts of a system. Manually using log files for debugging system failures is no longer feasible. Recent studies have explored data-driven, automated log file analysis techniques that aim to address this challenge and have focused on two major aspects: log parsing, in which unstructured, free-form text log files are transformed into a structured dataset by extracting a set of event templates that describe the various log messages; and log file analysis, in which data-driven techniques are applied to this structured dataset to model the system behaviour and identify failures. Previous work is yet to address the combination of these two aspects to realize an end-to-end framework for performing automated log file analysis. The objective of this dissertation is to design and develop a robust, end-to-end Automated Log File Analysis Framework capable of analysing log files generated by the MeerKAT CBF to assist in system debugging. The Data Miner, Inference Engine and the complete framework are the major subsystems developed in this dissertation. State-of-the-art, data-driven approaches to log parsing were considered and the best performing approaches were incorporated into the Data Miner. The Inference Engine implements an LSTM-based multi-class classifier that models the system behaviour and uses this to perform anomaly detection to identify failures from log files. The complete framework links these two components together in a software pipeline capable of ingesting unstructured log files and outputting assistive system debugging information. The performance and operation of the framework and its subcomponents is evaluated for correctness on a publicly available, labelled dataset consisting of log files from the Hadoop Distributed File System (HDFS). Given the absence of a labelled dataset, the applicability and usefulness of the framework in the context of the MeerKAT CBF is subjectively evaluated through a case study. The framework is able to correctly model system behaviour from log files, but anomaly detection performance is greatly impacted by the nature and quality of the log files available for tuning and training the framework. When analysing log files, the framework is able to identify anomalous events quickly, even when large log files are considered. While the design of the framework primarily considered the MeerKAT CBF, a robust and generalisable end-to-end framework for automated log file analysis was ultimately developed. DA - 2022 DB - OpenUCT DP - University of Cape Town KW - Engineering LK - https://open.uct.ac.za PY - 2022 T1 - Designing and developing a robust automated log file analysis framework for debugging complex system failure TI - Designing and developing a robust automated log file analysis framework for debugging complex system failure UR - http://hdl.handle.net/11427/36572 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/36572
dc.identifier.vancouvercitationVan Balla TJ. Designing and developing a robust automated log file analysis framework for debugging complex system failure. []. ,Faculty of Engineering and the Built Environment ,Department of Electrical Engineering, 2022 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/36572en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Electrical Engineering
dc.publisher.facultyFaculty of Engineering and the Built Environment
dc.subjectEngineering
dc.titleDesigning and developing a robust automated log file analysis framework for debugging complex system failure
dc.typeMaster Thesis
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_ebe_2022_van balla tyrone jade.pdf
Size:
7.73 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections