The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression

dc.contributor.advisorNitschke, Geoff Stuart
dc.contributor.authorOmbura, Martin
dc.date.accessioned2021-02-19T12:49:21Z
dc.date.available2021-02-19T12:49:21Z
dc.date.issued2020
dc.date.updated2021-02-19T12:28:00Z
dc.description.abstractComputer bugs and tests are antagonistic elements of the software development process, with the former attempting to corrupt a program and the latter aiming to identify and fix the introduced faults. The automation of bug identification and repair schemes through automated software testing is an area of research that has only seen success in niche areas of software development but has failed to progress into general areas of computing due to the complexity and diversity of programming languages, codebases and developer coding practices. Unlike traditional engineering fields such as mechanical or civil where project specifications are carefully outlined and built towards, software engineering suffers from a lack of global standardization required to “build from a spec”. In this study we investigate a coevolutionary spec-based approach to dynamically damage and repair programs mathematical programs (functions). We opt for mathematical functions instead of software due to their functional similarities and simpler syntax and semantics. We utilize symbolic regression (SR) as a framework to analyze the error maximized by bugs and minimized by test. We adopt a hybrid evolutionary algorithm (EA) that implements the tree based phenotypic structure of genetic programming (GP) and the list-based chromosome of genetic algorithm (GA) that permits embedding of mathematical tree manipulation (MTM) strategies, as well as adequate selection mechanisms for search. Bugs utilize the MTM strategies in their chromosome to manipulate the input program (IP) with the aim of maximizing the error while tests adopt a set of their own MTM strategies to repair the damaged program using a spec generated from the IP to guide the repair process. Both adversarial agents are investigated in four common coevolutionary topologies, Hall of Fame (HoF), K-Random Tournaments (KRT), Round Robin (RR) and Single Elimination Tournament (SET). We ran 1556 simulations each generating a random polynomial that the bugs and tests would have to contend over in all 4 topologies. We observed that KRT with a low k value of 5 performs best from a computational and fitness standpoint for all bugs and tests. Bugs were dominant in nearly all topologies for all polynomial complexities, whereas tests struggled in the HoF, RR and SET topologies as the input programs became more complex. The competitive landscape however was quite chaotic with the best individuals lasting a maximum of 14 generations out of 300, with the average top individuals lasting only 1 generation. This made predictions on when the best individuals would be born nearly impossible as the coevolutionary landscape changed quite rapidly and non-deterministically. The kinds of MTM strategies selected by both bugs and tests depended on the level of complexity of the input programs. For input programs that had negative polynomials, the best bugs opted to delete the program entirely and build a completely new tree, whereas the best tests were unable to select viable specialized strategies to repair such programs. For programs that had large polynomial degrees, bugs opted for strategies that added nodes their underlying GP tree, in the hopes of damaging the input program more. Tests on the other hand implemented strategies to carefully reduce the complexity of the polynomial. Tests however, frequently overcompensated when attempting to fix the fit bugs, leading to mediocre solutions.
dc.identifier.apacitationOmbura, M. (2020). <i>The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression</i>. (). ,Faculty of Science ,Department of Computer Science. Retrieved from http://hdl.handle.net/11427/32904en_ZA
dc.identifier.chicagocitationOmbura, Martin. <i>"The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression."</i> ., ,Faculty of Science ,Department of Computer Science, 2020. http://hdl.handle.net/11427/32904en_ZA
dc.identifier.citationOmbura, M. 2020. The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression. . ,Faculty of Science ,Department of Computer Science. http://hdl.handle.net/11427/32904en_ZA
dc.identifier.ris TY - Master Thesis AU - Ombura, Martin AB - Computer bugs and tests are antagonistic elements of the software development process, with the former attempting to corrupt a program and the latter aiming to identify and fix the introduced faults. The automation of bug identification and repair schemes through automated software testing is an area of research that has only seen success in niche areas of software development but has failed to progress into general areas of computing due to the complexity and diversity of programming languages, codebases and developer coding practices. Unlike traditional engineering fields such as mechanical or civil where project specifications are carefully outlined and built towards, software engineering suffers from a lack of global standardization required to “build from a spec”. In this study we investigate a coevolutionary spec-based approach to dynamically damage and repair programs mathematical programs (functions). We opt for mathematical functions instead of software due to their functional similarities and simpler syntax and semantics. We utilize symbolic regression (SR) as a framework to analyze the error maximized by bugs and minimized by test. We adopt a hybrid evolutionary algorithm (EA) that implements the tree based phenotypic structure of genetic programming (GP) and the list-based chromosome of genetic algorithm (GA) that permits embedding of mathematical tree manipulation (MTM) strategies, as well as adequate selection mechanisms for search. Bugs utilize the MTM strategies in their chromosome to manipulate the input program (IP) with the aim of maximizing the error while tests adopt a set of their own MTM strategies to repair the damaged program using a spec generated from the IP to guide the repair process. Both adversarial agents are investigated in four common coevolutionary topologies, Hall of Fame (HoF), K-Random Tournaments (KRT), Round Robin (RR) and Single Elimination Tournament (SET). We ran 1556 simulations each generating a random polynomial that the bugs and tests would have to contend over in all 4 topologies. We observed that KRT with a low k value of 5 performs best from a computational and fitness standpoint for all bugs and tests. Bugs were dominant in nearly all topologies for all polynomial complexities, whereas tests struggled in the HoF, RR and SET topologies as the input programs became more complex. The competitive landscape however was quite chaotic with the best individuals lasting a maximum of 14 generations out of 300, with the average top individuals lasting only 1 generation. This made predictions on when the best individuals would be born nearly impossible as the coevolutionary landscape changed quite rapidly and non-deterministically. The kinds of MTM strategies selected by both bugs and tests depended on the level of complexity of the input programs. For input programs that had negative polynomials, the best bugs opted to delete the program entirely and build a completely new tree, whereas the best tests were unable to select viable specialized strategies to repair such programs. For programs that had large polynomial degrees, bugs opted for strategies that added nodes their underlying GP tree, in the hopes of damaging the input program more. Tests on the other hand implemented strategies to carefully reduce the complexity of the polynomial. Tests however, frequently overcompensated when attempting to fix the fit bugs, leading to mediocre solutions. DA - 2020 DB - OpenUCT DP - University of Cape Town KW - computer science LK - https://open.uct.ac.za PY - 2020 T1 - The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression TI - The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression UR - http://hdl.handle.net/11427/32904 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/32904
dc.identifier.vancouvercitationOmbura M. The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression. []. ,Faculty of Science ,Department of Computer Science, 2020 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/32904en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Computer Science
dc.publisher.facultyFaculty of Science
dc.subjectcomputer science
dc.titleThe performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
dc.typeMaster Thesis
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2020_ombura martin.pdf
Size:
6.1 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections