Application of CNN-gcForestCS to cassava leaf image classification

dc.contributor.advisorBritz, Stefan
dc.contributor.authorCarew, Liam
dc.date.accessioned2024-04-04T08:11:46Z
dc.date.available2024-04-04T08:11:46Z
dc.date.issued2023
dc.date.updated2024-04-04T07:24:32Z
dc.description.abstractCassava is one of the most consumed carbohydrates in the world, providing a reliable source of income and nutrition to inhabitants of Latin America, Africa and Asia. However, its production is greatly affected by pathogenic infection with cassava mosaic disease (CMD) posing the greatest threat to cassava farmers in Africa and Asia. Given that developing nations are estimated to be hit hardest by climate change and projected to have the largest population increases in coming decades, optimisation of cassava yield in these areas is imperative to ensure food security. Traditionally, crop health is determined by manual inspection which can be laborious, error-prone and require technical expertise. This produces a costly barrier of entry for smallholding farmers who make up majority of global cassava production. Development of automated disease detection systems using convolutional neural networks (CNNs) deployable on mobile phones have shown to be a cost-efficient and effective method for cassava monitoring, mainly owing to their advanced feature extraction capabilities. However, CNNs require complex hyperparameter tuning and can be computationally intensive to train. GcForestCS (multi-grained cascade forest with confidence screening) presents an alternative statistical learning method that can be trained using CPU, and requires less complex hyperparameter tuning than deep learning while producing competitive performance for lower-dimensionality datasets. Taking advantage of the feature extraction capabilities of CNNs and the competitive performance of gcForestCS for lower-dimensionality datasets, the central aim of this dissertation was to investigate CNN-gcForestCS as an alternative to deep learning for cassava leaf disease detection. The performance of CNN-gcForestCS was compared to gcForestCS and deep learning where the effect of class balance, CNN feature extraction, CNN feature extractor fine-tuning, pooling after multi-grained scanning, and training set curation were assessed. The results showed that the best DenseNet201-gcForestCS model (86.79%) produced marginally worse performance than the best DenseNet201 model (87.43%), while the best MobileNetV2-gcForestCS model (83.66%) produced marginally better performance than the best MobileNetV2 model (82.87%). Overall, the results indicate that it is inconclusive whether CNN-gcForestCS is a viable alternative to deep learning for cassava leaf disease detection, especially when considering the high computational cost associated with the CNN-gcForestCS methodology.
dc.identifier.apacitationCarew, L. (2023). <i>Application of CNN-gcForestCS to cassava leaf image classification</i>. (). University of Cape Town ,Faculty of Science ,Department of Statistical Sciences. Retrieved from http://hdl.handle.net/11427/39293en_ZA
dc.identifier.chicagocitationCarew, Liam. <i>"Application of CNN-gcForestCS to cassava leaf image classification."</i> ., University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2023. http://hdl.handle.net/11427/39293en_ZA
dc.identifier.citationCarew, L. 2023. Application of CNN-gcForestCS to cassava leaf image classification. . University of Cape Town ,Faculty of Science ,Department of Statistical Sciences. http://hdl.handle.net/11427/39293en_ZA
dc.identifier.ris TY - Thesis / Dissertation AU - Carew, Liam AB - Cassava is one of the most consumed carbohydrates in the world, providing a reliable source of income and nutrition to inhabitants of Latin America, Africa and Asia. However, its production is greatly affected by pathogenic infection with cassava mosaic disease (CMD) posing the greatest threat to cassava farmers in Africa and Asia. Given that developing nations are estimated to be hit hardest by climate change and projected to have the largest population increases in coming decades, optimisation of cassava yield in these areas is imperative to ensure food security. Traditionally, crop health is determined by manual inspection which can be laborious, error-prone and require technical expertise. This produces a costly barrier of entry for smallholding farmers who make up majority of global cassava production. Development of automated disease detection systems using convolutional neural networks (CNNs) deployable on mobile phones have shown to be a cost-efficient and effective method for cassava monitoring, mainly owing to their advanced feature extraction capabilities. However, CNNs require complex hyperparameter tuning and can be computationally intensive to train. GcForestCS (multi-grained cascade forest with confidence screening) presents an alternative statistical learning method that can be trained using CPU, and requires less complex hyperparameter tuning than deep learning while producing competitive performance for lower-dimensionality datasets. Taking advantage of the feature extraction capabilities of CNNs and the competitive performance of gcForestCS for lower-dimensionality datasets, the central aim of this dissertation was to investigate CNN-gcForestCS as an alternative to deep learning for cassava leaf disease detection. The performance of CNN-gcForestCS was compared to gcForestCS and deep learning where the effect of class balance, CNN feature extraction, CNN feature extractor fine-tuning, pooling after multi-grained scanning, and training set curation were assessed. The results showed that the best DenseNet201-gcForestCS model (86.79%) produced marginally worse performance than the best DenseNet201 model (87.43%), while the best MobileNetV2-gcForestCS model (83.66%) produced marginally better performance than the best MobileNetV2 model (82.87%). Overall, the results indicate that it is inconclusive whether CNN-gcForestCS is a viable alternative to deep learning for cassava leaf disease detection, especially when considering the high computational cost associated with the CNN-gcForestCS methodology. DA - 2023 DB - OpenUCT DP - University of Cape Town KW - Statistical Science LK - https://open.uct.ac.za PY - 2023 T1 - ETD: Application of CNN-gcForestCS to cassava leaf image classification TI - ETD: Application of CNN-gcForestCS to cassava leaf image classification UR - http://hdl.handle.net/11427/39293 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/39293
dc.identifier.vancouvercitationCarew L. Application of CNN-gcForestCS to cassava leaf image classification. []. University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2023 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/39293en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Statistical Sciences
dc.publisher.facultyFaculty of Science
dc.publisher.institutionUniversity of Cape Town
dc.subjectStatistical Science
dc.titleApplication of CNN-gcForestCS to cassava leaf image classification
dc.typeThesis / Dissertation
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2023_carew liam.pdf
Size:
10.26 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.72 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections