Pattern recognition to detect fetal alchohol syndrome using stereo facial images

Master Thesis


Permanent link to this Item
Journal Title
Link to Journal
Journal ISSN
Volume Title

University of Cape Town

Fetal alcohol syndrome (FAS) is a condition which is caused by excessive consumption of alcohol by the mother during pregnancy. A FAS diagnosis depends on the presence of growth retardation, central nervous system and neurodevelopment abnormalities together with facial malformations. The main facial features which best distinguish children with and without FAS are smooth philtrum, thin upper lip and short palpebral fissures. Diagnosis of the facial phenotype associated with FAS can be done using methods such as direct facial anthropometry and photogrammetry. The project described here used information obtained from stereo facial images and applied facial shape analysis and pattern recognition to distinguish between children with FAS and control children. Other researches have reported on identifying FAS through the classification of 2D landmark coordinates and 3D landmark information in the form of Procrustes residuals. This project built on this previous work with the use of 3D information combined with texture as features for facial classification. Stereo facial images of children were used to obtain the 3D coordinates of those facial landmarks which play a role in defining the FAS facial phenotype. Two datasets were used: the first consisted of facial images of 34 children whose facial shapes had previously been analysed with respect to FAS. The second dataset consisted of a new set of images from 40 subjects. Elastic bunch graph matching was used on the frontal facial images of the study populaiii tion to obtain texture information, in the form of jets, around selected landmarks. Their 2D coordinates were also extracted during the process. Faces were classified using knearest neighbor (kNN), linear discriminant analysis (LDA) and support vector machine (SVM) classifiers. Principal component analysis was used for dimensionality reduction while classification accuracy was assessed using leave-one-out cross-validation. For dataset 1, using 2D coordinates together with texture information as features during classification produced a best classification accuracy of 72.7% with kNN, 75.8% with LDA and 78.8% with SVM. When the 2D coordinates were replaced by Procrustes residuals (which encode 3D facial shape information), the best classification accuracies were 69.7% with kNN, 81.8% with LDA and 78.6% with SVM. LDA produced the most consistent classification results. The classification accuracies for dataset 2 were lower than for dataset 1. The different conditions during data collection and the possible differences in the ethnic composition of the datasets were identified as likely causes for this decrease in classification accuracy.