Simultaneous clustering with mixtures of factor analysers
Permanent link to this Item
Link to Journal
University of Cape Town
This work details the method of Simultaneous Model-based Clustering. It also presents an extension to this method by reformulating it as a model with a mixture of factor analysers. This allows for the technique, known as Simultaneous Model-Based Clustering with a Mixture of Factor Analysers, to be able to cluster high dimensional gene-expression data. A new table of allowable and non-allowable models is formulated, along with a parameter estimation scheme for one such allowable model. Several numerical procedures are tested and various datasets, both real and generated, are clustered. The results of clustering the Iris data find a 3 component VEV model to have the lowest misclassification rate with comparable BIC values to the best scoring model. The clustering of Genetic data was less successful, where the 2-component model could successfully uncover the healthy tissue, but partitioned the cancerous tissue in half.
O'Donnell, W. 2013. Simultaneous clustering with mixtures of factor analysers. University of Cape Town.