Investigating the effect of paralogs on microarray gene-set analysis

 

Show simple item record

dc.contributor.advisor Mulder, Nicola en_ZA
dc.contributor.advisor Seoighe, Cathal en_ZA
dc.contributor.author Faure, André en_ZA
dc.date.accessioned 2014-07-30T17:37:33Z
dc.date.available 2014-07-30T17:37:33Z
dc.date.issued 2008 en_ZA
dc.identifier.citation Faure, A. 2008. Investigating the effect of paralogs on microarray gene-set analysis. University of Cape Town. en_ZA
dc.identifier.uri http://hdl.handle.net/11427/4260
dc.description Includes abstract.
dc.description Includes bibliographical references.
dc.description.abstract In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge from databases such as the Gene Ontology (GO) or KEGG to group genes into sets based on their annotations. They aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. The objective is that this approach reveals sets of genes with subtle but coordinated behaviour implicating specific biological processes or pathways in the response under study. Several GSA methods have been proposed and debates have ensued on the statistical foundations of the different approaches and the various hypothesis tests used. In particular, criticism has been directed at methods that rely on a strict cut-off to determine significant genes and those that assume genes are expressed independently. We show that paralogs, which typically have high sequence identity and similar molecular functions also exhibit high correlation in their expression patterns. This, together with the fact that the calculation of gene-set significance by all GSA methods is influenced by the number of genes in the gene set, means that sets with high numbers of paralogs are ranked in a biased manner that reflects more the redundant and dependent nature of para logs than any biological phenomenon. en_ZA
dc.language.iso eng en_ZA
dc.subject.other Cell Biology en_ZA
dc.title Investigating the effect of paralogs on microarray gene-set analysis en_ZA
dc.type Master Thesis
uct.type.publication Research en_ZA
uct.type.resource Thesis en_ZA
dc.publisher.institution University of Cape Town
dc.publisher.faculty Faculty of Science en_ZA
dc.publisher.department Department of Molecular and Cell Biology en_ZA
dc.type.qualificationlevel Masters
dc.type.qualificationname MSc en_ZA
uct.type.filetype Text
uct.type.filetype Image
dc.identifier.apacitation Faure, A. (2008). <i>Investigating the effect of paralogs on microarray gene-set analysis</i>. (Thesis). University of Cape Town ,Faculty of Science ,Department of Molecular and Cell Biology. Retrieved from http://hdl.handle.net/11427/4260 en_ZA
dc.identifier.chicagocitation Faure, André. <i>"Investigating the effect of paralogs on microarray gene-set analysis."</i> Thesis., University of Cape Town ,Faculty of Science ,Department of Molecular and Cell Biology, 2008. http://hdl.handle.net/11427/4260 en_ZA
dc.identifier.vancouvercitation Faure A. Investigating the effect of paralogs on microarray gene-set analysis. [Thesis]. University of Cape Town ,Faculty of Science ,Department of Molecular and Cell Biology, 2008 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/4260 en_ZA
dc.identifier.ris TY - Thesis / Dissertation AU - Faure, André AB - In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge from databases such as the Gene Ontology (GO) or KEGG to group genes into sets based on their annotations. They aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. The objective is that this approach reveals sets of genes with subtle but coordinated behaviour implicating specific biological processes or pathways in the response under study. Several GSA methods have been proposed and debates have ensued on the statistical foundations of the different approaches and the various hypothesis tests used. In particular, criticism has been directed at methods that rely on a strict cut-off to determine significant genes and those that assume genes are expressed independently. We show that paralogs, which typically have high sequence identity and similar molecular functions also exhibit high correlation in their expression patterns. This, together with the fact that the calculation of gene-set significance by all GSA methods is influenced by the number of genes in the gene set, means that sets with high numbers of paralogs are ranked in a biased manner that reflects more the redundant and dependent nature of para logs than any biological phenomenon. DA - 2008 DB - OpenUCT DP - University of Cape Town LK - https://open.uct.ac.za PB - University of Cape Town PY - 2008 T1 - Investigating the effect of paralogs on microarray gene-set analysis TI - Investigating the effect of paralogs on microarray gene-set analysis UR - http://hdl.handle.net/11427/4260 ER - en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record