Investigating the effect of paralogs on microarray gene-set analysis

dc.contributor.authorFaure, Andreen_ZA
dc.contributor.authorSeoighe, Cathalen_ZA
dc.contributor.authorMulder, Nicolaen_ZA
dc.date.accessioned2015-11-11T11:57:26Z
dc.date.available2015-11-11T11:57:26Z
dc.date.issued2011en_ZA
dc.description.abstractBACKGROUND: In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. RESULTS: We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http://www.cbio.uct.ac.za/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. CONCLUSIONS: The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.en_ZA
dc.identifier.apacitationFaure, A., Seoighe, C., & Mulder, N. (2011). Investigating the effect of paralogs on microarray gene-set analysis. <i>BMC Bioinformatics</i>, http://hdl.handle.net/11427/14875en_ZA
dc.identifier.chicagocitationFaure, Andre, Cathal Seoighe, and Nicola Mulder "Investigating the effect of paralogs on microarray gene-set analysis." <i>BMC Bioinformatics</i> (2011) http://hdl.handle.net/11427/14875en_ZA
dc.identifier.citationFaure, A. J., Seoighe, C., & Mulder, N. J. (2011). Investigating the effect of paralogs on microarray gene-set analysis. BMC bioinformatics, 12(1), 29.en_ZA
dc.identifier.ris TY - Journal Article AU - Faure, Andre AU - Seoighe, Cathal AU - Mulder, Nicola AB - BACKGROUND: In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. RESULTS: We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http://www.cbio.uct.ac.za/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. CONCLUSIONS: The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies. DA - 2011 DB - OpenUCT DO - 10.1186/1471-2105-12-29 DP - University of Cape Town J1 - BMC Bioinformatics LK - https://open.uct.ac.za PB - University of Cape Town PY - 2011 T1 - Investigating the effect of paralogs on microarray gene-set analysis TI - Investigating the effect of paralogs on microarray gene-set analysis UR - http://hdl.handle.net/11427/14875 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/14875
dc.identifier.urihttp://dx.doi.org/10.1186/1471-2105-12-29
dc.identifier.vancouvercitationFaure A, Seoighe C, Mulder N. Investigating the effect of paralogs on microarray gene-set analysis. BMC Bioinformatics. 2011; http://hdl.handle.net/11427/14875.en_ZA
dc.language.isoengen_ZA
dc.publisherBioMed Central Ltden_ZA
dc.publisher.departmentInstitute of Infectious Disease and Molecular Medicineen_ZA
dc.publisher.facultyFaculty of Health Sciencesen_ZA
dc.publisher.institutionUniversity of Cape Town
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution Licenseen_ZA
dc.rights.holder2011 Faure et al; licensee BioMed Central Ltd.en_ZA
dc.rights.urihttp://creativecommons.org/licenses/by/2.0en_ZA
dc.sourceBMC Bioinformaticsen_ZA
dc.source.urihttp://www.biomedcentral.com/bmcbioinformatics/en_ZA
dc.subject.otherGene Set Statisticen_ZA
dc.subject.otherNCI-60 Cancer Cell Lineen_ZA
dc.subject.otherMSigDB Gene Seten_ZA
dc.subject.otherGSA Methoden_ZA
dc.subject.otherBidirectional Geneen_ZA
dc.subject.otherLeukaemia Dataseten_ZA
dc.titleInvestigating the effect of paralogs on microarray gene-set analysisen_ZA
dc.typeJournal Articleen_ZA
uct.type.filetypeText
uct.type.filetypeImage
uct.type.publicationResearchen_ZA
uct.type.resourceArticleen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Faure_Investigating_effect_of_paralogs_2011.pdf
Size:
783.42 KB
Format:
Adobe Portable Document Format
Description:
Collections