[BioC] GOstats different goDag size just changing the
James W. MacDonald
jmacdon at uw.edu
Wed Oct 17 16:30:41 CEST 2012
Hi Cristobal,
On 10/17/2012 9:27 AM, Cristobal Fresno Rodríguez wrote:
> Hi Jim,
>
> Thanks for your answer. Although I understand that the test is
> mutually exclusive (you cannot be over and under represented at the
> same time), my pre-assumption was that the same terms ought to be
> tested under the two hypothesis tests. Hence, the same goDag structure
> with different p-values should come out of the analysis, which is not
> the case. Over test only consider terms with at least one geneId
> whereas under, consider also terms with no geneId but with
> universeGeneIds.
Exactly. When you are testing for over-representation, you are looking
at all the terms that have been chosen (e.g., those with at least one
geneId), and seeing if there are more genes for that term than would be
expected by chance. By definition you cannot have over-representation
for geneIds that are not in your set of significant genes.
However, when testing for under-representation, you look at all terms
(even those for which you have no geneIds), and then see if there are
fewer genes for that term than would be expected by chance. In this
situation you most certainly can (and will) have under-representation
for terms that aren't represented in your set of geneIds.
Best,
Jim
>
> Thanks to all
>
> Cristobal
>
>
>
> 2012/10/16 James W. MacDonald <jmacdon at uw.edu <mailto:jmacdon at uw.edu>>
>
> Hi Cristobal,
>
>
> On 10/16/2012 4:44 PM, Cristobal Fresno Rodríguez wrote:
>
> Dear list,
>
> I am trying to use GOstats using "under" and "over"
> testDirection. But,
> the hyperGTest builds two diferent goDags. Shouldn't they be
> of the same
> size??
>
>
> No. The testDirection refers to over-represented and
> under-represented GO terms, so they should be mutually exclusive
> given the same data set.
>
> Best,
>
> Jim
>
>
>
> Thanks,
>
> Cristobal
>
> library(GOstats)
> load(file="Genes.RData"); genesModel<- out; rm(out)
> univer<- unique(as.character(genesModel$GeneID))
> paramsOver<- new("GOHyperGParams",
>
> + geneIds= univer[1:100],
> + universeGeneIds=univer[1:200],
> + annotation="org.Mm.eg.db",
> + ontology="BP",
> + pvalueCutoff=0.01,
> + conditional=FALSE,
> + testDirection="over")
> Loading required package: org.Mm.eg.db
>
> paramsUnder<- new("GOHyperGParams",
>
> + geneIds= univer[1:100],
> + universeGeneIds=univer[1:200],
> + annotation="org.Mm.eg.db",
> + ontology="BP",
> + pvalueCutoff=0.01,
> + conditional=FALSE,
> + testDirection="under")
>
> over<- hyperGTest(paramsOver)
> under<- hyperGTest(paramsUnder)
> over
>
> Gene to GO BP test for over-representation
> 1146 GO BP ids tested (0 have p< 0.01)
> Selected gene set size: 80
> Gene universe size: 156
> Annotation package: org.Mm.eg <http://org.Mm.eg>
>
> under
>
> Gene to GO BP test for under-representation
> 1776 GO BP ids tested (0 have p< 0.01)
> Selected gene set size: 80
> Gene universe size: 156
> Annotation package: org.Mm.eg <http://org.Mm.eg>
>
> length(pvalues(over))
>
> [1] 1146
>
> length(pvalues(under))
>
> [1] 1776
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list