[BioC] Genes annotated to GO:0031281 using org.Hs.eg.db
James W. MacDonald
jmacdon at uw.edu
Wed Mar 19 22:44:03 CET 2014
Hi Tim,
See ?probeSetSummary.
As an example, after running the example for that function, you end up
with a GOHyperGResults object called 'hyp':
> head(summary(hyp))
GOBPID Pvalue OddsRatio ExpCount Count Size
1 GO:0031399 5.316815e-15 3.623932 28.94921 72 397
2 GO:0036211 2.482766e-14 2.949456 56.00249 107 768
3 GO:0023052 1.756878e-13 2.816284 110.47367 164 1515
And the first GO term has 72 significant genes out of 397 available. The
'ps' object is a list created by the probeSetSummary function, and each
list item presents the Entrez Gene ID, Probeset ID, and an indicator
showing it the probeset gave rise to the significant result (not really
necessary - you could just use unique() on the EntrezID column if you
wanted to):
> head(ps[["GO:0031399"]])
EntrezID ProbeSetID selected
1 10114 102_at 1
2 10114 41501_at 0
3 10114 41502_at 0
4 10188 1134_at 1
And
> sum(ps[["GO:0031399"]]$selected)
[1] 72
Note that you either have to use a _named_ vector of geneIds when you
set up your GOHyperGParams object
> head(prbs)
1000_at 1001_at 1002_f_at 1003_s_at 1005_at 1006_at
"5595" "7075" "1557" "643" "1843" "4319"
as is done in that example, or you can pass in a set of IDs using the
'sigProbesets' argument to probeSetSummary(). Or you can just subset to
the unique Entrez Gene IDs in whichever list item you are interested in.
Best,
Jim
On 3/19/2014 5:17 PM, Tim Smith wrote:
> Hi James,
>
> Thanks for the reply! I had 371 genes ( gene universe ~ 7k genes) for
> which I was checking enrichment and I got this term as one of the
> significant terms. The details are:
>
> GOBPID Pvalue OddsRatio ExpCount Count Size
> Term
> GO:0031281 0.021301601 10.113271 0.22913929 2 15
> positive regulation of cyclase activity
>
> I used GOstats for this analysis. If Count = 2, then shouldn't there
> be two genes that are directly annotated to this term?
>
>
>
>
> On Wednesday, March 19, 2014 10:17 AM, James W. MacDonald
> <jmacdon at uw.edu> wrote:
> Hi Tim,
>
> There may not be any IDs mapped to that term directly. You can use
> GO2ALLEGS, which maps all direct and child terms to Entrez Gene IDs.
>
> > get("GO:0031281", org.Hs.egGO2ALLEGS)
> IDA IDA TAS TAS IDA TAS IEA TAS TAS
> TAS
> "49" "116" "116" "117" "135" "135" "136" "136" "140"
> "153"
> IDA IDA TAS IDA IEA IDA IDA TAS IDA
> IEA
> "154" "155" "554" "796" "796" "799" "1394" "1394" "1812"
> "1812"
> IDA IEA NAS TAS TAS IEA TAS TAS IEA
> TAS
> "1816" "1816" "1816" "1816" "1909" "2692" "2696" "2740" "2774"
> "2778"
> IDA ISS ISS IBA IBA IMP ISS TAS TAS
> TAS
> "2852" "3973" "4763" "4842" "4843" "4846" "4846" "4914" "4915"
> "5032"
> ISS NAS IEA IDA IEA TAS TAS
> "5578" "5894" "7077" "7432" "7434" "10486" "10487"
>
> Or the more powerful select() method:
>
> > select(org.Hs.eg.db, "GO:0031281", c("ENTREZID", "SYMBOL"), "GOALL")
> GOALL EVIDENCEALL ONTOLOGYALL ENTREZID SYMBOL
> 1 GO:0031281 IDA BP 49 ACR
> 2 GO:0031281 IDA BP 116 ADCYAP1
> 3 GO:0031281 TAS BP 116 ADCYAP1
> 4 GO:0031281 TAS BP 117 ADCYAP1R1
> 5 GO:0031281 IDA BP 135 ADORA2A
> 6 GO:0031281 TAS BP 135 ADORA2A
> 7 GO:0031281 IEA BP 136 ADORA2B
> 8 GO:0031281 TAS BP 136 ADORA2B
> 9 GO:0031281 TAS BP 140 ADORA3
> 10 GO:0031281 TAS BP 153 ADRB1
> 11 GO:0031281 IDA BP 154 ADRB2
> 12 GO:0031281 IDA BP 155 ADRB3
> 13 GO:0031281 TAS BP 554 AVPR2
> 14 GO:0031281 IDA BP 796 CALCA
> 15 GO:0031281 IEA BP 796 CALCA
> 16 GO:0031281 IDA BP 799 CALCR
> 17 GO:0031281 IDA BP 1394 CRHR1
> 18 GO:0031281 TAS BP 1394 CRHR1
> 19 GO:0031281 IDA BP 1812 DRD1
> 20 GO:0031281 IEA BP 1812 DRD1
> 21 GO:0031281 IDA BP 1816 DRD5
> 22 GO:0031281 IEA BP 1816 DRD5
> 23 GO:0031281 NAS BP 1816 DRD5
> 24 GO:0031281 TAS BP 1816 DRD5
> 25 GO:0031281 TAS BP 1909 EDNRA
> 26 GO:0031281 IEA BP 2692 GHRHR
> 27 GO:0031281 TAS BP 2696 GIPR
> 28 GO:0031281 TAS BP 2740 GLP1R
> 29 GO:0031281 IEA BP 2774 GNAL
> 30 GO:0031281 TAS BP 2778 GNAS
> 31 GO:0031281 IDA BP 2852 GPER1
> 32 GO:0031281 ISS BP 3973 LHCGR
> 33 GO:0031281 ISS BP 4763 NF1
> 34 GO:0031281 IBA BP 4842 NOS1
> 35 GO:0031281 IBA BP 4843 NOS2
> 36 GO:0031281 IMP BP 4846 NOS3
> 37 GO:0031281 ISS BP 4846 NOS3
> 38 GO:0031281 TAS BP 4914 NTRK1
> 39 GO:0031281 TAS BP 4915 NTRK2
> 40 GO:0031281 TAS BP 5032 P2RY11
> 41 GO:0031281 ISS BP 5578 PRKCA
> 42 GO:0031281 NAS BP 5894 RAF1
> 43 GO:0031281 IEA BP 7077 TIMP2
> 44 GO:0031281 IDA BP 7432 VIP
> 45 GO:0031281 IEA BP 7434 VIPR2
> 46 GO:0031281 TAS BP 10486 CAP2
> 47 GO:0031281 TAS BP 10487 CAP1
> Warning message:
> In .generateExtraRows(tab, keys, jointype) :
> 'select' resulted in 1:many mapping between keys and return rows
>
> Best,
>
> Jim
>
>
> On 3/19/2014 4:28 AM, Tim Smith wrote:
> > Hi,
> >
> > I was trying to get the genes annotated to the GO term "GO:0031281".
> My code:
> >
> > library(org.Hs.eg.db)
> >
> > genes <- get("GO:0031281", org.Hs.egGO2EG)
> >
> > When I run the code, I get:
> >
> >> genes <- get("GO:0031281", org.Hs.egGO2EG)
> > Error in .checkKeys(value, Rkeys(x), x at ifnotfound
> <mailto:x at ifnotfound>) :
> > value for "GO:0031281" not found
> >
> > If I check in AMIGO for this GO term, it seems to have many gene
> products for Homo sapiens. Am I doing something wrong? Is there an
> alternate package that I can try, just to double check the results?
> >
> > thanks!
> > [[alternative HTML version deleted]]
> >
> >
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list