[BioC] how does an annotation package handle ambigious probe set id mappings
James W. MacDonald
jmacdon at med.umich.edu
Mon Oct 19 19:00:57 CEST 2009
Hi Andrew,
Andrew Yee wrote:
> Apologies if this has been asked before, but how does an annotation
> package handle an ambiguous probe set ID mapping?
>
> Take for example the Affymetrix chip U133X3P.
>
> When I use the annotation for this chip for probe set ID
> 1552641_3p_s_at, it returns only one match:
>
>> library('u133x3p.db')
>> mget('1552641_3p_s_at', env=u133x3pSYMBOL)
> $`1552641_3p_s_at`
> [1] "ATAD3B"
>> mget('1552641_3p_s_at', env=u133x3pENTREZID)
> $`1552641_3p_s_at`
> [1] "83858"
>
> However, when I search Affymetrix, with:
>
> https://www.affymetrix.com/analysis/netaffx/fullrecord.affx?pk=U133_X3P:1552641_3P_S_AT
>
> it states that it ambiguously maps to three gene symbols, ATAD3A,
> ATAD3B, and LOC732419.
>
> How does the annotation package determine which gene symbol it should map to?
In the past we just used the first probeset ==> Entrez Gene ID mapping.
However, in the soon to be released BioC 2.5 annotation packages all the
mappings are included (thanks to Marc Carlson).
> tmp <- toggleProbes(u133x3pENTREZID, "all")
> get('1552641_3p_s_at', tmp)
[1] "55210" "732419" "83858"
> tmp2 <- toggleProbes(u133x3pSYMBOL, "all")
> get('1552641_3p_s_at', tmp2)
[1] "ATAD3A" "LOC732419" "ATAD3B"
Oddly enough, this probeset isn't mapped in the 'regular' mappings:
> get('1552641_3p_s_at', u133x3pENTREZID)
[1] NA
> get('1552641_3p_s_at', u133x3pSYMBOL)
[1] NA
Marc?
> sessionInfo()
R version 2.10.0 Under development (unstable) (2009-09-21 r49780)
i386-pc-mingw32
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] u133x3p.db_2.3.5 org.Hs.eg.db_2.3.4 RSQLite_0.7-2
[4] DBI_0.2-4 AnnotationDbi_1.7.17 Biobase_2.5.6
loaded via a namespace (and not attached):
[1] tools_2.10.0
>
Best,
Jim
>
> Thanks,
> Andrew
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
More information about the Bioconductor
mailing list