[BioC] Creating annotation package with a new database schema
Fabian Grammes
fabian.grammes at umb.no
Tue Oct 23 22:20:31 CEST 2012
Hi Hervé
Thanks a lot, that was exactly the information that I've been
looking for !
After updating BioConductor today, I am struggling a bit with
getting the code to work again, but that should be fixed tomorrow
I hope :)
@ Marc
I've checked the function: makeOrgPackageFromNCBI,
however since I have most of my annotation information stored locally
(GO etc. - obtained via Blast2GO) and not yet available at NCBI,
I do not think the function helps in my case.
cheers, F
On Oct 23, 2012, at 12:04 AM, Hervé Pagès wrote:
> Hi Fabian,
>
> On 10/22/2012 05:57 AM, Fabian Grammes wrote:
>> Dear List
>>
>> I am working with Atlantic salmon and am highly interested to make a
>> custom annotation package for the
>> microarray that I am using.
>>
>> I've worked through the tutorial from Gabor Csardi ("Creating an
>> annotation package with a new database
>> schema" ), which was very helpful. However, I am struggling to
>> implement
>> the bimap objects to access
>> the GO annotations that I have in the DB.
>>
>> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC,
>> CC_all)
>> looking like the format that I found for
>> the organism packages in BioC:
>> ID GOID evi
>> 6092 GO:0000910 IEA
>> 6092 GO:0040035 IEA
>> 6092 GO:0000398 IEA
>>
>> So if someone could help me/ point me to the correct way how to
>> implement the GO mappings
>> in an annotation package that would be great.
>
> If you look for example at the hgu95av2.db package, it provides 3
> predefined Bimaps for accessing the GO data: hgu95av2GO (GO map),
> hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES
> (GO2ALLPROBES
> map). The 1st is a direct map, the 2nd and 3rd are reverse maps:
>
> > direction(hgu95av2GO)
> [1] 1
> > direction(hgu95av2GO2PROBE)
> [1] -1
> > direction(hgu95av2GO2ALLPROBES)
> [1] -1
>
> All of them are of class "ProbeGo3AnnDbBimap".
>
> The predefined Bimaps are created at load-time. The direct maps
> with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse
> maps by "manually" reversing some of the direct maps returned by
> createAnnDbBimaps().
>
> So you need to add an entry for the GO map to the list of "seeds"
> passed to createAnnDbBimaps(). In your case this entry needs to look
> something like (assuming ID is your internal id for genes):
>
> seeds <- list(
> ...
> list(
> objName="GO",
> Class="ProbeGo3AnnDbBimap",
> L2Rchain=list(
> list(
> tablename="probes",
> Lcolname="probe_id",
> Rcolname="gene_id",
> filter="{is_multiple}='0'"
> ),
> list(
> tablename="genes",
> Lcolname="gene_id",
> Rcolname="ID"
> ),
> list(
> Lcolname="ID",
> tagname=c(Evidence="{evi}"),
> Rcolname="GOID",
> Rattribnames=c(Ontology="NULL")
> )
> ),
> rightTables=c(BP="BP", CC="CC", MF="MF")
> )
> ...
> )
>
> Then:
>
> ann_objs <- createAnnDbBimaps(seeds, seed0)
>
> where 'seed0' is defined by something like:
>
> seed0 <- list(objTarget="chip <name_of_your_chip>",
> datacache=datacache)
>
> and 'datacache' is the environment that will be used for package-level
> caching of the data loaded from the DB (use NULL for no caching, I'm
> assuming those extra details, which are not GO-specific, are covered
> in Gabor's document, but I don't know).
>
> Then you can append the reverse maps to 'ann_objs' with something
> like:
>
> ## Append GO2PROBE map:
> map <- ann_objs$GO
> map <- revmap(map)
> map at objName <- "GO2PROBE"
> ann_objs$GO2PROBE <- map
>
> ## Append GO2ALLPROBES map:
> map <- ann_objs$GO2PROBE
> map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all")
> map at objName <- "GO2ALLPROBES"
> ann_objs$GO2ALLPROBES <- map
>
> All this needs to happen at load-time (via the .onLoad hook). Again
> I'm
> focusing on the GO-specific part of the story here, assuming that
> you've
> already managed to create the non-GO specific maps (thanks to Gabor's
> document).
>
> Hope this helps,
>
> H.
>
>>
>> kind regards, Fabian
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
More information about the Bioconductor
mailing list