[BioC] Creating annotation package with a new database schema
Marc Carlson
mcarlson at fhcrc.org
Fri Oct 26 01:16:22 CEST 2012
Hi Fabian,
If the data for GO is not available at NCBI, makeOrgPackageFromNCBI will
try to use blast2GO instead (for GO at least).
Marc
On 10/23/2012 01:20 PM, Fabian Grammes wrote:
> Hi Hervé
>
> Thanks a lot, that was exactly the information that I've been
> looking for !
>
> After updating BioConductor today, I am struggling a bit with
> getting the code to work again, but that should be fixed tomorrow
> I hope :)
>
> @ Marc
>
> I've checked the function: makeOrgPackageFromNCBI,
> however since I have most of my annotation information stored locally
> (GO etc. - obtained via Blast2GO) and not yet available at NCBI,
> I do not think the function helps in my case.
>
> cheers, F
>
> On Oct 23, 2012, at 12:04 AM, Hervé Pagès wrote:
>
>> Hi Fabian,
>>
>> On 10/22/2012 05:57 AM, Fabian Grammes wrote:
>>> Dear List
>>>
>>> I am working with Atlantic salmon and am highly interested to make a
>>> custom annotation package for the
>>> microarray that I am using.
>>>
>>> I've worked through the tutorial from Gabor Csardi ("Creating an
>>> annotation package with a new database
>>> schema" ), which was very helpful. However, I am struggling to
>>> implement
>>> the bimap objects to access
>>> the GO annotations that I have in the DB.
>>>
>>> The GO data is stored in 6 tables (BP, BP_all, MF, MF_all, CC, CC_all)
>>> looking like the format that I found for
>>> the organism packages in BioC:
>>> ID GOID evi
>>> 6092 GO:0000910 IEA
>>> 6092 GO:0040035 IEA
>>> 6092 GO:0000398 IEA
>>>
>>> So if someone could help me/ point me to the correct way how to
>>> implement the GO mappings
>>> in an annotation package that would be great.
>>
>> If you look for example at the hgu95av2.db package, it provides 3
>> predefined Bimaps for accessing the GO data: hgu95av2GO (GO map),
>> hgu95av2GO2PROBE (GO2PROBE map), and hgu95av2GO2ALLPROBES (GO2ALLPROBES
>> map). The 1st is a direct map, the 2nd and 3rd are reverse maps:
>>
>> > direction(hgu95av2GO)
>> [1] 1
>> > direction(hgu95av2GO2PROBE)
>> [1] -1
>> > direction(hgu95av2GO2ALLPROBES)
>> [1] -1
>>
>> All of them are of class "ProbeGo3AnnDbBimap".
>>
>> The predefined Bimaps are created at load-time. The direct maps
>> with a call to AnnotationDbi:::createAnnDbBimaps() and the reverse
>> maps by "manually" reversing some of the direct maps returned by
>> createAnnDbBimaps().
>>
>> So you need to add an entry for the GO map to the list of "seeds"
>> passed to createAnnDbBimaps(). In your case this entry needs to look
>> something like (assuming ID is your internal id for genes):
>>
>> seeds <- list(
>> ...
>> list(
>> objName="GO",
>> Class="ProbeGo3AnnDbBimap",
>> L2Rchain=list(
>> list(
>> tablename="probes",
>> Lcolname="probe_id",
>> Rcolname="gene_id",
>> filter="{is_multiple}='0'"
>> ),
>> list(
>> tablename="genes",
>> Lcolname="gene_id",
>> Rcolname="ID"
>> ),
>> list(
>> Lcolname="ID",
>> tagname=c(Evidence="{evi}"),
>> Rcolname="GOID",
>> Rattribnames=c(Ontology="NULL")
>> )
>> ),
>> rightTables=c(BP="BP", CC="CC", MF="MF")
>> )
>> ...
>> )
>>
>> Then:
>>
>> ann_objs <- createAnnDbBimaps(seeds, seed0)
>>
>> where 'seed0' is defined by something like:
>>
>> seed0 <- list(objTarget="chip <name_of_your_chip>",
>> datacache=datacache)
>>
>> and 'datacache' is the environment that will be used for package-level
>> caching of the data loaded from the DB (use NULL for no caching, I'm
>> assuming those extra details, which are not GO-specific, are covered
>> in Gabor's document, but I don't know).
>>
>> Then you can append the reverse maps to 'ann_objs' with something like:
>>
>> ## Append GO2PROBE map:
>> map <- ann_objs$GO
>> map <- revmap(map)
>> map at objName <- "GO2PROBE"
>> ann_objs$GO2PROBE <- map
>>
>> ## Append GO2ALLPROBES map:
>> map <- ann_objs$GO2PROBE
>> map at rightTables <- c(BP="BP_all", CC="CC_all", MF="MF_all")
>> map at objName <- "GO2ALLPROBES"
>> ann_objs$GO2ALLPROBES <- map
>>
>> All this needs to happen at load-time (via the .onLoad hook). Again I'm
>> focusing on the GO-specific part of the story here, assuming that you've
>> already managed to create the non-GO specific maps (thanks to Gabor's
>> document).
>>
>> Hope this helps,
>>
>> H.
>>
>>>
>>> kind regards, Fabian
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fhcrc.org
>> Phone: (206) 667-5791
>> Fax: (206) 667-1319
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list