[BioC] retrieve gene symbol/description
Sean Davis
sdavis2 at mail.nih.gov
Wed Apr 18 22:03:32 CEST 2012
On Wed, Apr 18, 2012 at 3:50 PM, array chip <arrayprofile at yahoo.com> wrote:
> Thank you Sean. You are right there is no annotation of this gene in GenBank
> or Ensemble. But if we dig into more, you can see that both GenBank (section
> "Reference sequence information" on the right panel)
> and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene
> Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns
> 100% to PANX3.
>
> Is there a package/function in bioconductor that still allows me to retrieve
> the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this
> situation, just want to automate the retrieval if possible.
I do not know of a single resource that is complete in this regard.
You could try using the AnnotationDbi package to build an annotation
package if that is your use case. Otherwise, you might try using
Unigene or NCBI Entrez Gene to get some more mapping down.
Sean
> ________________________________
> From: Sean Davis <sdavis2 at mail.nih.gov>
> To: array chip <arrayprofile at yahoo.com>
> Cc: "bioconductor at r-project.org" <bioconductor at r-project.org>
> Sent: Wednesday, April 18, 2012 12:16 PM
> Subject: Re: [BioC] retrieve gene symbol/description
>
> On Wed, Apr 18, 2012 at 3:08 PM, array chip <arrayprofile at yahoo.com> wrote:
>> Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs
>> using biomaRt. I was successful with some IDs, but not with others. For
>> example:
>>
>>> library(biomaRt)
>>
>>> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")
>>
>>
>>> getBM(attributes=c('embl', 'description','hgnc_symbol'), filters =
>>> 'embl', values = c('AF133587','AA456140'), mart = ensembl)
>>
>> embl
>> description hgnc_symbol
>> 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC
>> Symbol;Acc:13437] RTDR1
>>
>>
>> As you can see, the first ID returns gene symbol/description successfully,
>> but the 2nd one did not. What is the reason for the 2nd one not working? Is
>> there other ways to get it to work?
>>
>
> Hi, John.
>
> This query is working as expected. The genbank accession "AA456140"
> is not associated with any gene in the Ensembl gene collection. Try
> typing your two accessions into the ensembl search box. You'll note
> that one the first is associated with a gene while the second is
> simply a genomic alignment (and not associated with a gene).
>
> Sean
>
>
More information about the Bioconductor
mailing list