[BioC] biomaRt slow down?

Wed Oct 28 21:46:03 CET 2009

Hi,

Lately when I make queries via biomaRt, they are onerously long as 
compared to before (last spring/early summer).

For example, I frequently pull down the following information

mart <- useMart("ensembl",dataset="hsapiens_gene_ensembl")
WHAT <- c("ensembl_gene_id","ensembl_transcript_id",
         "chromosome_name",
         "strand",
         "exon_chrom_start",
         "exon_chrom_end",
         "rank",
         "ensembl_exon_id",
         "gene_biotype")
anno <- getBM(WHAT, mart = mart, 
filters="ensembl_gene_id",values=unique(gene),verbose = FALSE)

where I sometimes filter by gene (as indicated in the above code), or 
otherwise just bring it everything down. It was so fast to bring 
everything down, that I just automatically reran the code so as to make 
sure I was current (When I say fast, I don't have numbers, but much less 
than 15 minutes, I'm think). Now it's very slow -- the above code hasn't 
finished after an hour (unfortunately, I don't know how many genes it is 
because its the result of processing something and I don't know the 
results a priori). This has been my experience several times now (both 
with and without filtering on gene id) and I know someone else in my 
department has experienced the same thing without change of code.

So my question is: Is there something about this set of values that 
makes it now very slow combination when it wasn't before? Would dropping 
a specific value(s) speed it up?

Thanks,
Elizabeth

-- 
Elizabeth Purdom
Assistant Professor
Department of Statistics
UC, Berkeley
Evans Hall, Rm 433
epurdom at stat.berkeley.edu
(510) 642-6154 (office)
(510) 642-7892 (fax)