[BioC] biomaRt slow down?
Elizabeth Purdom
epurdom at stat.berkeley.edu
Wed Oct 28 21:46:03 CET 2009
Hi,
Lately when I make queries via biomaRt, they are onerously long as
compared to before (last spring/early summer).
For example, I frequently pull down the following information
mart <- useMart("ensembl",dataset="hsapiens_gene_ensembl")
WHAT <- c("ensembl_gene_id","ensembl_transcript_id",
"chromosome_name",
"strand",
"exon_chrom_start",
"exon_chrom_end",
"rank",
"ensembl_exon_id",
"gene_biotype")
anno <- getBM(WHAT, mart = mart,
filters="ensembl_gene_id",values=unique(gene),verbose = FALSE)
where I sometimes filter by gene (as indicated in the above code), or
otherwise just bring it everything down. It was so fast to bring
everything down, that I just automatically reran the code so as to make
sure I was current (When I say fast, I don't have numbers, but much less
than 15 minutes, I'm think). Now it's very slow -- the above code hasn't
finished after an hour (unfortunately, I don't know how many genes it is
because its the result of processing something and I don't know the
results a priori). This has been my experience several times now (both
with and without filtering on gene id) and I know someone else in my
department has experienced the same thing without change of code.
So my question is: Is there something about this set of values that
makes it now very slow combination when it wasn't before? Would dropping
a specific value(s) speed it up?
Thanks,
Elizabeth
--
Elizabeth Purdom
Assistant Professor
Department of Statistics
UC, Berkeley
Evans Hall, Rm 433
epurdom at stat.berkeley.edu
(510) 642-6154 (office)
(510) 642-7892 (fax)
More information about the Bioconductor
mailing list