[BioC] question about ontoCompare() performance change
Seth Falcon
sfalcon at fhcrc.org
Thu Oct 29 18:26:12 CET 2009
Hi Scott,
Thanks for the reminder and providing a reproducible example. We will
take a look and see if we can understand and provide a fix for the slow
down.
+ seth
On 10/28/09 5:23 PM, Scott Markel wrote:
> Just a quick FYI to anyone else using goTools' ontoCompare().
>
> It looks like it's approximately another factor of 2 slower in
> BioConductor 2.5. User time has gone from 25 seconds (2.3) to
> 150 seconds (2.4) to 290 seconds (2.5). Don't know if this is
> package-specific or caused by changes in R.
>
> Scott
>
>
> -----Original Message-----
> From: Scott Markel
> Sent: Wednesday, 10 June 2009 5:15 PM
> To: Bioconductor at stat.math.ethz.ch
> Subject: question about ontoCompare() performance change
>
> I'm seeing a noticeable performance change in goTools' ontoCompare()
> from BioConductor version 2.3 to 2.4. With the same input data the
> user time reported by system.time() on my Windows XP machine has gone
> from 25 seconds to about 150 seconds. Times on a RHEL 5 machine are
> 30 seconds and 130 seconds.
>
> I checked the ontoCompare() help, the goTools documentation, the mailing
> list archives, and Google for terms like "ontoCompare goTools performance",
> and didn't find anything.
>
> I'm sure I'm missing something obvious, but I'd appreciate advice on
> how I should now be using ontoCompare() in Bioc 2.4.
>
> The script, BioC 2.3 output, BioC 2.4 output, and two sets of
> sessionInfo() follow.
>
> Scott
>
> ##############################
> Here's the R script, using the same inputs for both BioC 2.3 and 2.4.
>
> prop<-list()
> prop$probeIDs<- c("1007_s_at", "1053_at", "117_at", "121_at",
> "1255_g_at", "1294_at", "1316_at", "1320_at", "1405_i_at", "1405_i_at")
> prop$microarrayType<- "hgu133a"
>
> library("goTools")
> library("hgu133a.db")
>
> system.time(result<- ontoCompare( list(prop$probeIDs),
> probeType=as.character(prop$microarrayType), method="none", goType="MF"))
> ##############################
> The BioC 2.3 output is
>
> user system elapsed
> 23.31 0.22 25.70
>
>> result
> binding catalytic activity chemoattractant activity enzyme regulator
> activity
> 1 10 4 2
> 1
> molecular transducer activity structural molecule activity
> 1 5 1
> transcription regulator activity NotFound
> 1 2 0
> ##############################
> The BioC 2.4 output is
>
> user system elapsed
> 151.16 0.41 169.11
>
>> result
> [,1]
> catalytic activity 4
> binding 10
> enzyme regulator activity 1
> transcription regulator activity 2
> chemoattractant activity 2
> molecular transducer activity 5
>
> ##############################
>> sessionInfo()
> R version 2.7.2 (2008-08-25)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] hgu133a_2.2.0 hgu133a.db_2.2.0 goTools_1.12.0
> [4] GO_2.2.0 annotate_1.18.0 xtable_1.5-4
> [7] AnnotationDbi_1.2.2 RSQLite_0.7-0 DBI_0.2-4
> [10] Biobase_2.0.1
> ##############################
>> sessionInfo()
> R version 2.9.0 (2009-04-17)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] hgu133a.db_2.2.11 goTools_1.18.0 GO.db_2.2.11
> [4] RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.6.0
> [7] Biobase_2.4.1
> ##############################
>
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect email: smarkel at accelrys.com
> Accelrys (SciTegic R&D) mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100 voice: +1 858 799 5603
> San Diego, CA 92121 fax: +1 858 799 5222
> USA web: http://www.accelrys.com
>
> http://www.linkedin.com/in/smarkel
> Vice President, Board of Directors:
> International Society for Computational Biology
> Co-chair: ISCB Publications Committee
> Associate Editor: PLoS Computational Biology
> Editorial Board: Briefings in Bioinformatics
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list