[Bioc-sig-seq] Replace Elements of RleList is Slow
Dario Strbenac
D.Strbenac at garvan.org.au
Sun Feb 13 05:00:11 CET 2011
Hello,
I have an RleList where sometimes I'd like to substitute some elements, before doing a viewApply on this modified list. I have an example where I just have one element and this takes almost a minute. Is there a chance of optimising this code in IRanges ? I'm trying to avoid casting the RleLists to lists then back to RleLists, to make my code shorter.
> class(coverageGenes)
[1] "SimpleRleList"
attr(,"package")
[1] "IRanges"
> length(coverageGenes)
[1] 17805
> w
[1] 8007
> xxx
SimpleRleList of length 1
$chr18
'numeric' Rle of length 51001 with 15108 runs
Lengths: 501 1 ... 14735
Values : 2.18311877482829 2.18461816959122 ... 0
> system.time(coverageGenes[w] <- xxx)
user system elapsed
55.780 2.070 57.866
> cgLIST <- as.list(coverageGenes)
> xxxL <- as.list(xxx)
> system.time(cgLIST[w] <- xxxL)
user system elapsed
0 0 0
So, the plain list based method works in a flash.
Note that xxx and w can sometimes be longer than 1, but I am just illustrating the base case. I had memory problems working with them being a bit longer. With only length 6 xxx and w, RAM usage shot up by 17 GB.
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_AU.UTF-8
[7] LC_PAPER=en_AU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] aroma.affymetrix_1.7.0 aroma.apd_0.1.7
[3] affxparser_1.22.0 R.huge_0.2.0
[5] aroma.core_1.7.0 aroma.light_1.18.0
[7] matrixStats_0.2.2 R.rsp_0.4.0
[9] R.cache_0.3.0 R.filesets_0.9.0
[11] digest_0.4.2 R.utils_1.5.3
[13] R.oo_1.7.4 R.methodsS3_1.2.1
[15] BSgenome.Hsapiens.UCSC.hg18_1.3.16 BSgenome_1.18.3
[17] Biostrings_2.18.2 GenomicRanges_1.2.3
[19] IRanges_1.8.9
loaded via a namespace (and not attached):
[1] Biobase_2.10.0 tools_2.12.0
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
More information about the Bioc-sig-sequencing
mailing list