[Bioc-sig-seq] viewApplying Efficiently
Martin Morgan
mtmorgan at fhcrc.org
Mon Feb 14 14:33:21 CET 2011
On 02/13/2011 04:44 PM, Martin Morgan wrote:
> On 02/13/2011 03:00 PM, Dario Strbenac wrote:
>> Hello,
>>
>
>> I have an RleList of about 17000 Rles and I'd like to get the
> regularly spaced values out of each one of them and have a list of
> vectors of numbers as the result.
>
> Maybe along the lines of
>
> elt <- Rle(as.numeric(floor(runif(50100, 1, 2.2)))) # simulated
> m <- Rle(rep(c(FALSE, TRUE), 100), rep(c(500, 1), 100)) # mask
> as.numeric((elt * m)[m])
just
i = seq(501, length(elt), by=500)
as.numeric(elt[i])
>
> ? This might be in an lapply on the RleList; m would have to be
> constructed to be the right length for each element.
>
> Martin
>>
>> e.g. my views locations is 17000 of these :
>>
>>> samplingRL[[1]] # is a RangesList
>> IRanges of length 101
>> start end width
>> [1] 501 501 1
>> [2] 1001 1001 1
>> [3] 1501 1501 1
>> [4] 2001 2001 1
>> [5] 2501 2501 1
>> [6] 3001 3001 1
>> [7] 3501 3501 1
>> [8] 4001 4001 1
>> [9] 4501 4501 1
>> ... ... ... ...
>> [93] 46501 46501 1
>> [94] 47001 47001 1
>> [95] 47501 47501 1
>> [96] 48001 48001 1
>> [97] 48501 48501 1
>> [98] 49001 49001 1
>> [99] 49501 49501 1
>> [100] 50001 50001 1
>> [101] 50501 50501 1
>>
>> and my RleList has data like :
>>
>>> rleList[[1]]
>> 'numeric' Rle of length 51001 with 38620 runs
>> Lengths: 501 1 ... 1089
>> Values : 0.671728853793319 0.677726432845045 ... 0.224909214439609
>>
>> I do the following to get the sampling position values in one step, but it uses up over 20 GB RAM in a matter of seconds, and I have to kill the process.
>>
>> result <- viewApply(Views(rleList, samplingRL), function(samples) as.numeric(samples), simplify = TRUE)
>>
>> Is there a better way ?
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioc-sig-sequencing
mailing list