[Bioc-sig-seq] viewApplying Efficiently

Martin Morgan mtmorgan at fhcrc.org
Mon Feb 14 14:33:21 CET 2011


On 02/13/2011 04:44 PM, Martin Morgan wrote:
> On 02/13/2011 03:00 PM, Dario Strbenac wrote:
>> Hello,
>>
> 
>> I have an RleList of about 17000 Rles and I'd like to get the
> regularly spaced values out of each one of them and have a list of
> vectors of numbers as the result.
> 
> Maybe along the lines of
> 
> elt <- Rle(as.numeric(floor(runif(50100, 1, 2.2)))) # simulated
> m <- Rle(rep(c(FALSE, TRUE), 100), rep(c(500, 1), 100)) # mask
> as.numeric((elt * m)[m])

just

  i = seq(501, length(elt), by=500)
  as.numeric(elt[i])

> 
> ? This might be in an lapply on the RleList; m would have to be
> constructed to be the right length for each element.
> 
> Martin
>>
>> e.g. my views locations is 17000 of these :
>>
>>> samplingRL[[1]] # is a RangesList
>> IRanges of length 101
>>       start   end width
>> [1]     501   501     1
>> [2]    1001  1001     1
>> [3]    1501  1501     1
>> [4]    2001  2001     1
>> [5]    2501  2501     1
>> [6]    3001  3001     1
>> [7]    3501  3501     1
>> [8]    4001  4001     1
>> [9]    4501  4501     1
>> ...     ...   ...   ...
>> [93]  46501 46501     1
>> [94]  47001 47001     1
>> [95]  47501 47501     1
>> [96]  48001 48001     1
>> [97]  48501 48501     1
>> [98]  49001 49001     1
>> [99]  49501 49501     1
>> [100] 50001 50001     1
>> [101] 50501 50501     1
>>
>> and my RleList has data like :
>>
>>> rleList[[1]]
>> 'numeric' Rle of length 51001 with 38620 runs
>>   Lengths:               501                 1 ...              1089
>>   Values : 0.671728853793319 0.677726432845045 ... 0.224909214439609
>>
>> I do the following to get the sampling position values in one step, but it uses up over 20 GB RAM in a matter of seconds, and I have to kill the process.
>>
>> result <- viewApply(Views(rleList, samplingRL), function(samples) as.numeric(samples), simplify = TRUE)
>>
>> Is there a better way ?
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> 
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-sig-sequencing mailing list