[Bioc-sig-seq] seqselect on SimpleRleList and RangesList - bug? and request
Janet Young
jayoung at fhcrc.org
Sat Jun 12 01:06:08 CEST 2010
Hi,
I've been playing around with seqselect on scores stored in a
SimpleRleList object to get subregions defined in a RangesList object.
I found a couple of things: first an enhancement request - would it
be possible to allow seqselect to deal with cases where not every
space (name) in the SimpleRleList has a corresponding space/name in
the RangesList object?
The second is either bug or else I've misunderstood the way seqselect
is supposed to work, in a dangerous way - it looks like seqselect
doesn't use the names of the list items to select scores, it just
assumes that in the two lists the elements have the same names in the
same order.
The code below should explain both issues problem much better than
those descriptions.
thanks,
Janet
> library(IRanges)
Attaching package: 'IRanges'
The following object(s) are masked from 'package:base':
cbind, Map, mapply, order, paste, pmax, pmax.int, pmin, pmin.int,
rbind, rep.int, table
>
> ### generate some arbitrary scores
> track <- RangedData(RangesList(chrA = IRanges(start = c(1, 4, 6),
width=c(3, 2, 4)),chrB = IRanges(start = c(1, 3, 6), width=c(3, 3,
4))) )
> trackCoverage <- coverage(track,
weight=list(chrA=c(2,7,3),chrB=c(1,1,1)) )
>
> ### define subregions
> exons <- RangesList(chrA = IRanges(start = c(2, 4), width =
c(2,2)),chrB = IRanges(start = 3, width = 5))
>
> ### seqselect works if all spaces in trackCoverage have an element
in exons
> seqselect(trackCoverage,exons )
SimpleRleList of length 2
$chrA
'integer' Rle of length 4 with 2 runs
Lengths: 2 2
Values : 2 7
$chrB
'integer' Rle of length 5 with 2 runs
Lengths: 1 4
Values : 2 1
>
> ### define subregions only on one chr
> exons_chrAonly <- RangesList(chrA = IRanges(start = c(2, 4), width
= c(2, 2)))
> ### now seqselect doesn't work if some spaces don't have any elements
> seqselect(trackCoverage,exons_chrAonly )
Error in seqselect(trackCoverage, exons_chrAonly) :
'length(start)' must equal 'length(x)' when 'end' and 'width' are
NULL
>
>
> ##### also, defining the regions with spaces in a different order
seems to cause trouble as seqselect doesn't seem to be using the
list's names - just going by order of elements
> exons_reorderchrs <- RangesList(chrB = IRanges(start = 3, width =
5),chrA = IRanges(start = c(2, 4), width = c(2,2)))
> seqselect(trackCoverage,exons_reorderchrs )
SimpleRleList of length 2
$chrA
'integer' Rle of length 5 with 3 runs
Lengths: 1 2 2
Values : 2 7 3
$chrB
'integer' Rle of length 4 with 3 runs
Lengths: 1 1 2
Values : 1 2 1
>
> identical ( seqselect(trackCoverage,exons ) ,
seqselect(trackCoverage,exons_reorderchrs ) )
[1] FALSE
>
> sessionInfo()
R version 2.11.1 (2010-05-31)
i386-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] IRanges_1.6.6
More information about the Bioc-sig-sequencing
mailing list