[Bioc-sig-seq] Resizing a GRanges object with elements on the "*" strand
Martin Morgan
mtmorgan at fhcrc.org
Tue Sep 21 00:56:57 CEST 2010
On 09/14/2010 01:46 PM, Michael Lawrence wrote:
> I just checked in some changes to IRanges, that make this method work:
>
> setMethod("resize", "GenomicRanges",
> function(x, width, fix = "start", use.names = TRUE)
> {
> revFix <- c(start = "end", end = "start", center = "center")
> fix <- ifelse(strand(x) == "-", revFix[fix], fix)
> ranges <-
> resize(ranges(x), width = width, fix = fix, use.names =
> use.names)
> if (!IRanges:::anyMissing(seqlengths(x))) {
> start(x) <- start(ranges)
> end(x) <- end(ranges)
> } else {
> x <- clone(x, ranges = ranges)
> }
> x
> }
> )
>
> That will accept the fix argument, except start and end are reversed for
> negative strand features. '*' is treated just like '+'. If this is
> acceptable to the GenomicRanges guys, I will commit this.
Yes this is acceptable; I'm cc'ing Herve as we were talking about this
just this morning. Martin
>
> On Tue, Sep 14, 2010 at 9:36 AM, Ivan Gregoretti <ivangreg at gmail.com> wrote:
>
>> Hello Leonardo,
>>
>> I believe that the issue here is that resize() does not support the
>> "fix" argument at all when handling GRanges.
>>
>> Actually that would be a nice upgrade of functionality for GRanges.
>>
>> I face the same limitation and I currently resize by hand. :(
>>
>> This is my work around:
>>
>> library(GenomicRanges)
>>
>> # a set of genomic features called C
>> C <- GRanges(seqnames=c("chr1","chr2","chr19","chrX"),
>> ranges=IRanges(start=c(0,0,5,1),
>> end=c(150,150,150,400)),
>> strand=c("*","-","*","+"),
>> score=c(10,20,30,90))
>>
>>
>> # peek at C
>> C
>> GRanges with 4 ranges and 1 elementMetadata value
>> seqnames ranges strand | score
>> <Rle> <IRanges> <Rle> | <numeric>
>> [1] chr1 [0, 150] * | 10
>> [2] chr2 [0, 150] - | 20
>> [3] chr19 [5, 150] * | 30
>> [4] chrX [1, 400] + | 90
>>
>> seqlengths
>> chr1 chr19 chr2 chrX
>> NA NA NA NA
>>
>> # this is the workaround
>> ranges(C) <- resize(ranges(C),1,fix="start")
>>
>> # peek at the resized set C
>> C
>> GRanges with 4 ranges and 1 elementMetadata value
>> seqnames ranges strand | score
>> <Rle> <IRanges> <Rle> | <numeric>
>> [1] chr1 [0, 0] * | 10
>> [2] chr2 [0, 0] - | 20
>> [3] chr19 [5, 5] * | 30
>> [4] chrX [1, 1] + | 90
>>
>> seqlengths
>> chr1 chr19 chr2 chrX
>> NA NA NA NA
>>
>>
>> Cheers,
>>
>> Ivan
>>
>>
>> Ivan Gregoretti, PhD
>> National Institute of Diabetes and Digestive and Kidney Diseases
>> National Institutes of Health
>> 5 Memorial Dr, Building 5, Room 205.
>> Bethesda, MD 20892. USA.
>> Phone: 1-301-496-1016 and 1-301-496-1592
>> Fax: 1-301-496-9878
>>
>>
>>
>> On Tue, Sep 14, 2010 at 11:36 AM, Leonardo Collado Torres
>> <lcollado at ibt.unam.mx> wrote:
>>> Hello,
>>>
>>> I have a rather simple question that involves GenomicRanges' design.
>>>
>>> Basically, I have a GRanges object where all the elements are from the
>>> undefined "*" strand. I just want to resize them to get the 1st (from
>> left
>>> to right) base. However, I'm not able to do so with the "resize" function
>>> even when specifying fix = "start" as it uses the fix = "center" method.
>> Is
>>> this the desired performance? I have 2 workarounds, but I'm puzzled as
>> the
>>> "flank" function actually uses the start (left to right) when elements
>> are
>>> from the "*" strand. Is there a quicker way to do this or should I stick
>> to
>>> the flank + shift workaround?
>>>
>>> Thank you and greetings,
>>> Leonardo
>>>
>>>> testGR <- GRanges( seqnames = rep("test", 3), ranges = IRanges ( start =
>>> c(10,100,1000), width = c(10, 100, 1000)), strand =
>> Rle(strand(c("+","-")),
>>> c(1,2)) )
>>>> testGR
>>> GRanges with 3 ranges and 0 elementMetadata values
>>> seqnames ranges strand |
>>> <Rle> <IRanges> <Rle> |
>>> [1] test [ 10, 19] + |
>>> [2] test [ 100, 199] - |
>>> [3] test [1000, 1999] - |
>>>
>>> seqlengths
>>> test
>>> NA
>>>> resize(testGR, 1, fix="start")
>>> GRanges with 3 ranges and 0 elementMetadata values
>>> seqnames ranges strand |
>>> <Rle> <IRanges> <Rle> |
>>> [1] test [ 10, 10] + |
>>> [2] test [ 199, 199] - |
>>> [3] test [1999, 1999] - |
>>>
>>> seqlengths
>>> test
>>> NA
>>>> testGR2 <- GRanges( seqnames = rep("test", 3), ranges = IRanges ( start
>> =
>>> c(10,100,1000), width = c(10, 100, 1000)), strand = Rle(strand(c("*")),
>>> c(3)) )
>>>> testGR2
>>> GRanges with 3 ranges and 0 elementMetadata values
>>> seqnames ranges strand |
>>> <Rle> <IRanges> <Rle> |
>>> [1] test [ 10, 19] * |
>>> [2] test [ 100, 199] * |
>>> [3] test [1000, 1999] * |
>>>
>>> seqlengths
>>> test
>>> NA
>>>> resize(testGR2, 1, fix="start")
>>> GRanges with 3 ranges and 0 elementMetadata values
>>> seqnames ranges strand |
>>> <Rle> <IRanges> <Rle> |
>>> [1] test [ 14, 14] * |
>>> [2] test [ 149, 149] * |
>>> [3] test [1499, 1499] * |
>>>
>>> seqlengths
>>> test
>>> NA
>>>
>>>> testGR3 <- GRanges ( seqnames = seqnames(testGR2), ranges = IRanges(
>> start
>>> = start(testGR2), width = 1), strand = strand(testGR2) )
>>>>
>>>> testGR3
>>> GRanges with 3 ranges and 0 elementMetadata values
>>> seqnames ranges strand |
>>> <Rle> <IRanges> <Rle> |
>>> [1] test [ 10, 10] * |
>>> [2] test [ 100, 100] * |
>>> [3] test [1000, 1000] * |
>>>
>>> seqlengths
>>> test
>>> NA
>>>>
>>>
>>>> testGR4 <- shift(flank( testGR2, 1), 1)
>>>> testGR4
>>> GRanges with 3 ranges and 0 elementMetadata values
>>> seqnames ranges strand |
>>> <Rle> <IRanges> <Rle> |
>>> [1] test [ 10, 10] * |
>>> [2] test [ 100, 100] * |
>>> [3] test [1000, 1000] * |
>>>
>>> seqlengths
>>> test
>>> NA
>>>
>>>> sessionInfo()
>>> R version 2.12.0 Under development (unstable) (2010-09-08 r52880)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
>>> [7] LC_PAPER=en_US.utf8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] GenomicRanges_1.1.25 IRanges_1.7.33
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.12.0
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
More information about the Bioc-sig-sequencing
mailing list