[Bioc-sig-seq] Add ability for `subset`ing IRanges-like objects based on their elementMetadata?

Stuart Andrews stu.andrews at gmail.com
Sat Jun 5 06:32:15 CEST 2010


Hi,

To answer Steve's question first question ... yes.  People (n = 1) are  
indeed subsetting GRanges objects.   In my case, I used  
GRanges::subset() like this:

R> subset(tags, elementMetadata(tags)$genome.hits < 5 & strand(tag) ==  
'+')

My question is, how does one discover that the singe-index square  
bracket is implemented like this for GRanges?   I didn't see any  
mention of this in GenomicRanges.pdf of "?GRanges" .

Are there other methods that do not appear in the documentation, and  
how can I learn about the existence of undocumented method in the  
future?

Thx,
- Stu

Stuart Andrews, Ph.D.
Postdoctoral Associate
Institute for Computational Biomedicine
Weill Cornell Medical College, New York, NY



On Jun 4, 2010, at 9:18 PM, Michael Lawrence wrote:

> On Fri, Jun 4, 2010 at 2:27 PM, Steve Lianoglou <
> mailinglist.honeypot at gmail.com> wrote:
>
>> Hi,
>>
>> Random question I thought I'd shoot out there ...
>>
>> I'm finding myself wanting to slice and dice IRanges-like objects  
>> (I'm
>> playing with GRanges right now) based on some column of their
>> elementMetadata.
>>
>>
> I guess if this makes sense then it would make sense to support this  
> for all
> Sequence derivatives. It works already for RangedData, btw.
>
> Michael
>
>
>> Are other people finding that they want to do this, too?
>> Would it make sense to add some subset-mojo to do that?
>>
>> Here's a motivating example:
>>
>> Say I have a GRanges object (`tags`), that looks something like:
>>
>> GRanges with 2217486 ranges and 8 elementMetadata values
>>   seqnames           ranges strand   |    tag.id genome.hits  
>> gene.hits
>>      <Rle>        <IRanges>  <Rle>   | <integer>   <integer>  
>> <integer>
>> [1]     chr1   [ 4850,  4866]      -   |    405384           
>> 10         3
>> [2]     chr1   [ 7804,  7820]      -   |    405387            
>> 6         4
>> [3]     chr1   [13162, 13178]      -   |    405397            
>> 5         4
>> [4]     chr1   [16712, 16728]      +   |        35       12164       
>> 2475
>> [5]     chr1   [21381, 21397]      +   |        45          
>> 497        79
>> [6]     chr1   [21479, 21495]      -   |      1466         
>> 3823       957
>>
>> And say that I want all "tags" with < 5 genome.hits on the "+"  
>> strand.
>> I'd like to:
>>
>> R> subset(tags, genome.hits < 5 & strand == '+')
>>
>> To do the same as:
>>
>> R> tags[elementMetadata(tags)$genome.hits < 5 & strand(tag) == '+']
>>
>> I realize that using `genome.hits` (from the elementMetadata) and
>> `strand` (not in the metadata) is crossing some boundaries, but I  
>> just
>> wanted to point out one of the more "complex" cases.
>>
>> Just curious,
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact 
>> >
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list