[Bioc-sig-seq] getting integers from SimpleRleViewsList
Janet Young
jayoung at fhcrc.org
Wed Jun 9 20:14:31 CEST 2010
thanks, Michael - very simple. I don't know why I didn't think of that!
On Jun 8, 2010, at 6:35 PM, Michael Lawrence wrote:
>
>
> On Tue, Jun 8, 2010 at 5:10 PM, Janet Young <jayoung at fhcrc.org> wrote:
> Hi,
>
> This is probably a simple question, but I can't quite figure it out
> for myself. I've been using RangedData (etc) quite a lot and have
> often found myself looping through the spaces when I probably don't
> need to. I'm trying not to do that now...
>
> I have some scores on a bunch of genomic intervals I'm interested in
> (they're conservation scores in 4kb promoter regions, e.g. phastCons
> scores). I've stored these as a SimpleRleList object.
>
> I also have some smaller subregions (matches to transcription factor
> motifs) and I want to compare the distribution of scores inside
> those subregions to scores outside those subregions. I have the
> regions as RangedData objects, over multiple spaces. I want to do
> this on individual base pair scores, rather than by taking subregion
> means and comparing those (so I don't want to use viewSums).
>
> I've managed to get the scores inside and outside the regions using
> the RleViewsList function (following the example in this thread: http://thread.gmane.org/gmane.comp.lang.r.sequencing/403/focus=408
> ). The scores are now as SimpleRleViewsList.
>
> I'm currently looping through the scores (the SimpleRleViewsList)
> space by space, converting to simple vector and concatenating the
> scores, but I imagine there's a way to do it without looping. Any
> ideas?
>
> Below is a simple example (extended from the older thread). My real
> scores are not integers and there's many more of them.
>
> #########################
> library(IRanges)
>
> ### generate some arbitrary scores
> track <- RangedData(RangesList(chrA = IRanges(start = c(1, 4, 6),
> width=c(3, 2, 4)),chrB = IRanges(start = c(1, 3, 6), width=c(3, 3,
> 4))) )
> trackCoverage <- coverage(track,
> weight=list(chrA=c(2,7,3),chrB=c(1,1,1)) )
>
> ### define subregions
> exons <- RangesList(chrA = IRanges(start = c(2, 4), width = c(2,
> 2)),chrB = IRanges(start = 3, width = 5))
>
>
> Views are not necessary in your case. Just seqselect() off of the
> RleList with a RangesList. This all works as one would expect.
>
> seqselect(trackCoverage, exons)
>
> That gives you an RleList. That can be unlist()'d to an Rle.
>
> In general, it does not make sense to unlist a list of views, like
> RleViewsList, since a Views can contain only a single subject.
>
> Michael
>
> ###get exon and intron scores
> exonView <- RleViewsList(rleList=trackCoverage, rangesList=exons)
> intronView <- RleViewsList(rleList=trackCoverage,
> rangesList=gaps(ranges(RangedData(exons)),start=1,end=9))
>
> ### now convert those to simple integer objects. Loop through
> chromosomes. Can I do this without looping?
> ### (I tried using unlist, but it does something weird to the
> scores, as it assumes they're all from the same space)
> exonScores <- integer()
> intronScores <- integer()
> for (chr in names(exonView) ) {
> thischrexonscores <- as.integer(seqselect(trackCoverage[[chr]],
> start=start(exonView[[chr]]),end=end(exonView[[chr]]) ))
> exonScores <- c(exonScores,thischrexonscores)
>
> thischrintronscores <-
> as.integer(seqselect(trackCoverage[[chr]],
> start=start(intronView[[chr]]),end=end(intronView[[chr]]) ))
> intronScores <- c(intronScores,thischrintronscores)
>
> }
>
> ###and do something with the scores to compare distributions
> wilcox.test(exonScores,intronScores)
>
> #########################
>
> thanks,
>
> Janet
>
> -------------------------------------------------------------------
>
> Dr. Janet Young (Trask lab)
>
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Avenue N., C3-168,
> P.O. Box 19024, Seattle, WA 98109-1024, USA.
>
> tel: (206) 667 1471 fax: (206) 667 6524
> email: jayoung ...at... fhcrc.org
>
> http://www.fhcrc.org/labs/trask/
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
More information about the Bioc-sig-sequencing
mailing list