[Bioc-sig-seq] weighted coverage on GRanges object
Janet Young
jayoung at fhcrc.org
Fri May 6 21:48:30 CEST 2011
Hi there,
I have a suggestion - is it possible to make it a little easier to directly calculate weighted coverage on a GRanges object that has regions on >1 chromosome? I think the code below explains what I mean.
I've figured out a workaround, but it took me quite a while to get there - perhaps putting the coercion in the underlying coverage/weights code will help others avoid that in future?
thanks very much,
Janet
-------------------------------------------------------------------
Dr. Janet Young (Trask lab)
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung ...at... fhcrc.org
http://www.fhcrc.org/labs/trask/
-------------------------------------------------------------------
library(GenomicRanges)
dat1 <- GRanges(seqnames="chr2L",IRanges(start=((1:4)*5),width=20),score=c(3,5,10,20) )
dat2 <- GRanges(seqnames=rep(c("chr2L","chr3L"),c(4,3)),IRanges(start=c( (1:4)*5 , (1:3)*5 ),width=20),score=c(3,5,10,20,6,10,30) )
seqlengths(dat2) <- c(50,40)
#### this works
coverage(dat1,weight=list(elementMetadata(dat1)$score))
## SimpleRleList of length 1
## $chr2L
## 'integer' Rle of length 39 with 8 runs
## Lengths: 4 5 5 5 5 5 5 5
## Values : 0 3 8 18 38 35 30 20
#### this works
myscores <- list(c(3,5,10,20), c(6,10,30))
coverage(dat2,weight=myscores)
## SimpleRleList of length 2
## $chr2L
## 'integer' Rle of length 50 with 9 runs
## Lengths: 4 5 5 5 5 5 5 5 11
## Values : 0 3 8 18 38 35 30 20 0
## $chr3L
## 'integer' Rle of length 40 with 7 runs
## Lengths: 4 5 5 10 5 5 6
## Values : 0 6 16 46 40 30 0
##### but it'd be great if this could work too....
coverage(dat2,weight=list(elementMetadata(dat2)$score))
## Error in normargWeight(weight, nseq) : 'weight' is longer than 'x'
### for now I'll coerce my scores into a list by chromosome: does this seem to be a reliable way to do it? this is the part it took me a while to figure out on my own...
myscores <- split( elementMetadata(dat2)$score , as.character(seqnames(dat2)) )
coverage(dat2,weight=myscores)
sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicRanges_1.4.3 IRanges_1.10.0
More information about the Bioc-sig-sequencing
mailing list