[Bioc-sig-seq] 'agregate' error message
Patrick Aboyoun
paboyoun at fhcrc.org
Mon Jan 4 19:27:17 CET 2010
P.,
The error message from aggregate isn't very informative and I'll clean
it up.
The aggregate function threw an error for the cov.y object because the
ranges in allPeaks referenced indices outside of the bounds of cov.y, in
particular cov.y is an Rle of length 11 and allPeaks included the
interval [17, 19]. If you know the length of underlying sequence, you
can pass that into the width argument to the coverage function. For
example, if the underlying sequence is of length 19, then the coverage
from the y ranges would be calculated as shown below. (I also added code
for more efficient summation withing the specified ranges.)
> cov.y <- coverage(y, width = 19)
> cov.y
'integer' Rle of length 19 with 5 runs
Lengths: 3 2 4 2 8
Values : 0 3 0 3 0
> y.counts <- aggregate(cov.y, allPeaks, sum)
> y.counts
[1] 6 0
> y.counts.efficient <- viewSums(Views(cov.y, allPeaks))
> y.counts.efficient
[1] 6 0
> sessionInfo()
R version 2.10.1 Patched (2009-12-14 r50738)
i386-apple-darwin9.8.0
locale:
[1] C/en_US.UTF-8/C/C/C/C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] IRanges_1.4.9
loaded via a namespace (and not attached):
[1] tools_2.10.1
Cheers,
Patrick
pterry at huskers.unl.edu wrote:
> Dear bioc-sig-sequencing,
>
> I am working with a toy example to learn the material covered in part 3 (Differential expression, pages 10-11) of 'A ChIP-Seq Data Analysis' handout for a 11/19/09 session at the 'High throughput sequence analysis tools and approaches with Bioconductor' workshop in Seattle.
>
> I generated an error message in the following output. Can you comment?
> (I note that when I use the sample data & code from the handout, ctcf.rda & gfp.rda, no errors are generated)
>
>
>> x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L, 15L, 17L, 17L),
>>
> + width=c(5L, 6L, 3L, 4L, 3L, 3L, 5L, 3L, 3L))
>
>
>> y <- IRanges(start=c(4L, 4L, 4L, 10L, 10L, 10L),
>>
> + width=c(2L, 2L, 2L, 2L, 2L, 2L))
>
>
>> cov.x <- coverage(x)
>> cov.y <- coverage(y)
>> allPeaks <- slice(cov.x, lower = 3)
>> allPeaks
>>
> Views on a 19-length Rle subject
>
> views:
> start end width
> [1] 4 5 2 [3 3]
> [2] 17 19 3 [3 3 3]
>
>> x.counts <- aggregate(cov.x, allPeaks, sum)
>> x.counts
>>
> [1] 6 9
>
>> y.counts <- aggregate(cov.y, allPeaks, sum)
>>
> Error in findIntervalAndStartFromWidth(start, runLength(x)) :
> 'x' must be less than 'sum(width)'
>
>
>> sessionInfo()
>>
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] ChIPseqTutorial_0.0.1 BSgenome.Mmusculus.UCSC.mm9_1.3.16
> [3] chipseq_0.2.0 ShortRead_1.4.0
> [5] lattice_0.17-26 BSgenome_1.14.0
> [7] Biostrings_2.14.1 IRanges_1.4.2
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.6.0 grid_2.10.1 hwriter_1.1
>
>
> Thanks,
> P. Terry
> huskers.unl.edu
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
More information about the Bioc-sig-sequencing
mailing list