[Bioc-sig-seq] Rsamtools: Select reads on a chromosome
Steve Lianoglou
mailinglist.honeypot at gmail.com
Wed Jan 20 23:08:28 CET 2010
Sorry, forgot to provide sessionInfo()
R version 2.11.0 Under development (unstable) (2009-12-28 r50849)
x86_64-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rsamtools_0.1.21 BSgenome_1.15.3 Biostrings_2.15.13 IRanges_1.5.26
loaded via a namespace (and not attached):
[1] Biobase_2.7.0
On Wed, Jan 20, 2010 at 5:04 PM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> Hi,
>
> About selecting all reads on a chromosome:
>
> On Thu, Jan 7, 2010 at 1:11 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> <snip>
>
>>> which <- RangesList(chr1=IRanges(start=1, end=247249719))
>>> params <- ScanBamParams(which=which)
>>> reads <- scanBam(my.bam.file, param=params)[[1]]
>>>
>>> Is there are "better" way to do it, eg. w/o making the IRanges object
>>> that's stretches over the chromosome?
>>
>> I don't think so, though 'end' doesn't have to be a literal end, e.g,.
>> .Machine$integer.max and 'stretches' doesn't really involve any cost --
>> just two numbers.
>
> I just tried to do this in another context, but this actually send R
> into a tailspin, eg:
>
> R> which <- RangesList(chr1=IRanges(start=1, end=.Machine$integer.max-1))
> R> r <- scanBam('scratch-sorted.bam', param=ScanBamParam(what='pos',
> which=which))
>
> *** caught segfault ***
> address 0x0, cause 'unknown'
>
> Traceback:
> 1: .Call(func, file, index, "rb", list(space(which),
> .uunlist(start(which)), .uunlist(end(which))), flag, simpleCigar,
> ...)
> 2: .io_bam(.scan_bam, file, index, tmpl, param = param)
> 3: .local(file, index, ...)
> 4: scanBam("scratch-sorted.bam", param = ScanBamParam(what = "pos",
> which = which))
> 5: scanBam("scratch-sorted.bam", param = ScanBamParam(what = "pos",
> which = which))
>
> I have access to chromosome length information, so it's not really a
> problem for me, but it seems as if something is happening which you
> didn't expect, so I thought you'd like to know.
>
> Thanks,
> -steve
>
> ps: I'm using IRanges(.., end=.Machine$integer.max-1) because using
> .Machine$integer.max causes an integer overflow
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioc-sig-sequencing
mailing list