[Bioc-sig-seq] Rsamtools problem reading seq information
ivan.borozan at utoronto.ca
ivan.borozan at utoronto.ca
Thu Nov 11 17:25:38 CET 2010
Hello list,
I have scanned a large bam (15G) file from Bioscope (SOLID) using
Rsamtools and the code below:
> library(Rsamtools)
Loading required package: GenomicRanges
>
>
> p4<- ScanBamParam(what = c("seq"), flag =
> scanBamFlag(isUnmappedQuery = TRUE))
>
> res3 <- scanBam("test.bam",param=p4, maxMemory=5000)[[1]]
it is not clear to me why I get all sequences as
> res3$seq[1]
A DNAStringSet instance of length 1
width seq
[1] 1 N
and all Phred-encoded, phred-scaled base quality scores as:
> p4<- ScanBamParam(what = c("qual"), flag =
> scanBamFlag(isUnmappedQuery = TRUE))
>
>
> res3 <-
> scanBam("solid0085_20090610_ICGC_Xeno_wholetranscrptome_4041X_F3_sortedByReadId.bam",param=p4,
> maxMemory=5000)[[1]]
>
> res3$qual[1]
A PhredQuality instance of length 1
width seq
[1] 1 !
Many thanks for any suggestions,
Ivan
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rsamtools_1.2.0 GenomicRanges_1.2.0 Biostrings_2.18.0
[4] IRanges_1.8.0
loaded via a namespace (and not attached):
[1] Biobase_2.10.0 tools_2.12.0
More information about the Bioc-sig-sequencing
mailing list