[Bioc-sig-seq] alignQuality gives NA

Martin Morgan mtmorgan at fhcrc.org
Tue Mar 31 20:55:15 CEST 2009


joseph wrote:
> why alignQuality is NA after readAligned() of a bowtie alignment file? 

because Bowtie doesn't report alignment qualities. See

http://bowtie-bio.sourceforge.net/manual.shtml#algn_out

Martin

> 
>> aln1 = readAligned("~/data", pattern="s_2.map", type="Bowtie")
> 
>> aln1
> class: AlignedRead
> length: 4072324 reads; width: 34..34 cycles
> chromosome: gi|89161210|ref|NC_000006.10|NC_000006 gi|89161210|ref|NC_000006.10|NC_000006 ... gi|89161205|ref|NC_000003.10|NC_000003 gi|89161205|ref|NC_000003.10|NC_000003 
> position: 20601560 20601583 ... 185508441 185508446 
> strand: + - ... + - 
> alignQuality: NumericQuality 
> alignData varLabels: mismatch 
>> head(id(aln1))
>   A BStringSet instance of length 6
>     width seq
> [1]    18 GAII:2:1:3:174#0/1
> [2]    18 GAII:2:1:3:174#0/2
> [3]    18 GAII:2:1:3:170#0/2
> [4]    18 GAII:2:1:3:170#0/1
> [5]    18 GAII:2:1:4:148#0/1
> [6]    18 GAII:2:1:4:148#0/2
>> head(sread(aln1))
>   A DNAStringSet instance of length 6
>     width seq
> [1]    34 CGTGTGTATGAGAAGGAGGGATATGAAGGAAGAT
> [2]    34 CCACCCGACTTACTCTGCAATCCATCTTCCTTCA
> [3]    34 AGCAGGTGCTGAGGTGGGAGGATCTAGCACCACC
> [4]    34 CTGGTGCCTGGTGGTGCTAGCTCCTCCCACCTCA
> [5]    34 TTTTCAAAACCATTCCTCAGTATCTTCAGGCATT
> [6]    34 GGGCCAAGCACATTCAGGAGGTCAAATGCCTGAA
>> head(quality(aln1)) 
> class: SFastqQuality
> quality:
>   A BStringSet instance of length 6
>     width seq
> [1]    34 ababa`aa`X_^``VYX_W][^\a]\W[MWV^V\
> [2]    34 aababbaa\GG\[G^GJVb[I[\BBBBBBBBBBB
> [3]    34 a`aa`WFRZMW]J[MWUL]TM[LWMTW\`U`_YR
> [4]    34 a`a`aaaaaa_``[_V_`_]_____\_]N][_]]
> [5]    34 abbb_aaaa`aaaa`aaa`Z[aaaaa__QWZZaa
> [6]    34 ababbbabbab^MM^bbb_][M^bbZG^_^GR_b
>> head(alignQuality(aln1))
> class: NumericQuality
> quality: NA NA ... NA NA (6 total)
>> head(strand(aln1))
> [1] + - + - + -
> Levels: - + *
>> head(alignData(aln1))
> An object of class "AlignedDataFrame"
>   readName: 1
>   varLabels and varMetadata description:
>     mismatch: Comma-separated mismatch positions
>>
> 
> 
> 
>> sessionInfo()
> R version 2.8.1 Patched (2009-03-03 r48046) 
> i386-apple-darwin9.6.0 
> locale:
> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> other attached packages:
> [1] ShortRead_1.1.51   lattice_0.17-20    BSgenome_1.11.13   Biostrings_2.11.44
> [5] IRanges_1.1.55    
> loaded via a namespace (and not attached):
> Error in x[["Version"]] : subscript out of bounds
> In addition: Warning message:
> In FUN(c("Biobase", "grid", "hwriter")[[3L]], ...) :
>   no package 'hwriter' was found
> 
> 
>       
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list