[Bioc-sig-seq] Quality Value Analysis from a BStringSet
Pratap, Abhishek
APratap at som.umaryland.edu
Thu Jun 3 21:39:32 CEST 2010
Hi All
I would like to extract and count the last 5 quality values from the FASTQ file. I have read the file using "readFastq" and have stored the quality values as a BStringSet.
Eg :
A BStringSet instance of length 5119916
width seq
[1] 75 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
[2] 75 bbbbbbbbbbbbabbbbbb`bbbbbbab`b_...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
[3] 75 aaaaaaa_aaaaO`aa^aaa_a_T_``^[`S...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
[4] 75 bbbbbbbbbbbbaabbbb`bbb_Uaa___BB...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
[5] 75 ``a`aa`aaYaTaaaBBBBBBBBBBBBBBBB...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
What I would like to do is subseq the last 5 quality values and do a count on #B. We suspect despite good avg quality we still have HIGH bad bases at the end of reads.
Any other ideas welcome.
Thanks!
-Abhi
More information about the Bioc-sig-sequencing
mailing list