[BioC] shortread quality
Martin Morgan
mtmorgan at fhcrc.org
Wed Jun 6 21:11:43 CEST 2012
Hi,
On 06/06/2012 08:00 AM, David martin wrote:
> Hi,
> I'm reading a fastq file from the solexa sequencer.
> I would like to know how many reads have a phred score (>Q29). The thing
If you mean the average base quality score, then
fq <- readFastq(sp, fqpattern)
score <- alphabetScore(fq)
gives the sum of the base quality scores for each read, so is a vector
as long as the length of the reads. The average is
aveScore <- score / width(fq)
and then you're in the realm of familiar R again, e.g.,
hist(aveScore)
table(aveScore > 29)
etc.
Hope that heps,
Martin
> is that i get the densities so i don't really know how many reads from
> the total pass that filter. It's probaly easy for you so any hint would
> be helpful
>
> library("ShortRead")
> fqpattern <- "1102sdd_SN148_A_s_3_seq_GJH-85.txt"
>
> path = getwd()
> sp <- SolexaPath(path,dataPath=path,analysisPath=path)
>
> # Read fastq File and save report
> fq <- readFastq(sp, fqpattern)
> qaSummary <- qa(fq,fqpattern)
> save(qaSummary, file=file.path("./", paste(fqpattern,".rda",sep="" )))
> report(qaSummary,dest="report")
>
> #Quality
>
> idx = which(qaSummary[["readQualityScore"]]["quality"] > 29)
> a = cbind( qaSummary[["readQualityScore"]][idx,"quality"] ,
> qaSummary[["readQualityScore"]][idx,"density"])
> a #reads with a quality >Q29
>
> #How to get the total number ? or percent compared to the total number
> of reads ?
>
> thanks
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list