[BioC] bug in Biostrings mismatchTable?
Hervé Pagès
hpages at fhcrc.org
Fri Oct 12 09:45:36 CEST 2012
Hi Janet,
Thanks again for the bug report. This one should be fixed in Biostrings
2.26.2 (release) and 2.27.3 (devel).
Cheers,
H.
On 10/10/2012 05:13 PM, Janet Young wrote:
> Hi there,
>
> I think I've found a bug in mismatchTable (Biostrings). It's reporting a mismatch after the end of the reported alignment. I think the code below shows the problem.
>
> thanks, as usual!
>
> Janet
>
> #####
>
> library(Biostrings)
>
> ### couple of seqs, the middle portion aligns, but the last few bases don't. I'm not interested in those last few bases, so I do a local alignment
> seq1 <- DNAString("GCTGAAGTAGTTCTCCAGAA")
> seq2 <- DNAString("GTAGTTCTCCAAAGT")
> aln1 <- pairwiseAlignment ( seq1, seq2, type="local" )
> aln1
> # Local PairwiseAlignmentsSingleSubject (1 of 1)
> # pattern: [7] GTAGTTCTCCA
> # subject: [1] GTAGTTCTCCA
> # score: 21.79932
>
> end(pattern(aln1))
> # [1] 17
>
> mismatchTable(aln1)
> # PatternId PatternStart PatternEnd PatternSubstring PatternQuality
> #1 1 18 18 G 7
> # SubjectStart SubjectEnd SubjectSubstring SubjectQuality
> #1 12 12 A 7
> #### the one mismatch that's reported is after the end of the alignment as reported above. There's another mismatch after the end of the alignment that wasn't reported
>
> sessionInfo()
>
> R Under development (unstable) (2012-10-03 r60868)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Biostrings_2.27.2 IRanges_1.17.0 BiocGenerics_0.5.0
>
> loaded via a namespace (and not attached):
> [1] parallel_2.16.0 stats4_2.16.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list