[BioC] a possible bug of trimLRPatterns

Fri Mar 9 12:10:54 CET 2012

On 03/08/2012 04:40 PM, wang peter wrote:
> 	reads<- readFastq(fastqfile);
> 	seqs<- sread(reads);	
> 	max.mismatchs<- mismatch_rate*1:nchar(DNAString(PCR2rc))
> 	trimmedCoords<- trimLRPatterns(Rpattern = PCR2rc, subject = seqs,
> max.Rmismatch= max.mismatchs, with.Rindels=T,ranges=T)
>
>> end(trimmedCoords)[1:20]
>   [1] 22 18 20 33 14 22 22 20 22 22 22 15 20 37 19 13 20 22  0 34
>> start(trimmedCoords)[1:20]
>   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>
> there is a "0" in the end of trimmedCoords so i cannot get the trimmed sequences

The sequence has been trimmed entirely

 > as.character(Views(DNAString("AA"), IRanges(1, 1)))
[1] "A"
 > as.character(Views(DNAString("AA"), IRanges(1, 0)))
[1] ""

Martin

>
> trimmed3End<- narrow(reads, start=end(trimmedCoords), end=width(reads))
>
> R version 2.14.1 (2011-12-22)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] ShortRead_1.12.4    latticeExtra_0.6-19 RColorBrewer_1.0-5
> [4] Rsamtools_1.6.3     lattice_0.20-0      Biostrings_2.22.0
> [7] GenomicRanges_1.6.7 IRanges_1.12.6
>
> loaded via a namespace (and not attached):
>   [1] Biobase_2.14.0     bitops_1.0-4.1     BSgenome_1.22.0    grid_2.14.1
>   [5] hwriter_1.3        RCurl_1.91-1       rtracklayer_1.14.4 tools_2.14.1
>   [9] XML_3.9-4          zlibbioc_1.0.0
>


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793