[BioC] trimTails function in ShortRead package give different results on the same input
Zhenyu Xu
zxu at embl.de
Mon Oct 15 14:34:45 CEST 2012
Hi ShortRead package developer,
I tried to use the function trimTails to trim some bad quality bases from reads coming out of 454 sequencing machine. However I got different results if I run the command several times starting from the same ShortReadQ object and same trimming parameter. This is observed in centos linux machine (6.2 and 6.3). I also tried this with my own mac machine, but the results are identical. So seems the problem only restrict to centos linux machine (Not sure other linux platform has this problem or not). the data sets(~11Mb) can be downloaded at http://dl.dropbox.com/u/68829208/454reads.rds.
best,
zhenyu
Please see the following of the execution:
wget http://dl.dropbox.com/u/68829208/454reads.rds
R
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ShortRead)
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following object(s) are masked from ‘package:stats’:
xtabs
The following object(s) are masked from ‘package:base’:
anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
rownames, sapply, setdiff, table, tapply, union, unique
Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: lattice
Loading required package: Rsamtools
Loading required package: latticeExtra
Loading required package: RColorBrewer
> readsSub <- readRDS("454reads.rds")
> readsSub
class: ShortReadQ
length: 5460 reads; width: 5..424 cycles
> trimTails(readsSub, 20, "5", successive=TRUE)
class: ShortReadQ
length: 5460 reads; width: 3..416 cycles
> trimTails(readsSub, 20, "5", successive=TRUE)
class: ShortReadQ
length: 5460 reads; width: 3..416 cycles
> trimTails(readsSub, 20, "5", successive=TRUE)
class: ShortReadQ
length: 5460 reads; width: 4..424 cycles
> trimTails(readsSub, 20, "5", successive=TRUE)
class: ShortReadQ
length: 5460 reads; width: 5..416 cycles
> trimTails(readsSub, 20, "5", successive=TRUE)
class: ShortReadQ
length: 5460 reads; width: 4..424 cycles
> x = trimTails(readsSub, 20, "5", successive=TRUE)
> y = trimTails(readsSub, 20, "5", successive=TRUE)
> sum(width(x)!=width(y))
[1] 1325
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ShortRead_1.14.4 latticeExtra_0.6-19 RColorBrewer_1.0-5
[4] Rsamtools_1.8.5 lattice_0.20-6 Biostrings_2.24.1
[7] GenomicRanges_1.8.9 IRanges_1.14.4 BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] Biobase_2.16.0 bitops_1.0-4.1 grid_2.15.1 hwriter_1.3 stats4_2.15.1
[6] zlibbioc_1.2.0
More information about the Bioconductor
mailing list