[Bioc-sig-seq] ShortRead, feature request (if not a bug report)
Ivan Gregoretti
ivangreg at gmail.com
Tue May 17 23:35:48 CEST 2011
Hello ShortRead connoisseurs,
ShortRead::readAligned is very smart because it allows you to load the
content of a large file without decompressing it. For example:
aln <- readAligned("s_1_export.txt.gz", type="SolexaExport")
However, its analogue reading function ShortRead::readFasta in my
system complains about being unable to handle gziped targets
fas <- readFasta("s_1.fa.gz")
Error in .normargInputFilepath(filepath) :
file "s_1.fa.gz" has unsupported type: gzfile
Currently the solution seems to be:
system("gunzip -f s_1.fa.gz")
fas <- readFasta("s_1.fa")
system("gzip -9f s_1.fa")
but this code is highly inefficient, especially with large files.
Please consider adding the missing functionality just like in readAligned.
In case it is a bug in my ShortRead version, see my session below.
Thank you,
Ivan
> sessionInfo()
R version 2.14.0 Under development (unstable) (2011-04-14 r55450)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
[3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
[7] LC_PAPER=en_US.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] annotate_1.31.0 AnnotationDbi_1.15.1 Biobase_2.13.1
[4] ShortRead_1.11.1 Rsamtools_1.5.9 lattice_0.19-26
[7] Biostrings_2.21.1 GenomicRanges_1.5.0 IRanges_1.11.1
loaded via a namespace (and not attached):
[1] DBI_0.2-5 grid_2.14.0 hwriter_1.3 RSQLite_0.9-4 tools_2.14.0
[6] xtable_1.5-6
More information about the Bioc-sig-sequencing
mailing list