[Bioc-sig-seq] Size of Illumina fastaq files to be read in shortReads
Anastasia Gioti
anastasia.gioti at ebc.uu.se
Wed Jun 24 20:37:33 CEST 2009
Dear list,
I just started playing with shortReads package in order to read fastaq
files from the illumina analyzer, and i have some issues.
The most important is the fact that the readFastaq crashes because of
memory I suppose when i try to read files >1GB. Ex:
fqpattern='s_3_1_sequence.txt'
> afrN=file.path(analysisPath(sp), fqpattern)
> afrN
[1] "/Users/nat/Data/Illumina/Solexa_disk_modforR/Data/
HJSN_FC1_280409_3//Data/C1-C55Firecrest/Bustard1.3.2_06-05-2009_rdixon/
GERALD_06-05-2009_rdixon/s_3_1_sequence.txt"
> afrNq=readFastq(sp, fqpattern)
Error: cannot allocate vector of size 27.0 Mb
R(1337,0xa07a2720) malloc: *** mmap(size=28340224) failed (error
code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
R(1337,0xa07a2720) malloc: *** mmap(size=28340224) failed (error
code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
I only succeeded in reading a file < 1GB, but i suppose that the
shortReads class is designed for big files ;-).
Another minor issue is the names of the folders in the Illumina output
directory that I need to designate in exptPath so that
p=SolexaPath(exptPath) is correctly parsed. I finally managed to find
the logic behind this, but I would like to confirm that the path
absolutely needs to contain this string: Data/C1-
C(readlength)Firecrest. At least in my hands it would not work with
other names (which are currently produced by illumina, for ex IPAR
instead of Firecrest). Is that correct? Maybe this parser is hard
coded for previous versions of Illumina outputs? In that case is there
any plan to update it? Although this is not very important
I use R2.8 on a Leopard with 8GB of memory, so I think that my problem
with fastq does not come from my computer...
Any help /suggestions are welcome!
Thank you,
Anastasia Gioti
Post-Doc, Evolutionary Biology Department
Upssala University
Norbyvagen 18D
SE-752 36 UPPSALA
anastasia.gioti at ebc.uu.se
Tel: +46-18-471 6465
Fax: +46-18-471 6310
More information about the Bioc-sig-sequencing
mailing list