[Bioc-sig-seq] Size of Illumina fastaq files to be read in shortReads
Kasper Daniel Hansen
khansen at stat.berkeley.edu
Thu Jun 25 08:40:47 CEST 2009
Note that you are probably not using a 64bit version of R, so you
cannot utilize all of your 8MB. Check by looking at .Machine
$sizeof.pointer
As a minimum upgrade to R-2.9, if you want to use bioconductor for
short reads.
Kasper
On Jun 24, 2009, at 11:37 , Anastasia Gioti wrote:
> Dear list,
> I just started playing with shortReads package in order to read
> fastaq files from the illumina analyzer, and i have some issues.
> The most important is the fact that the readFastaq crashes because
> of memory I suppose when i try to read files >1GB. Ex:
> fqpattern='s_3_1_sequence.txt'
> > afrN=file.path(analysisPath(sp), fqpattern)
> > afrN
> [1] "/Users/nat/Data/Illumina/Solexa_disk_modforR/Data/
> HJSN_FC1_280409_3//Data/C1-C55Firecrest/
> Bustard1.3.2_06-05-2009_rdixon/GERALD_06-05-2009_rdixon/
> s_3_1_sequence.txt"
> > afrNq=readFastq(sp, fqpattern)
> Error: cannot allocate vector of size 27.0 Mb
> R(1337,0xa07a2720) malloc: *** mmap(size=28340224) failed (error
> code=12)
> *** error: can't allocate region
> *** set a breakpoint in malloc_error_break to debug
> R(1337,0xa07a2720) malloc: *** mmap(size=28340224) failed (error
> code=12)
> *** error: can't allocate region
> *** set a breakpoint in malloc_error_break to debug
>
> I only succeeded in reading a file < 1GB, but i suppose that the
> shortReads class is designed for big files ;-).
> Another minor issue is the names of the folders in the Illumina
> output directory that I need to designate in exptPath so that
> p=SolexaPath(exptPath) is correctly parsed. I finally managed to
> find the logic behind this, but I would like to confirm that the
> path absolutely needs to contain this string: Data/C1-
> C(readlength)Firecrest. At least in my hands it would not work with
> other names (which are currently produced by illumina, for ex IPAR
> instead of Firecrest). Is that correct? Maybe this parser is hard
> coded for previous versions of Illumina outputs? In that case is
> there any plan to update it? Although this is not very important
>
> I use R2.8 on a Leopard with 8GB of memory, so I think that my
> problem with fastq does not come from my computer...
> Any help /suggestions are welcome!
> Thank you,
>
> Anastasia Gioti
> Post-Doc, Evolutionary Biology Department
> Upssala University
> Norbyvagen 18D
> SE-752 36 UPPSALA
> anastasia.gioti at ebc.uu.se
> Tel: +46-18-471 6465
> Fax: +46-18-471 6310
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
More information about the Bioc-sig-sequencing
mailing list