[Bioc-sig-seq] ShortRead internal: too many 'snap' entries
Yanwei Tan
Tan at nbio.uni-heidelberg.de
Sun Apr 4 23:07:29 CEST 2010
Dear Martin,
I use nFilter to filter out the sequences which contain any "N",
following is my codes:
> # read the fastq file
> fq<-readFastq("/Users/wei/Desktop/Originaldata",pattern="Bic.txt")
> # filter for N containing reads
> filt<-nFilter()
> fq<-fq[filt(fq)]
> # write the out
> writeFastq(fq,file="/Users/wei/Desktop/Originaldata/bicfiltered.txt")
After I got the filtered fastq file:
>readFastq("/Users/wei/Desktop/Originaldata", "bicfiltered.txt")
Error in .local(dirPath, pattern,...) :
ShortRead internal: too many 'snap' entries
My sessioninfo():
R version 2.10.1 (2009-12-14)
x86_64-apple-darwin9.8.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ShortRead_1.4.0 lattice_0.17-26 BSgenome_1.14.2
Biostrings_2.14.12 IRanges_1.4.11
loaded via a namespace (and not attached):
[1] Biobase_2.6.1 grid_2.10.1 hwriter_1.1 tools_2.10.1
Many thanks!
Wei
On 4/4/10 10:31 PM, Martin Morgan wrote:
> On 04/04/2010 11:55 AM, Yanwei Tan wrote:
>
>> Hi Ramzi Temanni,
>>
>> I met the same problem with you when running shortread. As Martin
>> mentioned, there is one new line missing after the last file record. How
>> did you fix this problem? I do not know how to add a new line after the
>> last line. My data is fastq file, I just filtered the reads which
>> contain N by using the nFilter function in shortread package.
>>
> In off-list email you said
>
>
>> I used ShortRead package to filter the data and then saved as fastq
>> file. But when I run the qa function again there is error in
>> .local(dirPath, pattern, ...):> > ShortRead internal: too many
>> 'snap' entries.
>>
> It is hard to follow what you are trying to accomplish. Please paste
> short code to illustrate. Use data files from ShortRead, so that your
> code is reproducible by others. Include the output of sessionInfo() so
> that it is clear which version of software you are using. Perhaps after
>
> example(readFastq)
>
> you do
>
>
>> rfq
>>
> class: ShortReadQ
> length: 256 reads; width: 36 cycles
>
>> file = tempfile() # a file to save output
>> noNrfq = rfq[nFilter()(rfq)]
>> writeFastq(noNrfq, file)
>> qaresult = qa(dirname(file), basename(file), type="fastq")
>>
> ? But what is the problem? Note also that it is not necessary to write
> the fastq file to disk,
>
>
>> qa(list(noNrfq=noNrfq))
>>
> class: ShortReadQQA(9)
> QA elements (access with qa[["elt"]]):
> readCounts: data.frame(1 3)
> baseCalls: data.frame(1 5)
> readQualityScore: data.frame(512 4)
> baseQuality: data.frame(94 3)
> alignQuality: data.frame(1 3)
> frequentSequences: data.frame(50 4)
> sequenceDistribution: data.frame(3 4)
> perCycle: list(2)
> baseCall: data.frame(141 4)
> quality: data.frame(341 5)
> perTile: list(2)
> readCounts: data.frame(0 4)
> medianReadQualityScore: data.frame(0 4)
>
> This is my sessionInfo()
>
>
>> sessionInfo()
>>
> R version 2.10.1 Patched (2010-03-27 r51570)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] ShortRead_1.4.0 lattice_0.18-3 BSgenome_1.14.2
> Biostrings_2.14.12
> [5] IRanges_1.4.16
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.6.1 grid_2.10.1 hwriter_1.2 tools_2.10.1
>
>
>> Many thanks in advance!
>> Wei
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
--
Yanwei Tan
Institute of Neurobiology
1.OG, AG Bading
Im Neuenheimer Feld 364
University of Heidelberg
69120 Heidelberg
Germany
Tel:+49-6221-548319
Fax:+49-6221-546700
More information about the Bioc-sig-sequencing
mailing list