[Bioc-sig-seq] readAligned for Illumina's export file
Martin Morgan
mtmorgan at fhcrc.org
Fri Nov 5 17:27:33 CET 2010
On 11/05/2010 08:51 AM, Kunbin Qu wrote:
> Dear all,
>
> can readAligned() or other function read in the reads mapped across
> the junctions in the "export" file (eg, s_1_export.txt) from
> Illumina's pipeline? The following is the example of a regular
> mapping entry and a read mapped across two exons. I had a test file
> named s1Test, and when I used the following command, it can only read
> in the first read. Thanks.
It's tricky to know what your file looks like, but this should be parsed
by readAligned.
> x = readAligned("/tmp/kunbin_export.txt", type="SolexaExport")
> x
class: AlignedRead
length: 2 reads; width: 51 cycles
chromosome: chrX.fa
splice_sites-auto.faDHRS7_50_50_chr14.fa_59681484_59685824
position: 108773654 20
strand: + -
alignQuality: NumericQuality
alignData varLabels: run lane ... filtering contig
> sread(x)
A DNAStringSet instance of length 2
width seq
[1] 51 NTTTTAAAAACAGAATTTCTGCTCTATAATAACACAGCTAAAGGGAAATAA
[2] 51 NGAACTTTAAGAGTGGTGTGGATGCAGACTCTTCTTATTTTAAAATCTTTA
> quality(x)
class: SFastqQuality
quality:
A BStringSet instance of length 2
width seq
[1] 51 BKOJHRQPPO_QQ_____b_b___b_bb_bb__bb__b_b___bbb_b__Q
[2] 51 BKIKKUUTTU_____[[[[[[[[[[_b_____b______QQQ__b___b__
maybe your 'cfilt' filters out 'chromosomes' (which should probably have
been something else, rseq?)
> chromosome(x)
[1] chrX.fa
[2] splice_sites-auto.faDHRS7_50_50_chr14.fa_59681484_59685824
2 Levels: chrX.fa ...
More hints on what 'it can only read the first read' means might help.
Martin
>
> -Kunbin
>
> SEQUENCER01 10 1 1 5110 943 0 1
> NTTTTAAAAACAGAATTTCTGCTCTATAATAACACAGCTAAAGGGAAATAA
> BKOJHRQPPO_QQ_____b_b___b_bb_bb__bb__b_b___bbb_b__Q chrX.fa
> 108773654 F T50 199
> Y
>
> SEQUENCER01 10 1 1 2815 941 0 1
> NGAACTTTAAGAGTGGTGTGGATGCAGACTCTTCTTATTTTAAAATCTTTA
> BKIKKUUTTU_____[[[[[[[[[[_b_____b______QQQ__b___b__
> splice_sites-auto.faDHRS7_50_50_chr14.fa_59681484_59685824 20
> R A50 200 Y
>
>
>
>
>> s1t<-readAligned("./", pattern="s1Test", type="SolexaExport",
>> filter=cfil) sessionInfo()
> R version 2.11.0 (2010-04-22) x86_64-unknown-linux-gnu
>
> locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3]
> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C
> LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9]
> LC_ADDRESS=C LC_TELEPHONE=C [11]
> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages: [1] stats graphics grDevices utils
> datasets methods base
>
> other attached packages: [1] ShortRead_1.6.2 Rsamtools_1.0.1
> lattice_0.19-11 [4] Biostrings_2.16.7 GenomicRanges_1.0.1
> IRanges_1.6.8
>
> loaded via a namespace (and not attached): [1] Biobase_2.8.0
> grid_2.11.0 hwriter_1.2 tools_2.11.0
>>
>
>
>
> ______________________________________________________________________
>
>
The contents of this electronic message, including any attachments, are
intended only for the use of the individual or entity to which they are
addressed and may contain confidential information. If you are not the
intended recipient, you are hereby notified that any use, dissemination,
distribution, or copying of this message or any attachment is strictly
prohibited. If you have received this transmission in error, please send
an e-mail to postmaster at genomichealth.com and delete this message, along
with any attachments, from your computer.
> [[alternative HTML version deleted]]
>
> _______________________________________________ Bioc-sig-sequencing
> mailing list Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioc-sig-sequencing
mailing list