[Bioc-sig-seq] 'coverage' error message
Patrick Aboyoun
paboyoun at fhcrc.org
Mon Jan 4 22:47:17 CET 2010
P.,
As the error message suggests, there is a mismatch between
names(arab.chromlens) and levels(chromosome(alns)), meaning the
chromosome lengths vector and the AlignedRead object are not in sync.
The aligned reads for this experiment were from a mouse model, not
arabidopsis thaliana, so you would need to reference
BSgenome.Mmusculus.UCSC.mm9 when performing these operations:
> filt1 <- alignDataFilter(expression(filtering=="Y"))
> filt2 <- chromosomeFilter("chr[0-9XYM]+.fa")
> filt <- compose(filt1, filt2)
> alns <- readAligned(extdataDir, pattern, type="SolexaExport",
filter=filt)
> alns
class: AlignedRead
length: 195719 reads; width: 35 cycles
chromosome: chr11.fa chr9.fa ... chr8.fa chr4.fa
position: 104853312 3036336 ... 44295163 47191474
strand: - - ... - -
alignQuality: NumericQuality
alignData varLabels: run lane ... filtering contig
> levels(alns at chromosome) <- sub(".fa$", "", levels(chromosome(alns)))
> library(BSgenome.Mmusculus.UCSC.mm9)
> mm9.chromlens <- seqlengths(Mmusculus)
> head(mm9.chromlens)
chr1 chr2 chr3 chr4 chr5 chr6
197195432 181748087 159599783 155630120 152537259 149517037
> cov.mm9 <- coverage(alns, width = mm9.chromlens, extend = 126L)
> cov.mm9
SimpleRleList of length 22
$chr1
'integer' Rle of length 197195432 with 27263 runs
Lengths: 3018534 161 16703 161 68815 161 33063 161 58217 161 ...
Values : 0 1 0 1 0 1 0 1 0 1 ...
$chr10
'integer' Rle of length 129993255 with 21699 runs
Lengths: 3019736 161 11311 161 4238 161 10661 161 793 161 ...
Values : 0 1 0 1 0 1 0 1 0 1 ...
$chr11
'integer' Rle of length 121843856 with 22105 runs
Lengths: 3000315 6 40 79 9 4 23 6 2 38 ...
Values : 0 1 2 3 4 5 6 5 4 5 ...
$chr12
'integer' Rle of length 121257530 with 18183 runs
Lengths: 3002552 161 6903 161 4375 161 5041 161 2491 161 ...
Values : 0 1 0 1 0 1 0 1 0 1 ...
$chr13
'integer' Rle of length 120284312 with 15907 runs
Lengths: 3001262 161 5650 161 29080 161 111 40 121 40 ...
Values : 0 1 0 1 0 1 0 1 2 1 ...
...
<17 more elements>
> sessionInfo()
R version 2.11.0 Under development (unstable) (2010-01-02 r50884)
i386-apple-darwin9.8.0
locale:
[1] C/C/C/C/C/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] BSgenome.Mmusculus.UCSC.mm9_1.3.16
[2] BSgenome.Athaliana.TAIR.04232008_1.3.16
[3] ShortReadTutorial_0.0.1
[4] ShortRead_1.5.10
[5] lattice_0.17-26
[6] BSgenome_1.15.3
[7] Biostrings_2.15.11
[8] IRanges_1.5.23
loaded via a namespace (and not attached):
[1] Biobase_2.7.3 grid_2.11.0 hwriter_1.1 tools_2.11.0
Cheers,
Patrick
pterry at huskers.unl.edu wrote:
> Dear bioc-sig-sequencing,
>
> I am trying to analyze Eland aligned files for differential expression, using the 'A ChIP-Seq Data Analysis' handout from a 11/19/09 session at the 'High throughput sequence analysis tools and approaches with Bioconductor' workshop in Seattle.
>
> I generated an error message in the following output. Can you comment?
>
> ...
>
>
>> alns_8 <- readAligned(cdataDir, pattern, "SolexaExport")
>> alns_8
>>
> class: AlignedRead
> length: 1380439 reads; width: 35 cycles
> chromosome: chr1.fas chr1.fas ... chr1.fas chr1.fas
> position: 7568294 167488 ... 4687256 5376960
> strand: + + ... + +
> alignQuality: NumericQuality
> alignData varLabels: run lane ... filtering contig
>
>> head(sread(alns_8))
>>
> A DNAStringSet instance of length 6
> width seq
> [1] 35 AGCTATGATCAAGAGAACCTTTCACGATCANNNCN
> [2] 35 CGGACGACGGGTAGTTTCGGGCTGTACCAANNNAN
> [3] 35 AGCTCAGCGATCTGAGCCACTTGCTCTTTGNNNTN
> [4] 35 GGGCCATAGGCCCGTTAAAATATTTTTCTCTNNCT
> [5] 35 ATTGTCCATTGACAAATGAAGATATTGGGATNNTT
> [6] 35 ACCCCTCCACCAGTATGTTGGCGAAAATCTCNNCC
>
>> table(strand(alns_8), useNA="ifany")
>>
>
> - + *
> 689912 690527 0
>
> ...
>
>
>> library(BSgenome.Athaliana.TAIR.04232008)
>> arab.chromlens <- seqlengths(Athaliana)
>> head(arab.chromlens)
>>
> chr1 chr2 chr3 chr4 chr5 chrC
> 30432563 19705359 23470805 18585042 26992728 154478
>
>> cov.arab8 <- coverage(alns_8, width = arab.chromlens, extend = 126L)
>>
> Error: UserArgumentMismatch
> 'names(width)' (or 'names(end)') mismatch with 'levels(chromosome(x))'
> see ?"AlignedRead-class"
>
>
>> sessionInfo()
>>
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] BSgenome.Athaliana.TAIR.04232008_1.3.16
> [2] chipseq_0.2.0
> [3] ShortRead_1.4.0
> [4] lattice_0.17-26
> [5] BSgenome_1.14.0
> [6] Biostrings_2.14.1
> [7] IRanges_1.4.2
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.6.0 grid_2.10.1 hwriter_1.1
>
>
>
> Thanks,
> P. Terry
> pterry at huskers.unl.edu
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
More information about the Bioc-sig-sequencing
mailing list