[Bioc-sig-seq] behavior of XStringSet after c() step
Thomas Girke
thomas.girke at ucr.edu
Sun Nov 22 00:08:45 CET 2009
Dear List,
Is there an explanation for the behavior change of XStringSet
objects that have gone through an append() or c() step and those
that didn't? I am not observing this problem in the previous
R/BioC release.
Below is a simple example to reproduce this error.
Thanks in advance for your help.
Thomas
## Example
> library(Biostrings)
> dset1 <- DNAStringSet(c("GCATATTAC", "AATCGATCC", "GCATATTAC"))
> dset2 <- DNAStringSet(c("CCGCATATTAC", "AAAATCGATCC", "GCATATAATAC"))
> dset3 <- c(dset1, dset2) # using append() doesn't fix the problem
> reverseComplement(dset3)
Error in .local(x, ...) : IRanges internal error: length(x) != 1
> DNAStringSet(dset3, start=1, end=4)
Error in super(x) : Biostrings internal error: length(x at pool) != 1
## The problem goes away by doing the following
> dset3fix <- DNAStringSet(unlist(strsplit(toString(dset3), ", ")))
> reverseComplement(dset3fix)
A DNAStringSet instance of length 6
width seq
[1] 9 GTAATATGC
[2] 9 GGATCGATT
[3] 9 GTAATATGC
[4] 11 GTAATATGCGG
[5] 11 GGATCGATTTT
[6] 11 GTATTATATGC
> DNAStringSet(dset3fix, start=1, end=4)
A DNAStringSet instance of length 6
width seq
[1] 4 GCAT
[2] 4 AATC
[3] 4 GCAT
[4] 4 CCGC
[5] 4 AAAA
[6] 4 GCAT
> sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.14.1 IRanges_1.4.3
loaded via a namespace (and not attached):
[1] Biobase_2.6.0
More information about the Bioc-sig-sequencing
mailing list