[Bioc-sig-seq] getSeq and Btaurus$chrUn.scaffolds - ambiguous name error
Janet Young
jayoung at fhcrc.org
Tue Nov 16 03:51:01 CET 2010
Hi again,
I'm interested in some sequences on the cow chrUn scaffolds, and am
having a bit of bother getting them. I think I might have uncovered a
bug, although I might just be doing something wrong. The code and
output below should explain all. Any suggestions?
thanks (again!),
Janet Young
-------------------------------------------------------------------
Dr. Janet Young (Trask lab)
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung ...at... fhcrc.org
http://www.fhcrc.org/labs/trask/
-------------------------------------------------------------------
library(BSgenome.Btaurus.UCSC.bosTau4)
###### this works, gives me a 50bp sequence
getSeq(Btaurus,"chr1",start=1,end=50)
[1] "TACCCCACTCACACTTATGGATAGATCAACTAAACAGAAAATTAACAAGG"
####### for some scaffolds in the chrUn.scaffolds pile, I don't get an
error message, but getSeqs seems to ignore the start and end
coordinates requested - e.g. the sequence returned here is the whole
scaffold, not just the first 50bp
getSeq(Btaurus,"chrUn.004.11829",start=1,end=50)
[1]
"TCATGTGTTTCTTCCAGTCCAGCATTTCTCATGATGTACTCTGCATATAAGTTAAATAAACAGGGTGACAA
TATACAGCCTTGATGAACTCCTTTTCCTATTTGGAACCAGTCTGTTGTTCCATGTCCAGTTCTAACTGTTGCTTCCTGACCTGCATACAGATTTCTCAAGAGGCAGATCAGGTGTTCTCATCTCCTGAGAATTGAAGGTACAAATTGTAGTGTTTCAATTGGCACCATGCTAATTTATCTTGGCCTAAAATAGTGAATGGGCTTCCCTGGTGGCTCAGGTGGTAAAGAATCTGCCTGCAATGCTGGAGACCTGGGTTCAATATCTGGGTTGGGAAGATTACCTGGAGGAGGGCATGGAGGCTTACTCGAATATTCTTGCCTGGAAAATCTCCATGGACAGAGAAGCTGGGTGGGTTACTGTCCATGGGGTCGCAAAGAGTCAGACGTGACTGAGCAACTAAGCACAGCACAACACAAAATAGTGAATACTGAGCAAGTAAAGGAAAAACCTCTTCCTCTCAGAAATTGGTCTTCATTTTTTCATGAGAATTGCTAGTCTTCCTCCCAAAGCCAAAACCATAAATTTGTTAGTGTTTGACCTCAATATATTTTCTCTTAACTCAGCTTTTAAACCTTCTCTGCCTCCTGCTACCATTCACTTTCTAGTACATTTGAAATCTGTCCAAGCCATTCCTGGGGTTCAGGTGTCTGAGACCTGATTTATTTCATTGATATATTAAAACACCCTTGAATCCAGCCAACGTATGTGGCCAGTTTTACTTGCTTTGCTCCCATACTGGTAATGGAATTTTTATGGCTGTAAAATATCTGGGTCATGTGGCATTTTCATCTTCTGTTGTCTTGAGCTGGTATAGTTTTACCAACGTGCCATTAAGGGATGGTTCCTTTACCATCATTGTGCTTCCTGGGGCCTTGCCCACTTTGCACTGTAAGTCAGAACAAGAGACCCTCCAAGTATTTAATTTCC"
#### for other scaffolds I just get an error message, although the
named scaffold definitely exists (is it doing a partial match on the
name, not an exact match?)
getSeq(Btaurus,"chrUn.004.1022")
Error in .getOneSeqFromBSgenomeMultipleSequences(x, names[i],
start[i], :
sequence chrUn.004.1022 found more than once, please use a non-
ambiguous name
which ( names(Btaurus$chrUn.scaffolds) == "chrUn.004.1022" )
[1] 1022
grep ( "chrUn.004.1022" , names(Btaurus$chrUn.scaffolds) )
[1] 1022 10220 10221 10222 10223 10224 10225 10226 10227 10228 10229
sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Btaurus.UCSC.bosTau4_1.3.16 BSgenome_1.18.1
[3] Biostrings_2.18.0 GenomicRanges_1.2.1
[5] IRanges_1.8.2
loaded via a namespace (and not attached):
[1] Biobase_2.10.0
More information about the Bioc-sig-sequencing
mailing list