[BioC] TEQC package isssue with chromosome format
nathalie
nac at sanger.ac.uk
Thu Jun 7 15:28:26 CEST 2012
Hi,
I would like to analyse the coverage of my Bam files using TEQC package
which have been aligned on a reference with the following format
chr number (1-19, X, Y ), start (integrer), end (integrer)
the chromosomes are not with the prefixe chr.
When I try to create the target file with the Nochr nomenclature, it
fails with the following error message
> targets<-get.targets(""NOchr.txt", chrcol=1,startcol=2,endcol=3,
zerobased=F, sep="\t",skip=0)
Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE =
"IRanges") :
solving row 1: range cannot be determined from the supplied arguments
(too many NAs)
This is working when I change de format with "chr" prefixes.
> head(targets)
RangedData with 6 rows and 0 value columns across 21 spaces
space ranges |
<factor> <IRanges> |
1 chr1 [3206100, 3207051] |
2 chr1 [3411780, 3411984] |
3 chr1 [3660630, 3661431] |
4 chr1 [4334678, 4340174] |
5 chr1 [4341988, 4342164] |
6 chr1 [4342280, 4342908] |
But then my bams are in the wrong format as they don't have those
prefixes....
> head(mybams.bam)
RangedData with 6 rows and 1 value column across 211 spaces
space ranges | ID
<factor> <IRanges> | <character>
1 1 [3000748, 3000822] | HS10_07304:1:1301:15698:141841#2
2 1 [3000748, 3000822] | HS2_07343:1:2107:4612:106954#2
3 1 [3000748, 3000822] | HS2_07343:2:1204:4374:169685#2
4 1 [3000818, 3000892] | HS10_07304:1:1301:15698:141841#2
5 1 [3000818, 3000892] | HS2_07343:1:2107:4612:106954#2
6 1 [3000818, 3000892] | HS2_07343:2:1204:4374:169685#2
Is it possible to make the function accept the NoChr coordinate or is
the only way to change everything back to chr prefixes????
many thanks
Nat
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] TEQC_2.4.0 hwriter_1.3 Rsamtools_1.8.4
[4] Biostrings_2.24.1 GenomicRanges_1.8.3 IRanges_1.14.2
[7] BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] Biobase_2.16.0 bitops_1.0-4.1 stats4_2.15.0 zlibbioc_1.2.0
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the Bioconductor
mailing list