[BioC] TEQC package isssue with chromosome format

nathalie nac at sanger.ac.uk
Thu Jun 7 15:28:26 CEST 2012


Hi,
I would like to analyse the coverage of my Bam files using TEQC package 
which have been aligned on a reference with the following format
chr number (1-19, X, Y ), start (integrer), end (integrer)
the chromosomes are not with the prefixe chr.

When I try to create the target file with the Nochr nomenclature, it 
fails with the following error message
 > targets<-get.targets(""NOchr.txt", chrcol=1,startcol=2,endcol=3, 
zerobased=F, sep="\t",skip=0)
Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = 
"IRanges") :
   solving row 1: range cannot be determined from the supplied arguments 
(too many NAs)

This is working  when I change de format with "chr" prefixes.
 > head(targets)
RangedData with 6 rows and 0 value columns across 21 spaces
      space             ranges |
<factor> <IRanges> |
1     chr1 [3206100, 3207051] |
2     chr1 [3411780, 3411984] |
3     chr1 [3660630, 3661431] |
4     chr1 [4334678, 4340174] |
5     chr1 [4341988, 4342164] |
6     chr1 [4342280, 4342908] |
But then my bams are in the wrong format as they don't have those 
prefixes....
 > head(mybams.bam)
RangedData with 6 rows and 1 value column across 211 spaces
      space             ranges |                               ID
<factor> <IRanges> | <character>
1        1 [3000748, 3000822] | HS10_07304:1:1301:15698:141841#2
2        1 [3000748, 3000822] |   HS2_07343:1:2107:4612:106954#2
3        1 [3000748, 3000822] |   HS2_07343:2:1204:4374:169685#2
4        1 [3000818, 3000892] | HS10_07304:1:1301:15698:141841#2
5        1 [3000818, 3000892] |   HS2_07343:1:2107:4612:106954#2
6        1 [3000818, 3000892] |   HS2_07343:2:1204:4374:169685#2


  Is it possible to make the function accept the NoChr coordinate or is 
the only way to change everything  back to chr prefixes????

many thanks
Nat
 > sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=C
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] TEQC_2.4.0          hwriter_1.3         Rsamtools_1.8.4
[4] Biostrings_2.24.1   GenomicRanges_1.8.3 IRanges_1.14.2
[7] BiocGenerics_0.2.0

loaded via a namespace (and not attached):
[1] Biobase_2.16.0 bitops_1.0-4.1 stats4_2.15.0  zlibbioc_1.2.0




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.



More information about the Bioconductor mailing list