[Bioc-sig-seq] ChIPpeakAnno annotatePeakInBatch problems in R2.13/ChIPpeakAnno_1.8.0 and R2.13/ChIPpeakAnno_2.0.2
Sonia Leach
sonia.leach at gmail.com
Wed Aug 31 21:52:43 CEST 2011
I had a problem with the original ChIPpeakAnno distribution
ChIPpeakAnno_1.8.0 for R2.13 where depending on the number of spaces
in the RangedData Annotation object sent to annotatePeakInBatch, I
would get the error:
Error in FUN(1L[[1L]], ...) : object 'r' not found
(see Problem 1 below) which went away when I downloaded the
development version R2.13/ChIPpeakAnno_2.0.2
However, then I had the problem that calling annotatePeakInBatch(...,
output="overlapping", multiple=FALSE) returned the same number of
answers as annotatePeakInBatch(..., output="overlapping",
multiple=TRUE) (see Problem 2 below). Obviously, the work around is to
take one hit from among the multiples returned but this should be
fixed.
The annotation file I used is just a bed6 dump from UCSC goldenpath.
============ problem 1:
library(ChIPpeakAnno)
myPeak = RangedData(IRanges(start = c(17208381), end = c(17208381), names = c("S
ite1")),space = c("chr1"),strand = c('+'))
## This object has 25 spaces for chr1..22,X,Y,M
UCSC = read.delim('Annots/UCSC_knownGene.hg19.bed',header=FALSE)
UCSC_rangeD = RangedData(IRanges(start= UCSC[,2], end= UCSC[,3], names=UCSC[,4])
, space=as.character(UCSC[,1]),strand=UCSC[,6])
## This object has just 1 space but the same data as UCSC_rangedD[868,]
feature = RangedData(IRanges(start = c(17066767), end = c(17267729), names = c("
Site1")),space = c("chr1"),strand = c('+'))
## with UCSC_rangeD[868,], gives error in R2.13/ChIPpeakAnno_1.8.0
## Error in FUN(1L[[1L]], ...) : object 'r' not found
annotation = annotatePeakInBatch(myPeak, AnnotationData=UCSC_rangeD[868,], outpu
t="overlapping", maxgap=0, multiple=FALSE)
## with 1-space feature, no error
annotation = annotatePeakInBatch(myPeak, AnnotationData=feature, output="overlap
ping", maxgap=0, multiple=FALSE)
<sorry, I no longer have the session info for this run - but it is the
basic R2.13 install plus biocLite(ChIPpeakAnno), and should have the
same versions as the session info shown for problem 2 below, minus the
new dev version for ChIPpeakAnno (i.e. everything the same as below,
except ChIPpeakAnno_2.0.2.tar.gz, gplots_2.8.0.tar.gz,
caTools_1.12.tar.gz, gdata_2.8.2.tar.gz, gtools_2.6.2.tar.gz)
>
======== Problem 2
R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)
> library(ChIPpeakAnno)
Warning message:
replacing previous import 'space' when loading 'IRanges'
> UCSC = read.delim('Annots/UCSC_knownGene.hg19.bed',header=FALSE)
> UCSC_rangeD = RangedData(IRanges(start= UCSC[,2], end= UCSC[,3], names=UCSC[,4]), space=as.character(UCSC[,1]),strand=UCSC[,6])
> data = unique(read.table(file[i], sep="\t", header=FALSE))
> ids = sub("ID=(\\d+);.+", "ID\\1", data[,9], perl=TRUE)
> data_rangeD = RangedData(IRanges(start=data$V4, end=data$V5, names=paste(ids,data$V3, sep="_")), space=data$V1, strand="+")
> dim(data_rangeD)
[1] 19501 1
> annotationU = annotatePeakInBatch(data_rangeD, AnnotationData=UCSC_rangeD, out
put="overlapping", maxgap=0, multiple=FALSE)
> dim(annotationU)
[1] 16777 9
> annotationU = annotatePeakInBatch(data_rangeD, AnnotationData=UCSC_rangeD, out
put="overlapping", maxgap=0, multiple=TRUE)
> dim(annotationU)
[1] 16777 9
> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] ChIPpeakAnno_2.0.2 gplots_2.8.0
[3] caTools_1.12 bitops_1.0-4.1
[5] gdata_2.8.2 gtools_2.6.2
[7] limma_3.8.3 org.Hs.eg.db_2.5.0
[9] GO.db_2.5.0 RSQLite_0.9-4
[11] DBI_0.2-5 AnnotationDbi_1.14.1
[13] BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.20.0
[15] GenomicRanges_1.4.8 Biostrings_2.20.2
[17] IRanges_1.10.6 multtest_2.8.0
[19] Biobase_2.12.2 biomaRt_2.8.1
loaded via a namespace (and not attached):
[1] MASS_7.3-12 RCurl_1.6-9 splines_2.13.0 survival_2.36-5
[5] XML_3.4-2
>
More information about the Bioc-sig-sequencing
mailing list