[Bioc-sig-seq] ChIPpeakAnno fails to interpret standard chromosome name and strandedness
Julie Zhu
julie.zhu at umassmed.edu
Tue Feb 9 18:22:03 CET 2010
Hi Ivan,
Thank you very much for your valuable suggestion and for the examples.
Both examples should work now. The strand can be represented as "+/-" or
1/-1. The chromosome can be represented as "chr1" or "1".
You can get the fixed code from svn ( ChIPpeakAnno_1.2.3) or wait for it to
be posted.
Best regards,
Julie
*******************************************
Lihua Julie Zhu, Ph.D
Research Associate Professor
Program Gene Function and Expression
University of Massachusetts Medical School
364 Plantation Street, Room 613
Worcester, MA 01605
508-856-5256
http://www.umassmed.edu/pgfe/faculty/zhu.cfm
On 2/9/10 1:12 AM, "Ivan Gregoretti" <ivangreg at gmail.com> wrote:
> Hello everybody
>
> The package ChIPpeakAnno comes with a couple example RangedData sets.
>
> With those toy sets you can familiarise yourself with the package. It works.
>
> Now, if you redefine the sets so that space '1' becomes 'chr1' or
> strand '1' becomes '+', the functions do not work.
>
> Notice that 'chr1' and strand '+' is standard nomenclature, '1' and '1' is
> not.
>
> I tried to fix it myself but failed.
>
> Can anybody help?
>
> Thanks,
>
> Ivan
>
>
> ## These are the examples that work ##
> myPeak1 = RangedData(IRanges(start = c(967654, 2010897, 2496704,
> 3075869, 3123260, 3857501, 201089),
> end = c(967754, 2010997, 2496804,
> 3075969, 3123360, 3857601, 201089),
> names = c("Site1", "Site2", "Site3",
> "Site4", "Site5", "Site6", "site7")),
> space = c("1", "2", "3", "4", "5", "6", "2"))
>
> TFbindingSites = RangedData(IRanges(start = c(967659, 2010898,
> 2496700, 3075866, 3123260, 3857500, 96765, 201089, 249670, 307586,
> 312326, 385750),
> end = c(967869, 2011108,
> 2496920, 3076166, 3123470, 3857780, 96985, 201299, 249890, 307796,
> 312586, 385960),
> names = c("t1", "t2", "t3", "t4",
> "t5", "t6", "t7", "t8", "t9", "t10", "t11", "t12")),
> space = c("1", "2", "3", "4", "5", "6",
> "1", "2", "3", "4", "5", "6"),
> strand = c(1,1, 1, 1, 1, 1, -1, -1, -1, -1, -1,
> -1))
> annotatedPeak2 = annotatePeakInBatch(myPeak1, AnnotationData = TFbindingSites)
>
>
> ## this are the examples that are properly named and then do not work ##
> myPeak1 = RangedData(IRanges(start = c(967654, 2010897, 2496704,
> 3075869, 3123260, 3857501, 201089),
> end = c(967754, 2010997, 2496804,
> 3075969, 3123360, 3857601, 201089),
> names = c("Site1", "Site2", "Site3",
> "Site4", "Site5", "Site6", "site7")),
> space = c("chr1", "chr2", "chr3", "chr4", "chr5",
> "chr6", "chr2"))
>
> TFbindingSites = RangedData(IRanges(start = c(967659, 2010898,
> 2496700, 3075866, 3123260, 3857500, 96765, 201089, 249670, 307586,
> 312326, 385750),
> end = c(967869, 2011108,
> 2496920, 3076166, 3123470, 3857780, 96985, 201299, 249890, 307796,
> 312586, 385960),
> names = c("t1", "t2", "t3", "t4",
> "t5", "t6", "t7", "t8", "t9", "t10", "t11", "t12")),
> space = c("chr1", "chr2", "chr3", "chr4",
> "chr5", "chr6", "chr1", "chr2", "chr3", "chr4", "chr5", "chr6"),
> strand = c("+","+", "+", "+", "+", "+",
> "-", "-", "-", "-", "-", "-"))
> annotatedPeak2 = annotatePeakInBatch(myPeak1, AnnotationData = TFbindingSites)
> Error in fix.by(by.x, x) : 'by' must specify valid column(s)
>
>
>> sessionInfo()
> R version 2.10.0 (2009-10-26)
> x86_64-redhat-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=C
> [4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=en_US
> [7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] ChIPpeakAnno_1.2.2 org.Hs.eg.db_2.3.6
> [3] GO.db_2.3.5 RSQLite_0.8-2
> [5] DBI_0.2-5 AnnotationDbi_1.8.1
> [7] BSgenome.Ecoli.NCBI.20080805_1.3.16 BSgenome_1.14.2
> [9] Biostrings_2.14.12 IRanges_1.4.10
> [11] multtest_2.2.0 Biobase_2.6.1
> [13] biomaRt_2.2.0
>
> loaded via a namespace (and not attached):
> [1] MASS_7.3-3 RCurl_1.3-1 XML_2.6-0 splines_2.10.0
> [5] survival_2.35-8
>
>
> Ivan Gregoretti, PhD
> National Institute of Diabetes and Digestive and Kidney Diseases
> National Institutes of Health
> 5 Memorial Dr, Building 5, Room 205.
> Bethesda, MD 20892. USA.
> Phone: 1-301-496-1592
> Fax: 1-301-496-9878
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
More information about the Bioc-sig-sequencing
mailing list