[Bioc-sig-seq] GRangesList with duplicate names
Pages, Herve
hpages at fhcrc.org
Fri Feb 25 09:08:47 CET 2011
Hi Dario,
A GRangesList object with duplicated names is apparently
considered broken:
> grl <- GRangesList(GRanges(), GRanges())
> names(grl) <- c("a", "a")
> validObject(grl)
Error in `rownames<-`(`*tmp*`, value = c("a", "a")) :
duplicate rownames not allowed
If we are ok with this feature, we should fix the "names<-"
method (and any other code around that lets the user generate
broken objects).
But if we are not ok with this feature, we should modify
the validity method for GRangesList objects. I tend to prefer
this solution for 3 reasons:
1. Consistency with ordinary vectors: the names of a vector
in R are not required to be unique.
2. It's not uncommon to see the same name used for 2 different
genes. One might still want to be able to stick those names
on a GRangesList object where each top-level element corresponds
to a gene (e.g. exons grouped by gene).
3. It's easier to modify the validity method than to go around
trying to find and fix every piece of code in GenomicRanges
(and maybe other places) that can potentially produce a
GRangesList object with duplicated names.
How do our power users feel about this?
Thanks,
H.
----- Original Message -----
From: "Dario Strbenac" <D.Strbenac at garvan.org.au>
To: bioc-sig-sequencing at r-project.org
Sent: Thursday, February 24, 2011 10:00:11 PM
Subject: [Bioc-sig-seq] GRangesList with duplicate names
Hello,
It is possible to create a GRangesList with duplicated names, but not to re-order it.
> summary(grl)
Length Class Mode
3 GRangesList S4
> names(grl) <- c("Cancer", "Cancer", "Normal")
> grl[3:1]
Error in `rownames<-`(`*tmp*`, value = c("Normal", "Cancer", "Cancer")) :
duplicate rownames not allowed
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_AU.UTF-8
[7] LC_PAPER=en_AU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicRanges_1.2.3 IRanges_1.8.9
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
More information about the Bioc-sig-sequencing
mailing list