[Bioc-sig-seq] GenomicFeatures, error in type conversion RangeData to GRanges
Martin Morgan
mtmorgan at fhcrc.org
Thu Apr 1 16:22:54 CEST 2010
On 04/01/2010 07:12 AM, Michael Lawrence wrote:
> On Thu, Apr 1, 2010 at 7:09 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>
>> On 03/31/2010 07:11 PM, pterry at huskers.unl.edu wrote:
>>> Dear bioc-sig-sequencing,
>>>
>>> I would like to annotate chip-seq peaks for the arabidopsis genome. In
>> trying to work thru the GenomicFeatures vignette dated 03/27/10, I need to
>> convert my ChIPSeq peaks from a RangedData object to a GRanges object. In a
>> recent, but previous Bioconductor development version, the conversion with
>> this particular RangedData object worked fine.
>>>
>>> In this more recent Bioconductor development version, I get the following
>> error message:
>>>
>>>> gr_ChSeqPks <- as(rd0_chr1_s_8_trt_vs_INPctl, "GRanges")
>>> Error in validObject(.Object) :
>>> invalid class "GRanges" object: slot 'strand' contains missing values
>>>> rd0_chr1_s_8_trt_vs_INPctl
>>> RangedData with 57 rows and 2 value columns across 1 space
>>> space ranges | ARAB8 ARAB7INPCTL
>>> <character> <IRanges> | <integer> <integer>
>>> 1 chr1 [ 617092, 617094] | 24 0
>>> 2 chr1 [1808262, 1808262] | 8 0
>>> 3 chr1 [3889445, 3889452] | 64 0
>>> 4 chr1 [4404410, 4404410] | 8 0
>>> 5 chr1 [7081127, 7081127] | 8 0
>>> 6 chr1 [7128574, 7128581] | 64 0
>>> 7 chr1 [7128592, 7128649] | 464 0
>>> 8 chr1 [7530777, 7530781] | 40 0
>>> 9 chr1 [7530784, 7530786] | 24 0
>>> ... ... ... ... ... ...
>>
>> Hi,
>>
>>> rd = RangedData(IRanges(1, 10))
>>> as(rd, "GRanges")
>> Error in validObject(.Object) :
>> invalid class "GRanges" object: slot 'strand' contains missing values
>>> rd[["strand"]] = "*"
>>> as(rd, "GRanges")
>> GRanges with 1 range and 0 elementMetadata values
>> seqnames ranges strand |
>> <Rle> <IRanges> <Rle> |
>> [1] 1 [1, 10] * |
>>
>> seqlengths
>> 1
>> NA
>>
>> Martin
>>
>>
> Shouldn't the coerce function just do this automatically?
Currently GRanges thinks of strand as '+', '-', '*', whereas IRanges
allows NA as well (hence the error) so coercing NA to * represents a
decision on the part of the investigator that '*' (strand irrelevant) is
synonymous with NA (no information about strand available). Part of the
motivation for this current state of affairs is that the use case for
both NA and * were unclear, but course corrections welcome.
Martin
>
>>>
>>>> sessionInfo()
>>> R version 2.12.0 Under development (unstable) (2010-03-30 r51506)
>>> x86_64-unknown-linux-gnu
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] biomaRt_2.3.5 GenomicFeatures_0.5.0 GenomicRanges_0.1.0
>>> [4] IRanges_1.5.73
>>>
>>> loaded via a namespace (and not attached):
>>> [1] Biobase_2.7.5 Biostrings_2.15.26 BSgenome_1.15.20 DBI_0.2-5
>>> [5] RCurl_1.3-1 RSQLite_0.8-4 rtracklayer_1.7.11 tools_2.12.0
>>> [9] XML_2.8-1
>>>>
>>>
>>>
>>> Thanks,
>>> P. Terry
>>> pterry at huskers.unl.edu
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioc-sig-sequencing
mailing list