[Bioc-sig-seq] The "ranges" slot in the sread slot of an AlignedRead class
Nicolas Delhomme
delhomme at embl.de
Fri May 15 14:43:33 CEST 2009
Hi Martin, Hi all,
The following code does what I want quite nicely:
mrna.ranges<-RangedData(
IRanges(start=position(mrna.aln),width=width(mrna.aln)),
space = chromosome(mrna.aln),
universe = "dm3",
indices=unlist(sapply(levels(chromosome(mrna.aln)),function(chr)
{which(chromosome(mrna.aln)==chr)}),use.name=FALSE)
)
Out of my AlignedRead (mrna.aln), I create a RangedData which contains
the ranges splitted by chromosome and sorted into a RangesList. The
additional parameter: indices (the name is arbitrary) contains the
position of the corresponding read in the original mrna.aln object and
is stored in a SplitXDataFrame.
Best,
---------------------------------------------------------------
Nicolas Delhomme
High Throughput Functional Genomics Center
European Molecular Biology Laboratory
Tel: +49 6221 387 8426
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------
On 13 May 2009, at 16:53, Martin Morgan wrote:
> Nicolas Delhomme <delhomme at embl.de> writes:
>
>> Hi Martin,
>>
>> That's what I thought; i was just curious to learn more. Thanks for
>> the details!
>>
>> I should have think of it, as I put it after the session info, that
>> most probably my second question will be invisible :-)
>>
>> I paste it here again:
>>
>>>
>>> And is there an easy way to create a RangesList from an AlignedRead
>>> object? I figured out how to do it, but I just want to be sure
>>> that I
>>> didn't miss it. If it doesn't exist, I think it would be a valuable
>>> addition and I could contribute the few lines of code.
>
> Sorry for missing that; I don't think there is anything built-in. We
> could exchange your code about introducing something off-list, if you
> like.
>
> Martin
>
>> Best wishes,
>>
>> ---------------------------------------------------------------
>> Nicolas Delhomme
>>
>> High Throughput Functional Genomics Center
>>
>> European Molecular Biology Laboratory
>>
>> Tel: +49 6221 387 8426
>> Email: nicolas.delhomme at embl.de
>> Meyerhofstrasse 1 - Postfach 10.2209
>> 69102 Heidelberg, Germany
>> ---------------------------------------------------------------
>>
>>
>>
>> On 13 May 2009, at 04:30, Martin Morgan wrote:
>>
>>> Hi Nicolas --
>>>
>>> Nicolas Delhomme <delhomme at embl.de> writes:
>>>
>>>> Hi all,
>>>>
>>>> Well the question is quite easy :-) What does this slot holds?
>>>> Because
>>>> it looks very different from the actual positions: i.e.
>>>>
>>>> these are the 10 first ranges
>>>>
>>>>> sread(aln.clean[chromosome(aln.clean)=="2R"])@ranges[1:10]
>>>
>>> It's internal to the way reads themselves are stored.
>>> sread(aln.clean)
>>> returns a DNAStringSet object, the ranges slot of a DNAStringSet
>>> points to offsets into a larger DNAString. As you show later, you
>>> want to use position(aln.clean) for alignment information.
>>>
>>> This representation is meant to be entirely internal to the class.
>>> The
>>> intention is that the user manipulate objects with defined functions
>>> and methods (like position()). Of course the user can get at the
>>> contents of slots with @, but there are no guarantees about what
>>> will
>>> be there if the user does this!.
>>>
>>> Martin
>>>
>>>
>>>> IRanges object:
>>>> start end width
>>>> [1] 4141 4176 36
>>>> [2] 4177 4212 36
>>>> [3] 4357 4392 36
>>>> [4] 4465 4500 36
>>>> [5] 5113 5148 36
>>>> [6] 5365 5400 36
>>>> [7] 5401 5436 36
>>>> [8] 6049 6084 36
>>>> [9] 6301 6336 36
>>>> [10] 6373 6408 36
>>>>
>>>> and these are the 10 first positions
>>>>
>>>>> position(aln.clean[chromosome(aln.clean)=="2R"])[1:10]
>>>> [1] 6419544 18694365 10064416 17228214 5850736 11976428 15335440
>>>> 3370962
>>>> [9] 15327509 3366816
>>>>
>>>>> sessionInfo()
>>>> R version 2.9.0 (2009-04-17)
>>>> x86_64-unknown-linux-gnu
>>>>
>>>> locale:
>>>> LC_CTYPE = en_US .UTF -8
>>>> ;LC_NUMERIC = C ;LC_TIME = en_US .UTF -8
>>>> ;LC_COLLATE = en_US .UTF -8
>>>> ;LC_MONETARY = C ;LC_MESSAGES = en_US .UTF -8
>>>> ;LC_PAPER = en_US .UTF -8
>>>> ;LC_NAME = C ;LC_ADDRESS
>>>> =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods
>>>> base
>>>>
>>>> other attached packages:
>>>> [1] ShortRead_1.2.0 lattice_0.17-22 BSgenome_1.12.0
>>>> Biostrings_2.12.1
>>>> [5] IRanges_1.2.1 rtracklayer_1.4.0 RCurl_0.94-1
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] Biobase_2.4.1 grid_2.9.0 hwriter_1.1 tools_2.9.0
>>>> XML_2.3-0
>>>>
>>>> And is there an easy way to create a RangesList from an AlignedRead
>>>> object? I figured out how to do it, but I just want to be sure
>>>> that I
>>>> didn't miss it. If it doesn't exist, I think it would be a valuable
>>>> addition and I could contribute the few lines of code.
>>>>
>>>> Best,
>>>>
>>>> ---------------------------------------------------------------
>>>> Nicolas Delhomme
>>>>
>>>> High Throughput Functional Genomics Center
>>>>
>>>> European Molecular Biology Laboratory
>>>>
>>>> Tel: +49 6221 387 8426
>>>> Email: nicolas.delhomme at embl.de
>>>> Meyerhofstrasse 1 - Postfach 10.2209
>>>> 69102 Heidelberg, Germany
>>>>
>>>> _______________________________________________
>>>> Bioc-sig-sequencing mailing list
>>>> Bioc-sig-sequencing at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>
>>> --
>>> Martin Morgan
>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N.
>>> PO Box 19024 Seattle, WA 98109
>>>
>>> Location: Arnold Building M1 B861
>>> Phone: (206) 667-2793
>>
>
> --
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
More information about the Bioc-sig-sequencing
mailing list