[Bioc-sig-seq] mapping coordinates

Jason Lu jasonlu68 at gmail.com
Wed Mar 17 01:25:15 CET 2010


Hi Patrick,
Yes. I would like to do something as Sean said. You are right that in
my SimpleRleList, the elements are ensembl transcript ids, and the
values are coverages.

My purpose is to convert the ranges to the genomic coordinates (no
need to map to UCSC genes). I have the exon_start and exon_end
position by using biomaRt. I would like to do something like shift,
but not sure if it will work (since I need to keep track of the strand
info and all exon start positions). Currently I wrote a python script
to do this, but also want to do it in R directly.

Thanks all for being helpful.

Jason



On Tue, Mar 16, 2010 at 6:25 PM, Patrick Aboyoun <paboyoun at fhcrc.org> wrote:
> I'm not so sure the shift operator will do the trick since we don't have a
> mapping of Ensembl transcripts to UCSC known gene transcripts. Also what is
> stored in the SimpleRleList object. Are the elements transcripts and do the
> values represent coverage?
>
>
> Patrick
>
>
>
> On 3/16/10 1:53 PM, Michael Lawrence wrote:
>
> So to summarize, Jason needs to use GenomicFeatures to get the ensembl exons
> out, get the start positions, and use them to shift his ranges.
>
> On Tue, Mar 16, 2010 at 1:37 PM, Patrick Aboyoun <paboyoun at fhcrc.org> wrote:
>>
>> The GenomicFeatures package in BioC 2.6 (w/ R-devel) supports the
>> following set of transcript tables provided by UCSC:
>>
>>                                    track           subtrack
>> knownGene                     UCSC Genes <NA>
>> knownGeneOld3             Old UCSC Genes <NA>
>> wgEncodeGencodeManualRel2  Gencode Genes    Genecode Manual
>> wgEncodeGencodeAutoRel2    Gencode Genes      Genecode Auto
>> wgEncodeGencodePolyaRel2   Gencode Genes     Genecode PolyA
>> ccdsGene                   Consensus CDS <NA>
>> refGene                     RefSeq Genes <NA>
>> xenoRefGene                 Other RefSeq <NA>
>> vegaGene                      Vega Genes Vega Protein Genes
>> vegaPseudoGene                Vega Genes   Vega Pseudogenes
>> ensGene                    Ensembl Genes <NA>
>> acembly                    AceView Genes <NA>
>> sibGene                        SIB Genes <NA>
>> nscanPasaGene                     N-SCAN    N-SCAN PASA-EST
>> nscanGene                         N-SCAN             N-SCAN
>> sgpGene                        SGP Genes <NA>
>> geneid                      Geneid Genes <NA>
>> genscan                    Genscan Genes <NA>
>> exoniphy                        Exoniphy <NA>
>> augustusHints                   Augustus     Augustus Hints
>> augustusXRA                     Augustus   Augustus De Novo
>> augustusAbinitio                Augustus Augustus Ab Initio
>> acescan                          ACEScan <NA>
>>
>>
>> Patrick
>>
>> On 3/16/10 1:18 PM, Michael Lawrence wrote:
>> > I think his problem is not access to UCSC but retrieving the exon
>> > locations. The transformation of the coordinate systems is a simple
>> > shift() call, once you have the exons. I don't know about Ensembl, but
>> > the GenomicFeatures package provides the UCSC predictions.
>> >
>> > I just realized yesterday that all of the gene predictions stored
>> > within UCSC have the same, documented table format. Presumably
>> > rtracklayer could convert these into something like what is provided
>> > by GenomicFeatures, but I think GenomicFeatures already has a function
>> > for doing that (building the DB). Could the utilities in
>> > GenomicFeatures be slightly generalized to support the alternative
>> > gene predictions in UCSC?
>> >
>> > Sorry for sort of hijacking the thread...
>> >
>> > Michael
>> >
>> > On Tue, Mar 16, 2010 at 11:20 AM, Patrick Aboyoun <paboyoun at fhcrc.org
>> > <mailto:paboyoun at fhcrc.org>> wrote:
>> >
>> >     Jason,
>> >     The rtracklayer provides an interface to the UCSC browser. In
>> >     particular rtracklayer::export can export a SimpleRleList to a bed
>> >     file that you can upload to UCSC.
>> >
>> >     Patrick
>> >
>> >
>> >
>> >     On 3/16/10 6:50 AM, Jason Lu wrote:
>> >
>> >         Hi,
>> >
>> >         I would like get suggestions on what is the best way to do this.
>> > I
>> >         have a "SimpleRleList" object, in which the range coordinates
>> > are
>> >         relevant to an ensembl transcript (ranges from 1 to the total
>> >         length
>> >         of the transcript). I would like to show my data on the UCSC
>> >         browser,
>> >         so I need to map those coordinates to the genome. Currenty I
>> >         wrote my
>> >         own script to do this, based on the exon locations from
>> >         biomart. There
>> >         got to be an easy way to do this, but don't seem to find a quick
>> >         answer. Sorry if this has been answered.
>> >
>> >         Thanks.
>> >
>> >         Jason
>> >
>> >         _______________________________________________
>> >         Bioc-sig-sequencing mailing list
>> >         Bioc-sig-sequencing at r-project.org
>> >         <mailto:Bioc-sig-sequencing at r-project.org>
>> >         https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>> >
>> >
>> >     _______________________________________________
>> >     Bioc-sig-sequencing mailing list
>> >     Bioc-sig-sequencing at r-project.org
>> >     <mailto:Bioc-sig-sequencing at r-project.org>
>> >     https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>> >
>> >
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
>



More information about the Bioc-sig-sequencing mailing list