[Bioc-sig-seq] makeTranscriptDbFromUCSC question

Hervé Pagès hpages at fhcrc.org
Thu Mar 4 07:41:56 CET 2010


Hi Marc,

This seems like a good usecase to add to the vignette. Maybe we could
make it a formal task during the next stand up.

H.


Marc Carlson wrote:
> Hi Joseph,
> 
> Here is a more generic approach you could take
> 
> ##1st some setup for a toy example
> library(GenomicFeatures)
> txdb <- loadFeatures(system.file("extdata", "UCSC_knownGene_sample.sqlite",
>                                                 package="GenomicFeatures"))
> gr <- GRanges(seqnames = rep("chr1",2),
>                            ranges = IRanges(start=c(500,10500),
> end=c(10000,30000)),
>                            strand = strand(rep("-",2)))
> 
> 
> 
> ## You might want to just get a new ann by using transcripts() like this
> ann = transcripts(txdb)
> 
> ## Then you could add some padding to the upstream regions like so:
> start(ann) = start(ann) - 10000
> 
> ## and you can add to the end like this:
> end(ann) = end(ann) + 10000
> 
> ## Then you can use this modified annotation with findOverlaps
> ol = findOverlaps(ann, gr)
> 
> ## Then you could get the index of things that hit the subject
> hits <- subjectHits(ol)
> 
> ## and use that to see which things in gr are annotated.
> gr[hits]
> 
> ## Or use the queryHits to see what annotations go with your ranges
> ann[unique(queryHits(ol))]
> 
> 
> Does that help?
> 
> 
> 
>   Marc
> 
> 
> 
> 
> On 03/03/2010 02:28 PM, joseph wrote:
>> Hello
>> I am using transcriptsByRanges(txdb, gr) to annotate a set of reads with transcript and gene IDs from txdb obtained with makeTranscriptDbFromUCSC(genome = "mm9", tablename = "knownGene").
>> My question is how to make changes to the txdb to include 10 kb upstream of the transcription start site and 3 kb downstream of the end of the transcript. 
>> Thanks
>> Joseph Dhahbi
>>
>>
>>
>>       
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-sig-sequencing mailing list