[Bioc-sig-seq] find overlaps compatible with a transcript
Elizabeth Purdom
epurdom at stat.berkeley.edu
Mon Sep 13 20:18:23 CEST 2010
Hello,
I am using a TranscriptDb and trying to find overlaps with transcripts.
For example, I have a gapped alignment and I want to see what
transcripts it is compatible with. If txdb is my TranscriptDb, and gr is
my gapped alignment as a GenomicRanges object, I can do findOverlaps to
see if my read overlaps in any way overlaps with the individual exons of
the transcript, but not whether it overlaps with the implied transcript.
For example, if my gapped read overlaps exon 1,2,3 of the transcript, it
can only be compatible if it overlaps in a particular way (it must
contain the end of exon 1, the beginning of exon 3, and all of exon 2).
Is there a way to check this? This is probably answered somewhere, but I
can't seem to find it.
Thanks,
Elizabeth
An example:
> txdb <- loadFeatures(system.file("extdata",
"UCSC_knownGene_sample.sqlite", package="GenomicFeatures"))
> exByTx<-exonsBy(newtxdb$txdb,"tx")
#this is compatible
> grOk<-GRanges(seqnames =c("chr1", "chr1", "chr1"), ranges
=IRanges(c(2000,2476,3084),c(2090,2584,3089)), strand =rep("*",3))
#this is not
> grNotOk<-GRanges(seqnames =c("chr1", "chr1", "chr1"),ranges =
IRanges(c(2000,2500,3084),c(2090,2584,3089)),
strand =rep("*",3))
#both overlap the same set of transcripts, but the the second is not
compatible with either transcript
> findOverlaps(GRangesList(grOk,grNotOk),exByTx)
An object of class "RangesMatching"
Slot "matchMatrix":
query subject
[1,] 1 1
[2,] 1 2
[3,] 2 1
[4,] 2 2
Slot "DIM":
[1] 2 135
More information about the Bioc-sig-sequencing
mailing list