[Bioc-sig-seq] GRanges, failure assigning chromosome lengths
Chris Seidel
seidel at phaget4.org
Sat Sep 4 00:07:29 CEST 2010
Did anything ever get resolved in terms of assigning chromosome lengths
to a GRanges object when it contains alignments that run off the
chromosome ends? The message below was the last of the original thread
that I could find.
I'm currently having the problem of reading solexa export files into a
GRanges object, and then sometimes having an error while setting the
chromosome lengths if the object has a few reads that are past the
boundary. The only solution I see is to somehow toss out the offending
reads - which means I have to write a complicated function to loop
through all reads and check them against the chromosome length - so I
was just wondering since Ivan brought this problem up back in April, if
a solution was ever reached. (or if anyone knows of an efficient way to
address the problem).
-Chris
> -----Original Message-----
> From: bioc-sig-sequencing-bounces at r-project.org
> [mailto:bioc-sig-sequencing-bounces at r-project.org] On Behalf
> Of Patrick Aboyoun
> Sent: Tuesday, April 27, 2010 12:39 PM
> To: Sean Davis
> Cc: bioc-sig-sequencing at r-project.org
> Subject: Re: [Bioc-sig-seq] GRanges, failure assigning
> chromosome lengths
>
>
> Sean and Ivan,
> Thanks for the insight. I'll look at devising a compromise within the
> existing framework. I need to explore the various methods for GRanges
> object to better understand the impact of a compromise. We
> started with
> the simplest interpretation of limit bounds because it simplifies the
> code. For example, we need to establish the rules for coverage or
> findOverlaps when the DNA is circular or the alignment runs
> off the end
> of a linear chromosome.
>
>
> Patrick
>
>
> On 4/27/10 8:05 AM, Sean Davis wrote:
> > On Tue, Apr 27, 2010 at 10:51 AM, Ivan
> Gregoretti<ivangreg at gmail.com>
> > wrote:
> >
> >> Good morning Sean and everybody,
> >>
> >>
> >>> Actually, the edge case is general as alignments, even on linear
> >>> chromosomes, may extend beyond the end of the chromosome,
> I believe.
> >>> In the best case, these alignments are clipped (in CIGAR
> terms), but
> >>> I don't know that all aligners are doing that appropriately.
> >>>
> >>> Sean
> >>>
> >> So, you rather go for an overriding switch rather than
> infrastructure
> >> overhaul?
> >>
> >> I ask this because GRanges is an exceptionally convenient
> format for
> >> ChIP-seqers and Patrick is trying to make a decision to
> make it work
> >> for real world data.
> >>
> > I guess that I mean to say that the two issues of aligning
> off the end
> > of the chromosome and handling circular genomes are related but
> > separate issues. An override seems quite reasonable for
> dealing with
> > the former. Until aligners or common formats (BAM/SAM)
> deal with the
> > latter, it will be difficult to deal appropriately with circular
> > genomes, so an override is probably a fine compromise.
> >
> > Sean
> >
> >
> >
> >> And yes indeed: aligners do align a little bit past the boundaries
> >> even for linear chromosomes. Thanks for pointing that out!
> >>
> >> Ivan
> >>
> >>
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
More information about the Bioc-sig-sequencing
mailing list