[Bioc-sig-seq] Comparing two chipseq position sets

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu May 7 17:26:28 CEST 2009


Hi,

On May 7, 2009, at 10:53 AM, Steve Goldstein wrote:
> A simple permutation test could be done by selecting random sets of  
> intervals "matching" the query intervals and counting the number of  
> overlaps with the reference intervals.  Each random set of intervals  
> could be picked so that the number and size of the intervals was the  
> same as the query.   A general implementation of the method would  
> need to know the length of each chromosome.

I was just about to suggest something similar, though I didn't think  
to consider chromosome length ... can you give some intuition as to  
why that's important for this question?

I guess you'd expect more "collisions" to happen at random on a  
chromosome if it's longer, but I think in general one wouldn't be  
interested in finding the number of reads that "collide" between two  
experiments for a particular chromosome as you might be interested in  
just seeing how many collisions happen over the entire extent of the  
genome ... so is it helpful to think of the genome as broken up into  
chromosome-pieces, or would it suffice to simply think of it as being  
one contiguous length of sequence for this purpose?

> Of course, if the null hypothesis for this permutation test (the  
> sets intervals are not related) is rejected, then you have to think  
> about the next questions:  To what degree are the set related?

I'm picturing the aligned reads as painting small lines on a large  
canvas (the entire canvas is the genome in this analogy).

If your first expt is painting red lines.
Your second expt is painting in blue lines.
The question is how much of the canvas is purple.

So it "feels" like some sort of an enrichment test to me, could you  
try to answer this question in a similar fashion to GO enrichment, via  
some sort of hypergeometric test? That's not exactly correct ... just  
brainstorming is all.

> Where do they differ and where are they the same?

I'll stop with my speculation here ... :-)

-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

http://cbio.mskcc.org/~lianos



More information about the Bioc-sig-sequencing mailing list