[Bioc-sig-seq] Comparing two chipseq position sets
Ivan Gregoretti
ivangreg at gmail.com
Thu May 7 16:02:52 CEST 2009
Hello Steve, Nicolas and Michael,
I agree with all of you: it is not a trivial question.
I asked the bioc-sig-seq listers because I thought, --Hey, this must
be the everyday's question of the genome analyst.
Say you ran your chipseq under condition A and then you ran it under
condition B. Then you have to decide whether A and B made any
difference. It doesn't get any simpler than that!
I can't compare the two means or the two dispersions. I have to
compare pairs. The problem is that it is not trivial to unambiguously
determine which spot in B must be paired with each spot in A. To start
with, A and B may have different numbers of loci (ie 15000 versus
18000).
I'll take a look at genomeIntervals and IRanges.
By the way, Michael, would you let me know as soon as the new IRanges
documentation comes out? You guys were working on something, I
understand.
Thank you all,
Ivan
Ivan Gregoretti, PhD
National Institute of Diabetes and Digestive and Kidney Diseases
National Institutes of Health
5 Memorial Dr, Building 5, Room 205.
Bethesda, MD 20892. USA.
Phone: 1-301-496-1592
Fax: 1-301-496-9878
On Thu, May 7, 2009 at 9:24 AM, Michael Lawrence <mflawren at fhcrc.org> wrote:
>
>
> On Wed, May 6, 2009 at 12:40 PM, Ivan Gregoretti <ivangreg at gmail.com> wrote:
>>
>> Hello Bioc-sig-seq,
>>
>> Say you run your ChIP-seq and find binding positions like this
>>
>> chr1 3660781 3662707
>> chr1 4481742 4482656
>> chr1 4482813 4484003
>> chr1 4561320 4562262
>> chr1 4774887 4776304
>> chr1 4797291 4798822
>> chr1 4847807 4848846
>> chr1 5008093 5009386
>> chr1 5009514 5010046
>> chr1 5010095 5010583
>> ...[many more loci and chromosomes]...
>>
>> Then you want to compare it to published data like this
>>
>> chr1 3659579 3662079
>> chr1 4773791 4776291
>> chr1 4797473 4799973
>> chr1 4847394 4849894
>> chr1 5007460 5009960
>> chr1 5072753 5075253
>> chr1 6204242 6206742
>> chr1 7078730 7081230
>> chr1 9282452 9284952
>> chr1 9683423 9685923
>> ...[many more loci and chromosomes]...
>>
>> What method would you use to test whether these two lists are
>> significantly different?
>
> This is a tough statistical question that probably needs to be a bit more
> specific, but as far as technical tools, in addition to genomeIntervals
> there is the IRanges package and its efficient "overlap" function. IRanges
> is well integrated with the rest of sequence analysis infrastructure in
> Bioconductor.
>
>>
>> Any pointer would be appreciated.
>>
>> Ivan
>>
>> Ivan Gregoretti, PhD
>> National Institute of Diabetes and Digestive and Kidney Diseases
>> National Institutes of Health
>> 5 Memorial Dr, Building 5, Room 205.
>> Bethesda, MD 20892. USA.
>> Phone: 1-301-496-1592
>> Fax: 1-301-496-9878
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
More information about the Bioc-sig-sequencing
mailing list