[Bioc-sig-seq] WIG from coverage

Wed Sep 30 20:20:30 CEST 2009

Hi Michael,

Thanks for the feedback. It's helpful.

I think for ChIP Seq analysis a common activity will be going from
mapped reads to a trackfile, with some exploration of normalization,
peaks, smoothing, etc, in between. So I'm trying to figure out the
simplest way to do this, and traverse all the objects correctly.

-----Original Message-----
From: lawremi at gmail.com [mailto:lawremi at gmail.com] On Behalf Of Michael
Lawrence
Sent: Wednesday, September 30, 2009 1:22 AM
To: Seidel, Christopher
Cc: bioc-sig-sequencing at r-project.org
Subject: Re: [Bioc-sig-seq] WIG from coverage

On Tue, Sep 29, 2009 at 7:23 PM, Michael Lawrence <mflawren at fhcrc.org>
wrote:

>> ip1.cov <- coverage(ip1, width=230208)
>> export(ip1.cov, "coverage.wig")
>> Am I doing something wrong?

> Note that in general your code will not work, because you do not tell
rtracklayer the name of the chromosome. This makes me worry that
generating a default chromosome name will often hide user errors.

I would have specified the chromosome if I knew it was necessary and
could have figured out how to do it. The current worflow goes from:

export.txt file -> readAligned Object (one lane) -> GenomeData Object
(one lane) -> GenomeDataList (2 lanes) -> IRanges (from extendReads) ->
Rle (from coverage). 

To go from coverage to a track it would seem I have to make a RangedData
object from the coverage result, rather than use the result directly, is
that right? When I use the (modified) workflow below, now I get the
chromosome names correct:

# read in some IP data
aln <- readAligned(spath, pattern="s_1_export.txt", type="SolexaExport",
filter=filt)
# convert to GenomeData object
s1 <- as(aln,"GenomeData")

# read in some WCE data
aln <- readAligned(spath, pattern="s_3_export.txt", type="SolexaExport",
filter=filt)
# convert to GenomeData object
s3 <- as(aln,"GenomeData")

# create GenomeDataList from ip and wce
ypd <- GenomeDataList(list(ip=s1, wce=s3))

# extend reads
ip1 <- extendReads(ypd$ip$chrI, seqLen=150)
wce1 <- extendReads(ypd$wce$chrI, seqLen=150)

# find coverage for chromosome 1
ip1.cov <- coverage(ip1, width=230208)
wce1.cov <- coverage(wce1, width=230208)

# make a track for just chrI of IP
range1 <- IRanges(start=start(ip1.cov), end=end(ip1.cov))
score <- runValue(ip1.cov)
chrom <- rep("chrI", length(range1))
# put all these things together in a GenomicData object for rtracklayer
export
x <- GenomicData(range1, score, chrom=chrom, genome="sacCer2")
export(x, "ip1_coverage.wig")

So now this works to give chromosome names:

track name="R Track" type=wiggle_0
chrI    0       2       169
chrI    2       4       176
chrI    4       5       178
chrI    5       7       179

But it feels awkward to extract the ranges and values from the coverage
result, just to put them into a new kind of object. I know I'm doing
something wrong...any help?

Also, according to UCSC this is a bedGraph file, not a wig file. So why
is the result of the wig export a bedGraph file? I don't get it.

Thanks for the help.

-Chris