[Bioc-sig-seq] Random Intervals in GenomicRanges
Anna Terry
anna.terry at csc.mrc.ac.uk
Mon Dec 6 12:03:14 CET 2010
Hi Michael,
Thanks for this - I used runif to write the following function, which
will return the same number of fixed width regions for each chromosome
as a given genomic ranges object and make sure that they are non
overlapping.
Cheers
Anna
randomGRangesFixedWidth <- function(granges, width) {
random.gr <- GRanges()
for(i in 1:length(seqnames(granges)@values)) {
curr.ir <- IRanges()
while(length(curr.ir) < seqnames(granges)@lengths[i]) {
nToAdd <- seqnames(granges)@lengths[i] - length(curr.ir)
temp.ir <- IRanges(start=as.integer(runif(nToAdd, min=1,
max=seqlengths(granges)[seqnames(granges)@values[i]]-(width -1))),
width=width)
curr.ir <- c(curr.ir, temp.ir)
curr.ir <- asNormalIRanges(curr.ir)
curr.ir <- curr.ir[width(curr.ir) ==width]
curr.ir <- IRanges(curr.ir)
}
random.gr <- c(random.gr,
GRanges(seqnames=rep(seqnames(granges)@values[i],
seqnames(granges)@lengths[i]), ranges=asNormalIRanges(curr.ir)))
}
seqlengths(random.gr) <- seqlengths(granges)
return(random.gr)
}
On 02/12/10 15:23, Michael Dondrup wrote:
> Hi,
>
> I don't know about a specific function (maybe there is) so I use the following code to simulate ranges.
> I simplified it a bit so it might not directly run but shows the concept.
>
> simulate.GRanges<- function (n, mean.length, genome.size ){
> # assuming read lengths are poisson distributed around the mean length, you could use a different random function
> # assuming reads are uniformly distributed wrt. start position
> ir = IRanges(start=as.integer(runif(n, min=1, max=genome.size)), width=rpois(n, mean.length))
> ir = restrict(ir, 1, genome.size) # cut off reads at the end, not perfect
> strand = Rle(sample(x=c("+","-", "*"), size=n, replace=TRUE)) # make some strand information
> gr = GRanges(ir , seqnames="chr1",
> seqlengths=genome.size,
> strand=strand, universe="Mygenome")
> return(gr)
> }
>
> You just need to get an estimate of read-length distribution from somewhere then.
>
> Hope this is useful
>
> Michael
>
> On Dec 2, 2010, at 4:06 PM, Anna Terry wrote:
>
>
>> Hi,
>>
>> Is there any function in GenomicRanges (or other package) for getting a set of random ranges based on size distribution of a sample set?
>> Similar functions are available in bedTools( http://code.google.com/p/bedtools/wiki/Usage#shuffleBed) and Random Intervals from the ENCODE tools on Galaxy.
>>
>> Many Thanks
>> Anna
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
--
Anna Terry
Lymphocyte Development
MRC Clinical Sciences Centre
Imperial College Faculty of Medicine
Hammersmith Hospital
Du Cane Road
London W12 0NN
Tel: 0208 3832140
0208 3832145
Email: anna.terry at csc.mrc.ac.uk
More information about the Bioc-sig-sequencing
mailing list