[Bioc-sig-seq] Finding Mean Value of Overlapping Ranges
Dario Strbenac
D.Strbenac at garvan.org.au
Fri Jun 25 07:31:15 CEST 2010
Hello,
I have a question about what is the most efficient way to perform my use case.
What I have done is gotten a matchMatrix from an overlapping, then split it :
regionSiteMap <- findOverlaps(regions, sites)@matchMatrix
indexList <- split(regionSiteMap[, "subject"], regionSiteMap[, "query"])
Now I'd like to, for each region, use the indices to the sites to get the sites' scores from a vector and take the mean, like :
means <- sapply(indicesList, function(indices) mean(scoreVect[indices]))
The problem about this is that I have ~ 8 million 'regions', and ~ 28 million 'sites'. So the indexList is a list of ~ 8 million elements with a few indices in each one, and scoresVect is a numeric vector of scores of length ~ 28 million.
Can anyone suggest what is the fastest way to go on this task ?
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
More information about the Bioc-sig-sequencing
mailing list