[Bioc-sig-seq] Input from multiple Solexa runs
Deepayan Sarkar
deepayan.sarkar at gmail.com
Fri Apr 24 00:33:44 CEST 2009
On Thu, Apr 23, 2009 at 3:22 PM, <ig2ar-saf2 at yahoo.co.uk> wrote:
>
> Hi Deepayan,
>
> When I do
>
> control1 <- combineLaneReads(c(expt1_analysis1[c("1", "2")],
> expt1_analysis2[c("3", "4")]))
>
> is there a way to filter reads so that I only get one read per genomic position?
combineLaneReads is a very simple function:
combineLaneReads <- function(laneList, chromList = names(laneList[[1]])) {
names(chromList) = chromList ##to get the return value named
GenomeData(lapply(chromList,
function(chr) {
list("+" = unlist(lapply(laneList,
function(x) x[[chr]][["+"]]), use.names = FALSE),
"-" = unlist(lapply(laneList,
function(x) x[[chr]][["-"]]), use.names = FALSE))
}))
}
and you can just wrap a unique() around the unlist() to make the start
positions unique. But why would you want that? Within a lane,
duplicates are likely to be PCR artifacts, but for data from different
lanes, aren't duplicates more likely to be real? We could easily add
an argument to support this if you have a valid use-case.
-Deepayan
More information about the Bioc-sig-sequencing
mailing list