[BioC] Recommended gene model for DESeq
Simon Anders
anders at embl.de
Sat Apr 7 00:59:30 CEST 2012
Hi Gordon
On 2012-04-06 23:24, Assaf Gordon wrote:
> Once per gene - got it. What about a case where a read matches
> multiple genes? (described as "ambiguous" in
> HTSeq-Count/GenomicRanges "modes") Is it OK to count this read
> several times (once for each gene, multiple different genes), or
> would that invalidate the results?
HTSeq-count discards reads that map to several genes and counts vthem as
"ambiguous". I've explained the reason a while ago in a
SeqAnswers thread, in post #4 here:
http://seqanswers.com/forums/showthread.php?t=9129
"Imagine we have two paralogous genes that have identical sequence at
one half or their length and divergent sequence at the other half, and
one of these genes is differentially expressed and the other is not. All
reads that stem from the identical-sequence parts of the transcripts
will map to both genes, and if we include them in our counts, both genes
will appear to be differentially expressed, even though only one is
really. If we count only the uniquely mapping reads (i.e., those
stemming from the divergent parts of the transcripts), we are safe."
Simon
More information about the Bioconductor
mailing list