[Bioc-sig-seq] finding the final nucleotide of trimmed reads

Harris A. Jaffee hj at jhu.edu
Thu Aug 26 16:25:36 CEST 2010


It sounded to me like he simply wanted the distribution of last bases,
which happen to be all T in your example.  We can extract and tabulate
them more directly like this:

# example of trimLRPatterns value:
 > A = DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))

 > n = nchar(A)
 > last = narrow(A, start=n, end=n)
 > alphabetFrequency(last, baseOnly=TRUE, collapse=TRUE)
     A     C     G     T other
     0     0     0     3     0

On Aug 26, 2010, at 9:29 AM, Joern Toedling wrote:

> Hi,
>
> have a look at the "shift" argument of the function consensusMatrix  
> from
> Biostrings.
>
> This code example should correspond to your question. Three  
> nucleotide strings
> are aligned at their last position and the sequence composition is  
> obtained:
>
> A <- DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))
> maxlen <- max(nchar(A))
> consensusMatrix(A, shift=maxlen-nchar(A), baseOnly=TRUE)
>
> I tested this with Biostrings_2.17.29, but I guess that it works  
> with the
> current release version, too.
>
> Regards,
> Joern
>
>
> On Thu, 26 Aug 2010 07:35:01 -0500, joseph franklin wrote
>> Hi,
>>
>> I've been trimming adapters from reads using trimLRPatterns.  The
>> resulting, trimmed set contains a heterogenous mix of widths: from
>> ~18-35 nt.  Can anyone guide me toward an elegant way to find the
>> nucleotide composition of the final (right-most) cycle for each of
>> the trimmed reads?
>>
>> Many thanks,
>> Joe Franklin
>
> ---
> Joern Toedling
> Institut Curie -- U900
> 26 rue d'Ulm, 75005 Paris, FRANCE
> Tel. +33 (0)156246927
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list