[BioC] A problem about trimLRPatterns
    wang peter 
    wng.peter at gmail.com
       
    Fri Mar  2 15:48:28 CET 2012
    
    
  
dear Harris
thank you for your perfect example: but i still have 3 small questions:
1. when j=15, s="GAATAGTACTGTAGGCACCATCAATAGATCGGAA"
and p = "CTGTAGGCACCATCAATAGATCGGAAGAGCGGTT"
and the edit distance between s and p is 16, not 8
> subject = "TATAGTAGATATTGGAATAGTACTGTAGGCACCATCAATAGATCGGAA"
> pattern = "CTGTAGGCACCATCAATAGATCGGAAGAGCGGTTCAGAAGGAATGCCGAG"
>
> sapply(15:nchar(subject), function(j) {
>        s = substr(subject, j, nchar(subject))
>        p = substr(pattern, 1, nchar(subject)-j+1)
>        neditEndingAt(ending.at=nchar(s), pattern = p, subject = s, with.indels=TRUE)
> })
>
>  [1]  8  7  6  5  4  3  2  1  0  2  4  6  8 10 11 11 10  9  8  8  9  8  7  7  6
> [26]  5  5  4  3  2  4  3  2  1
2. if the trimLRPatterns try to trim the longest substring in the
scope of mismatch number,
it will remove some bp which are not noise? right?
3. so the trimLRPatterns algorithm is based on the edit distance, right? i think
it uses dynamic programming to calculate the Levenshtein distance.
but it seems much faster than my program which also uses dynamic programming
thank you very much
shan
    
    
More information about the Bioconductor
mailing list