[BioC] a question about trimLRPatterns?
wang peter
wng.peter at gmail.com
Tue Oct 30 17:58:47 CET 2012
i want to know how this function works?
for example:
trimLRPatterns(Rpattern = Rpattern, subject = subject,
max.Rmismatch=1,with.Lindels=TRUE)
subject = "TATAGTAGATATTGGAATAGTACTGTAGGCACCATCAATAGATCGGAA"
Rpattern = "GAATAGTACTGTAGGCACCATCAATAGATCGGAA"
the function will try to calculate the distance by such coding:
sapply((nchar(subject)-nchar(Rpattern)+1):nchar(subject), function(j) {
s = substr(subject, j, nchar(subject))
p = substr(Rpattern, 1, nchar(subject)-j+1)
neditEndingAt(ending.at=nchar(s), pattern = p, subject = s,
with.indels=TRUE)
})
[1] 0 2 4 6 8 10 12 14 15 14 13 12 11 10 9 9 8 7 8 7 6 5
6 6 5 4 4 4 3 2 1 0
[33] 1 1
when the function find the value which is first satisfy the
max.Rmismatch value, it will stop
in this case,they function will stop at the first position.
IF
subject = "TATAGTAGATATTGGAATAGTACTGTAGGCACCATCAATAGATCGGAA"
Rpattern = "GAATAGTACTGTAGGCACCATCAATAGATCGGTT"
The results
[1] 2 3 4 6 8 10 12 14 15 14 13 12 11 10 9 9 8 7 8 7 6 5
6 6 5 4 4 4 3 2 1 0
[33] 1 1
it will stop
in this case,they function will stop at
subject = "TATAGTAGATATTGGAATAGTACTGTAGGCACCATCAATAGATCGGAA"
Rpattern =
"GAATAGTACTGTAGGCACCATCAATAGATCGGTT"
so the shortcoming is the trimLRPatterns cannot find the shared
sequence between subject and Rpattern
"GAATAGTACTGTAGGCACCATCAATAGATCGG"
--
shan gao
Room 231(Dr.Fei lab)
Boyce Thompson Institute
Cornell University
Tower Road, Ithaca, NY 14853-1801
Office phone: 1-607-254-1267(day)
Official email:sg839 at cornell.edu
Facebook:http://www.facebook.com/profile.php?id=100001986532253
More information about the Bioconductor
mailing list