[Bioc-sig-seq] trimLRPatterns for wild card
Harris A. Jaffee
hj at jhu.edu
Thu Sep 23 21:41:18 CEST 2010
Both pairs of R/L- arguments refer to a "context" in the subject.
Nothing refers to where N's might be in your pattern. Whether you
want to allow indels in the matching is a different story, and it's
your decision, maybe depending on the number of N's in your _NNNNN_
and the rest of your whole situation.
If I understand the question, this is a good example from ?
matchPattern :
## A simple inexact matching example with a short subject:
x <- DNAString("AAGCGCGATATG")
m1 <- matchPattern("GCNNNAT", x)
m1
m2 <- matchPattern("GCNNNAT", x, fixed=FALSE)
m2
Except, again, fixed="subject" is probably better.
On Sep 23, 2010, at 3:08 PM, Kunbin Qu wrote:
> Harris, another question following your advice. If I have a case
> like the following:
>
> fixedPattern1_NNNNN_fixedPattern1
>
> would the Rfixed help me, as the NNNNN is not at the right end? Or
> should I consider using with.Lindel and withRindel? Thanks.
>
> -Kunbin
>
>
> -----Original Message-----
> From: Harris A. Jaffee [mailto:hj at jhu.edu]
> Sent: Thursday, September 23, 2010 11:56 AM
> To: Kunbin Qu
> Cc: bioc-sig-sequencing at r-project.org
> Subject: Re: [Bioc-sig-seq] trimLRPatterns for wild card
>
> Let me try again. You want Rfixed="subject", only. Rfixed=FALSE
> works in your case by accident since your subject does not contain N.
> If it did, the N's would match anything in your pattern, which isn't
> what you want.
>
> To clean up another issue, with Rfixed=FALSE, one would expect that
> both the pattern and subject would have to be DNA or RNA, and that is
> eventually correct. But, by the time it matters, if your subject is
> one of those types, your pattern will have been converted accordingly
> for you, automatically.
>
> On Sep 23, 2010, at 2:01 PM, Harris A. Jaffee wrote:
>> You need to set 'Rfixed' to either FALSE or "subject":
>>
>> trimLRPatterns(Rpattern=p, subject=a, Rfixed=FALSE)
>>
>> trimLRPatterns(Rpattern=p, subject=a, Rfixed="subject")
>>
>> See ?matchPattern and ?`lowlevel-matching`.
>>
>> By the way, your pattern did not have to be a DNAString. "CCNNT"
>> would have been fine. But the ambiguity machinery with regard to
>> the subject requires that it be an RNA- or DNAString.
>>
>> -Harris
>>
>> On Sep 23, 2010, at 1:32 PM, Kunbin Qu wrote:
>>
>>> Hi, all,
>>>
>>> Does trimLRPatterns function take wild card N? When I tested it,
>>> it does not seem to work. Is there a way to do that? Thanks.
>>>
>>> -Kunbin
>>>
>>>
>>>> a<-DNAString("AAACCCCTTTCCTTT")
>>>> p<-DNAString("CCNNT")
>>>> trimLRPatterns(Rpattern=p, subject=a)
>>> 15-letter "DNAString" instance
>>> seq: AAACCCCTTTCCTTT
>>>> p<-DNAString("CCTTT")
>>>> trimLRPatterns(Rpattern=p, subject=a)
>>> 10-letter "DNAString" instance
>>> seq: AAACCCCTTT
>>>>
>>>
>>> ____________________________________________________________________
>>> _
>>> _
>>> The contents of this electronic message, including any
>>> attachments, are intended only for the use of the individual or
>>> entity to which they are addressed and may contain confidential
>>> information. If you are not the intended recipient, you are hereby
>>> notified that any use, dissemination, distribution, or copying of
>>> this message or any attachment is strictly prohibited. If you have
>>> received this transmission in error, please send an e-mail to
>>> postmaster at genomichealth.com and delete this message, along with
>>> any attachments, from your computer.
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
>
> ______________________________________________________________________
> The contents of this electronic message, including any attachments,
> are intended only for the use of the individual or entity to which
> they are addressed and may contain confidential information. If you
> are not the intended recipient, you are hereby notified that any
> use, dissemination, distribution, or copying of this message or any
> attachment is strictly prohibited. If you have received this
> transmission in error, please send an e-mail to
> postmaster at genomichealth.com and delete this message, along with
> any attachments, from your computer.
More information about the Bioc-sig-sequencing
mailing list