[Bioc-sig-seq] trimLRPatterns for wild card

Harris A. Jaffee hj at jhu.edu
Thu Sep 23 21:41:18 CEST 2010


Both pairs of R/L- arguments refer to a "context" in the subject.
Nothing refers to where N's might be in your pattern.  Whether you
want to allow indels in the matching is a different story, and it's
your decision, maybe depending on the number of N's in your _NNNNN_
and the rest of your whole situation.

If I understand the question, this is a good example from ? 
matchPattern :
	
        ## A simple inexact matching example with a short subject:
        x <- DNAString("AAGCGCGATATG")
        m1 <- matchPattern("GCNNNAT", x)
        m1
        m2 <- matchPattern("GCNNNAT", x, fixed=FALSE)
        m2

Except, again, fixed="subject" is probably better.

On Sep 23, 2010, at 3:08 PM, Kunbin Qu wrote:

> Harris, another question following your advice. If I have a case  
> like the following:
>
> fixedPattern1_NNNNN_fixedPattern1
>
> would the Rfixed help me, as the NNNNN is not at the right end? Or  
> should I consider using with.Lindel and withRindel? Thanks.
>
> -Kunbin
>
>
> -----Original Message-----
> From: Harris A. Jaffee [mailto:hj at jhu.edu]
> Sent: Thursday, September 23, 2010 11:56 AM
> To: Kunbin Qu
> Cc: bioc-sig-sequencing at r-project.org
> Subject: Re: [Bioc-sig-seq] trimLRPatterns for wild card
>
> Let me try again.  You want Rfixed="subject", only.  Rfixed=FALSE
> works in your case by accident since your subject does not contain N.
> If it did, the N's would match anything in your pattern, which isn't
> what you want.
>
> To clean up another issue, with Rfixed=FALSE, one would expect that
> both the pattern and subject would have to be DNA or RNA, and that is
> eventually correct.  But, by the time it matters, if your subject is
> one of those types, your pattern will have been converted accordingly
> for you, automatically.
>
> On Sep 23, 2010, at 2:01 PM, Harris A. Jaffee wrote:
>> You need to set 'Rfixed' to either FALSE or "subject":
>>
>> 	trimLRPatterns(Rpattern=p, subject=a, Rfixed=FALSE)
>>
>> 	trimLRPatterns(Rpattern=p, subject=a, Rfixed="subject")
>>
>> See ?matchPattern and ?`lowlevel-matching`.
>>
>> By the way, your pattern did not have to be a DNAString.  "CCNNT"
>> would have been fine.  But the ambiguity machinery with regard to
>> the subject requires that it be an RNA- or DNAString.
>>
>> -Harris
>>
>> On Sep 23, 2010, at 1:32 PM, Kunbin Qu wrote:
>>
>>> Hi, all,
>>>
>>> Does trimLRPatterns function take wild card N? When I tested it,
>>> it does not seem to work. Is there a way to do that? Thanks.
>>>
>>> -Kunbin
>>>
>>>
>>>> a<-DNAString("AAACCCCTTTCCTTT")
>>>> p<-DNAString("CCNNT")
>>>> trimLRPatterns(Rpattern=p, subject=a)
>>>   15-letter "DNAString" instance
>>> seq: AAACCCCTTTCCTTT
>>>> p<-DNAString("CCTTT")
>>>> trimLRPatterns(Rpattern=p, subject=a)
>>>   10-letter "DNAString" instance
>>> seq: AAACCCCTTT
>>>>
>>>
>>> ____________________________________________________________________ 
>>> _
>>> _
>>> The contents of this electronic message, including any
>>> attachments, are intended only for the use of the individual or
>>> entity to which they are addressed and may contain confidential
>>> information. If you are not the intended recipient, you are hereby
>>> notified that any use, dissemination, distribution, or copying of
>>> this message or any attachment is strictly prohibited. If you have
>>> received this transmission in error, please send an e-mail to
>>> postmaster at genomichealth.com and delete this message, along with
>>> any attachments, from your computer.
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
>
> ______________________________________________________________________
> The contents of this electronic message, including any attachments,  
> are intended only for the use of the individual or entity to which  
> they are addressed and may contain confidential information. If you  
> are not the intended recipient, you are hereby notified that any  
> use, dissemination, distribution, or copying of this message or any  
> attachment is strictly prohibited. If you have received this  
> transmission in error, please send an e-mail to  
> postmaster at genomichealth.com and delete this message, along with  
> any attachments, from your computer.



More information about the Bioc-sig-sequencing mailing list