[R] Help with text separation
    David Winsemius 
    dwinsemius at comcast.net
       
    Mon Nov 14 18:05:13 CET 2011
    
    
  
On Nov 14, 2011, at 4:20 AM, Michael Griffiths wrote:
> Good morning R list,
>
> My apologies if this has *already* answered elsewhere, but I have  
> not found
> the answer that I am looking for.
>
> I have a character string, i.e.
>
>
> form<-c('~ A + B + C + C / D + E + E / F + G + H + I + J + K + L * M')
>
> Now, my aim is to find the position of all those instances of '*'  
> and to
> remove said '*'. However, I would also like to remove the preceding
> variable name before the '*', the math operator preceding this, and  
> also
> the variable name after the '*'. So, here I would like to remove  
> '+L*M'
This would be a very narrow implementation that requires the +/spc/ 
alnum/spc/*/alnum sequence exactly;
 > sub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]*", "", form)
[1] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
This is a more general implementation using the "*" operator that  
matches each of the preceding item 0 or more times.
  form<-c('~ A + B + C + C / D + E + E / F + G + H + I + J + K + L * M',
  '~ A + B + C + C / D + E + E / F + G + H + I + J + K + L*M',
   '~ A + B + C + C / D + E + E / F + G + H + I + J + K +Llll*M'
  )
 > sub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]*", "", form)
[1] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
[2] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
[3] "~ A + B + C + C / D + E + E / F + G + H + I + J + K "
---stripped out code---
-- 
David Winsemius, MD
West Hartford, CT
    
    
More information about the R-help
mailing list