[R] Regular expressions: offsets of groups
    Titus von der Malsburg 
    malsburg at gmail.com
       
    Mon Sep 27 17:48:09 CEST 2010
    
    
  
Dear list!
> gregexpr("a+(b+)", "abcdaabbc")
[[1]]
[1] 1 5
attr(,"match.length")
[1] 2 4
What I want is the offsets of the matches for the group (b+), i.e. 2
and 7, not the offsets of the complete matches.  Is there a way in R
to get that?
I know about gsubgn and strapply, but they only give me the strings
matched by groups not their offsets.
I could write something myself that first takes the above matches
("ab" and "aabb") and then searches again using only the group (b+).
For this to work, I'd have to parse the regular expression and search
several times (> 2, for nested groups) instead of just once.  But I'm
sure there is a better way to do this.
Thanks for any suggestion!
   Titus
    
    
More information about the R-help
mailing list