[R] Regular expression help
    Duncan Murdoch 
    murdoch.duncan at gmail.com
       
    Mon Oct  9 18:15:07 CEST 2017
    
    
  
On 09/10/2017 11:23 AM, Ulrik Stervbo wrote:
> Hi Duncan,
> 
> why not split on / and take the correct elements? It is not as elegant 
> as regex but could do the trick.
Thanks for the suggestion.  There are likely many thousands of lines of 
data like the two real examples (which had about 5000 and 60000 lines 
respectively), so I was thinking that would be too slow, as it would 
involve nested strsplit() calls.  But in fact, it's not so bad, so I 
might go with it.  Here's a stab at it:
lines <- <the lines to be split, e.g. the lines starting with "f" in 
http://sci.esa.int/science-e/www/object/doc.cfm?fobjectid=54726>
l2 <- strsplit(lines, " ")
l3 <- lapply(l2, function(x) {
         y <- strsplit(x, "/")
         sapply(y, function(z) if (length(z) == 3) z[3] else "")
       })
Duncan
> 
> Best,
> Ulrik
> 
> On Mon, 9 Oct 2017 at 17:03 Duncan Murdoch <murdoch.duncan at gmail.com 
> <mailto:murdoch.duncan at gmail.com>> wrote:
> 
>     I have a file containing "words" like
> 
> 
>     a
> 
>     a/b
> 
>     a/b/c
> 
>     where there may be multiple words on a line (separated by spaces).  The
>     a, b, and c strings can contain non-space, non-slash characters. I'd
>     like to use gsub() to extract the c strings (which should be empty if
>     there are none).
> 
>     A real example is
> 
>     "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587"
> 
>     which I'd like to transform to
> 
>     " 587 587 587 587"
> 
>     Another real example is
> 
>     "f 1067 28680 24462"
> 
>     which should transform to "   ".
> 
>     I've tried a few different regexprs, but am unable to find a way to say
>     "transform words by deleting everything up to and including the 2nd
>     slash" when there might be zero, one or two slashes.  Any suggestions?
> 
>     Duncan Murdoch
> 
>     ______________________________________________
>     R-help at r-project.org <mailto:R-help at r-project.org> mailing list --
>     To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     and provide commented, minimal, self-contained, reproducible code.
>
    
    
More information about the R-help
mailing list