[R] UNIX-like "cut" command in R
    Mike Miller 
    mbmiller+l at gmail.com
       
    Tue May  3 06:26:59 CEST 2011
    
    
  
On Mon, 2 May 2011, Gabor Grothendieck wrote:
> On Mon, May 2, 2011 at 10:32 PM, Mike Miller <mbmiller+l at gmail.com> wrote:
>> On Tue, 3 May 2011, Andrew Robinson wrote:
>>
>>> try substr()
>>
>> OK.  Apparently, it allows things like this...
>>
>>> substr("abcdef",2,4)
>>
>> [1] "bcd"
>>
>> ...which is like this:
>>
>> echo "abcdef" | cut -c2-4
>>
>> But that doesn't use a delimiter, it only does character-based cutting, and
>> it is very limited.  With "cut -c" I can do stuff this:
>>
>> echo "abcdefghijklmnopqrstuvwxyz" | cut -c-3,12-15,17-
>>
>> abclmnoqrstuvwxyz
>>
>> It extracts characters 1 to 3, 12 to 15 and 17 to the end.
>>
>> That was a great tip, though, because it led me to strsplit, which can do
>> what I want, however somewhat awkwardly:
>>
>>> y <- "a b c d e f g h i j k l m n o p q r s t u v w x y z"
>>> paste(unlist(strsplit(y, delim))[c(1:3,12:15,17:26)], collapse=delim)
>>
>> [1] "a b c l m n o q r s t u v w x y z"
>>
>> That gives me what I want, but it is still a little awkward.  I guess I
>> don't quite get what I'm doing with lists.  I'm not clear on how this would
>> work with a vector of strings.
>>
>
> Try this:
>
>> read.fwf(textConnection("abcdefghijklmnopqrstuvwxyz"), widths = c(3, 8, 4, 1, 10), colClasses = c(NA, "NULL"))
>   V1   V3         V5
> 1 abc lmno qrstuvwxyz
That gives me a few more functions to study.  Of course the new code 
(using read.fwf() and textConnection()) is not doing what was requested 
and it requires some work to compute the widths from the given numbers 
(c(1:3, 12:15, 17:26) has to be converted to c(3, 8, 4, 1, 10)).
Mike
    
    
More information about the R-help
mailing list