[R] Regular Expression
jim holtman
jholtman at gmail.com
Tue Jul 24 19:54:01 CEST 2012
Is this what you want:
> x <- read.table(text = "MONTH QUARTER YEAR
+ 2012-07 2012-3 2012
+ 2001-07 2001-3 2001
+ 2002-01 2002-1 2002", header = TRUE, as.is = TRUE)
> x
MONTH QUARTER YEAR
1 2012-07 2012-3 2012
2 2001-07 2001-3 2001
3 2002-01 2002-1 2002
> x$MONTH <- sub(".*-(.*)", "\\1", x$MONTH)
> x$QUARTER <- sub(".*-(.*)", "\\1", x$QUARTER)
> x
MONTH QUARTER YEAR
1 07 3 2012
2 07 3 2001
3 01 1 2002
>
>
On Tue, Jul 24, 2012 at 1:36 PM, Fred G <bayespokerguy at gmail.com> wrote:
> Hi--
>
> I have three columns in an input file:
> MONTH QUARTER YEAR
> 2012-07 2012-3 2012
> 2001-07 2001-3 2001
> 2002-01 2002-1 2002
>
> I want to make output like so:
> MONTH QUARTER YEAR
> 07 3 2012
> 07 3 2001
> 01 1 2002
>
> I was having some trouble getting the regular expression to work. I think
> it should be something like the following:
> tmp <- uncurated$MONTH
> *tmp <- gsub("[^-\\d\\d]","",tmp,perl=TRUE)*
> *tmp[tmp=="-"] <- ""*
> *curated$MONTH <- tmp*
> *
> *
> tmp <- uncurated$QUARTER
> *tmp <- gsub("[^-\\d]","",tmp,perl=TRUE)*
> *tmp[tmp=="-"] <- ""*
> *curated$QUARTER <- tmp*
> *
> *
> *but it's not quite working. I want to be able to isolate any digits that
> occur after the hyphen and to delete everything before and including the
> hyphen. Would greatly appreciate any clarification anyone can provide.*
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list