[R] year extraction over a list
Rui Barradas
ruipbarradas at sapo.pt
Tue Jul 31 18:14:36 CEST 2012
Hello,
Try the following.
x <- c("text, text, 2001, text", "text, 2000, text", "1999, text, text,
text")
extract.year <- function(x, n = 4){
pattern <- paste(".*([[:digit:]]{", n, "}).*", sep="")
as.integer(sub(pattern, "\\1", x))
}
extract.year(x)
The argument 'n' is the number of digits of year. Then use the function
as you want, within lapply, for instance, or directly as in
extract.year(foo$a)
Hope this helps,
Rui Barradas
Em 31-07-2012 16:33, jimi adams escreveu:
> Hello,
> I have a data frame, one element in that data frame is a LIST, with each element being a character string. I am trying to extract the first year listed in each of those character strings. The character elements are typically csv, but the position of the year can vary (think citations with varying citation standards). I.e.,
>
> foo$a
> [[1]]
> [1] text, text, 2001, text
> [2] text, 2000, text
> [3] 1999, text, text, text, …
>
> I'm trying to figure out how to create a new list such that each element is that year, i.e., the result on the above would be:
> foo$year
> [[1]]
> [1] 2001
> [2] 2000
> [3] 1999
> …
>
> For some reason i'm not figuring out how to properly get lapply and strsplit (or other alternatives) to play nicely together. Any help greatly appreciated.
>
> thanks,
> jimi
>
>
> jimi adams
> Assistant Professor
> Department of Sociology
> American University
> e: jadams at american.edu
> w: jimiadams.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list