[R] dates in French format

Denis Chabot chabotd at globetrotter.net
Thu Jan 31 15:46:20 CET 2008


(I've put the R Mac list in cc because of the crashes I have  
experienced trying some of the suggestions below)

Hi Gabor and Prof Ripley,

Le 31 janv. 08 à 02:11, Prof Brian Ripley a écrit :

> The output from sessionInfo() the posting guide asked for would have  
> been very helpful here.

You are right, sorry about that:


 > library(chron)
 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1

locale:
fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] chron_2.3-16


>
>
> I think the problem is likely to be that these are not standard French
> abbreviations according to my systems.

I was ready to blame Excel for the use of non-standard abbreviations,  
but I would have been wrong: it seems that "janv" is a Mac OS X  
decision from what I can see in my system settings. I am not sure what  
would be a bullet-proof authority on french abbreviations. My  
dictionary was of no help, but wikipedia seems to endorse Mac OS X and  
Windows use of "janv":

<http://fr.wikipedia.org/wiki/Mois#Abr.C3.A9viations>

> On Linux I get
>
>> format(Sys.Date(), "%d-%b-%y")
> [1] "31-jan-08"
>> format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-déc-07"
>
> and on Windows
>
>> format(Sys.Date(), "%d-%b-%y")
> [1] "31-janv.-08"
>
>> format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-déc.-07"

I tried this too:
 > format(Sys.Date(), "%d-%b-%y")
[1] "31-jan-08"
 > format(Sys.Date()-50, "%d-%b-%y")
[1] "12-déc-07"

I am lost here: since the OS uses "janv", why did the above give  
"jan"???

>
>
> And yes, chron is US-centric and so only allows English names.
>
> Assuming you know exactly what is meant by 'French short format', I  
> think the simplest thing to do is to set up a table by
>
> tr <- month.abb
> names(tr)[1] <- c("janv")  # complete it
>
> x <- "9-janv-08"
> x2 <- strsplit(x, "-")
> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
> collapse="-")})
> as.Date(x3, format = "%d-%b-%y")

Thank you Prof Ripley, although I'll have to do my homework to fully  
understand what is happening with the function you wrote.

But I wonder why I cannot make this a Date object:

 > x <- "9-janv-08"
 > x2 <- strsplit(x, "-")
 > x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
collapse="-")})
 > as.Date(x3, format = "%d-%b-%y")
[1] "2008-01-09"
 > class(x3)
[1] "character"
 > x4 <- as.Date(x3, format = "%d-%b-%y")

  *** caught bus error ***
address 0x8, cause 'non-existent physical address'

Traceback:
  1: strptime(x, format)
  2: as.Date.character(x3, format = "%d-%b-%y")
  3: as.Date(x3, format = "%d-%b-%y")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

The problem may be my system as I get this error when trying Gabor's  
suggestions (below).

Le 31 janv. 08 à 00:21, Gabor Grothendieck a écrit :
> Suppose we have:
>
> dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
> déc-07",
> "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
> "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> "16-janv-08", "18-janv-08")
>
> Try this (where we are assuming the just released chron 2.3-17):
>
> library(chron)
> Sys.setlocale("LC_ALL", "French")
> as.chron(as.Date(dd, "%d-%b-%y"))
>
> # or with chron 2.3-16 last line is replaced with:
> chron(unclass(as.Date(dd, "%d-%b-%y")))
>

 > library(chron)
 > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
déc-07",
+ "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
+ "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
+ "16-janv-08", "18-janv-08")
 > Sys.setlocale("LC_ALL", "French")
[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "French") :
   la requête OS pour spécifier la localisation à "French" n'a pas pu  
être honorée
 > chron(unclass(as.Date(dd, "%d-%b-%y")))

  *** caught bus error ***
address 0x8, cause 'non-existent physical address'

Traceback:
  1: strptime(x, format)
  2: as.Date.character(dd, "%d-%b-%y")
  3: as.Date(dd, "%d-%b-%y")
  4: inherits(dates., "dates")
  5: chron(unclass(as.Date(dd, "%d-%b-%y")))

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

> If those don't work (the above didn't work on my Vista system but this
> is system dependent and
> might work on yours)  then try this alternative
>
>> library(chron)
>> library(gsubfn)
>> Sys.setlocale('LC_ALL','French')
> [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France. 
> 1252;LC_MONETARY=French_France. 
> 1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
>> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
>> = "month"), "%b")
>> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d, y,  
>> sep = "/"))
>> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
> 12/28/07
> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
> 01/16/08
> [17] 01/18/08

Again, this Sys.setlocale call does not work for me and the use of  
as.Date crashes my copy of R:

 > library(chron)
 > library(gsubfn)
Le chargement a nécessité le package : proto
 > french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
= "month"), "%b")

  *** caught bus error ***
address 0x8, cause 'non-existent physical address'

Traceback:
  1: strptime(x, f)
  2: fromchar(x)
  3: as.Date.character("2000-01-01")
  4: as.Date("2000-01-01")
  5: seq(as.Date("2000-01-01"), length = 12, by = "month")
  6: format(seq(as.Date("2000-01-01"), length = 12, by = "month"),      
"%b")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

However, if I replace that call by this, the rest of Gabor's solution  
works.

 > library(chron)
 > library(gsubfn)
Le chargement a nécessité le package : proto
 > french.months <- c("janv", "fév", "mars", "avr", "mai", "juin",  
"juil", "août", "sept", "oct", "nov", "déc")
 > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
déc-07",
+ "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
+ "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
+ "16-janv-08", "18-janv-08")
 > f <- function (d, m, y) chron(paste(pmatch(m, french.months), d, y,  
sep = "/"))
 > strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
  [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
12/28/07
  [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
01/16/08
[17] 01/18/08

So thanks again. I will try to reinstall R on my computer and see if I  
still get these errors.


Denis

>
>
>
> On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd at globetrotter.net>  
> wrote:
>> Hello R users,
>>
>> I have to import a file with one column containing dates written in
>> French short format, such as:
>>
>>   7-déc-07
>>  11-déc-07
>>  14-déc-07
>>  18-déc-07
>>  21-déc-07
>>  24-déc-07
>>  26-déc-07
>>  28-déc-07
>>  31-déc-07
>>  2-janv-08
>>  4-janv-08
>>  7-janv-08
>>  9-janv-08
>> 11-janv-08
>> 14-janv-08
>> 16-janv-08
>> 18-janv-08
>>
>> There are other columns for other (numeric) variables in the data
>> file. In my read.csv2 statement, I indicate that the date column must
>> be imported "as.is" to keep it as character.
>>
>> I would like to transform this into a date object in R. So far I've
>> used chron for my dates and times needs, but I am willing to change  
>> if
>> another object/package will ease the task of importing these dates.
>>
>> My reading of the chron help led me to believe that the formats it
>> understands are only month names in English.
>>
>> Are there other "formats" I can use with chron, or must I somehow  
>> edit
>> this character variables to replace French month names by English  
>> ones
>> (or numbers from 1 to 12)?
>>
>> Thanks in advance,
>>
>> Denis
>> p.s. I read this in digest mode, so I'll get your replies faster if
>> you cc to my email



More information about the R-help mailing list