[ESS] European characters in R.

Martin Maechler maechler at stat.math.ethz.ch
Wed Nov 25 09:32:13 CET 2009


>>>>> "GJ" == Gérald Jean <gerald.jean at videotron.ca>
>>>>>     on Mon, 23 Nov 2009 12:29:44 -0500 writes:

    GJ> Hello there,
    GJ> I use:

    GJ> R version 2.9.2 (2009-08-24)
    GJ> Copyright (C) 2009 The R Foundation for Statistical Computing
    GJ> ISBN 3-900051-07-0

    GJ> on Ubuntu 9.10, Emacs-22.2.1 and ESS-5.4

    GJ> I have a data file containing lots of European characters, French,
    GJ> German, Italian and so on.  I can read it ok in R but I can't display
    GJ> the characters correctly.  Here a simple example:

    >> ttt.g <- "gérald"
    GJ> Erreur : caractères multioctets incorrects dans l'analyse de code
    GJ> (parser) Ã  la ligne 1

    GJ> outputting the colnames of my data set I get:

    >> names(ttt)
    GJ> [1] "ID"           "Domaine"      "Nom"          "MillÃ.Â.sime"
    GJ> "Pays"        
    GJ> [6] "RÃ.Â.gion"    "Appellation"  "Vignoble"     "Couleur"
    GJ> "Alcool"      
    GJ> [11] "Classement"   "Cuve"         "mois"         "Bio"
    GJ> "CÃ.Â.page..1"
    GJ> [16] "X."           "CÃ.Â.page..2" "X..1"         "CÃ.Â.page..3"
    GJ> "X..2"        
    GJ> [21] "CÃ.Â.page..4" "X..3"         "CÃ.Â.page..5" "X..4"
    GJ> "Prix"        
    GJ> [26] "QuantitÃ.Â."  "Internet"    

    GJ> The locale are set as follows:

    >> Sys.getlocale()
    GJ> [1]
    GJ> "LC_CTYPE=fr_CA.UTF-8;LC_NUMERIC=C;LC_TIME=fr_CA.UTF-8;LC_COLLATE=fr_CA.UTF-8;LC_MONETARY=C;
    GJ> LC_MESSAGES=fr_CA.UTF-8;LC_PAPER=fr_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=fr_CA.UTF-8;
    GJ> LC_IDENTIFICATION=C"

    GJ> I tried to play with Emacs' coding systmes with no luck!  Any idea on
    GJ> how to handle this?

Yes, "in principle".

For some reasons, Ubuntu have decided to keep their default with
ISO-latin1 rather than Unicode, at least for emacs.

But there's a whole set of Commands with key strokes to deal
with coding systems inside Emacs.
The keystrokes all start with  'C-x RET'  and that's the only
thing you need to remember, as you can quickly get the list via
C-x RET C-h   

   {General principle :
       <key-sequence-beginning> C-h
       always gives help (C-h) on all key sequences beginning
       with <key-sequence-beginning>}

So, IIRC, you go into the *R* buffer and press
C-x RET p {p: "buffer-[p]rocess"}.

You may need a bit more, but this should help you getting into
the correct direction!

Regards,
Martin Mächler, ETH Zürich

    GJ> Thanks,
    GJ> Gérald Jean



More information about the ESS-help mailing list