Quantitative Ecology 2010-11-10 14:56:00

November 10, 2010
By

(This article was first published on Quantitative Ecology, and kindly contributed to R-bloggers)

At last... I have been suffering with XEmacs displaying odd characters instead of the quotation marks that are used in R help files. This was driving me up the wall because it makes the files (and R output in general) very hard to read; however, I finally diagnosed the problem: Xemacs was not recognizing UTF-8 encoding. Below is a quote from Marjan Parsa that describes how to set up Emacs and XEmacs to automatically detect UTF-8 files. My quality of life has already improved.


How can I get XEmacs to work with UTF-8 files?

* Set up XEmacs so that it autodetects UTF-8 encoded files.
* In the case of starting a new file in a non-UTF-8 locale, set the file coding system to UTF-8 using C-x RET f.
* If running XEmacs in non-graphical mode in a UTF-8 xterm, set the terminal coding system to UTF-8 using C-x RET t.

If you want XEmacs to load UTF-8 files correctly, add the following lines to your ~/.xemacs/init.el:

(require 'un-define)
(set-coding-priority-list '(utf-8))
(set-coding-category-system 'utf-8 'utf-8)

Note that Emacs does not deal well with these additions, so if you also run Emacs, then adding the following will keep Emacs from complaining:

;; Are we running XEmacs or Emacs?
(defvar running-xemacs (string-match "XEmacs\\|Lucid" emacs-version))

...

(if (not running-xemacs) nil
;; enable Mule-UCS
(require 'un-define)

;; by default xemacs does not autodetect Unicode
(set-coding-priority-list '(utf-8))
(set-coding-category-system 'utf-8 'utf-8))

These lines will get XEmacs to load UTF-8 files in UTF-8 mode (it will display a "u" in the bottom left corner of your status bar). If you have already loaded a file and would like to start inputting UTF-8, you can use C-x RET f, to set the file coding system to UTF-8. Note that you may additionally have to set the terminal coding system to UTF-8. This seems to be necessary, for example, in the case where XEmacs is run in non-graphical mode inside a UTF-8 enabled xterm. You can set the terminal encoding using C-x RET t.

Caution: I have had problems with XEmacs double encoding in the case where 1) the file contains UTF-8, 2) the file is loaded in non-UTF-8 mode, 3) the user switches to UTF-8 mode (using C-x RET f), 4) enters some text, and 5) saves. In other words, if your file already contains UTF-8 characters, make sure that it is loaded in UTF-8 mode before editing it.

To leave a comment for the author, please follow the link and comment on his blog: Quantitative Ecology.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.