Site icon R-bloggers

Using Arial in R figures destined for PLOS ONE

[This article was first published on From the Bottom of the Heap - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Despite the refreshing change that the journal PLOS ONE represents in terms of open access and an refreshing change to the stupidity that is quality/novelty selection by the two or three people that review a paper, it’s submission requirements are far less progressive. Yes they make you jump through a lot of hoops getting your figures and tables just so, and I can appreciate why they want some control over this in terms of the look and feel of the journal. A couple of things grate though:

The choice of EPS is a pain, but can be worked round relatively easily and R can output to an EPS file via the postscript() device, as long as you follow a few basic guidelines, which I’ll cover below. But the use of Arial! facepalm

Firstly, you may not legally be able to install these s on your computer (even though they are available in several forms on the internet) unless you have a licence for a Microsoft product that ships them — though given the dominance of Windows in the consumer PC market, most people will have a valid Windows licence somewhere. Secondly, you need to work very hard to use these s in some applications, including R as we’ll see, simply because those applications were built to use different or open definitions.

What is doubly frustrating about this is that there are entirely free and open s that could be mandated by PLOS ONE. The Liberation Fonts suite for example is one such set of s, the creation of which was sponsored by Red Hat. The aim was to provide a that is metric compatible (i.e. the glyphs occupy the same physical space) with Microsoft’s Arial, Times New Roman, and Courier New s that are prevalent in the Windows world. The Liberation Fonts aren’t copies of the Microsoft ones, but for a given string, they should occupy the same amount of real estate in the document or on screen. Or PLOS ONE could have stuck with the standard set of Postscript s, for which there are free equivalents.

Bob O’Hara raised this issue over a year ago, and Michael Eisen took the time to comment that this was an acknowledged issue and indicated that the problems stemmed from the publishing tools used by PLOS.

Despite the draconian restrictions, PLOS ONE does have a pretty good set of instructions or tips to go alongside them, to help authors prepare figures for the journal. These instructions even include some tips on creating your figures in R with the Arial family. These instructions basically involve converting the .ttf (TrueType files) into .afm (Adobe Font Metric) files via the tt2afm utility and subsequently registering the .afm files with R’s postscript() plotting device.

Firstly, the instructions refer to an older way of specifying families (as a vector paths to four or five .afm files — the fifth file would be for the Symbol , and if missing R will use the default); there is no reason to presume that R will continue to maintain this backwards-compatible behaviour. Second, they require the user to get and install the tt2afm utility (or some other utility that achieves the same job). Thirdly, the instructions are incomplete if you wish to have the figure reproduce on any system; the s need to be embedded for those users that don’t have Arial installed. Admittedly, PLOS ONE’s production system will have these s, so that this is missing is irrelevant, but it is something one may need to consider when working with colleagues using a range of OSes.

The new way of referring to families is a little more convoluted now in modern versions of R. Thankfully, a very simple solution is present that handles the registration of s with R’s graphics devices, and as an added bonus will convert .tty files into the Type1 equivalents. This solution is Winston Chang’s extra package. In the code chunks below, I’ll walk you through installing and using the package to produce a figure suitable for submission to PLOS ONE or any other journal that demands the use of the proprietary Arial .

First up, install the extra package from your local CRAN mirror. For the code below to work, you’ll need version 0.15 or later, which was released to CRAN a few days ago (as of writing). extra relies on two additional packages

As all of this is available on CRAN, thanks to Winston and the CRAN maintainers, you can get up and running simply by installing extra via

install.packages("extra")

Assuming that the package installs for you then each time you wish to use it, you need to load the package into your R session (as with most other R packages)

library("extra")

The first time you use the package, you will need to register s with the extra database. This process will search your computer for s, register them with the database and convert the metric information in the .ttf files into .afm equivalents, plus a number of other steps like linking to the .ttf files for the actual glyphs (the .afm files only contain the metrics or size and other metadata for the , not the individual characters or glyphs, which only live in the .ttf files) to allow embedding. To initiate this process, run

_import()

This can take some time, depending on how many s you have installed on your system. However, you only need to do this once (assuming you don’t install s regularly) or at the most after you’ve added s to your system. You’ll need to confirm the process when asked (just hit y followed by return), and as it is doing its thing, you should see a series of statements printed to the console as each is found and converted, eg

> _import()
Importing s may take a few minutes, depending on the number of s and the speed of the system.
Continue? [y/n] y
Scanning ttf files in /usr/shares/ ...
Extracting .afm files from .ttf files...
/usr/shares/dejavu/DejaVuSans-Bold.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSans-Bold
/usr/shares/dejavu/DejaVuSans-BoldOblique.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSans-BoldOblique
/usr/shares/dejavu/DejaVuSans-ExtraLight.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSans-ExtraLight
/usr/shares/dejavu/DejaVuSans-Oblique.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSans-Oblique
/usr/shares/dejavu/DejaVuSans.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSans
/usr/shares/dejavu/DejaVuSansCondensed-Bold.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSansCondensed-Bold
/usr/shares/dejavu/DejaVuSansCondensed-BoldOblique.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSansCondensed-BoldOblique
/usr/shares/dejavu/DejaVuSansCondensed-Oblique.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSansCondensed-Oblique
/usr/shares/dejavu/DejaVuSansCondensed.ttf => /home/gavin/R/build/3.0-patched/library/extradb/metrics/DejaVuSansCondensed
....

Once this is finished, you can get a list of s it found and registered via the s() function

## When it is finished, look at the s it has found
s()

On my system, this returned

> s()
 [1] "Abyssinica SIL"         "Andale Mono"            "Arial Black"           
 [4] "Arial"                  "Comic Sans MS"          "Courier New"           
 [7] "DejaVu Sans"            "DejaVu Sans Light"      "DejaVu Sans Condensed" 
[10] "DejaVu Sans Mono"       "DejaVu Serif"           "DejaVu Serif Condensed"
[13] "Droid Arabic Naskh"     "Droid Sans"             "Droid Sans Arabic"     
[16] "Droid Sans Armenian"    "Droid Sans Devanagari"  "Droid Sans Ethiopic"   
[19] "Droid Sans Fallback"    "Droid Sans Georgian"    "Droid Sans Hebrew"     
[22] "Droid Sans Mono"        "Droid Sans Tamil"       "Droid Sans Thai"       
[25] "Droid Serif"            "FreeMono"               "FreeSans"              
[28] "FreeSerif"              "Georgia"                "Impact"                
[31] "Jomolhari"              "Khmer OS"               "Khmer OS Content"      
[34] "Khmer OS System"        "Liberation Mono"        "Liberation Sans"       
[37] "Liberation Sans Narrow" "Liberation Serif"       "LKLUG"                 
[40] "Lohit Assamese"         "Lohit Bengali"          "Lohit Devanagari"      
[43] "Lohit Gujarati"         "Lohit Kannada"          "Lohit Oriya"           
[46] "Lohit Punjabi"          "Lohit Tamil"            "Lohit Telugu"          
[49] "Meera"                  "Mingzat"                "NanumGothic"           
[52] "NanumGothicExtraBold"   "Eeyek Unicode"          "Nuosu SIL"             
[55] "OpenSymbol"             "Padauk"                 "PT Sans"               
[58] "PT Sans Narrow"         "Tahoma"                 "Times New Roman"       
[61] "Trebuchet MS"           "Verdana"                "VL Gothic"             
[64] "Waree"                  "Webdings"

Having loaded (or discovered and registered s with the system), you need to register the s with a particular graphics device. In particular, you need to do this for the pdf() or postscript() devices. For PLOS ONE, you’ll be wanting the postscript() device as that journal requires EPS format files. Here I’ll show both as PDF is generally more useful than EPS when passing figures between colleagues.

The loads() function is used to register s and by default it will register them with the pdf() device. The postscript() device can be specified via the device = “postscript” argument

loads() ## for pdf()
## or
loads(device = "postscript") ## for postscript()

R will print messages about which s have been registered; you can silence this by adding quiet = TRUE to the call to loads().

Now we are in a position to produce a plot and export it from R in PDF or EPS formats. By way of illustration, I’ll use a kernel density estimate (KDE) of the probability density function of the Old Faithful duration between eruption data, available in object faithful (note this is now available without an explicit data() call)

dens <- with(faithful, density(waiting))
plot(dens)

Exporting EPS files via postscript()

To export an EPS file, we use the postscript() device, though to get true EPS output, some additional arguments need to be specified. These are

In addition, we need to tell the plot use a particular family, in this case we use family = “Arial”. The character value you pass family is one of the entries returned by s() (see above). Here is a call that will generate an EPS file of the KDE of the Old Faithful data

postscript("myfig.eps", height = 6, width = 6.83,
           family = "Arial", paper = "special", onefile = FALSE,
           horizontal = FALSE)
op <- par(mar = c(5, 4, 0.05, 0.05) + 0.1)
plot(dens, main = "", xlab = "Duration between eruptions (minutes)")
par(op)
dev.off()

Notice that here I use the width = 6.83 (in inches) which corresponds to a 3-column figure in the PLOS One world. Other dimensions for figures can be found on the PLOS ONE Guidelines for Figure and Table Preparation.

Exporting PDF files via pdf()

A similar invocation is required for the pdf() device, but there is no horizontal argument:

pdf("myfig.pdf", height = 6, width = 6.83,
           family = "Arial", paper = "special", onefile = FALSE)
op <- par(mar = c(5, 4, 0.05, 0.05) + 0.1)
plot(dens, main = "", xlab = "Duration between eruptions (minutes)")
par(op)
dev.off()

Setting up for use on other devices

If a device has a family argument, then you can use the following to open a new device using a given family (here “Arial”)

dev.new(family = "Arial")

This is useful if you want to visualise how the figure will look as you create it. However, it works using the system provision (on Linux and MacOS X that means Pango), not via extra, so do read ?X11, ?windows, or ?quartz for details on how s are resolved there.

Embedding s

In order for the figure to display properly on any computer with a suitable viewer, the person viewing the file we just generated will need to have Arial installed on their system. Embedding the (or a subset of glyphs actually used) avoids this situation, at the expense of increased file size. However, embedding s with R requires the use of Ghostscript, which needs to be installed. As I mentioned earlier, technically you don’t need this for submission to PLOS ONE, but I include instructions for embedding s for completeness.

The extra package has a wrapper to the standard embedFonts() function in base R; embed_s(). This takes the path to the input EPS file, the path/filename for the outputted file with embedded s (this can be missing, in which case the input file is overwritten!), plus a format argument that you can usually ignore (see ?embedFonts for details), and and optional argument options. This options argument is very useful to pass along arguments to the ghostscript programme. On my system, the default ghostscript output device was set to either US Letter or A4 paper, so the EPS figure had a large amount of white space above and to the right of the figure. This needed addressing, obviously and for that I used the EPSCrop argument, which has to be given as you would include it if working with ghostscript directly, hence in the code below I pass -dEPSCrop to the option argument.

## EPS file
embed_s("./myfig.eps", outfile = "./myfig-embed.eps",
            options = "-dEPSCrop")
## PDF file
embed_s("./myfig.pdf", outfile = "./myfig-embed.pdf")

Note the PDF file doesn’t need any additional ghostscript arguments so I omit the option argument in that case.

And that is it. Once you’ve set up extra and allowed it to search and convert any TrueType s on your system, using those s is as simple as registering your s with a particular device via loads(device = “foo”) (where “foo” is one of the supported devices; see ?loads), and then specifying the family name of the when creating the plotting device.

If you have any suggestions for improving these instructions let me know in the Comments; I should pass them along to PLOS ONE at some point so that can add them to their Guidelines for Figure and Table Preparation.

To leave a comment for the author, please follow the link and comment on their blog: From the Bottom of the Heap - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.