Importing data directly from MS Excel

August 10, 2012
By

(This article was first published on Insights of a PhD student » R, and kindly contributed to R-bloggers)

R is great for exploring, analysing and graphing your valuable data. No question about it. Unfortunately though, there’s no base package support for importing data directly from MS Excel. This means that you have to faff about saving it in another format, and THEN import this new file. This just adds another file to your computer, taking up more space or cluttering (if your machine looks anything like mine at least) your folders.

There are methods though around this. You CAN import files directly from Excel. You can even write Excel files.

Basically, there’s two options, the gdata package which relies on the perl language (dont fret, you dont have to learn perl too, just download it and forget about it!), or the xlsReadWrite package which uses Java (again, not to worry, this is used in web browsers, so you probably already have it).

gdata supports the xlsx format, while xlsReadWrite only supports xls. xlsReadWrite also makes creating xls files super easy! I use this for taking analysis results, putting them into a table in R, create the xls file with the table in and then copy that into Word for inclusion in my manuscripts. Can save lots of copying and pasting, especially if youre running lots of tests or have lots of means and SEs to put in a table.

Ive created a small xlsx file to show how it works. You can download it here: http://db.tt/zOCGC4ve or alternatively just cite the path in the call to the functions.

install.packages(pkgs="gdata")
library(gdata)
trying URL 'http://db.tt/zOCGC4ve'
Content type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' length 8759 bytes
opened URL

Person  Potato Carrot Tomato
1       1       4     10      1
2       2      10      5      1
3       3       6     10      8
4       4       9      2      4
5       5       1      3      9
6       6      10      9      7

Easy huh? Theres loads of other options available too. If you have multiple sheets in your workbook, you can reference them by name or number:

dat <- read.xls("http://db.tt/zOCGC4ve", sheet="bread")
trying URL 'http://db.tt/zOCGC4ve'
Content type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' length 9363 bytes
opened URL

Person White Brown Wholewheat
1      1     4     6          9
2      2     2     9         10
3      3     4     9          8
4      4     9     7          8
5      5     1     1          8
6      6     4     1          3

Handy! Other options are much the same as those in read.table and read.csv, such as head, skip, nrow, dec, etc.

If you dont install perl (referring here to gdata of course) that R cant find the executable and gives you an error message. If this happens you need to find the perl.exe file and specify it in the call to read.xls as below (this is where my perl exe is located, yours might be different).

read.xls(..., perl="C:/Strawberry/perl/bin/perl.exe")

The read.xls function in the xlsReadWrite package works in much the same way so I’m not going to go into it.

UPDATE: Ive just found there is also a package called XLConnect. Ive yet to play with that though…

UPDATE 2: Theres also an xlsx package which I am yet to try which seems to support both reading and writing to xlsx files, which is naturally handy to keep everything the same (and even adding sheets to the original data files perhaps…)

HTH