Importing an Excel Workbook into R

June 5, 2014
By

(This article was first published on The Pith of Performance, and kindly contributed to R-bloggers)

The usual route for importing data from spreadsheet applications like Excel or OpenOffice into R involves first exporting the data in CSV format. A newer (c. 2011) and more efficient CRAN package, called XLConnect, facilitates reading an entire Excel workbook and manipulating worksheets and cells programmatically from within R.

XLConnect doesn't require a running installation of Microsoft Excel or any other special drivers to be able to read and write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE). Moreover, XLConnect can handle older .xls (BIFF) as well as the newer .xlsx (Office Open XML) file formats. Internally, XLConnect uses Apache POI (Poor Obfuscation Implementation) to manipulate Microsoft Office documents.

As a simple demonstration, the following worksheet, from a Guerrilla Capacity Planning workbook, will be displayed in R.

First, the Excel workbook is loaded as an R object:


require(XLConnect)
wb <- loadWorkbook("~/.../XLConnect/82scaling.xlsx")

The structure of the workbook object is:


> str(wb)
Formal class 'workbook' [package "XLConnect"] with 2 slots
..@ filename: chr ".../XLConnect/82scaling.xlsx"
..@ jobj :Formal class 'jobjRef' [package "rJava"] with 2 slots
.. .. ..@ jobj :
.. .. ..@ jclass: chr "com/miraisolutions/xlconnect/integration/r/RWorkbookWrapper"

Next, the workbook object is converted to a data frame:


df <- readWorksheet(wb,
sheet = "SGI-NUMA",
startCol = which(LETTERS=="A"),
startRow = 3,
endCol = which(LETTERS=="P")
endRow = 15,
)

which can be compared with the original worksheet (above):


Measured KRays.Sec RelCap Efficiency Inverse Fit Transform Trendline Parameters
1 CPU (p) X(p) C=X(p)/X(1) C/p p/C p-1 (p/C)-1 Quadratic Coefficients
2 1 20 1.00 1.00 1.00 0 0.00 a 5.0000E-06
3 4 78 3.90 0.98 1.03 3 0.03 b 0.0500
4 8 130 6.50 0.81 1.23 7 0.23 c 0.0000
...

How to apply other XLConnect functions is described in the associated vignette.

To leave a comment for the author, please follow the link and comment on his blog: The Pith of Performance.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.