Importing an Excel Workbook into R

[This article was first published on The Pith of Performance, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The usual route for importing data from spreadsheet applications like Excel or OpenOffice into R involves first exporting the data in CSV format. A newer (c. 2011) and more efficient CRAN package, called XLConnect, facilitates reading an entire Excel workbook and manipulating worksheets and cells programmatically from within R.

XLConnect doesn’t require a running installation of Microsoft Excel or any other special drivers to be able to read and write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE). Moreover, XLConnect can handle older .xls (BIFF) as well as the newer .xlsx (Office Open XML) file formats. Internally, XLConnect uses Apache POI (Poor Obfuscation Implementation) to manipulate Microsoft Office documents.

As a simple demonstration, the following worksheet, from a Guerrilla Capacity Planning workbook, will be displayed in R.

First, the Excel workbook is loaded as an R object:

require(XLConnect)
wb <- loadWorkbook("~/.../XLConnect/82scaling.xlsx")

The structure of the workbook object is:

> str(wb)
Formal class 'workbook' [package "XLConnect"] with 2 slots
  ..@ filename: chr ".../XLConnect/82scaling.xlsx"
  ..@ jobj    :Formal class 'jobjRef' [package "rJava"] with 2 slots
  .. .. ..@ jobj  : 
  .. .. ..@ jclass: chr "com/miraisolutions/xlconnect/integration/r/RWorkbookWrapper"

Next, the workbook object is converted to a data frame:

df <-  readWorksheet(wb, 
    sheet = "SGI-NUMA", 
    startCol = which(LETTERS=="A"), 
    startRow = 3, 
    endCol = which(LETTERS=="P")
    endRow = 15, 
)

which can be compared with the original worksheet (above):

  Measured KRays.Sec      RelCap Efficiency Inverse Fit Transform Trendline   Parameters
1   CPU (p)      X(p) C=X(p)/X(1)        C/p     p/C p-1   (p/C)-1 Quadratic Coefficients
2         1        20        1.00       1.00    1.00   0      0.00         a   5.0000E-06
3         4        78        3.90       0.98    1.03   3      0.03         b       0.0500
4         8       130        6.50       0.81    1.23   7      0.23         c       0.0000
 ...

How to apply other XLConnect functions is described in the associated vignette.

To leave a comment for the author, please follow the link and comment on their blog: The Pith of Performance.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)