Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Introduction

Saving your R dataframe to a .csv can be useful; being able to view the data all at once can help to see the bigger picture. Often though, multiple dataframes, all pieces of the same project, need to be viewed this way and related back to one another. In this case viewing becomes far easier when these dataframes are written to .xlsx across multiple sheets in a single workbook. Not to mention the time and energy saved when you no longer have to find and open multiple files.

Four packages in R are available to do just this. I generated some test data (a 30000 x 40 dataframe with sampled values between 1 and 100) and tested each one with varying levels of success.

# Package use

## The xlsx package

I ran into problems very early on with this package. The installation was dependent on the rJava package which gave an error during installation. Given that we’re using a Unix platform, the fix to this was to run sudo apt-get install r-cran-rjava in console. Executing this command succesfully installed the package and after restarting R-studio I was able to install and load the xlsx package.

The next problem I ran into was a java out of memory error, specifically java.lang.OutOfMemoryError. This time the fix was allocating more memory to Java trough the options(java.parameters = "-Xmx40000m") setting. The memory increase amount (and therefore the amount of data that can be written) is dependent on the amount of memory your computer can allocate to R. In this case I have allocated 40gig, which most of us do not have laying around. Once this was done I was able write up to 10 worksheets without any problems, but again, I ran into memory problems when I tried to scale up.

## The XLConnect package

This package had all the same problems as the previous with one added extra. During testing, the writing process was done many times; each time it was done the previous .xlsx file saved as the same name needed to be deleted. If this wasn’t done R would continue attempting to write to the file indefinitely with no indications of stopping. There seems to be a fix for this using the createNames() function but it didn’t seem worth the effort.

## The WriteXLS package

This package was easy enough to use with no apparent errors. The problem was that because this method writes the data directly into a workbook, instead of first creating it locally within the R environment, it was extremely slow.

## The openxlsx package

Finally, I looked to the opnexlsx package. Again, no problems here; the most user friendly of the four.

# Microbenchmark test findings

After using each of the four packages I decided, taking usability, memory issues and speed into account, not to continue further with the WriteXLS and the XLConnect packages. I then tested the speed of the remaining two, openxlsx and xlsx, using the microbenchmark package with an evaluation number of 5 and varying sheet numbers.

1 Sheet with a 30000 x 40 dataframe run 5 times:

3 Sheets with a 30000 x 40 dataframe run 5 times:

5 Sheets with a 30000 x 40 dataframe run 5 times:

10 Sheets with a 30000 x 40 dataframe run 5 times:

Graphing the mean, median, min and max times for both the openxlsx and the xlsxpackages show that other than min time, the openxlsx package is faster.

# Conclusion

After testing each of the four packages using a dataframe of 30000 x 40 random numbers bewteen 1 and 100, I found that 2 of the packages, the XLConnectpackage and the WriteXLS package, were either not very user friendly or were simply too slow. The remaining two, the openxlsx package and the xlsx package, were then tested using the microbenchmark package with an evaluation number of 5 and various sheet numbers. The tests concluded that the openxlsx package was overall faster. With this in mind and with the ease of use for this package, it comes in as the favorite.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.