Split apply combine in R
apply family of functions in R is incredible powerful, yet for newcomers often somewhat mysterious. Thus, Bernd gave an overview of the different apply functions and their cousins. The various functions differ in their object inputs, e.g. vectors, arrays, data frames or lists, and their outputs. Other related functions are
ave. While functions like
aggregate reduce the output size, others like
ave will return as many rows as the input object and repeat the results where necessary.
Alternatively to the base R function Bernd touched also on the
**ply functions of the
plyr package. The function names are certainly easier to remember, but their syntax can be a little awkward (.()). Bernd’s slides, in German, are already available from our Meetup site.
When dealing with data stored in spreadsheets most member of the group rely on
write.csv in R. However, if you have a spreadsheet with multiple tabs and formatted numbers,
read.csv becomes clumsy, as you would have to save each tab without any formatting in separate files.
Günter presented the
XLConnect as an alternative to
read.csv or indeed
RODBC for reading spreadsheet data. It uses the Apache POI API as the underlying interface.
XLConnect requires a Java runtime environment on your computer, but no installation of Excel. That makes it a true platform independent solution to exchange data with spreadsheets and R. Not only can you read defined rows and columns from Excel into R, or indeed named ranges, but in the same way data can be stored in Excel files again and to top it all – also graphic output from R.
Next Kölner R meeting
The next meeting is scheduled for 13 December 2013. A discussion of the data.table package is already on the agenda.
Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.