After reviewing a book about R designed for beginners (see my previous post) I thought I’d step up the pace slightly and look at a more advanced book. I’m pleased to say that I was not disappointed. This book is so comprehensive – you can find nearly anything you want in it!
The book starts with a brief, but comprehensive, R tutorial. This tutorial is rather light on actual statistics, but gives a very good introduction to the syntax of the R language. This focus on the language continues through the whole of Part II, which contains detailed chapters on the language, syntax, objects, symbols, functions and high-performance programming in R. This is very different to most books on R which jump straight into the statistics. Although this part may seem rather boring to some, it provides a very good grounding in the basics of R programming, which make the rest of the book significantly easier to understand.
Through reading Part II I learnt a huge amount about the R language – finding out some things that I never knew, and realising how many things that I did know actually worked. This part may well confuse those with no previous programming experience (see my comments on this later), but those who are at least slightly familiar with the terminology, it gives a very comprehensive explanation of the language itself.
Part III is where the book starts to get into R’s main use: statistics and statistical graphics. Very sensibly, this section starts at the beginning of the process with how to import data (including instruction on how to connect R to databases) and then a lengthy (over thirty pages) section on preparing data for analysis. This is incredibly useful as this can often take a significant proportion of the time spent on a project. The graphics chapters after this provide a comprehensive introduction to the standard (‘base’) graphics system, and then the lattice graphics system. I’m glad to see that a non-base-graphics system is given space in this book. From what I’ve seen on the web, it seems that very few R programmers use the base graphics system for producing production graphics, so this is a sensible inclusion.
Finally, after all the preparation, we get to the statistical analysis section (Part IV). Of course, as the book has covered so much of the fundamentals earlier, the statistics section can fly along, focussing mainly on the statistics themselves and the syntax of the tools that R provides, rather than the mechanics of how to write valid R commands. A wide range of statistical tests are included, and they are covered in a very sensible order (in fact, almost exactly the same order that my statistics class covered them: starting with summary statistics, then probability distributions, on to statistical models and then beyond that to classification and machine learning). I won’t lie and say that I’ve read every word of this section, but the bits I have read have been very good: concise but comprehensive.
There’s not really much more I can say about this book: it has become my ‘go-to’ reference for anything I need to do in R. It would be intimidating for a beginner, but it is not aimed at beginners, so that’s fine. For those of us who are slightly more experienced with R, it is a great book, and I thoroughly recommend it.
(Disclaimer: I was provided with a free review copy of this book)