Mastering Data Analysis with R

October 17, 2015

(This article was first published on rapporter, and kindly contributed to R-bloggers)

Sorry about the noisy post title, but it happens to be the name of the book I was working on in the past year, which has been just published at Packt:

Although I do not think that reading this ~400 page book will turn everyone into a true master of R and data analysis, but I believe it can get you on the way. I wrote this book for a relatively large target audience in mind with some prior R experience (like at an introductory university course or MOOC covering how to install R, load CSV files or generate a histogram), but without the time/need to walk through a complete series of books on the stats background, algorithms and domain specific knowledge on handling different data types.

So this is not a reference book, it does not even include a piece of formal mathematical formula, but instead it does provide a practical introduction, many references and hands-on examples on the following topics:

  • Reading data from larger text files and databases in an optimal way
  • Loading data from the Web via parsing HTML, XML, JSON and interacting with APIs
  • Filtering, summarizing and restructuring data
  • Building and interpreting generalized linear models
  • Traditional multivariate statistical methods for dimension reduction and latent variables
  • Classification and clustering, including supervised and unsupervised statistical and machine learning methods
  • Handling outliers and missing values
  • Processing unstructured text data
  • A bit of social network analysis
  • Smoothing, seasonal decomposition and modeling time-series 
  • Visualizing spatial data

And a free chapter (available from Packt) on “Analyzing the R community”, which combines quite a few techniques described in the above mentioned chapters into an actual use case, including some reproducible examples from some of my past researches on this topic:

  • The number of R Foundation members and R conference attendees (previously presented at the useR! 2014 and 2015 conferences besides an interactive webapp on R-activity around the world)
  • The number of packages per R package maintainers
  • The volume and timeline of messages and posters on the [R-help] mailing list
  • Estimating the number of R users around the world
  • The number of R users on Facebook and Twitter

Besides this free chapter, Packt offered a 50% discount on the e-book format of this book for two weeks, that you can activate via the RXI37LH discount code until October 30 2015 (Friday). Another promo code for 20% discount on printed copies is also being generated — to be available early next week. For more details, revisit this page later and look for new comments, or follow me on Twitter:

Some quick statistics on the book:

  • 14 chapters
  • 396 pages
  • 95 packages loaded
  • hflights and data.table used in 7, ggplot2 in 5, dplyr and plyr in 4, microbenchmark and MASS used in 3 chapters
  • 5 reviewers
  • more than 20 persons contributing
  • 2,711 lines of the code bundle on GitHub
  • 581 days between signing the author contract and the actual publication date
  • around 320 e-mail sent and received with the ISBN on the subject line
  • 10,000 kilometers between the places where I wrote the first and the last chapters
  • and I forgot to use time tracking software after logging 174.73 hours spent on the book

And most importantly: I’d love to and looking forward to hearing any kind of private or public feedback on this book!

To leave a comment for the author, please follow the link and comment on their blog: rapporter. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)