Mastering Data Analysis with R

[This article was first published on rapporter, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Sorry about the noisy post title, but it happens to be the name of the book I was working on in the past year, which has been just published at Packt:


Although I do not think that reading this ~400 page book will turn everyone into a true master of R and data analysis, but I believe it can get you on the way. I wrote this book for a relatively large target audience in mind with some prior R experience (like at an introductory university course or MOOC covering how to install R, load CSV files or generate a histogram), but without the time/need to walk through a complete series of books on the stats background, algorithms and domain specific knowledge on handling different data types.

So this is not a reference book, it does not even include a piece of formal mathematical formula, but instead it does provide a practical introduction, many references and hands-on examples on the following topics:
  • Reading data from larger text files and databases in an optimal way
  • Loading data from the Web via parsing HTML, XML, JSON and interacting with APIs
  • Filtering, summarizing and restructuring data
  • Building and interpreting generalized linear models
  • Traditional multivariate statistical methods for dimension reduction and latent variables
  • Classification and clustering, including supervised and unsupervised statistical and machine learning methods
  • Handling outliers and missing values
  • Processing unstructured text data
  • A bit of social network analysis
  • Smoothing, seasonal decomposition and modeling time-series 
  • Visualizing spatial data
And a free chapter (available from Packt) on “Analyzing the R community”, which combines quite a few techniques described in the above mentioned chapters into an actual use case, including some reproducible examples from some of my past researches on this topic:
  • The number of R Foundation members and R conference attendees (previously presented at the useR! 2014 and 2015 conferences besides an interactive webapp on R-activity around the world)
  • The number of packages per R package maintainers
  • The volume and timeline of messages and posters on the [R-help] mailing list
  • Estimating the number of R users around the world
  • The number of R users on Facebook and Twitter
Besides this free chapter, Packt offered a 50% discount on the e-book format of this book for two weeks, that you can activate via the RXI37LH discount code until October 30 2015 (Friday). Another promo code for 20% discount on printed copies is also being generated — to be available early next week. For more details, revisit this page later and look for new comments, or follow me on Twitter:


Some quick statistics on the book:
  • 14 chapters
  • 396 pages
  • 95 packages loaded
  • hflights and data.table used in 7, ggplot2 in 5, dplyr and plyr in 4, microbenchmark and MASS used in 3 chapters
  • 5 reviewers
  • more than 20 persons contributing
  • 2,711 lines of the code bundle on GitHub
  • 581 days between signing the author contract and the actual publication date
  • around 320 e-mail sent and received with the ISBN on the subject line
  • 10,000 kilometers between the places where I wrote the first and the last chapters
  • and I forgot to use time tracking software after logging 174.73 hours spent on the book
And most importantly: I’d love to and looking forward to hearing any kind of private or public feedback on this book!

To leave a comment for the author, please follow the link and comment on their blog: rapporter.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)