# Mastering Data Analysis with R

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Sorry about the noisy post title, but it happens to be the name of the book I was working on in the past year, which has been just published at Packt:**rapporter**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Although I do not think that reading this ~400 page book will turn everyone into a true master of R and data analysis, but I believe it can get you on the way. I wrote this book for a relatively large target audience in mind with some prior R experience (like at an introductory university course or MOOC covering how to install R, load CSV files or generate a histogram), but without the time/need to walk through a complete series of books on the stats background, algorithms and domain specific knowledge on handling different data types.

So this is not a reference book, it does not even include a piece of formal mathematical formula, but instead it does provide a

**practical introduction**, many references and

**hands-on examples**on the following topics:

- Reading data from larger text files and databases in an optimal way
- Loading data from the Web via parsing HTML, XML, JSON and interacting with APIs
- Filtering, summarizing and restructuring data
- Building and interpreting generalized linear models
- Traditional multivariate statistical methods for dimension reduction and latent variables
- Classification and clustering, including supervised and unsupervised statistical and machine learning methods
- Handling outliers and missing values
- Processing unstructured text data
- A bit of social network analysis
- Smoothing, seasonal decomposition and modeling time-series
- Visualizing spatial data

**free chapter**(available from Packt) on “Analyzing the R community”, which combines quite a few techniques described in the above mentioned chapters into an actual use case, including some reproducible examples from some of my past researches on this topic:

- The number of R Foundation members and R conference attendees (previously presented at the useR! 2014 and 2015 conferences besides an interactive webapp on R-activity around the world)
- The number of packages per R package maintainers
- The volume and timeline of messages and posters on the [R-help] mailing list
- Estimating the number of R users around the world
- The number of R users on Facebook and Twitter

**50% discount**on the e-book format of this book for two weeks, that you can activate via the

*RXI37LH*discount code until October 30 2015 (Friday). Another promo code for 20% discount on printed copies is also being generated — to be available early next week. For more details, revisit this page later and look for new comments, or follow me on Twitter:

`After ~1001 sleepless nights, my #rstats book on #datascience is published w/ a free chapter https://t.co/7lS4pgN06k pic.twitter.com/MnM24P67dE— Gergely Daróczi (@daroczig) October 1, 2015`

Some quick

**statistics**on the book:

- 14 chapters
- 396 pages
- 95 packages loaded
*hflights*and*data.table*used in 7,*ggplot2*in 5,*dplyr*and*plyr*in 4,*microbenchmark*and*MASS*used in 3 chapters- 5 reviewers
- more than 20 persons contributing
- 2,711 lines of the code bundle on GitHub
- 581 days between signing the author contract and the actual publication date
- around 320 e-mail sent and received with the ISBN on the subject line
- 10,000 kilometers between the places where I wrote the first and the last chapters
- and I forgot to use time tracking software after logging 174.73 hours spent on the book

**feedback**on this book!

To

**leave a comment**for the author, please follow the link and comment on their blog:**rapporter**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.