Statistics doesn’t have to be so hard: simulate!

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

My second-favourite keynote from yesterday's Strata Hadoop World conference was this one, from Pinterest's John Rauser. To many people (especially in the Big Data world), Statistics is a series of complex equations, but a just a little intuition goes a long way to really understanding data. John illustrates this wonderfully using an example of data collected to determine whether consuming beer causes mosquitoes to bite you more:  


The big lesson here, IMO, is that so many statistical problems can seem complex, but you can actually get a lot of insight by recognizing that your data is just one possible instance of a random process. If you have a hypothesis for what that process is, you can simulate it, and get an intuitive sense of how surprising your data is. R has excellent tools for simulating data, and a couple of hours spent writing code to simulate data can often give insights that will be valuable for the formal data analysis to come. 

(By the way, my favourite keynote from the connference was Amanda Cox's keynote on data visualization at the New York Times, which featured several examples developed in R. Sadly, though, it wasn't recorded.)

O'Reilly Strata: Statistics Without the Agonizing Pain

To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)