Articles by David Robinson

Slides, videos, and tweets from the 2017 New York R Conference

May 22, 2017 | 0 Comments

In April I attended the 2017 New York R conference, hosted by Lander Analytics and Work-Bench. It was both the third time the conference was held and the third time I’ve attended, and it gets more fun each year, especially because this year eight of us attended from Stack Overflow (...
[Read more...]

Gender and verbs across 100,000 stories: a tidy analysis

April 27, 2017 | 0 Comments

Previously in this series Examining the arc of 100,000 stories I was fascinated by my colleague Julia Silge’s recent blog post on what verbs tend to occur after “he” or “she” in several novels, and what they might imply about gender roles within fictional work. This made me wonder what ... [Read more...]

Examining the arc of 100,000 stories: a tidy analysis

April 26, 2017 | 0 Comments

I recently came across a great natural language dataset from Mark Riedel: 112,000 plots of stories downloaded from English language Wikipedia. This includes books, movies, TV episodes, video games- anything that has a Plot section on a Wikipedia page. ... [Read more...]

Simulation of empirical Bayesian methods (using baseball statistics)

January 11, 2017 | 0 Comments

Previously in this series: The beta distribution Empirical Bayes estimation Credible intervals The Bayesian approach to false discovery rates Bayesian A/B testing Beta-binomial regression Understanding empirical Bayesian hierarchical modeling Mixture models and expectation-maximization The ebbr package We’re approaching the end of this series on empirical Bayesian methods, and ... [Read more...]

Introducing the ebbr package for empirical Bayes estimation (using baseball statistics)

January 5, 2017 | 0 Comments

Previously in this series: The beta distribution Empirical Bayes estimation Credible intervals The Bayesian approach to false discovery rates Bayesian A/B testing Beta-binomial regression Understanding empirical Bayesian hierarchical modeling Mixture models and expectation-maximization We’ve introduced a number of statistical techniques in this series: estimating a beta prior, beta-binomial ... [Read more...]

Understanding mixture models and expectation-maximization (using baseball statistics)

January 2, 2017 | 0 Comments

Previously in this series: Understanding the beta distribution Understanding empirical Bayes estimation Understanding credible intervals Understanding the Bayesian approach to false discovery rates Understanding Bayesian A/B testing Understanding beta binomial regression Understanding empirical Bayesian hierarchical modeling In this series on empirical Bayesian methods on baseball data, we’ve been ... [Read more...]

The ‘deadly board game’ puzzle: efficient simulation in R

October 19, 2016 | 0 Comments

Last Friday’s “The Riddler” column on FiveThirtyEight presents an interesting probabilistic puzzle: While traveling in the Kingdom of Arbitraria, you are accused of a heinous crime. Arbitraria decides who’s guilty or innocent not through a court system, but a board game. It’s played on a simple board: ... [Read more...]

Understanding empirical Bayesian hierarchical modeling (using baseball statistics)

October 11, 2016 | 0 Comments

Previously in this series: Understanding the beta distribution Understanding empirical Bayes estimation Understanding credible intervals Understanding the Bayesian approach to false discovery rates Understanding Bayesian A/B testing Understanding beta binomial regression Suppose you were a scout hiring a new baseball player, and were choosing between two that have had 100 ... [Read more...]

Analysis of the #7FavPackages hashtag

August 26, 2016 | 0 Comments

Twitter has seen a recent trend of “first 7” and “favorite 7” hashtags, like #7FirstJobs and #7FavFilms. Last week I added one to the mix, about my 7 favorite R packages: devtoolsdplyrggplot2knitrRcpprmarkdownshiny#7FavPackages #rstats— David Robinson (@drob) August 16, 2016 Hadley Wickham agreed to share his own, but on one condition: @drob I'll do ... [Read more...]

useR and JSM 2016 conferences: a story in tweets

August 23, 2016 | 0 Comments

I was amused by a Guardian article last month that declared “I’m a serious academic, not a professional Instagrammer,” arguing that social media is a distraction for scientific research. This attitude was, to say the least, not popular on academic Twitter, which responded with the #seriousacademic hashtag. When someone ... [Read more...]

Does sentiment analysis work? A tidy analysis of Yelp reviews

July 21, 2016 | 0 Comments

This year Julia Silge and I released the tidytext package for text mining using tidy tools such as dplyr, tidyr, ggplot2 and broom. One of the canonical examples of tidy text mining this package makes possible is sentiment analysis. Sentiment analysis is often used by companies to quantify general social ... [Read more...]

One year as a Data Scientist at Stack Overflow

June 20, 2016 | 0 Comments

One day in January 2013 I found myself wasting time on the internet. This wasn’t a good idea: I was as busy as anyone 2.5 years into their PhD. I had to finish a presentation on some yeast genetics research, I was months behind on a paper with an NYU collaborator ...
[Read more...]

Understanding beta binomial regression (using baseball statistics)

May 31, 2016 | 0 Comments

Previously in this series: Understanding the beta distribution Understanding empirical Bayes estimation Understanding credible intervals Understanding the Bayesian approach to false discovery rates Understanding Bayesian A/B testing In this series we’ve been using the empirical Bayes method to estimate batting averages of baseball players. Empirical Bayes is useful ... [Read more...]
1 2 3 4

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)