set_na_where(): a nonstandard evaluation use case

August 14, 2017 |

In this post, I describe a recent case where I used rlang’s tidy evaluation system to do some data-cleaning. This example is not particularly involved, but it demonstrates is a basic but powerful idea: That we can capture the expressions that a user...

A tour of the tibble package

July 9, 2017 |

Dataframes are used in R to hold tabular data. Think of the prototypical spreadsheet or database table: a grid of data arranged into rows and columns. That's a dataframe. The tibble R package provides a fresh take on dataframes to fix some longstan...

Plotting partial pooling in mixed-effects models

June 21, 2017 |

In this post, I demonstrate a few techniques for plotting information from a relatively simple mixed-effects model fit in R. These plots can help us develop intuitions about what these models are doing and what “partial pooling” means. The sleeps...

New package polypoly (helper functions for orthogonal polynomials)

May 29, 2017 |

Last week, I released a new package called polypoly to CRAN. It wraps up some common tasks for dealing with orthogonal polynomials into a single package. The README shows off the main functionality, as well as the neat “logo” I made for the packag...

I don’t know Fisher’s exact test, but I know Stan

May 15, 2017 |

A few days ago, I watched a terrific lecture by Bob Carpenter on Bayesian models. He started with a Bayesian approach to Fisher’s exact test. I had never heard of this classical procedure, so I was curious to play with the example. In this post, I use the same ...

Simulating Unown encounter rates in Pokémon Go

March 21, 2017 |

Pokémon Go is an augmented reality game where people with smartphones walk around and catch Pokémon. As in the classic games, players are Pokémon “trainers” who have to travel around and collect creatures. Some types are rarer than others, som...

Repeatedly applying a function

January 11, 2017 |

A colleague of mine sent me the following R question: I have a function that takes a list and does some stuff to it and then returns it. I then take that output and run it through the same function again. But I obviously don’t want to repeatedly ...

RStanARM basics: visualizing uncertainty in linear regression

November 18, 2016 |

As part of my tutorial talk on RStanARM, I presented some examples of how to visualize the uncertainty in Bayesian linear regression models. This post is an expanded demonstration of the approaches I presented in that tutorial. Data: Does brain mass predict how much mammals sleep in a day? Let’...

August 14, 2016 |

The lazyeval package is a tool-set for performing nonstandard evaluation in R. Nonstandard evaluation refers to any situation where something special happens with how user input or code is evaluated. For example, the library function doesn’t evalua...

Fixing APA citations from Pandoc with stringr

August 3, 2016 |

Pandoc is awesome. It's the universal translator for plain-text documents. I especially like that it can do inline citations. I write @Jones2005 proved aliens exist and pandoc produces "Jones (2005) proved aliens exist". But it doesn't quite d...

Why is using list() critical for .dots = setNames() uses in dplyr?

March 22, 2016 |

I wrote an answer about why setNames() shows up sometimes in standard evaluation with dplyr. My explanation turned into a mini-tutorial on why those standard evaluation functions have a .dots argument. The basic idea is that the usual variadic argume...

Confusion matrix statistics on late talker diagnoses

October 5, 2015 |

How many late talkers are just late bloomers? More precisely, how many children identified as late talkers at 18 months catch up to the normal range by one year later? This is an important question. From a clinical perspective, we want to support children with language delays, but it is also ...
