Blog Archives

Programmatically download political science data with the psData package

February 23, 2014
By

A lot of progress has been made on improving political scientists’ ability to access data ‘programmatically’, e.g. data can be downloaded with source code R. Packages such as WDI for World Bank Development Indicator and dvn for many data sets stored on the Dataverse Network make it much easier for political scientists to use this data...

Read more »

Three Quick and Simple Data Cleaning Helper Functions (December 2013)

December 6, 2013
By

As I go about cleaning and merging data sets with R I often end up creating and using simple functions over and over. When this happens, I stick them in the DataCombine package. This makes it easier for me to remember how to do an operation and others can possibly benefit from simplified and (hopefully) more intuitive code....

Read more »

Showing results from Cox Proportional Hazard Models in R with simPH

September 2, 2013
By
Showing results from Cox Proportional Hazard Models in R with simPH

Effectively showing estimates and uncertainty from Cox Proportional Hazard (PH) models, especially for interactive and non-linear effects, can be challenging with currently available software. So, researchers often just simply display a results table. These are pretty useless for Cox PH models. It is difficult to decipher a simple linear variable’s estimated effect and basically impossible to understand time...

Read more »

GitHub renders CSV in the browser, becomes even better for social data set creation

August 22, 2013
By
GitHub renders CSV in the browser, becomes even better for social data set creation

I've written in a number of places about how GitHub can be a great place to store data. Unlike basically all other web data storage sites (many of which I really like such as Dataverse and FigShare) GitHub enables deep social data set development and f...

Read more »

Getting Started with Reproducible Research: A chapter from my new book

July 15, 2013
By
Getting Started with Reproducible Research: A chapter from my new book

(This article was first published on Christopher Gandrud (간드루드 크리스토파), and kindly contributed to R-bloggers) This is an abridged excerpt from Chapter 2 of my new book Reproducible Research with R and RStudio. It's published by Chapman & Hall/CRC Press. You can purchase it on Amazon. "Search inside this book" includes a complete table of contents. Researchers often start...

Read more »

Quick and Simple D3 Network Graphs from R

June 8, 2013
By
Quick and Simple D3 Network Graphs from R

Sometimes I just want to quickly make a simple D3 JavaScript directed network graph with data in R. Because D3 network graphs can be manipulated in the browser–i.e. nodes can be moved around and highlighted–they're really nice for data exploration. They're also really nice in HTML presentations. So I put together a...

Read more »

Slide: one function for lag/lead variables in data frames, including time-series cross-sectional data

May 21, 2013
By

I often want to quickly create a lag or lead variable in an R data frame. Sometimes I also want to create the lag or lead variable for different groups in a data frame, for example, if I want to lag GDP for each country in a data frame. I've found the various R methods for doing this hard...

Read more »

Reinhart & Rogoff: Everyone makes coding mistakes, we need to make it easy to find them + Graphing uncertainty

April 17, 2013
By
Reinhart & Rogoff: Everyone makes coding mistakes, we need to make it easy to find them + Graphing uncertainty

You may have already seen a lot written on the replication of Reinhart & Rogoff’s (R &amp R) much cited 2010 paper done by Herndon, Ash, and Pollin. If you haven’t, here is a round up of some of some of what has been written: Konczal, Yglesias, Krugman, Cowen, Peng,

Read more »

Dropbox & R Data

April 11, 2013
By

I'm always looking for ways to download data from the internet into R. Though I prefer to host and access plain-text data sets (CSV is my personal favourite) from GitHub (see my short paper on the topic) sometimes it's convenient to get data stored on Dropbox. There has been a change in the way Dropbox...

Read more »

FillIn: a function for filling in missing data in one data frame with info from another

February 15, 2013
By

Sometimes I want to use R to fill in values that are missing in one data frame with values from another. For example, I have data from the World Bank on government deficits. However, there are some country-years with missing data. I gathered data from ...

Read more »