Spatial Clustering: Conley Standard Errors for R

September 8, 2014

(This article was first published on fReigeist » R, and kindly contributed to R-bloggers)

I have been working quite a lot with climate and weather data, to study the impact of rainfall shocks on violence in India and how this relationship changed, after the social insurance scheme NREGA was introduced.

In my context, it becomes particularly relevant to adjust for spatial correlation if you find yourself in a situation when you have – either too few — or too many clusters. When you have too few clusters, such as states, clustered standard errors are likely to be too small; when you have too many clusters, your standard errors may be again, too small. This is often the case, when you think about small geographies, where shocks to your dependent variable are likely to be spatially correlated (such as Natural disasters or resource booms).

A feasible alternative may be to compute Conley standard errors following the approaches suggested in Conley (1999) and Conley (2008).

Solomon Hsiang has provided some stata and matlab code to compute such standard errors, here is my attempt to compute such standard errors in R.

Spatial and Serial Correlation Correction

If you use Sol’s code, you need to be cautious about computing the standard errors with fixed-effect models, because his code uses plain OLS to get the estimates and the residuals. This is – of course – problematic, if you have a dataset with a large number of fixed effects. The procedure would then compute Conley errors for a whole lot of coefficients (i.e. your fixed effects), that you typically do not care about.

The way to proceed in Stata is to demean the data by your fixed-effects and then simply pass the residuals left-hand side and the right hand side for which you want standard errors to the “ols_spatial_HAC” procedure. I have written a Stata function that does this, but there are still some caveats and it needs to be thoroughly tested.

My R-functions below are almost literally translated equivalents from Sol’s function, except that it is not (yet) as flexible as his function and that it basically just takes the residuals directly and does not run the regression. This saves you time but also gives you a high degree of flexibility when it comes to the type of R-functions you want to use for your regressions.

The first core function is called “iterateObs”, which does what it says it does. For each observation, a correction term needs to be computed.

you want to work with. the

Originally, when I started working in R the fact that you actually do need to know how to specify your standard errors was a bit scary. You could not just type “, cluster”. I have developed a function that does essentially what Sol’s ols_spatial_HAC function does in R – where the function accepts


The code I present here is a bit more flexible; its very simple where you simply pass a set of regression residuals – since this is all you need – to compute the standard errors – and pass the coefficients for which you want the variance-covariance matrix computed.


To leave a comment for the author, please follow the link and comment on their blog: fReigeist » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)