Reading SPSS Data into R with Haven

[This article was first published on Stats Can Be Fun, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When psychology researchers switch from SPSS to R a common first question is “Can I load SPSS data in R?”. The answer is yes, and it’s now easier than ever thanks to the Haven package which both reads and writes SPSS files. Previously, you might have used the foreign library and the read.spss command – I don’t recommend this approach. Currently, the Haven package represents your best bet for quickly and accurately loading SPSS data. The Haven package is written by Hadley Wickham (of ggplot2 fame) and based on Evan Miller’s ReadStat. Moreover, it also reads Stata and SAS files.

As with any R package Haven is easily installed the first time you use it:

install.packages("haven")

For every R session in which you use the Haven package you need to activate it using the library command. As well, when you load a file using the Haven package, recognize that it will look for the file in R’s working directory. You can set working directory using the menus in R or RStudio. The example below illustrates how to load SPSS data from R’s working directory. I load the goggles data from Discovering Statistics Using SPSS. The lines below activate the Haven package and then read the “goggles.sav” file into a data frame called “my.data”.

library(haven)
my.data <- read_spss("goggles.sav")

If working directories are confusing for you, you might prefer to use the slightly longer command below that brings up a window which you can use to select the data file you want to load. This is much easier to use, but slightly longer to type. A down side of this approach is that you need slightly different commands depending on if you are an OSX or Windows user. 

On OSX the R commands for loadings SPSS data using a file selector window are:

library(haven)
my.data <- read_spss(file.choose())

On Windows the R commands for loadings SPSS data using a file selector window are:

library(haven)
my.data <- read_spss(choose.files())

That's it! Now you know how to load SPSS data into R.  

If you use OSX, I one additional tip for you. You might want to use the consider getting Text Expander (also see update below). This software allows you to use  keystroke shortcuts in OSX applications. Consequently, you can set up a keystroke shortcut such that when you're in R or RStudio and you type ";load" it automatically replaced by the two lines above (you can use any shortcut you define instead of ";load" I suggest here). This is a quick and easy way to load files and takes much of the hassle out of this step.

Update: August 28, 2015
@hadleywickham pointed out to me that RStudio also has built in code snippets that work similar to Text Expander (thanks!).  You can read more about them: here. I gave them a quick try this morning and they work very well. It appears they work in the script window of RStudio but not the console window - which isn't an issue for me since I script everything. Hopefully you'll find this a helpful feature.

To leave a comment for the author, please follow the link and comment on their blog: Stats Can Be Fun.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)