Reading/Writing Stata (.dta) files with Foreign

[This article was first published on is.R(), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Oftentimes we find ourselves collaborating with others who might not use R or prefer to use Stata to clean and manage their data. Luckily, there is the foreign package that permits handling data of different types (SAS, SPSS, Stata, etc.) within the R environment. The documentation can be found here: 

http://cran.r-project.org/web/packages/foreign/foreign.pdf

In today’s gist, I’ll show how to do the two most basic things one would probably want to do with a Stata (.dta) file: read it into R and write a dataframe from R into a new .dta file. Foreign makes this very easy to do. In this code the functions are going to look first in your current working directory for the .dta files, so please set the directory accordingly or specify the complete file path.

read.dta()

The first command you’ll want to use from within R is read.dta() which loads a Stata dataset. Here I’m using a small subset of the 2010 CCES that I have saved as “stata.dta”

As you can see, six of the seven variables in the data are factors. While factors are good sometimes  we can prevent some of the frustrations of working with them by using the “convert.factors=” option; when convert.factors is FALSE, R replaces the factor value with the underlying numeric value found in Stata. The values can be found in Stata using the “tab Var, nolab” option:

One other useful option within the read.dta() command is “convert.underscore” which can be used to remove underscores used in Stata variable names and replacing them with periods:

write.dta()


Writing data files from R into Stata is also very straightforward; To save your dataframe (DF) as a Stata file (fromR) you simply use write.dta(DF, “fromR.dta”). My example below uses the line:

write.dta(STATA, “fromR.dta”)

Of course, there are some additional options specifying how to deal with factors and dates, but that is discussed in the package documentation linked above.

Once you open the file in Stata you will see it is written by R:

Full code is below, enjoy:

To leave a comment for the author, please follow the link and comment on their blog: is.R().

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)