Oftentimes we find ourselves collaborating with others who might not use R or prefer to use Stata to clean and manage their data. Luckily, there is the foreign package that permits handling data of different types (SAS, SPSS, Stata, etc.) within the R environment. The documentation can be found here:
In today’s gist, I’ll show how to do the two most basic things one would probably want to do with a Stata (.dta) file: read it into R and write a dataframe from R into a new .dta file. Foreign makes this very easy to do. In this code the functions are going to look first in your current working directory for the .dta files, so please set the directory accordingly or specify the complete file path.
The first command you’ll want to use from within R is read.dta() which loads a Stata dataset. Here I’m using a small subset of the 2010 CCES that I have saved as “stata.dta”
As you can see, six of the seven variables in the data are factors. While factors are good sometimes we can prevent some of the frustrations of working with them by using the “convert.factors=” option; when convert.factors is FALSE, R replaces the factor value with the underlying numeric value found in Stata. The values can be found in Stata using the “tab Var, nolab” option:
One other useful option within the read.dta() command is “convert.underscore” which can be used to remove underscores used in Stata variable names and replacing them with periods:
Writing data files from R into Stata is also very straightforward; To save your dataframe (DF) as a Stata file (fromR) you simply use write.dta(DF, “fromR.dta”). My example below uses the line:
Of course, there are some additional options specifying how to deal with factors and dates, but that is discussed in the package documentation linked above.
Once you open the file in Stata you will see it is written by R:
Full code is below, enjoy: