In this blog article from October 2014, we gave a brief introduction to translate2R explaining the main reasons for as well as advantages of translate2R. Now, we want to showcase the inner workings of this migration process with a focus on the R environment and the respective R package translateSPSS2R. Furthermore, we will also show the motivation for providing the data science community with a one click solution for migrating from SPSS to R.
General Motivation – Why would someone consider transferring from SPSS to R?
Companies and public institutions still work with SPSS but many of them might consider transferring to state-of-the-art analytic technologies.
There are quite a few alternatives to SPSS, but to us the most promising might be the R statistic language: its tightly knit relationship with the data science community will always ensure an unprecedented contemporariness setting it apart from other analytical solutions. Additionally, the ever growing number of extension packages makes R the most flexible data-science environment which covers an expanding scope of analytical functions. Since R is open source, this comes essentially without any licensing costs. A comprehensive overview of the advantages of R can be found here.
translate2R – one click lowers the threshold of change
There are certain complications when someone tries to translate tens of thousands lines of code from SPSS to R. First of all, the effort required is significant and hard to estimate in advance. Furthermore, a manual translation is error prone and writing functionally equivalent R script might prove to be a rather complex task due to the differences in attributes in the two languages for example. It is difficult to compare the migrated script against the original code and challenging to debug the migrated script in case of any inconsistencies. Plus, translate2R can also support proof-of-concept projects regarding the migration of proven SPSS code to R.
Translate2R aims at solving this problem by automatizing the gross of this workload – one button translates the majority of SPSS code into a fully functional R script.
From the outside, translate2R presents itself as a two-fold web front end with one window containing the input data as SPSS syntax, and the other window containing the output data as R script. By copy-pasting, manually writing, or importing files of SPSS syntax into the input section, an equivalent R script will automatically be created by a click on the “translate” button.
translateSPSS2R – a new R package enables R to unfold typical SPSS functions
To make this new R script work within the R environment, a new object type was created. This object type, called xpssFrame, contains attributes that are very specific to common SPSS objects, enabling us to simulate the functional spectrum of SPSS within the R environment. E.g. in SPSS there are 2 types of missing values – system missing and user-defined – but in R user-defined missing values do not exist. Hence, the translation of an SPSS command that deals with user-defined missing values is impossible with common R objects as they do not possess missing value attributes.
Consequently, the translateSPSS2R package for R contains R functions equivalent to most of the core SPSS functions. These functions, which are able to deal with the migrated SPSS structure, generally start with the prefix xpss.
Here, we have a short example to compare SPSS with R script depicting the clear analogy of functions:
translate2R is still under development, so which SPSS functions can now be used in R? And what is the added-value of using them in R?
As the majority of existing SPSS code consists of data-management commands like labeling, categorizing, recoding and computing new variables, we focused on this area at the beginning.
For the future, we plan to implement the core analytical functions of base SPSS. However, we also believe that the transfer of SPSS syntax to R opens up a plethora of new possibilities and angles to apply advanced analytics to your data.
Exemplary Translation of SPSS to R
The data set at hand contains various car models from three different continents with additional variables being sales price, resale price/value and the total sales figures. Moreover, the characteristics of the car models like horse power or capacity are contained in the data.
Here, we read an SPSS data file into R and create a new variable from the sum of three already existing variables. The newly created variable gets to be renamed, labeled, and recoded. Moreover, data management functions are being applied to reduce the quantity of the data at hand. Based on this subsetting process, descriptive statistics are being produced.
To Sum Up
All in all, the development of translate2R was motivated by our wish to open up the possibilities of R to SPSS users by providing them with an efficient and automated one-click solution. In order to achieve this, we had to develop the R package translateSPSS2R with the object type xpssFrame that simulates SPSS typical data properties in R as closely as possible. This object enables us to transfer the rationale of SPSS data analysis to R.
We would like to invite you to experience translate2R and the possibilities of our R package translateSPSS2R on our website.