# Read sas7bdat files in R with GGASoftware Parso library

**BioStatMatt » R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

… using the new R package `sas7bdat.parso`.

The software company GGASoftware has extended the work of myself and others on the `sas7bdat` R package by developing a Java library called Parso, which also reads `sas7bdat` files. They have worked out most of the remaining kinks. For example, the Parso library reads `sas7bdat` files with compressed data (i.e., written with `COMPRESS=yes` or `COMPRESS=binary`). I hope to eventually bring the project full circle, and incorporate their improvements into the sas7bdat file format documentation and code in the `sas7bdat` package.

The Parso library is made available under terms of the GPLv3, and is also available under a commercial license. So, last weekend, with the help of Tobias Verbeke’s `helloJavaWorld` R package template, I implemented an R package that wraps the functionality of the Parso library. The new package, `sas7bdat.parso` (currently hosted exclusively on GitHub), depends on the R package `rJava`, and implements the functions `s7b2csv` and `read.sas7bdat.parso`. The former function is the workhorse, which reads a sas7bdat file and writes a corresponding CSV file. All of the file input/output happens in the Java implementation (for speed and simplicity). The latter function `read.sas7bdat.parso` simply converts a sas7bdat file to temporary (i.e., using `tempfile`) CSV file, and then reads the CSV file using `read.csv`. There may still be some kinks the the Parso library, or in the wrapper R package, but I hope that this additional resource will help finally eliminate the SAS data file barrier that many of us have experienced for years.

Installation of the R package `rJava` is more complicated than simply calling `install.packages("rJava")`. In order for the `rJava` package to work, and hence the `sas7bdat.parso` package, a JDK (Java Development Kit) must be installed. You can download the Oracle JDK from the Oracle website. Once the JDK is installed, the easiest way to install the `sas7bdat.parso` library is using the `install_github` function in the `devtools` package (e.g., `library("devtools"); install_github("biostatmatt/sas7bdat.parso")`).

**leave a comment**for the author, please follow the link and comment on their blog:

**BioStatMatt » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.