sas7bdat database reader update

June 14, 2011
By

(This article was first published on BioStatMatt » R, and kindly contributed to R-bloggers)

An earlier post (1216) introduced a compatibility study (i.e. reverse engineering) of the sas7bdat database file format. The code and documentation for this are here: http://github.com/biostatmatt/sas7bdat. I've recently restructured the code as an R package, and added some functionality. Look for the sas7bdat package on the CRAN. Also, the read.sas7bdat code has been ported to a Java framework by Kasper Sørensen: http://eobjects.org/svn/SassyReader/trunk/ under the LGPL.

The read.sas7bdat function now returns a data frame with the column.info attribute, which describes the various attributes of the database fields. The column.info attribute is a list of lists, one for each field. Each list contains zero or more of:

  • name: The field name
  • label: The field label (usually a longer description)
  • offset: The field offset in packed binary row data (bytes)
  • length: The field length (bytes)
  • type: The field type, either 'character' or 'numeric'

The document describing the sas7bdat binary format is included as a vignette (using rst2latex). Here is a preview of the R package: sas7bdat_0.1.tar.gz. The package comes with a list of internet resources for sas7bdat test files (see data(sas7bdat.sources)).

To leave a comment for the author, please follow the link and comment on his blog: BioStatMatt » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.