Another Mystery: sas7bdat != sd2

October 14, 2011
By

(This article was first published on BioStatMatt » R, and kindly contributed to R-bloggers)

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted file is incompatible with the sas7bdat format (regular SAS users will probably know this already).

However, the structure of sd2 and sas7bdat formatted files appear superficially similar. For example, the sd2 file seems to have a 'header' followed by 'pages' of data and metadata. Also, the metadata appear to be structured into 'subheaders', much like the metadata of sas7bdat files.

Given these similarities, and that [good] software developers will tend to reuse code and concepts, I think sd2 mystery would crack fairly easily, if someone could devote the effort. Since the format is obsolete, this might be a good project, perhaps for a CS [graduate] student, or another computer savvy student. I'd be happy to facilitate this if someone is interested.

To leave a comment for the author, please follow the link and comment on his blog: BioStatMatt » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , ,

Comments are closed.