Another Mystery: sas7bdat != sd2

October 14, 2011

(This article was first published on BioStatMatt » R, and kindly contributed to R-bloggers)

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted file is incompatible with the sas7bdat format (regular SAS users will probably know this already).

However, the structure of sd2 and sas7bdat formatted files appear superficially similar. For example, the sd2 file seems to have a ‘header’ followed by ‘pages’ of data and metadata. Also, the metadata appear to be structured into ‘subheaders’, much like the metadata of sas7bdat files.

Given these similarities, and that [good] software developers will tend to reuse code and concepts, I think sd2 mystery would crack fairly easily, if someone could devote the effort. Since the format is obsolete, this might be a good project, perhaps for a CS [graduate] student, or another computer savvy student. I’d be happy to facilitate this if someone is interested.

To leave a comment for the author, please follow the link and comment on their blog: BioStatMatt » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)