Blog Archives

Another Mystery: sas7bdat != sd2

October 14, 2011
By

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted file is incompatible with the

Read more »

A Note on Antoniak’s Approximation for Dirichlet Processes

September 21, 2011
By
A Note on Antoniak’s Approximation for Dirichlet Processes

Antoniak's 1974 article titled Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems (Annals of Statistics 2(6):1152-1174) is a fundamental work for most modern developments in this area. The article gives two expressions for the expected number of distinct values in a sample of size n, drawn from a Dirichlet process-distributed probability distribution with

Read more »

More sas7bdat progress

September 13, 2011
By

The development version of the read.sas7bdat function (in the sas7bdat package) now reads field labels and formats. In addition, errors of the type "found <x> <type> subheaders where 1 expected" are now a thing of the past. These improvements are largely due to work by Clint Cummins. The function also works on some files generated

Read more »

The Open Governance Index: Results for The R Project

August 24, 2011
By

Just over two weeks ago, I invited readers to complete the Open Governance Index (OGI) Questionnaire regarding The R Project. The OGI evaluates several facets of governance in open source projects (OGI publication). The OGI questionnaire is reproduced below, and each question is linked from the table of useR responses. The table below presents the

Read more »

tty Connection + sas7bdat: useR! 2011 Presentation Slides

August 21, 2011
By
tty Connection + sas7bdat: useR! 2011 Presentation Slides

Experimenting with a tty Connection for R I presented twice at this years useR!. The first was a regular talk on the tty connection patch for R. The talk went smoothly, despite a live demonstration using the DLP-232PC data acquisition module (datasheet). The slides for this presentation are here: shotwell-tty-useR-2011.pdf The image above is a

Read more »

The Open Governing Index: How open is the R project?

August 8, 2011
By

The Open Governing Index is a new measure developed by VisionMobile, that rates open-source projects regarding their governance process. The index has four facets, described thoroughly in the "Open Governance Index" publication, and briefly below. access - These criteria assess the availability of source code, a permissive license, developer support mechanisms, a roadmap, and openness

Read more »

Outlier Detection with DPM Slides from JSM 2011

August 5, 2011
By
Outlier Detection with DPM Slides from JSM 2011

Here are the 14 slides I used during my talk at the Joint Statistical Meetings 2011: shotwell-jsm-2011.pdf. I'm trying hard to minimize the text in my presentation slides. But, this usually requires that I practice more. Hence, you will know which talks I have practiced thoroughly by the amount of text in the slides .

Read more »

sas7bdat reader ported to ActionScript

July 25, 2011
By

By Brian Kimball: http://code.google.com/p/sasquatch

Read more »

Prepping for useR! 2011 – tty connection update

July 22, 2011
By
Prepping for useR! 2011 – tty connection update

I'm putting together my presentation for useR! 2011 titled "Experimenting with a tty connection for R". Hence, I've updated the tty connection patch to work with R versions 2.13.0 and 2.13.1. And, instead of re-listing the patch files and re-writing instructions on their application, I've devoted a small portion of my Code page for this

Read more »

Slides for Reproducible Research Talk at Interface 2011

July 20, 2011
By
Slides for Reproducible Research Talk at Interface 2011

I gave a talk at the Interface Symposium on reproducible research in practice. I went first in the session, so the slides have a bit more background and philosophy. It was a great session; one of Jon Claerbout's colleagues spoke, Sergey Fomel, a founding author of Madagascar; Sorin Mitran from UNC Chapel Hill talked about

Read more »