Articles by BioStatMatt

Why balloons are better than balls (in urn schemes)

November 18, 2011 | BioStatMatt

The below is taken from a work in progress: The Polya urn is a heuristic associated with Dirichlet process mixtures. We present the scheme in a modified format, using balloons instead of balls, where the probability of drawing a balloon from the urn is proportional to its volume. Balloons are ... [Read more...]

Parameter vs. Observation Dimension?

October 24, 2011 | BioStatMatt

Bill Bolstad's response to Xi'an's review of his book Understanding Computational Bayesian Statistics included the following comment, which I found interesting: Frequentist p-values are constructed in the parameter dimension using a probability distribution defined only in the observation dimension. Bayesian credible intervals are constructed in the parameter dimension using a ... [Read more...]

Another Mystery: sas7bdat != sd2

October 14, 2011 | BioStatMatt

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted ... [Read more...]

A Note on Antoniak’s Approximation for Dirichlet Processes

September 21, 2011 | BioStatMatt

Antoniak's 1974 article titled Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems (Annals of Statistics 2(6):1152-1174) is a fundamental work for most modern developments in this area. The article gives two expressions for the expected number of distinct values in a sample of size n, drawn from a Dirichlet ... [Read more...]

More sas7bdat progress

September 13, 2011 | BioStatMatt

The development version of the read.sas7bdat function (in the sas7bdat package) now reads field labels and formats. In addition, errors of the type "found subheaders where 1 expected" are now a thing of the past. These improvements are largely due to work by Clint Cummins. The function also ... [Read more...]

The Open Governance Index: Results for The R Project

August 24, 2011 | BioStatMatt

Just over two weeks ago, I invited readers to complete the Open Governance Index (OGI) Questionnaire regarding The R Project. The OGI evaluates several facets of governance in open source projects (OGI publication). The OGI questionnaire is reproduced below, and each question is linked from the table of useR responses. ... [Read more...]

The Open Governing Index: How open is the R project?

August 8, 2011 | BioStatMatt

The Open Governing Index is a new measure developed by VisionMobile, that rates open-source projects regarding their governance process. The index has four facets, described thoroughly in the "Open Governance Index" publication, and briefly below. access - These criteria assess the availability of source code, a permissive license, developer support ... [Read more...]

Outlier Detection with DPM Slides from JSM 2011

August 5, 2011 | BioStatMatt

Here are the 14 slides I used during my talk at the Joint Statistical Meetings 2011: shotwell-jsm-2011.pdf. I'm trying hard to minimize the text in my presentation slides. But, this usually requires that I practice more. Hence, you will know which talks I have practiced thoroughly by the amount of text ... [Read more...]

Prepping for useR! 2011 – tty connection update

July 22, 2011 | BioStatMatt

I'm putting together my presentation for useR! 2011 titled "Experimenting with a tty connection for R". Hence, I've updated the tty connection patch to work with R versions 2.13.0 and 2.13.1. And, instead of re-listing the patch files and re-writing instructions on their application, I've devoted a small portion of my Code page ... [Read more...]

sas7bdat database reader update

June 14, 2011 | BioStatMatt

An earlier post (1216) introduced a compatibility study (i.e. reverse engineering) of the sas7bdat database file format. The code and documentation for this are here: http://github.com/biostatmatt/sas7bdat. I've recently restructured the code as an R package, and added some functionality. Look for the sas7bdat ... [Read more...]

David Banks on Reproducible Research

June 8, 2011 | BioStatMatt

Just got an email linking to Reproducible Research: A Range of Response, in the new journal Statistics, Politics, and Policy 2(1) by David Banks, who is also the journal's editor. Interestingly, the commentary doesn't mention the journal's policy (if one exists) on the reproducibility of research submitted there. Banks' writing is ... [Read more...]

Sweave diagram, following Knuth’s original

June 2, 2011 | BioStatMatt

In preparation for a talk, I updated Knuth's original diagram in Donald E. Knuth. Literate programming. The Computer Journal, 27(2):97–111, May 1984. The new diagram is Sweave specific. Click the Sweave diagram for a PDF version, or right-click and select 'save image as' for the PNG version. Permission is granted for any ... [Read more...]

Comments on an R Connections API

May 9, 2011 | BioStatMatt

I wrote this post months ago but never hit 'Publish'. But, the subject has changed little since then. So, here's to cleaning out the draft folder... R's connections are the heart of data/code/text input and output. Without connections, R would be crippled. Additional connections make R more ... connected ... [Read more...]

Progress reading SAS sas7bdat files (natively) in R

April 18, 2011 | BioStatMatt

This post describes some preliminary results from a compatibility study of the SAS sas7bdat file format. The most current results stored in a github repository here: sas7bdat The ultimate goal is a native solution to the incompatibility between open-source statistical software (e.g. R) and sas7bdat database ... [Read more...]

Some comments peer-review and a year of blogging

April 13, 2011 | BioStatMatt

It's been a year since I began keeping a web log. This post presents some thoughts related to the experience. Blogging is Sharing Ideas Blogging is online self-publishing. There is no faster way to share your ideas so broadly. Last year at the useR! conference (in Gaithersburg, MD, just a ... [Read more...]
1 2 3 4

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)