Technical

Resampling Hierarchically Structured Data Recursively

April 4, 2012 | BioStatMatt

That's a mouthful! I presented this topic to a group of Vandy statisticians a few days ago. My notes (essentially reproduced in this post) are recorded at the Dept. of Biostatistics wiki: HowToBootstrapCorrelatedData. The presentation covers some bootstrap strategies for hierarchically structured (correlated) data, but focuses on the multi-stage bootstrap; ... [Read more...]

useR! 2012 Simple Abstract Helper

January 3, 2012 | BioStatMatt

useR! 2012 has issued a call for abstracts! I've extended the WebSweave concept to offer a tool to create simple abstracts online, including those with markup, which may then be submitted at the conference website. Use the following link for the Simple Abstract Helper. [Read more...]

Mortgage Refinance Calculator

December 20, 2011 | BioStatMatt

Mortgage rates are low, considering historical rates for the last 50 years. It may be timely to consider a mortgage refinance. The image above links to a simple tool for exploring mortgage refinance, built using rapache and the yet-to-be-archived yarr package for R. Hence, there are now two mortgage-related calculators on ... [Read more...]

New Powerball (lottery) Rules Will Cost You More

December 16, 2011 | BioStatMatt

The popular news are reporting [1,2,3,4,5] that the Multi-State Lottery Commission (MUSL) will change the rules for their lottery game Powerball, effective Jan. 15, 2012. I sent an email to the MUSL (at 8:00am Dec, 14th) asking for the new official rules, but haven't received a response yet (as of 10:30am Dec, 16th). ... [Read more...]

Why balloons are better than balls (in urn schemes)

November 18, 2011 | BioStatMatt

The below is taken from a work in progress: The Polya urn is a heuristic associated with Dirichlet process mixtures. We present the scheme in a modified format, using balloons instead of balls, where the probability of drawing a balloon from the urn is proportional to its volume. Balloons are ... [Read more...]

Parameter vs. Observation Dimension?

October 24, 2011 | BioStatMatt

Bill Bolstad's response to Xi'an's review of his book Understanding Computational Bayesian Statistics included the following comment, which I found interesting: Frequentist p-values are constructed in the parameter dimension using a probability distribution defined only in the observation dimension. Bayesian credible intervals are constructed in the parameter dimension using a ... [Read more...]

Another Mystery: sas7bdat != sd2

October 14, 2011 | BioStatMatt

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted ... [Read more...]

A Note on Antoniak’s Approximation for Dirichlet Processes

September 21, 2011 | BioStatMatt

Antoniak's 1974 article titled Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems (Annals of Statistics 2(6):1152-1174) is a fundamental work for most modern developments in this area. The article gives two expressions for the expected number of distinct values in a sample of size n, drawn from a Dirichlet ... [Read more...]

More sas7bdat progress

September 13, 2011 | BioStatMatt

The development version of the read.sas7bdat function (in the sas7bdat package) now reads field labels and formats. In addition, errors of the type "found subheaders where 1 expected" are now a thing of the past. These improvements are largely due to work by Clint Cummins. The function also ... [Read more...]

The Open Governance Index: Results for The R Project

August 24, 2011 | BioStatMatt

Just over two weeks ago, I invited readers to complete the Open Governance Index (OGI) Questionnaire regarding The R Project. The OGI evaluates several facets of governance in open source projects (OGI publication). The OGI questionnaire is reproduced below, and each question is linked from the table of useR responses. ... [Read more...]

The Open Governing Index: How open is the R project?

August 8, 2011 | BioStatMatt

The Open Governing Index is a new measure developed by VisionMobile, that rates open-source projects regarding their governance process. The index has four facets, described thoroughly in the "Open Governance Index" publication, and briefly below. access - These criteria assess the availability of source code, a permissive license, developer support ... [Read more...]

Outlier Detection with DPM Slides from JSM 2011

August 5, 2011 | BioStatMatt

Here are the 14 slides I used during my talk at the Joint Statistical Meetings 2011: shotwell-jsm-2011.pdf. I'm trying hard to minimize the text in my presentation slides. But, this usually requires that I practice more. Hence, you will know which talks I have practiced thoroughly by the amount of text ... [Read more...]

Prepping for useR! 2011 – tty connection update

July 22, 2011 | BioStatMatt

I'm putting together my presentation for useR! 2011 titled "Experimenting with a tty connection for R". Hence, I've updated the tty connection patch to work with R versions 2.13.0 and 2.13.1. And, instead of re-listing the patch files and re-writing instructions on their application, I've devoted a small portion of my Code page ... [Read more...]

sas7bdat database reader update

June 14, 2011 | BioStatMatt

An earlier post (1216) introduced a compatibility study (i.e. reverse engineering) of the sas7bdat database file format. The code and documentation for this are here: http://github.com/biostatmatt/sas7bdat. I've recently restructured the code as an R package, and added some functionality. Look for the sas7bdat ... [Read more...]

David Banks on Reproducible Research

June 8, 2011 | BioStatMatt

Just got an email linking to Reproducible Research: A Range of Response, in the new journal Statistics, Politics, and Policy 2(1) by David Banks, who is also the journal's editor. Interestingly, the commentary doesn't mention the journal's policy (if one exists) on the reproducibility of research submitted there. Banks' writing is ... [Read more...]

Sweave diagram, following Knuth’s original

June 2, 2011 | BioStatMatt

In preparation for a talk, I updated Knuth's original diagram in Donald E. Knuth. Literate programming. The Computer Journal, 27(2):97–111, May 1984. The new diagram is Sweave specific. Click the Sweave diagram for a PDF version, or right-click and select 'save image as' for the PNG version. Permission is granted for any ... [Read more...]
1 2 3

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)