Blog Archives

Bayesian Nonparametrics in R

June 25, 2012
By
Bayesian Nonparametrics in R

On July 25th, I’ll be presenting at the Seattle R Meetup about implementing Bayesian nonparametrics in R. If you’re not sure what Bayesian nonparametric methods are, they’re a family of methods that allow you to fit traditional statistical models, such as mixture models or latent factor models, without having to fully specify the number of

Read more »

The Great Julia RNG Refactor

June 21, 2012
By

Many readers of this blog will know that I’m a big fan of Bayesian methods, in large part because automated inference tools like JAGS allow modelers to focus on the types of structure they want to extract from data rather than worry about the algorithmic details of how they will fit their models to data.

Read more »

Criticism 4 of NHST: No Mechanism for Producing Substantive Cumulative Knowledge

May 18, 2012
By

In this fourth part of my series of criticisms of NHST, I’m going to focus on broad

Read more »

Criticism 3 of NHST: Essential Information is Lost When Transforming 2D Data into a 1D Measure

May 14, 2012
By
Criticism 3 of NHST: Essential Information is Lost When Transforming 2D Data into a 1D Measure

Introduction Continuing on with my series on the weaknesses of NHST, I’d like to focus on an issue that’s not specific to NHST, but rather one that’s relevant to all quantitative analysis: the destruction caused by an inappropriate reduction of dimensionality. In our case, we’ll be concerned with the loss of essential information caused by

Read more »

Criticism 2 of NHST: NHST Conflates Rare Events with Evidence Against the Null Hypothesis

May 12, 2012
By

Introduction This is my second post in a series describing the weaknesses of the NHST paradigm. In the first post, I argued that NHST is a dangerous tool for a community of researchers because p-values cannot be interpreted properly without perfect knowledge of the research practices of other scientists — knowledge that we cannot hope

Read more »

Criticism 1 of NHST: Good Tools for Individual Researchers are not Good Tools for Research Communities

May 10, 2012
By

Introduction Over my years as a graduate student, I have built up a long list of complaints about the use of Null Hypothesis Significance Testing (NHST) in the empirical sciences. In the next few weeks, I’m planning to publish a series of blog posts, each of which will articulate one specific weakness of NHST. The

Read more »

cumplyr: Extending the plyr Package to Handle Cross-Dependencies

May 3, 2012
By

Introduction For me, Hadley Wickham‘s reshape and plyr packages are invaluable because they encapsulate omnipresent design patterns in statistical computing: reshape handles switching between the different possible representations of the same underlying data, while plyr automates what Hadley calls the Split-Apply-Combine strategy, in which you split up your data into several subsets, perform some computation

Read more »

Implementing the Exact Binomial Test in Julia

April 14, 2012
By

One major benefit of spending my time recently adding statistical functionality to Julia is that I’ve learned a lot about the inner guts of algorithmic null hypothesis significance testing. Implementing Welch’s two-sample t-test last week was a trivial task because of the symmetry of the null hypothesis, but implementing the exact binomial test has proven

Read more »

Floating Point Arithmetic and The Descent into Madness

April 13, 2012
By

While I should confess upfront that I’ve always had a weaker command of the details of floating point arithmetic than I feel I ought to have, this sort of thing still blows my mind when I stumble upon it. These moments invariably make me realize that floating point math will simply never satisfy my naive

Read more »

Comparing Julia and R’s Vocabularies

April 9, 2012
By

While exploring the Julia manual recently, I realized that it might be helpful to put the basic vocabularies of Julia and R side-by-side for easy comparison. So I took Hadley Wickham’s R Vocabulary section from the book he’s putting together on the devtools wiki, put all of the functions Hadley listed into a CSV file,

Read more »