Blog Archives

What’s Next

May 9, 2013
By

The last two weeks have been full of changes for me. For those who’ve been asking about what’s next, I thought I’d write up a quick summary of all the news. (1) I successfully defended my thesis this past Monday. Completing a Ph.D. has been a massive undertaking for the past five years, and it’s

Read more »

Using Norms to Understand Linear Regression

March 22, 2013
By

Introduction In my last post, I described how we can derive modes, medians and means as three natural solutions to the problem of summarizing a list of numbers, \((x_1, x_2, \ldots, x_n)\), using a single number, \(s\). In particular, we measured the quality of different potential summaries in three different ways, which led us to

Read more »

Modes, Medians and Means: A Unifying Perspective

March 22, 2013
By
Modes, Medians and Means: A Unifying Perspective

Introduction / Warning Any traditional introductory statistics course will teach students the definitions of modes, medians and means. But, because introductory courses can’t assume that students have much mathematical maturity, the close relationship between these three summary statistics can’t be made clear. This post tries to remedy that situation by making it clear that all

Read more »

Writing Better Statistical Programs in R

January 24, 2013
By
Writing Better Statistical Programs in R

A while back a friend asked me for advice about speeding up some R code that they’d written. Because they were running an extensive Monte Carlo simulation of a model they’d been developing, the poor performance of their code had become an impediment to their work. After I looked through their code, it was clear

Read more »

Americans Live Longer and Work Less

January 21, 2013
By
Americans Live Longer and Work Less

Today I saw an article on Hacker News entitled, “America’s CEOs Want You to Work Until You’re 70″. I was particularly surprised by this article appearing out of the blue because I take it for granted that America will eventually have to raise the retirement age to avoid bankruptcy. After reading the article, I wasn’t

Read more »

Symbolic Differentiation in Julia

January 7, 2013
By

A Brief Introduction to Metaprogramming in Julia In contrast to my previous post, which described one way in which Julia allows (and expects) the programmer to write code that directly employs the atomic operations offered by computers, this post is meant to introduce newcomers to some of Julia’s higher level functions for metaprogramming. To make

Read more »

Computers are Machines

January 3, 2013
By

When people try out Julia for the first time, many of them are worried by the following example: 1 2 3 4 5 6 7 julia> factorial(n) = n == 0 ? 1 : n * factorial(n - 1)   julia> factorial(20) 2432902008176640000   julia> factorial(21) -4249290049419214848 If you’re not familiar with computer architecture, this

Read more »

What is Correctness for Statistical Software?

December 14, 2012
By
What is Correctness for Statistical Software?

Introduction A few months ago, Drew Conway and I gave a webcast that tried to teach people about the basic principles behind linear and logistic regression. To illustrate logistic regression, we worked through a series of progressively more complex spam detection problems. The simplest data set we used was the following: This data set has

Read more »

A Cheap Criticism of p-Values

December 6, 2012
By

One of these days I am going to finish my series on problems with how NHST is issued in the social sciences. Until then, I came up with a cheap criticism of p-values today. To make sense of my complaint, you’ll want to head over to Andy Gelman’s blog and read the comments on his

Read more »

The State of Statistics in Julia

December 2, 2012
By

Updated 12.2.2012: Added sample output based on a suggestion from Stefan Karpinski. Introduction Over the last few weeks, the Julia core team has rolled out a demo version of Julia’s package management system. While the Julia package system is still very much in beta, it nevertheless provides the first plausible way for non-expert users to

Read more »