Reshape Wide to LongLet's use the Loblolly dataset from the datasets package. These data track the growth of some loblolly pine trees.> Loblolly height age Seed1 4.51 3 30115 10.89 ...

One issue I continuously encounter when starting to work with a new dataset is that of the codebook. In general, I prefer to load a codebook into R like any other data source, specifically as a data frame. And ideally, one data frame to provides the variable names with descriptions and any other meta data available, and a separate...

Factor Analysis of Baseball's Hall of Fame VotersRecently, Nate Silver wrote a post which analyzed how voters who voted for and against Barry Bonds for Baseball's Hall of Fame differed. Not surprisingly, those who voted for Bonds were more likely to vote for other suspected steroids users (like Roger Clemens). This got...

A trackback from Martin Hawksey’s recent post on Analysing WordPress post velocity and momentum stats with Google Sheets (Spreadsheet), which demonstrates how to pull WordPress stats into a Google Spreadsheet and generates charts and reports therein, reminded me of the WordPress stats API. So here’s a quick function for pulling WordPress reports into R. (Code

In two previous posts, I have written about how you can speed up your R computations either by using strange notation and non-standard functions or by compiling your code. Last year my department bought a 64-core computational server, which allowed me ...

Students in any basic statistics class are taught linear regression, which is one of the simplest forms of a statistical model. The basic idea is that a ‘response’ variable can be mathematically related to one or any number of ‘explanatory’ variables through a linear equation and a normally distributed error term. With any statistical tool,

This a brief guide to using R in collaborative, social ways. R is a powerful open-source programming language for data analysis, statistics, and visualization, but much of its power derives from a large, engaged community of users. This is an introduction to tools for engaging the community to improve your R code and collaborate with others. (Am I...

A Brief Introduction to Metaprogramming in Julia In contrast to my previous post, which described one way in which Julia allows (and expects) the programmer to write code that directly employs the atomic operations offered by computers, this post is meant to introduce newcomers to some of Julia’s higher level functions for metaprogramming. To make

This blog post by Sean Taylor generated quite a stir. He discussed the signals one sends by using certain software packages and seems to think that R users are more competent. The reactions ranged from amusement to bashing. In defense of hard to learn statistical tools, i.e. #rstats prsm.tc/gyTBRK <- pretty funny 'who uses what I encourage you...

Calibrations of 2013 predictions for 18 equity indices — plus some publicly available predictions. Orientation The distributions are an attempt to see the variability if there were no market-driving news for the whole year. Another way of thinking: mentally moving the distribution to center on a prediction gives a sense of the variability of results … Continue reading...