Turns out that Megan would have never gotten a callback for an audition. (Via Ben Schmidt.)

Finally, I get around to telling you that… …on Friday 24th February, I took a day out from my regular job to attend a meeting on Open Source Drug Discovery for Malaria. I should state straight away that whilst drug discovery and chem(o)informatics are topics that I find very interesting, I have no professional experience

I highly recommend Anthony Damico's excellent two-minute videos on programming in R. You can find the full list of 90+ videos here. This is the first of the series, which tells you how to download and install R:More generally, Anthony's video collectio...

In case you missed them, here are some articles from April of particular interest to R users. Information Age published a feature article on R, describing how new graduates are driving adoption of R in industry. Bob Muenchen has updated his list of R package equivalents to SAS and SPSS procedures. A history of Data Science, including Bill Cleveland's...

Imagine you perform a statistical analysis on a time series of stock market data. After some transformation, averaging, and “renormalization” you find that the resulting quantity, let’s call it , behaves as a function of time like . Since you are a physicist you get excited because you have just discovered a power law. Physicists

In part 2, we saw that adding a volatility filter to a single instrument test did little to improve performance or risk adjusted returns. How will the volatility filter impact a multiple instrument portfolio? In part 3 of the follow up, I will evaluate the impact of the volatility filter on a multiple instrument test. … Continue reading...

Introduction Over my years as a graduate student, I have built up a long list of complaints about the use of Null Hypothesis Significance Testing (NHST) in the empirical sciences. In the next few weeks, I’m planning to publish a series of blog posts, each of which will articulate one specific weakness of NHST. The

In yesterday's webinar, Revolution Analytics CTO David Champagne demonstrated how to integrate statistical graphics and analytic computations created using R software with a variety of third-party applications. In each case Revolution R Enterprise Server is running as a compute server to the client application, with R scripts launched on each user interaction via the RevoDeployR Web Services API. David...

Accounting for temporal dependence in econometric analysis is important, as the presence of temporal dependence violates the assumption that observations are independent units. Historically, much less attention has been paid to correcting for spatial dependence, which, if present, also violates this independence assumption. The comparability of temporal and spatial dependence is useful for illustrating why

From August 1990. It was in the form of a note sent to all the people in the statistics group of Bell Labs, where I’d worked that summer. To all: Here’s the abstract of the work I’ve done this summer. It’s stored in the file, /fs5/gelman/abstract.bell, and copies of the Figures 1-3 are on Trevor’s The post The...

This is a guest post written by Branson Owen, an enthusiastic R and data.table user. Wow, a long time desired feature of data.table finally came true in version 1.8.1! data.table now allowed numeric columns and big number (via bit64) in …Read more »

I can never seem to get exactly what I want from an R text editor. Let me correct that, I can never seem to get exactly what I want from an R text editor on a MAC. I used to use Tinn-R which met most my needs: Free,lightweight with ...

There are quite a few books out now on “data science”. I’ve picked out three that I think are the best place to start for computational journalists. First is Machine Learning for Hackers, by Drew Conway and John Myles White. The autho...

Foursquare, the mobile location-sharing app (of which I'm a big fan), has an excellent recommondation system. Based on your recent checkins, places your friends found popular, and even the time of day, Foursquare Explore will recommend a great place for a sushi lunch, or the best place to buy new shoes. This presentation from Foursquare engineer Ben Lee shows...

I have posted previously about the open data available on Socrata (https://opendata.socrata.com/), and I was looking at the site again today when I stumbled upon a listing of levels of various radioactive isotopes by US city and state. The data is available at https://opendata.socrata.com/Government/Sorted-RadNet-Laboratory-Analysis/w9fb-tgv6 . You will need to click export, and then download it as a...

The recent Hack/Reduce hackathon in Montreal was a tonne of fun. Our team tackled a data set of consisting of Bixi (Montreal’s bicycle share system) station states at one minute temporal resolution. We used Hadoop and mapreduce to pull out some features of user behaviours. One of the things we extracted was the flux at

Yair pointed me to this awesome blog of how the NYT people make their graphs. This blows away all other stat graphics blogs (including this one). Lots of examples from mockup to first tries to final version. I recognize a lot of what they’re doing from my own experience. Also from my experience it’s hard The post chartsnthings...

Milano R net, in collaboration with Quantide, organizes an "Introduction to R" course Milano; June 7-8, 2012 Continue reading →

In R, the traditional way to load packages can sometimes lead to situations where several lines of code need to be written just to load packages. These lines can cause errors if the packages are not installed, and can also be hard to maintain, particularly during deployment. Fortunately, there is a way to create a function in R...