Blog Archives

Calculating AUC the hard way

October 10, 2013
By
Calculating AUC the hard way

The Area Under the Receiver Operator Curve is a commonly used metric of model performance in machine learning and many other binary classification/prediction problems. The idea is to generate a threshold independent measure of how well a model is able to distinguish between two possible outcomes. Threshold independent here just means that for any model

Read more »

Time-series forecasting: Bike Accidents

August 20, 2013
By
Time-series forecasting: Bike Accidents

About a year ago I posted this video visualization of all the reported accidents involving bicycles in Montreal between 2006 and 2010. In the process I also calculated and plotted the accident rate using a monthly moving average. The results followed a pattern that was for the most part to be expected. The rate shoots up

Read more »

From Whale Calls to Dark Matter: Competitive Data Science with R and Python

July 12, 2013
By
From Whale Calls to Dark Matter: Competitive Data Science with R and Python

Back in June I gave a fun talk at Montreal Python on some of my dabbling in the competitive data science scene. The good people at Savior-fair Linux recorded the talk and have edited it all together into a pretty slick video. If you can spare twenty-minutes or so, have a look. If you want

Read more »

How likely is the NSA PRISM program to catch a terrorist?

June 6, 2013
By
How likely is the NSA PRISM program to catch a terrorist?

Recent revelations about PRISM, the NSA’s massive program of surveillance of civilian communications have caused quite a stir. And rightfully so, as it appears that the agency has been granted warrantless direct access to just about any form of digital communication engaged in by American citizens, and that their access to such data has been

Read more »

What is probabilistic truth? Part 2 – Everything is conditional

May 24, 2013
By
What is probabilistic truth? Part 2 – Everything is conditional

Read Part 1 When making a statement of the form “1/2 is the correct probability that this coin will land tails”, there are a few things which are left unsaid, but which are typically implied. The statement is one about the probability of an unknown event occurring, and it would seem reasonable to write this

Read more »

What is probabilistic truth?

May 18, 2013
By
What is probabilistic truth?

I am currently working on a validation metric for binary prediction models. That is, models which make predictions about outcomes that can take on either of two possible states (eg Dead/not dead, heads/tails, cat in picture/no cat in picture, etc.) The most commonly used metric for this class of models is AUC, which assesses the

Read more »

CAISN

May 7, 2013
By
CAISN

Reblogged from Zero to R Hero: Canadian Aquatic Invasive Species Networks Annual General Meeting in Kananaskis, Alberta. May 03, 3:25-5:30. This 2-hour workshop will focus on how and why we do numerical simulation in R. Time permitting, we will also look at how to build and fit likelihood based statistical models. We ask that you bring your

Read more »

Mathematical abstraction and the robustness to assumptions

April 12, 2013
By
Mathematical abstraction and the robustness to assumptions

I’ve been showing my new favourite toys to just about anyone foolish enough to actually engage me in conversation. I described how my shiny new set of non-transitive dice work here, complete with a map showing all the relevant probabilities. All was neat and tidy and wonderful until fellow ecologist, Aaron Ball, tried to burst

Read more »

A quick guide to non-transitive Grime Dice

April 7, 2013
By
A quick guide to non-transitive Grime Dice

A very special package that I am rather excited about arrived in the mail recently. The package contained a set of 6-sided dice. These dice, however, don’t have the standard numbers one to six on their faces. Instead, they have assorted numbers between zero and nine. Here’s the exact configuration: Aside from maybe making for

Read more »

Open Data Exchange 2013, April 6. Montreal

March 29, 2013
By
Open Data Exchange 2013, April 6. Montreal

UPDATE: The day was great! There are many people doing really amazing things with open data and it was amazing to meet them. Here are my slides from the panel talk. Next Saturday, I’ll be sitting on a panel discussing future avenues for open data at ODX13. From the odx13 site: Odx13 is a mini-conference

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)