Blog Archives

KDD Cup 2015: The story of how I built hundreds of predictive models….And got so close, yet so far away from 1st place!

June 25, 2015
By
KDD Cup 2015: The story of how I built hundreds of predictive models….And got so close, yet so far away from 1st place!

(This article was first published on Data Until I Die!, and kindly contributed to R-bloggers) The challenge from the KDD Cup this year was to use their data relating to student enrollment in online MOOCs to predict who would drop out vs who would stay. The short story is that using H2O and a lot of my free time,...

Read more »

An R Enthusiast Goes Pythonic!

May 28, 2015
By
An R Enthusiast Goes Pythonic!

I’ve spent so many years using and broadcasting my love for R and using Python quite minimally. Having read recently about machine learning in Python, I decided to take on a fun little ML project using Python from start to … Continue reading →

Read more »

Predicting Mobile Phone Prices

April 6, 2015
By
Predicting Mobile Phone Prices

Recently a colleague of mine showed me a nauseating interactive scatterplot that plots mobile phones according to two dimensions of the user’s choice from a list of possible dimensions.  Although the interactive visualization was offensive to my tastes, the JSON … Continue reading →

Read more »

Contraceptive Choice in Indonesia

January 20, 2015
By
Contraceptive Choice in Indonesia

I wanted yet another opportunity to get to use the fabulous caret package, but also to finally give plot.ly a try.  To scratch both itches, I dipped into the UCI machine learning library yet again and came up with a … Continue reading →

Read more »

Predictive modelling fun with the caret package

December 9, 2014
By
Predictive modelling fun with the caret package

I’m back!  6 months after my second child was born, I’ve finally made it back to my blog with something fun to write about.  I recently read through the excellent Machine Learning with R ebook and was impressed by the caret … Continue reading →

Read more »

Data Until I Die: My blog title and statement of values

April 28, 2014
By
Data Until I Die: My blog title and statement of values

When I started keeping this Blog, my intent was to write about and keep helpful snippets of R code that I used in the line of work.  It was the start of my second job after grad school and I … Continue reading →

Read more »

Ontario First Nations Libraries Compared Using Ontario Open Data

April 7, 2014
By
Ontario First Nations Libraries Compared Using Ontario Open Data

I recently downloaded a very cool dataset on Ontario libraries from the Ontario Open Data Catalogue.  The dataset contains 142 columns of information describing 386 libraries in Ontario, representing a fantastically massive data collection effort for such important cultural institutions (although … Continue reading →

Read more »

A Delicious Analysis! (aka topic modelling using recipes)

February 17, 2014
By
A Delicious Analysis! (aka topic modelling using recipes)

A few months ago, I saw a link on twitter to an awesome graph charting the similarities of different foods based on their flavour compounds, in addition to their prevalence in recipes (see the whole study, The Flavor Network and the … Continue reading →

Read more »

UofT R session went well. Thanks RStudio Server!

February 9, 2014
By
UofT R session went well.  Thanks RStudio Server!

Apart from going longer than I had anticipated, very little of any significance went wrong during my R session at UofT on friday!  It took a while at the beginning for everyone to get set up.  Everyone was connecting to … Continue reading →

Read more »

Teaching a Class of Undergrads, RStudio Server, and My Ubuntu Machine

February 2, 2014
By
Teaching a Class of Undergrads, RStudio Server, and My Ubuntu Machine

I was chatting about public speaking with my brother, who is a Lecturer in the Faculty of Pharmacy at UofT, when he offered me the opportunity to come to his class and teach about R.  Always eager to spread the … Continue reading →

Read more »