Blog Archives

Predicting Mobile Phone Prices

April 6, 2015
By
Predicting Mobile Phone Prices

Recently a colleague of mine showed me a nauseating interactive scatterplot that plots mobile phones according to two dimensions of the user’s choice from a list of possible dimensions.  Although the interactive visualization was offensive to my tastes, the JSON … Continue reading →

Read more »

Contraceptive Choice in Indonesia

January 20, 2015
By
Contraceptive Choice in Indonesia

I wanted yet another opportunity to get to use the fabulous caret package, but also to finally give plot.ly a try.  To scratch both itches, I dipped into the UCI machine learning library yet again and came up with a … Continue reading →

Read more »

Predictive modelling fun with the caret package

December 9, 2014
By
Predictive modelling fun with the caret package

I’m back!  6 months after my second child was born, I’ve finally made it back to my blog with something fun to write about.  I recently read through the excellent Machine Learning with R ebook and was impressed by the caret … Continue reading →

Read more »

Data Until I Die: My blog title and statement of values

April 28, 2014
By
Data Until I Die: My blog title and statement of values

When I started keeping this Blog, my intent was to write about and keep helpful snippets of R code that I used in the line of work.  It was the start of my second job after grad school and I … Continue reading →

Read more »

Ontario First Nations Libraries Compared Using Ontario Open Data

April 7, 2014
By
Ontario First Nations Libraries Compared Using Ontario Open Data

I recently downloaded a very cool dataset on Ontario libraries from the Ontario Open Data Catalogue.  The dataset contains 142 columns of information describing 386 libraries in Ontario, representing a fantastically massive data collection effort for such important cultural institutions (although … Continue reading →

Read more »

A Delicious Analysis! (aka topic modelling using recipes)

February 17, 2014
By
A Delicious Analysis! (aka topic modelling using recipes)

A few months ago, I saw a link on twitter to an awesome graph charting the similarities of different foods based on their flavour compounds, in addition to their prevalence in recipes (see the whole study, The Flavor Network and the … Continue reading →

Read more »

UofT R session went well. Thanks RStudio Server!

February 9, 2014
By
UofT R session went well.  Thanks RStudio Server!

Apart from going longer than I had anticipated, very little of any significance went wrong during my R session at UofT on friday!  It took a while at the beginning for everyone to get set up.  Everyone was connecting to … Continue reading →

Read more »

Teaching a Class of Undergrads, RStudio Server, and My Ubuntu Machine

February 2, 2014
By
Teaching a Class of Undergrads, RStudio Server, and My Ubuntu Machine

I was chatting about public speaking with my brother, who is a Lecturer in the Faculty of Pharmacy at UofT, when he offered me the opportunity to come to his class and teach about R.  Always eager to spread the … Continue reading →

Read more »

Nuclear vs Green Energy: Share the Wealth or Get Your Own?

December 12, 2013
By
Nuclear vs Green Energy: Share the Wealth or Get Your Own?

Thanks to Ontario Open Data, a survey dataset was recently made public containing peoples’ responses to questions about Ontario’s Long Term Energy Plan (LTEP).  The survey did fairly well in terms of raw response numbers, with 7,889 responses in total … Continue reading →

Read more »

Enron Email Corpus Topic Model Analysis Part 2 – This Time with Better regex

November 4, 2013
By
Enron Email Corpus Topic Model Analysis Part 2 – This Time with Better regex

After posting my analysis of the Enron email corpus, I realized that the regex patterns I set up to capture and filter out the cautionary/privacy messages at the bottoms of peoples emails were not working.  Let’s have a look at … Continue reading →

Read more »