Blog Archives

Data Until I Die: My blog title and statement of values

April 28, 2014
By
Data Until I Die: My blog title and statement of values

When I started keeping this Blog, my intent was to write about and keep helpful snippets of R code that I used in the line of work.  It was the start of my second job after grad school and I … Continue reading →

Read more »

Ontario First Nations Libraries Compared Using Ontario Open Data

April 7, 2014
By
Ontario First Nations Libraries Compared Using Ontario Open Data

I recently downloaded a very cool dataset on Ontario libraries from the Ontario Open Data Catalogue.  The dataset contains 142 columns of information describing 386 libraries in Ontario, representing a fantastically massive data collection effort for such important cultural institutions (although … Continue reading →

Read more »

A Delicious Analysis! (aka topic modelling using recipes)

February 17, 2014
By
A Delicious Analysis! (aka topic modelling using recipes)

A few months ago, I saw a link on twitter to an awesome graph charting the similarities of different foods based on their flavour compounds, in addition to their prevalence in recipes (see the whole study, The Flavor Network and the … Continue reading →

Read more »

UofT R session went well. Thanks RStudio Server!

February 9, 2014
By
UofT R session went well.  Thanks RStudio Server!

Apart from going longer than I had anticipated, very little of any significance went wrong during my R session at UofT on friday!  It took a while at the beginning for everyone to get set up.  Everyone was connecting to … Continue reading →

Read more »

Teaching a Class of Undergrads, RStudio Server, and My Ubuntu Machine

February 2, 2014
By
Teaching a Class of Undergrads, RStudio Server, and My Ubuntu Machine

I was chatting about public speaking with my brother, who is a Lecturer in the Faculty of Pharmacy at UofT, when he offered me the opportunity to come to his class and teach about R.  Always eager to spread the … Continue reading →

Read more »

Nuclear vs Green Energy: Share the Wealth or Get Your Own?

December 12, 2013
By
Nuclear vs Green Energy: Share the Wealth or Get Your Own?

Thanks to Ontario Open Data, a survey dataset was recently made public containing peoples’ responses to questions about Ontario’s Long Term Energy Plan (LTEP).  The survey did fairly well in terms of raw response numbers, with 7,889 responses in total … Continue reading →

Read more »

Enron Email Corpus Topic Model Analysis Part 2 – This Time with Better regex

November 4, 2013
By
Enron Email Corpus Topic Model Analysis Part 2 – This Time with Better regex

After posting my analysis of the Enron email corpus, I realized that the regex patterns I set up to capture and filter out the cautionary/privacy messages at the bottoms of peoples emails were not working.  Let’s have a look at … Continue reading →

Read more »

A Rather Nosy Topic Model Analysis of the Enron Email Corpus

November 3, 2013
By
A Rather Nosy Topic Model Analysis of the Enron Email Corpus

Having only ever played with Latent Dirichlet Allocation using gensim in python, I was very interested to see a nice example of this kind of topic modelling in R.  Whenever I see a really cool analysis done, I get the … Continue reading →

Read more »

When did “How I Met Your Mother” become less legen.. wait for it…

October 21, 2013
By
When did “How I Met Your Mother” become less legen.. wait for it…

…dary!  Or, as you’ll see below, when did it become slightly less legendary?  The analysis in this post was inspired by DiffusePrioR’s analysis of when The Simpsons became less Cromulent. When I read his post a while back, I thought … Continue reading →

Read more »

Big and small daycares in Toronto by building type, mapped using RGoogleMaps and Toronto Open Data

October 17, 2013
By
Big and small daycares in Toronto by building type, mapped using RGoogleMaps and Toronto Open Data

Before my daughter was born, I thought that my wife and I would have to send her to a licensed child care centre somewhere in Toronto.  I had heard over and over how long of a waiting list I should … Continue reading →

Read more »