In my last post I said that I would try to investigate the question of who actually does want a casino, and whether place of residence is a factor in where they want the casino to be built. So, here … Continue reading →
In my last post I said that I would try to investigate the question of who actually does want a casino, and whether place of residence is a factor in where they want the casino to be built. So, here … Continue reading →
Toronto City Council is in the midst of a very lengthy process of considering whether or not to allow the OLG to build of a new casino in Toronto, and where. The process started in November of 2012, and set … Continue reading →
(This article was first published on Data and Analysis with R, at Work, and kindly contributed to R-bloggers) The reorder function, in R 3.0.0, is behaving strangely (or I’m really not understanding something). Take the following simple data frame: df = data.frame(a1 = c(4,1,1,3,2,4,2), a2 = c(“h”,”j”,”j”,”e”,”c”,”h”,”c”)) I expect that if I call the reorder function on the a2...
I had a very long file of monetary transactions (about 207,000 rows) with about two handfuls of columns describing each transaction (including date). The task I needed to perform on this file was to select the value from one of … Continue reading →
Call me incompetent, but I just can’t get ffdfdply to work with my ffdf dataframes. I’ve tried repeatedly and it just doesn’t seem to work! I’ve seen numerous examples on stackoverflow, but maybe I’m applying them incorrectly. Wanting to do some … Continue reading →
It’s survey analysis season for me at work! When analyzing survey data, the one kind of analysis I have realized that I’m not used to doing is finding patterns in binary data. In other words, if I have a question … Continue reading →
Sitting in my synagogue this past Saturday, I started thinking about the authorship analysis that I did using function word counts from texts authored by Shakespeare, Austen, etc. I started to wonder if I could do something similar with the … Continue reading →
After the work I did for my last post, I wanted to practice doing multiple classification. I first thought of using the famous iris dataset, but felt that was a little boring. Ideally, I wanted to look for a practice … Continue reading →
Now that I’m on my winter break, I’ve been taking a little bit of time to read up on some modelling techniques that I’ve never used before. Two such techniques are Random Forests and Conditional Trees. Since both can be used … Continue reading →
Recently at work we got sent a data file containing information on donations to a specific charitable organization, ranging all the way back to the 80′s. Usually, when we receive a dataset with a donation history in it, each row … Continue reading →