Following up on my previous posts about H2O Deep Learning (TTTAR1) and RUGSMAPS (TTTAR2), here is a quick update on two interesting things I have been working on: a Kaggle tutorial and a new RUGSMAPS app.
Short Tutorials based on a Kaggle Competition
First of all, I would like to share with you my first ever guest post on Domino Data Lab’s blog – “How to use R, H2O and Domino for a Kaggle competition”.
|My guest post on Domino Data Lab’s blog.|
As a sequel to TTTAR1, this blog post is a more in-depth machine learning article with starter code and short tutorials. The purpose is to get more people started with R, H2O Deep Learning and Domino Data Lab using a recent Kaggle competition as case study. The short tutorials should be generic enough for Kaggle competitions as well as any general data mining exercises. I hope you will find it useful if you are interested in machine learning stuff.
RUGSMAPS2: A Crowd-Sourcing Experiment
Shortly after RUGSMAPS went public, Ines Garmendia kindly pointed out that there were several mistakes in the app (thanks, Ines! … at the same time, doh!!!). For example, the two groups in Madrid are supposed to be much further apart than I thought. More importantly, they are subgroups of the main “Comunidad R Hispano” group. Without Ines‘ feedback, I would never be able to notice that myself. I was only relying on the source data from the contest.
I know a lot more local knowledge is required to make RUGSMAPS a much better app for the R community. It is also necessary to streamline the updating process so that new groups can be added easily. Therefore, I am now proposing a crowd-sourcing experiment and am hoping that more RUGs organisers/members can contribute in future. My idea is a dynamic web app (RUGSMAPS2) that reads information directly from a live Google spreadsheet.
Let’s start with my favourite LondonR and its sister group ManchesterR. I know a lot about them personally so I can provide information like their key sponsor (Mango Solutions), venues and websites. To make this information available to all other R users, I just need to update the Google spreadsheet and the new RUGSMAPS2 will automatically render maps with new data.
|Adding venue, key sponsors, websites and other information.|
|LondonR and ManchesterR with additional information.|
For the Comunidad R Hispano issues I mentioned above, you can see that Ines helped me to fill in some new information about the four subgroups:
|Entering main and subgroup information.|
|Fixing RUGSMAPS to show subgroups of Comunidad R Hispano correctly.|
So if you’re interested in helping me out or know someone else who might be able to help, please spread the word and forward this Google spreadsheet. Oh, wait … can EVERYONE edit that spreadsheet??? Yes. I understand there might be issues if everyone can edit the spreadsheet without my permission. That’s why I am calling it an experiment! I would like to see whether I can turn this into a successful crowd-sourcing project (or, otherwise, organised chaos). The app won’t go live until I am happy with the new information and features. You can check out the RUGSMAPS2 repository for all the latest development!