In my last post, I spoke about a certain surprise I had to share so here it is……
*cracks a knuckle or two*
It’s a web app called Billion Dollar Questions!
It’s a simple and fun web app that anyone can use to predict what sort of billionaire they’ll become. Simply tell the app who you are and a model runs its magic and tells you your future billionaire status. You can share your prediction on Twitter and Facebook to rake up cool points (if you are going to be Consistent anyway).
At this point, I think I should say that I can in no way guarantee you’d become a billionaire. My skills border around Data Science not making money rain.
Here’s how to use it
Before you go any further, I highly recommend that you read my last post. That way, a lot of the stuff on the app would be familiar to you.
Using the app is pretty simple, fill the form in a way that best describes you, click “Predict” and in a few seconds, the app would tell you what sort of billionaire you’d become. Here’s a GIF on how it works:
You can also use it on your desktop, tablet or mobile device!
Now that you’ve seen how it works, here’s the app:
Interested in How I did it?
My work is divided into two parts and can be found on my GitHub repo here:
Shiny is an amazing tool from Rstudio that gives you the ability to create R-driven web apps which can be easily deployed for anyone to use without ever having to touch code. A Shiny app usually has three parts:
- The UI: This is what you see at the front-end made up of R-wrapped HTML, JS and CSS.
- The Server side: This basically your usual R code. All R calculations, functions, scripts are run server side. In my case, this is where the model takes all your inputs, converts it to a dataframe and carries out predictions.
- Global: This is optional and it is used to declare variables globally which are to be accessed by multiple objects/functions. It is advisable to only use this when necessary because such variables or objects are loaded at runtime and if they are large or take too much time, it can slow down the loading time of your app. In my case, I read in the original dataframe here as well as the model which I used for the app since both objects would be needed by multiple functions.
My UI, server and global variables are all in the app.R file in the GitHub repo shared above.
Some Custom HTML and CSS
Things to Keep in Mind
A number of people asked me why I used h2o and not R’s famous caret for my machine learning. The answer is: it was the use case. The billionaire data had a significant amount of missing values and had variables with over 50 different categories. These two things are what most machine learning algorithm implementations in R don’t deal well with and h2o handles both gracefully. You can check out h2o’s implementation approach here.
The code that I used to create the final model used on the app along with some interesting research which did not introduce at this time, can be found here.