From my perspective the most important event that happened at useR! 2014 was that I got to meet the 0xdata team and now, long story short, here I am introducing the latest version of H2O, labeled Lagrange (18.104.22.168), to the R and greater data science communities. Before joining 0xdata, I was working at a competitor on a rival project and was repeatedly asked why my generalized linear model analytic didn’t run as fast as H2O’s GLM. The answer then as it is now is the same — because H2O has a cutting edge distributed in-memory parallel computing architecture — but I no longer receive an electric shock every time I say so.
For those hearing about H2O for the first time, it is an open-source distributed in-memory data analysis tool designed for extremely large data sets and the H2O Lagrange (22.214.171.124) release provides scalable solutions for the following analysis techniques:
- Generalized Linear Model
- Random Forest
- Principal Components Analysis
- Gradient Boosted Regression and Classification
- Naive Bayes
- Deep Learning
In my first blog post at 0xdata, I wanted to keep it simple and make sure R
users know how to get the
h2o package, which is cross-referenced on the
High-Performance and Parallel Computing
Machine and Statistical Learning
CRAN Task Views, up and running on their
computers. To so do, open an R console of your choice and type
# Download, install, and initialize the H2O package install.packages("h2o", repos = c("http://h2o-release.s3.amazonaws.com/h2o/rel-lagrange/11/R", getOption("repos"))) library(h2o) localH2O <- h2o.init() # List and run some demos to see H2O at work demo(package = "h2o") demo(h2o.glm) demo(h2o.deeplearning)
After you are done experimenting with the demos in R, you can open up a web browser to http://localhost:54321/ to give the H2O web interface a once over and then hop over to 0xdata’s YouTube channel for some in-depth talks.
Over the coming weeks we at 0xdata will continue to blog about how to use H2O through R and other interfaces. If there is a particular use case you would like to see addressed, join our h2ostream Google Groups conversation or e-mail us at [email protected]. Until then, happy analyzing.