Site icon R-bloggers

An introduction to H2O.ai

[This article was first published on The Jumping Rivers Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you came here looking for an introduction to water, or a synopsis of the 2003 TV series about teenage mermaids you have sadly come to the wrong place. The H2O that we will talk about is H2O.ai, a company which develops products for easy, scalable, machine learning and artificial intelligence.

Introduction

Machine learning and artificial intelligence (or AI for short) are topics which have had a lot of interest over the past 4-5 years. Some of this interest has come from businesses as they begin to utilise the information they collect on a day-to-day basis to streamline/automate processes or gain insight. A lot of companies are now looking to hire data scientists/engineers and in turn this is making a lot more people interested in machine learning and AI.

Now, as you look at upskilling in machine learning and AI, you might start by reading some books, taking some online courses and, if you are anything like me, going through many, many, many online tutorials and blog posts on different techniques. It’s at this point you will probably start to realise there are a lot of tools out there that you can use for your machine learning or AI problems. Deciding which tool is best for the job at hand can be very difficult. Hopefully, after reading this blog you will have a better idea of H2O.ai’s products and if they are what you have been looking for.

Who are H2O.ai?

H2O.ai are a company which say they are the visionary leaders in making AI accessible for everyone. Currently, they are the AI partner for over twenty thousand organisations including over half of the companies listed on the Fortune 500 and are used by over one million data scientists around the world. They also have twenty of the world’s Kaggle Grandmasters (of which, at the point of writing, there are 262 in the world) working for their company showing the great talent they have working there.


Data comes in all shapes and sizes. It can often be difficult to know where to start. Whatever your problem, Jumping Rivers can help.


Current products

H2O.ai are an open-source company that supply both free and proprietary tools. As H2O.ai state that they are democratising machine learning and AI, they have a range of tools to aid everyone with the machine learning projects from idea to production, no matter their level of expertise. Below, you can read a short overview of the different tools that they provide.

Open source tools

Image taken from H2O Wave homepage.

Propriety tools

If you are interested in trying out any of the above propriety tools (excluding Enterprise Puddle), H2O.ai are offering a free 90-day free trial.

Technical features

H2O.ai products are used for distributed in-memory machine learning platforms. They achieve this by distributing data across an H2O cluster and storing it in memory in a compressed format which allows for parallelisation. H2O.ai use Java as their main coding language. REST APIs are used to allow you to access and code in H2O products in languages such as R and Python so you can use H2O, H2O Wave, Sparkling Water and Driverless AI without needing to learn another coding language if you know R or Python!

Another feature of H2O and H2O Driverless AI that you might find useful is any model created with either tool can be exported for later use. In H2O, a model can be exported as a hierarchical data format (HDF5) file, or a MOJO (model object, optimized) or a POJO (plain old java object), if you want to learn more about these different formats here is a useful link. In H2O Driverless AI you can export ‘Scoring Pipelines’. These can be used to deploy the models that you have developed within Driverless AI for production. They can be exported as either Python Scoring Pipelines, or MOJO Scoring Pipelines. Within the Python Scoring Pipeline an example Python script is added to show you how to use the pipeline in practice. If you would like to know more about exporting your Scoring Pipelines from Driverless AI take a look here.

Conclusion

Now you have read about H2O.ai and the tools that they provide, I hope that you have a better idea of what H2O.ai tools you could use for your machine learning projects. H2O.ai have created a set of tools which knit together nicely when used with each other. If you want to see how these tools are used in production, H2O.ai have a full section on their website dedicated to show use cases for their tools.


For updates and revisions to this article, see the original post

To leave a comment for the author, please follow the link and comment on their blog: The Jumping Rivers Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.