Site icon R-bloggers

Advent of 2020, Day 1 – What is Azure DataBricks

[This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Azure Databricks is a data analytics platform (PaaS), specially optimised for Microsoft Azure cloud platform. Databricks is an enterprise-grade platform service that is unified for data lake architecture for large analytical operations.

Azure Databricks combines:

Azure Databricks is optimized for the Microsoft Azure and offeres interactive workspace for collaboration between data engineers, data scientists, and machine learning engineers. With the multi language capabilities to create notebooks in Python, R, Scala, Spark, SQL and others.

It gives you the capabilities also to run SQL queries on data lake, create multiple visualisation types to explore query results from different perspectives, and build and share dashboards.

Azure Databricks is designed to build and handle big data pipeline, for data ingestion (raw or structured) into Azure through several different Azure services as:

If supports also connectivity so several persisted storages for creating data lake, like:

Your analytics workflow will be using Spark technology to read data from multiple different sources, and create state of the art analytics in Azure Databricks.

Welcome page to Azure Databricks gives you easy, fast and collaborative interface.

Complete set of code and Notebooks will be available at the Github repository.

Happy Coding and Stay Healthy!

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.