Query Azure Cosmos DB in R

[This article was first published on R – Detroit Data Lab, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It’s 2019, and your company has chosen to store data in Cosmos DB, the world’s most versatile and powerful database format, because they’re all-in on cloud native architectures. You’re excited! This means you’re working with an ultra low latency, planet-scale data source!

Just one problem: your codebase is R, and connecting to retrieve the data is trickier than you’d hoped.

I give you cosmosR: a simple package I wrote to give you a head start on connections to your Cosmos DB. It won’t solve all your problems, but it’ll help you overcome the authentication headers, create some simple queries, and get you up in running much faster than rolling your own connectivity from scratch.

*Note: this is only for Cosmos DB storing documents, with retrieval via the SQL API. For more information on the SQL API please visit the Microsoft documentation.

Begin by making sure we have devtools installed, then loaded:

install.packages("devtools")
library("devtools")

With that set, install cosmosR via GitHub:

install_github("aaron2012r2/cosmosR")

We’re two lines of code away from your first data frame! Be sure to have your Access KeyURIDatabase Name, and Collection Name for your Cosmos DB handy.

Store the access key for reusable querying via cosmosAuth:

cosmosAuth("KeyGoesHere", "uri", "dbName", "collName")

Then we can use cosmosQuery to perform a simple “SELECT * from c” query, which will retrieve all documents from the Cosmos DB. By setting content.response to TRUE, we are specifying that we want to convert the raw response to a character response that we can read.

dfAllDocuments <- cosmosQuery(content.response = TRUE)

Just like that, we have our information! Because documents are stored as JSON, being that Cosmos DB is primarily controlled using Javascript, you should carefully inspect all responses.

 

 

The package used in this tutorial is available at my GitHub. Repo linked here. Please visit the link for further information on other tips or known limitations. If you see opportunities to contribute, please branch, create issues, and help further develop!

To leave a comment for the author, please follow the link and comment on their blog: R – Detroit Data Lab.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)