It’s 2019, and your company has chosen to store data in Cosmos DB, the world’s most versatile and powerful database format, because they’re all-in on cloud native architectures. You’re excited! This means you’re working with an ultra low latency, planet-scale data source!
Just one problem: your codebase is R, and connecting to retrieve the data is trickier than you’d hoped.
I give you cosmosR: a simple package I wrote to give you a head start on connections to your Cosmos DB. It won’t solve all your problems, but it’ll help you overcome the authentication headers, create some simple queries, and get you up in running much faster than rolling your own connectivity from scratch.
*Note: this is only for Cosmos DB storing documents, with retrieval via the SQL API. For more information on the SQL API please visit the Microsoft documentation.
Begin by making sure we have devtools installed, then loaded:
With that set, install cosmosR via GitHub:
We’re two lines of code away from your first data frame! Be sure to have your Access Key, URI, Database Name, and Collection Name for your Cosmos DB handy.
Store the access key for reusable querying via cosmosAuth:
cosmosAuth("KeyGoesHere", "uri", "dbName", "collName")
Then we can use cosmosQuery to perform a simple “SELECT * from c” query, which will retrieve all documents from the Cosmos DB. By setting content.response to TRUE, we are specifying that we want to convert the raw response to a character response that we can read.
dfAllDocuments <- cosmosQuery(content.response = TRUE)
The package used in this tutorial is available at my GitHub. Repo linked here. Please visit the link for further information on other tips or known limitations. If you see opportunities to contribute, please branch, create issues, and help further develop!