Retrieve data.world data sets in Power BI using R

February 10, 2018
By

(This article was first published on R – Detroit Data Lab, and kindly contributed to R-bloggers)

[update: the data.world connector is now available in beta, as a preview feature in Power BI. However, this post is relevant in case you’d like to manage what you retrieve through your own code]

data.world is a website promising “open, secure, social, and linked” data sets. Using external data sets is a great way to learn, benchmark against competition or industry, and spark new ideas. With this tutorial, we’ll show you how to use their R Library in Power BI to retrieve and build a simple visual of some basic information.

To begin, you’ll want to install their R SDK. The short version here is to open your favorite R IDE, and enter:

install.packages("data.world")

After, get your API key from data.world by going to advanced settings in your data.world profile. Open your favorite IDE back up, and make sure to save the API key so the library will have it on open in the future:

api_key_config <- data.world::save_config("API KEY")
data.world::set_config(api_key_config)

Find your favorite data set, and we can begin. For this exercise, I’ll choose the Original Six Hockey data set.

The author of this set is scuttlemonkey, the name is original-six-hockey, and the CSV I want in particular is DET-captains, a list of all captains of the Detroit Red Wings. This gives us everything we need.

In Power BI, choose ‘Get Data’ and then find ‘R Script’

RScript

Enter the following code into the window:

library(data.world)
detroit.captains <- query(qry_sql("SELECT * FROM `DET-captains`"), dataset = "scuttlemonkey/original-six-hockey")

Hit submit, and in no time, you’ll see a preview of a data frame containing all the Detroit Red Wings captains in their history!

DataFrame

From here, we have a dataset available for us to analyze in whatever ways we need within Power BI.

Total points in a season by each Detroit Red Wings captain from 1980 to 2017

With a data set analyzed, we can ask questions like:

What happened in the 2004-2005 season? (There was an NHL lockout)

Why are there two captains in 1980? (Reed Larson and Errol Thompson split captainship that season, with Errol Thompson eventually being traded to the Pittsburgh Penguins)

 

 

To leave a comment for the author, please follow the link and comment on their blog: R – Detroit Data Lab.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)