Accessing GrahpQL from R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Few months ago, I embarked on a full stack spatial data project at work. The project kicked off amazingly, until I was almost backed to the wall when I discovered that some of the data sources were served via a GraphQL API. Before now, I haven’t worked with GraphQL. But, I have heard a lot about it and how amazing it is for querying data.
GraphQL is a query language for application programming interfaces (APIs) that prioritizes giving clients exactly the data they request. It’s designed to make APIs flexible, fast and friendly. Basically, it is used to load data from a server to a client and it does this in a much more efficient manner than traditional methods and services.
I started off installing the GraphQL playground, which is now my go to tool to interact with GraphQL API and it offers a great workflow to understand the schema and structure of any GraphQL API. I recommend you try it out.
After hours of trying out the GraphQL playground, I finally understood the schema of the GraphQL API endpoint for the project I’m working on. Well, that was the easy part I must say.
Now that I can query the API, everything should seem great. Right? Well, it’s not. Why? Well, the GraphQL API I was working with fetches the geospatial data as json file instead of geojson. Probably, because of my lack of understanding of how GraphQL truly worked with spatial data. The issue now is, I can see my data. But, just not in the format that I can work with. I am tempted to talk about the technicality behind it, but I won’t… maybe another time.
How did I find out about ghql?
At work, our scripting language of choice is R. R is a free software environment for statistical computing and graphics. Well, that’s how it’s officially defined. But, trust me its gradually evolving to do more than statistical computing and making of beautiful graphics. It was time for me to programmatically access data with GraphQL API from R. I felt excited because, I’m an R lover. But, I knew it was a going to be a huge work, I guess I was ready for the huge work.
After spending days researching about the R packages that can interact with a GraphQL API, I found three packages. I picked ghql
over the others because, it was an rOpenSci package.
Moving on
In order for R to interact with any GraphQL API, it requires a GraphQL client. That’s where ghql
, a GraphQL client for R, developed by Scott Chamberlain comes into play. Still confused? Well, so was I at first. I tried interacting with a GraphQL server in R and felt like giving up even before getting anywhere. Hopefully this blog post will assist others.
Working with Countries List
, a GraphQL public API
Let’s say you were working on a project that required country-specific data, such as currency, or language. You could get such data from the Countries GraphQL API which is a public GraphQL API for information about countries, continents, and languages. This public API uses Countries List and provinces as data sources, so the schema follows the shape of those data, with a few exceptions:
- The codes used to key the objects in the original data are available as a code property on each item returned from the API.
- The country.continent and country.languages are objects and arrays of objects, respectively.
- Each Country has an array of states populated by their states/provinces, if any.
Loading the libraries
library(ghql) library(jsonlite) library(dplyr)
Link to the GraphQL schema api
link <- 'https://countries.trevorblades.com/'
Create a new graphqlClient object
conn <- GraphqlClient$new(url = link)
Define a Graphql Query
query <- ' query($code: ID!){ country(code: $code){ name native capital currency phone languages{ code name } } }'
The ghql
query class and define query in a character string
new <- Query$new()$query('link', query)
Inspecting the schema
new$link ## ## ## query($code: ID!){ ## country(code: $code){ ## name ## native ## capital ## currency ## phone ## languages{ ## code ## name ## } ## } ## }
Define a variable as a named list
variable <- list( code = "DE" )
Making a request, passing in the query and then the variables. Then you convert the raw object to a structured json object
result <- conn$exec(new$link, variables = variable) %>% fromJSON(flatten = F) result ## $data ## $data$country ## $data$country$name ## [1] "Germany" ## ## $data$country$native ## [1] "Deutschland" ## ## $data$country$capital ## [1] "Berlin" ## ## $data$country$currency ## [1] "EUR" ## ## $data$country$phone ## [1] "49" ## ## $data$country$languages ## code name ## 1 de German
Convert the json data into a tibble object
country_data <- result$data$country %>% as_tibble() country_data ## # A tibble: 1 x 6 ## name native capital currency phone languages$code $name ## <chr> <chr> <chr> <chr> <chr> <chr> <chr> ## 1 Germany Deutschland Berlin EUR 49 de German
More examples
Working with a GraphQL API without a defined variable named list
link <- 'https://countries.trevorblades.com/' # R6 class for constructing graphql queries conn <- GraphqlClient$new(url = link) ## Define query ## Create a query class first qry <- Query$new() ## The graphQL server schema qry$query('x', '{ continent(code: "AF") { countries{ code name native capital currency phone languages { name } } } } ') ## Execute the query res <- conn$exec(qry$queries$x) # Convert the the output from raw to json format res <- jsonlite::fromJSON(res, flatten = TRUE) ## convert the from json to dataframe object res_data <- res$data$continent$countries %>% as_tibble() ## Inspect the first 6 rows of the data res_data ## # A tibble: 58 x 7 ## code name native capital currency phone languages ## <chr> <chr> <chr> <chr> <chr> <chr> <list> ## 1 AO Angola Angola Luanda AOA 244 <df[,1] [1~ ## 2 BF Burkina Faso Burkina Faso Ouagadou~ XOF 226 <df[,1] [2~ ## 3 BI Burundi Burundi Bujumbura BIF 257 <df[,1] [2~ ## 4 BJ Benin Bénin Porto-No~ XOF 229 <df[,1] [1~ ## 5 BW Botswana Botswana Gaborone BWP 267 <df[,1] [2~ ## 6 CD Democratic Repu~ République démoc~ Kinshasa CDF 243 <df[,1] [5~ ## 7 CF Central African~ Ködörösêse tî Bê~ Bangui XAF 236 <df[,1] [2~ ## 8 CG Republic of the~ République du Co~ Brazzavi~ XAF 242 <df[,1] [2~ ## 9 CI Ivory Coast Côte d'Ivoire Yamousso~ XOF 225 <df[,1] [1~ ## 10 CM Cameroon Cameroon Yaoundé XAF 237 <df[,1] [2~ ## # ... with 48 more rows
The last dance (conclusion)
So, you have stuck with me this far? Thanks!
My final thought. I think GraphQL can greatly simplify data needs for both client product developers, server-side engineers and data scientist. It’s still early to assertain the extent of it’s impact in the technological world. But, it seems very promising since the Team behind GraphQL are continously improving the technology, and there is a growing community.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.