R Quick tip: Microsoft Cognitive Services’ Text Analytics API

Posted on March 1, 2017 by Steph in R bloggers | 0 Comments

[This article was first published on R – Locke Data, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Today in class, I taught some fundamentals of API consumption in R. As it was aligned to some Microsoft content, we first used HaveIBeenPwned.com‘s API and then played with Microsoft Cognitive Services‘ Text Analytics API. This brief post overviews what you need to get started, and how you can chain consecutive calls to these APIs in order to perform multi-lingual sentiment analysis.

Getting started

To use the analytics API, you need two things:

A key
A URL

Get the key by signing up for free on the Cognitive Services site for the Text Analytics API. Make sure to verify your email address!

The URL can be retrieved from the API documentation and you can even play in an online sandbox before you start implementing it.

To use the API in R, you need:

httr
jsonlite, which httr will kindly install for you

To make some things easier on myself, I’m also going to use dplyr and data.table.

library(httr)
library(jsonlite)
library(data.table)
library(dplyr)

Starting info

To talk to the API, we need our URL, our API, and some data.

# Setup
cogapikey<-"XXX"
cogapi<-"https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages"

text=c("is this english?"
       ,"tak er der mere kage"
       ,"merci beaucoup"
       ,"guten morgen"
       ,"bonjour"
       ,"merde"
       ,"That's terrible"
       ,"R is awesome")

# Prep data
df<-data_frame(id=1:8,text)
mydata<-list(documents= df)

Topic detection

We have some different languages and we need to first do language detection before we can analyse the sentiment of our phrases

# Construct a request
response<-POST(cogapi, 
               add_headers(`Ocp-Apim-Subscription-Key`=cogapikey),
               body=toJSON(mydata))

Now we need to consume our response such that we can add the language code to our existing data.frame. The structure of the response JSON doesn’t play well with others so I use data.table’s nifty rbindlist. It is a **very good* candidate for purrr but I’m not up to speed on that yet.

# Process response
respcontent<-content(response, as="text")
respdf<-data_frame(
    id=as.numeric(fromJSON(respcontent)$documents$id), 
    iso6391Name=rbindlist(
      fromJSON(respcontent)$documents$detectedLanguages
    )$iso6391Name
  )

Now that we have a table, we can join the two together

# Combine
df%>%
  inner_join(respdf) %>%
  select(id, language=iso6391Name, text) ->
  dft

Sentiment analysis

With an ID, text, and a language code, we can now request the sentiment of our text be analysed.

# New info
mydata<-list(documents= dft)
cogapi<-"https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"
# Construct a request
response<-POST(cogapi, 
               add_headers(`Ocp-Apim-Subscription-Key`=cogapikey),
               body=toJSON(mydata))

Processing this response is simpler than processing the language response

# Process reponse
respcontent<-content(response, as="text")

fromJSON(respcontent)$documents %>%
   mutate(id=as.numeric(id)) ->
   responses

# Combine
dft %>%
  left_join(responses) ->
  dfts

And… et voila! A multi-language dataset with the language identified and the sentiment scored where the language can be scored.

id	language	text	score
1	en	is this english?	0.2852910
2	da	tak er der mere kage	NA
3	fr	merci beaucoup	0.8121097
4	de	guten morgen	NA
5	fr	bonjour	0.8118965
6	fr	merde	0.0515683
7	en	That’s terrible	0.1738841
8	en	R is awesome	0.9546152

The post R Quick tip: Microsoft Cognitive Services’ Text Analytics API appeared first on Locke Data.

To leave a comment for the author, please follow the link and comment on their blog: R – Locke Data.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

R Quick tip: Microsoft Cognitive Services’ Text Analytics API

Getting started

Starting info

Topic detection

Sentiment analysis

Related

Getting started

Starting info

Topic detection

Sentiment analysis

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)