literacy rates using semantics and R

[This article was first published on - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Literacy Rates

Somehow I stumbled into the world of linked open data trying to pull information easily off of a wikipedia page without having to write a customer scrapper. Enter in dbpedia, semantic technologies and some wonderful R packages take care of the back-end coding.

The Research Group Data and Web Science at the University of Mannheim has exposed a SPARQL endpoint for the CIA Factbook

Using this and the following query, I was able to quickly pull the gender specific literacy rates:

PREFIX db: <>
PREFIX rdfs: <>
PREFIX foaf: <>
PREFIX d2r: <>
PREFIX owl: <>
PREFIX map: <file:/var/www/>
PREFIX xsd: <>
PREFIX factbook: <>
PREFIX rdf: <>

  DISTINCT ?label 
    ((?litMale - ?litFemale) AS ?litDiff)
  WHERE { 
    ?resource factbook:literacy_female ?litFemale;
      factbook:literacy_male ?litMale; 
      rdfs:label ?label .

What’s the next logical step after getting data back in tabular form?

Visualization* using ggplot2!

Female literacy rates are on the x-axis, male literacy rates on the y-axis, size of the country name represents the distance between the gender rates and the color of the country name is based on the relative “strength” of the gender differences.

Full code is available in a github repo: dataparadigms – SemanticR.

To leave a comment for the author, please follow the link and comment on their blog: - R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)