Querying DBpedia from R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
DBpedia is an extract of structured information from wikipedia. The structured data can be retrieved using an SQL-like query language for RDF called SPARQL. There is already an R package for this kind of queries named SPARQL
.
There is an S4 class Dbpedia
part of my datamart
package that aims to support the creation of predefined parameterized queries. Here is an example that retrieves data on German Federal States:
> library(datamart) > dbp <- dbpedia() > # see a list of predefined queries > queries(dbp) Dbpedia#Nuts1 Xsparql#character "Nuts1" "character" > # lists Federal States > head(query(dbp, "Nuts1")) name nuts popDate pop 1 Niedersachsen DE9 2007-10-31 7977000 2 Hessen DE7 2007-09-30 6073000 3 Nordrhein-Westfalen DEA 2009-01-31 17920000 4 Freie Hansestadt Bremen DE5 2007-10-31 664000 5 Berlin DE3 2010-09-30 3450889 6 Brandenburg DE4 2008-12-31 2522493 area gdp popMetro 1 47624200000 188 NA 2 21100000000 225 NA 3 34084100000 54107 NA 4 408000000 24 NA 5 891850000 95 4429847 6 29478600000 48 NA
It is straightforward to extend the Dbpedia
class for further queries. More challenging in my opinion is to figure out useful queries. Some examples can be found at Bob DuCharme’s blog, in the article by Jos van den Oever at kde.org, in a discussion on a mailing list and a tutorial at the W3C, at Kingsley Idehen’s blog and at DBpedia’s wiki.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.