Building a pokemon graph database
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
What happens when you combine Pokemon with Neo4j?
I’m a huge Pokemon fan. So, when I found about this awesome post from Joshua Kunst, I just couldn’t wait to throw all that data into Neo4j.
It also happens to be a great way to learn how to build a graph database from scratch. The objective of this exercise is to build a graph database where the nodes are the pokemon and the types, and the relationships are the effectiveness between the pokemon based only on their types.
Getting the data
First of all, be sure to check Joshua’s post to learn how to import all that pokemon data. We will asume that the data is in a data frame called df.
Then, we need to get the relationships between types. The easiest thing for acomplishing that is to scrape the table from pokemondb.net.
library(RNeo4j)
library(rvest)
library(methods)
library(dplyr)
link <- "http://pokemondb.net/type"
link_html <- read_html(link)
types <- link_html %>%
html_nodes("table") %>%
.[[1]] %>%
html_table()
#Give format
names(types)[1] <- "Type"
types$Type <- tolower(types$Type)
names(types)[2:ncol(types)] <- types$Type
types[is.na(types)] <- 1
types[types == ""] <- 1
types[types == "½"] <- 0.5
knitr::kable(types, format = "html")
Type | normal | fire | water | electric | grass | ice | fighting | poison | ground | flying | psychic | bug | rock | ghost | dragon | dark | steel | fairy |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
normal | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.5 | 0 | 1 | 1 | 0.5 | 1 |
fire | 1 | 0.5 | 0.5 | 1 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 2 | 0.5 | 1 | 0.5 | 1 | 2 | 1 |
water | 1 | 2 | 0.5 | 1 | 0.5 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 2 | 1 | 0.5 | 1 | 1 | 1 |
electric | 1 | 1 | 2 | 0.5 | 0.5 | 1 | 1 | 1 | 0 | 2 | 1 | 1 | 1 | 1 | 0.5 | 1 | 1 | 1 |
grass | 1 | 0.5 | 2 | 1 | 0.5 | 1 | 1 | 0.5 | 2 | 0.5 | 1 | 0.5 | 2 | 1 | 0.5 | 1 | 0.5 | 1 |
ice | 1 | 0.5 | 0.5 | 1 | 2 | 0.5 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 1 | 2 | 1 | 0.5 | 1 |
fighting | 2 | 1 | 1 | 1 | 1 | 2 | 1 | 0.5 | 1 | 0.5 | 0.5 | 0.5 | 2 | 0 | 1 | 2 | 2 | 0.5 |
poison | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 0.5 | 0.5 | 1 | 1 | 1 | 0.5 | 0.5 | 1 | 1 | 0 | 2 |
ground | 1 | 2 | 1 | 2 | 0.5 | 1 | 1 | 2 | 1 | 0 | 1 | 0.5 | 2 | 1 | 1 | 1 | 2 | 1 |
flying | 1 | 1 | 1 | 0.5 | 2 | 1 | 2 | 1 | 1 | 1 | 1 | 2 | 0.5 | 1 | 1 | 1 | 0.5 | 1 |
psychic | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 0.5 | 1 | 1 | 1 | 1 | 0 | 0.5 | 1 |
bug | 1 | 0.5 | 1 | 1 | 2 | 1 | 0.5 | 0.5 | 1 | 0.5 | 2 | 1 | 1 | 0.5 | 1 | 2 | 0.5 | 0.5 |
rock | 1 | 2 | 1 | 1 | 1 | 2 | 0.5 | 1 | 0.5 | 2 | 1 | 2 | 1 | 1 | 1 | 1 | 0.5 | 1 |
ghost | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 2 | 1 | 0.5 | 1 | 1 |
dragon | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 0.5 | 0 |
dark | 1 | 1 | 1 | 1 | 1 | 1 | 0.5 | 1 | 1 | 1 | 2 | 1 | 1 | 2 | 1 | 0.5 | 1 | 0.5 |
steel | 1 | 0.5 | 0.5 | 0.5 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 0.5 | 2 |
fairy | 1 | 0.5 | 1 | 1 | 1 | 1 | 2 | 0.5 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 0.5 | 1 |
Then we need to separate the types of the pokemon.
df %>% select(id, type = type_1) -> t1
df %>% select(id, type = type_2) -> t2
rbind(t1,t2) -> tf
poke_df <- df %>% select(-type_1, -type_2) %>%
left_join(tf, by = "id") %>%
filter(!is.na(type))
We are ready to import to Neo4j, so we need to set the connection.
Then, we create the pokenodes and the type nodes. We set a relationship for the typing.
#Connect to Graph
graph = startGraph(url = url,
username = username,
password = password)
#Constraints
addConstraint(graph, "Pokemon", "id")
addConstraint(graph, "Type", "type")
#Create nodes and relationships within the same function
pokenodes <- function(x) {
pokemon <- getOrCreateNode(graph, "Pokemon", id = x["id"], name = x["pokemon"],
height = x["height"], weight = x["weight"],
attack = x["attack"], defense = x["defense"],
hp = x["hp"], special_attack = x["special_attack"],
special_defense = x["special_defense"], speed = x["speed"],
url_image = x["url_image"], url_icon = x["url_icon"])
type <- getOrCreateNode(graph, "Type", type = x["type"])
createRel(pokemon,"TYPE",type)
}
#Apply to every row
apply(poke_df[1:nrow(poke_df),],1,pokenodes)
We define the desired relationship (effectiveness) using the scraped table
types <- types %>% gather(Type)
names(types)[2] <- "Type_Rel"
effectiveness <- types %>% filter(value != 1)
And we are ready to upload the effectiveness, this time using a transaction. Thanks to Nicloe White for this useful post
#Query for creating relationships for the pokenodes
query = "
MERGE (n:Type {type:{type_1}})
MERGE (m:Type {type:{type_2}})
CREATE (n)-[r:EFECTIVENESS]->(m)
SET r.value = {value}
"
#Transactiopn endpoint
t = newTransaction(graph)
for (i in 1:nrow(effectiveness)) {
type_1 = effectiveness[i, ]$Type
type_2 = effectiveness[i, ]$Type_Rel
value = effectiveness[i, ]$value
appendCypher(t,
query,
type_1 = type_1,
type_2 = type_2,
value = value)
}
commit(t)
It’s time to query our database!!! Let’s check all the pokemon that Salamence is double effective:
library(visNetwork)
#Query to check for effectiveness for Salamence
final_query <- "
match (n:Pokemon)-[t:TYPE]->(l:Type)-[e:EFECTIVENESS]->(s:Type)<-[j:TYPE]-(z:Pokemon)
where n.name = 'salamence'
return n.name as poke1, e.value as value, z.name as poke2, n.url_icon as icon1,
z.url_icon as icon2, n.url_image as image1, z.url_image as image2"
#Execute the query
poke_cypher = cypher(graph, final_query)
#Get data for VisNetwork
poke_cypher <- poke_cypher %>%
mutate(value = as.numeric(value)) %>%
group_by(poke1, poke2, image1, image2, icon1, icon2) %>%
summarise(value = prod(value)) %>%
ungroup()
#Filter by double effective
poke_sp_eft <- poke_cypher %>%
filter(value == 2)
#More data for VisNetwork
poke <- unique(c(poke_sp_eft$poke1, poke_sp_eft$poke2))
img <- unique(c(poke_sp_eft$icon1, poke_sp_eft$icon2))
nodes <- data.frame(id = poke, label = poke, image = img, shape = "image")
edges <- poke_sp_eft %>%
select(from = poke1, to = poke2)
#The VISUALIZATION
visNetwork(nodes, edges, width = "100%")
And that’s how you do it! With the RNeo4j it’s so easy to set a graph. Maybe in the future it could be expanded in a recommender system or something like that.
Check out a shiny app for the pokemon database!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.