Having fun with rgefx package and sigmajs in R

(This article was first published on jkunst.com: Entries for category R, and kindly contributed to R-bloggers)

The las week I knew the r package rgexf made by George Vega Yon. Rgexf is a R library to work with GEXF graph files. This type of files allow represent networks in a xml. So, if you have a list of nodes and a data frame of edges (source-target) you can obtain a gexf file with write.gexf function. Then you have a gexf file which can open with Gephi or use with Sigma.js to show via web. Simple, right?

Now, if you work with your data and want to visualize your gexf object you must open Gephi or create a simple http server to view your chart using SigmaJS (this is because for security a reason, I guess, like this). For not do this extra step we can use the Rook  R library for print gexf objects more quickly and easy.

Basically we need to make a RHttp element with 2 apps. One app for show the plot (using SigmaJS) and another app to access the data made by write.gefx. We need a template for made the first app, I used the example founded here and I modified a little (here, "save link as"), this index.html must be in your working directory. Now the function is:

 plot.gexf <- function(gexf.object){
      library(Rook)
      graph <- gexf.object$graph
      s <- Rhttpd$new()
      s$start(listen='127.0.0.1')
      my.app <- function(env){
            res <- Response$new()
            res$write(paste(readLines("index.html", warn=F), collapse="\n"))
            res$finish()
      }

      s$add(app=my.app, name='plot')

      my.app2 <- function(env){
            res <- Response$new()
            res$write(graph)
            res$finish()
      }

      s$add(app=my.app2, name='data')
      s$browse('plot') 
}

Now we need to create the gexf object.


nNodes <- 100
nRelations <- 200

nodes <- data.frame(id = c(1:nNodes),
                    names = c(1:nNodes))


allrelations <- as.data.frame(t(combn(nNodes, 2)))
relations <- allrelations[sample(1:nrow(allrelations),
                                 size = min(c(nRelations, nrow(allrelations)))),]
names(relations) <- c("target", "source")

nodecolors <- data.frame(r = sample(1:249, size = nrow(nodes), replace=T),
                         g = sample(1:249, size = nrow(nodes), replace=T),
                         b = sample(1:249, size = nrow(nodes), replace=T),
                         a = runif(nrow(nodes), min=.5, max=1))


nodesizes <- sample(50:500, size=nrow(nodes), replace=T)
edgethicks <- sample(50:500, size=nrow(relations), replace=T)

This is all random, the color, sizes, etc. But if the position of each node is random the visualization will not achive its purpouse: find agglomerations or find groups of nodes more closely between them. For this reason we can use the sna package to find an optimal layout to show the nodes depending the links between them (if you know other packages with more algorithms please email me!).

links <- matrix(rep(0, nNodes*nNodes), ncol = nNodes)
for(edge in 1:nRelations){
      links[(relations[edge,]$target), (relations[edge,]$source)] <- 1
}

library(sna)

positions <- gplot.layout.mds(links, layout.par=list())

positions <- cbind(positions, 0) # needs a z axis

Finally we create the graph with the parameters and we plot it!

graph <- write.gexf(nodes=nodes,
                    edges=relations,
                    nodesVizAtt=list(
                      color=nodecolors,
                      size=nodesizes,
                      position=positions
                    ),
                    edgesVizAtt=list(
                      thickness= edgethicks
                    ))
                    
plot.gexf(graph)

And you'll obtain something like this live example. Have fun ;)!

plot.gefx.example

Update:

The are many algorithms to find layout of the network in the sna package

# positions <- gplot.layout.adj(links, layout.par=list())
# positions <- gplot.layout.circle(links, layout.par=list())
# positions <- gplot.layout.circrand(links, layout.par=list())
# positions <- gplot.layout.eigen(links, layout.par=list())
# positions <- gplot.layout.fruchtermanreingold(links, layout.par=list())
# positions <- gplot.layout.geodist(links, layout.par=list())
# positions <- gplot.layout.hall(links, layout.par=list())
# positions <- gplot.layout.kamadakawai(links, layout.par=list())
positions <- gplot.layout.mds(links, layout.par=list())
# positions <- gplot.layout.princoord(links, layout.par=list())
# positions <- gplot.layout.random(links, layout.par=list())
# positions <- gplot.layout.rmds(links, layout.par=list())
# positions <- gplot.layout.segeo(links, layout.par=list())
# positions <- gplot.layout.seham(links, layout.par=list())
# positions <- gplot.layout.spring(links, layout.par=list())
# positions <- gplot.layout.springrepulse(links, layout.par=list())
# positions <- gplot.layout.target(links, layout.par=list())

To leave a comment for the author, please follow the link and comment on his blog: jkunst.com: Entries for category R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.