**Brokering Closure » R**, and kindly contributed to R-bloggers)

Elijah asked the following via SOCNET mailing list:

I was wondering if anyone knew of a script or tool which would give me the network distance of nodes to a particular class of nodes. I think of this as an Erdos number, except instead of getting the distance to one node, I want the distance to the closest node of a particular class. Let’s say I have a network of people and I know their professions. Some are Students, some are Journalists and a small number are Engineers. I’d like to be able to find out the network distance of each node to the closest Engineer node. It would be particularly useful if the script also had the option to total edge weight into the calculation.

If you get your network data into R it is fairly straightforward to do this using igraph package. Here is the function:

# shortest paths to nodes with a specified value on certain node attribute spnt <- function(g, aname, avalue, weights=NULL, ...) { require(igraph) stopifnot(inherits(g, "igraph")) a <- get.vertex.attribute(g, aname) m <- shortest.paths(g, v=V(g)[a==avalue], weights=weights, ...) apply(m, 2, min) } |

It assumes that ‘g’ is a network (object of class ‘igraph’), ‘aname’ is a name of the node attribute, ‘avalue’ is the value of the attribute ‘aname’ that designates the nodes to/from which we would like to calculate distances, finally ‘weights’ can be optionally used to include weights in the calculation (as a numeric vector).

The function will return a vector of distances in ‘g’ from all the nodes to the closest node that have a value ‘avalue’ on attribute ‘aname’.

As an example consider the network below. It is undirected and has 15 nodes. It has two attributes defined: a node attribute called “color” having values “orange”, “lightblue”, and “lightgreen”, and an edge attribute called “w” with values 1 or 2. Both attributes are shown in the picture as a node color and edge label. The numbers on the nodes are node ids.

Assuming that the network is called ‘g’ we can use the function above in the following way:

# from lightblue nodes to all others spnt(g, "color", "lightblue") ## [1] 0 1 2 1 2 3 0 1 2 1 2 3 2 3 4 # from orange nodes to all others spnt(g, "color", "orange") ## [1] 1 0 1 0 1 0 1 0 0 2 1 0 2 1 0 # to lightblue, but using weights (shortest path = minimal weight) spnt(g, "color", "lightblue", weights=E(g)$w) ## [1] 0 2 3 1 2 3 0 2 4 2 3 4 3 5 5 |

A couple of end notes:

- In the result vector you will get 0s for the nodes of specified type, i.e. in the last example there are 0s for the “lightblue” nodes.
- If a certain node is not connected (directly or via other nodes) to any node of specified type the vector will contain ‘Inf’ (plus infinity).
- The algorithm will not accept negative weights. But this limitation can be effectively dodged by transforming the weights so that they are all positive (for example adding some number), performing the computation, and then transforming back the results to the original scale.
- You can exploit other features of ‘shortest.paths’ function, on which this function is based. Any extra arguments to ‘spnt’ are passed to ‘shortest.paths’. For example, if the network is directed you can calculate shortest paths that are either incoming, or outgoing (via ‘mode’ argument). See help page of ‘shortest.paths’.

**leave a comment**for the author, please follow the link and comment on their blog:

**Brokering Closure » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...