Based on Launchpad traffic and mailing list responses, Gabor and Tamas will soon be releasing igraph 0.6. In celebration, I’ll be publishing a number of helpful lists and tables I’ve put together to organize information about igraph.
In this post, we’ll cover the community detection algorithms (~i.e., clustering, partitioning, segmenting) available in 0.6 and their characteristics, such as their worst-case runtime performance and whether they support directed or weighted edges. Much of the information below is gleaned from the igraph C documentation, source algorithm publications, and three years of tracking the 0.6 trunk.
Optimal Modularity
In my mind, modularity is more of a framework than just a simple function over a graph. I don’t know how Mark feels, but the amount of work based on this idea, implicitly (~87,000) or explicitly (~1000), is staggering given the short six years since.
That said, modularity is just a framework, and, like all frameworks, has its shortcomings. If you are going to talk about modularity in a quantitative way, there are two must-read ideas on the topic:
- The performance of modularity maximization in practical contexts.
B. H. Good, Y.-A. de Montjoye and A. Clauset.
Physical Review E 81, 046106 (2010). - Resolution limit in community detection
Santo Fortunato, Marc Barthelemy - More work on tradeoffs in this limit by Fortunato:
Limits of modularity maximization in community detection
Andrea Lancichinetti, Santo Fortunato
TL;DR: The plateau of the modularity surface we care about most behaves poorly in real networks, and, depending on |V| and |E| in absolute and relative terms, we might not see “smaller” clusters.
If you’re still going to calculate the optimal modularity configuration, here are the details:
- New to version 0.6: TRUE
- Directed edges: FALSE
- Weighted edges: FALSE
- Handles multiple components: TRUE
- Runtime: B^(|V|^2)
Edge-Betweenness
OK, time to get less verbose. I’ll stick to the primary reference and igraph characteristics going forward:
- Community structure in social and biological networks
M. Girvan and M. E. J. Newman - New to version 0.6: FALSE
- Directed edges: TRUE
- Weighted edges: TRUE
- Handles multiple components: TRUE
- Runtime: |V||E|^2
Leading Eigenvector
- Finding community structure in networks using the eigenvectors of matrices
MEJ Newman
Phys Rev E 74:036104 (2006) - New to version 0.6: FALSE
- Directed edges: FALSE
- Weighted edges: FALSE
- Handles multiple components: TRUE
- Runtime: c|V|^2 + |E|
Fast-Greedy
- Finding community structure in very large networks
Aaron Clauset, M. E. J. Newman, Cristopher Moore - Finding Community Structure in Mega-scale Social Networks
Ken Wakita, Toshiyuki Tsurumi - New to version 0.6: FALSE
- Directed edges: FALSE
- Weighted edges: TRUE
- Handles multiple components: TRUE
- Runtime: |V||E| log |V|
Multi-Level
- Fast unfolding of communities in large networks
Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre - New to version 0.6: TRUE
- Directed edges: FALSE
- Weighted edges: TRUE
- Handles multiple components: TRUE
- Runtime: “linear” when |V| \approx |E|; (a quick glance at the algorithm suggests this would be quadratic for fully-connected graphs)
Walktrap
- Computing communities in large networks using random walks
Pascal Pons, Matthieu Latapy - New to version 0.6: FALSE
- Directed edges: FALSE
- Weighted edges: TRUE
- Handles multiple components: FALSE
- Runtime: |E||V|^2
Label Propagation
- Near linear time algorithm to detect community structures in large-scale networks.
Raghavan, U.N. and Albert, R. and Kumara, S.
Phys Rev E 76, 036106. (2007) - New to version 0.6: TRUE
- Directed edges: FALSE
- Weighted edges: TRUE
- Handles multiple components: FALSE
- Runtime: |V| + |E|
InfoMAP
- The map equation
M. Rosvall, D. Axelsson, C. T. Bergstrom - New to version 0.6: TRUE
- Directed edges: TRUE
- Weighted edges: TRUE
- Weighted nodes: TRUE
- Handles multiple components: FALSE
- Runtime: none given; looks worst case like |V|(|V| + |E|) based on quick reading.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).