**StaTEAstics.**, and kindly contributed to R-bloggers)

This week, I got my hands on some agricultural trade data. Trade data are typically extremely dirty so treat with care when you get your hands on them. Lab standard equipments are required.

So I decided to look how countries trade by plotting the network (The data is confidential so I would not disclose the country nor the commodity).

## Load the library

library(XML)

library(reshape)

library(igraph)

t1001.df = read.csv(“http://dl.dropbox.com/u/18161931/trade.csv”, header = TRUE)

## Create the graph

net.mat = as.matrix(t1001.df)

net.g = graph(t(unique(net.mat[, 1:2])))

## Delete vertices with no edge and set edge with proportional to the side of trade

full.g = delete.vertices(net.g, which(degree(net.g) == 0))

E(full.g)$width = scale(net.mat[, 3])

net.g = graph(t(unique(net.mat[, 1:2])))

full.g = delete.vertices(net.g, which(degree(net.g) == 0))

## Change arrow size according to trade volume

E(full.g)$width = net.mat[, 3]/sum(net.mat[, 3])

E(full.g)$width[E(full.g)$width < 0.05] = 0.05

E(full.g)$width = E(full.g)$width * 20

## Compute the size of exporting vertex

sum.df = with(t1001.df, aggregate(TradeValue, list(rtCode), sum))

## Change size and color of exporting country

V(full.g)$size = 8

V(full.g)$size =

((sum.df[, “x”]/sum(sum.df[, “x”])) – min(sum.df[, “x”]/sum(sum.df[, “x”]))) *

15 + 8

V(full.g)$color = “lightblue”

V(full.g)$color = “steelblue”

## Plot the network

set.seed(587)

plot(full.g, edge.arrow.size = 0.3, edge.curved = TRUE,

vertex.label.color = “black”)

The exporters are coloured in dark blue while the importers in light blue. The width of the connection is proportional to the amount of trading between the countries.

Looking at the plot one can easily identify that country 43, 13, are major exporters while country 7 and 37 are major importers. These information can be easily extracted with some simple analysis, however, there are some subtle points which are a little bit hard to identify without a network diagram.

(1) There are clear cluster relationships, certain countries only import from either 43, 13, or 28 while some import from more than one. There could be certain cost/logistic/trade/geographical reasons for this kind of pattern.

(2) Country 10 is isolated meaning that there are no trading between the rest of the world!

The network reveal some subtle information very quickly and is a very good exploratory tool for trade data.

**leave a comment**for the author, please follow the link and comment on their blog:

**StaTEAstics.**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...