Non-Uniform Population Density in some European Countries

April 17, 2016
By

(This article was first published on R-english – Freakonometrics, and kindly contributed to R-bloggers)

A few months ago, I did mention that France was a country with strong inequalities, especially when you look at higher education, and research teams. Paris has almost 50% of the CNRS researchers, while only 3% of the population lives there.

It looks like Paris is the only city, in France. And I wanted to check that, indeed, France is a country with strong inequalities, when we look at population density.

Using data from sedac.ciesin.columbia.edu, it is possible to get population density on a small granularity level,

> rm(list=ls())
> base=read.table(
+      "/home/charpentier/glp00ag.asc",
+      skip=6)
> X=t(as.matrix(base,ncol=8640))
> X=X[,ncol(X):1]

The scales for latitudes and longitudes can be obtained from the text file,

> #ncols         8640
> #nrows         3432
> #xllcorner     -180
> #yllcorner     -58
> #cellsize      0.0416666666667

Hence, we have

> library(maps)
> world=map(database="world")
> vx=seq(-180,180,length=nrow(X)+1)
> vx=(vx[2:length(vx)]+vx[1:(length(vx)-1)])/2
> vy=seq(-58,85,length=ncol(X)+1)
> vy=(vy[2:length(vy)]+vy[1:(length(vy)-1)])/2

If we plot our density, as in a previous post, on Where People Live,

> I=seq(1,nrow(X),by=10)
> J=seq(1,ncol(X),by=10)
> image(vx[I],vy[J],log(1+X[I,J]),
+ col=rev(heat.colors(101)))
> lines(world[[1]],world[[2]])

we can see that we have a match, between the big population matrix, and polygons of countries.

Consider France, for instance. We can download the contour polygon with higher precision,

> library(rgdal)
> fra=download.file(
"http://biogeo.ucdavis.edu/data/gadm2.8/rds/FRA_adm0.rds",
+ "fr.rds")
> Fra=readRDS("fr.rds")
> n=length([email protected][[1]]@Polygons)
> L=rep(NA,n)
> for(i in 1:n) L[i]=nrow([email protected][[1]]@Polygons[[i]]@coords)
> idx=which.max(L)
> polygon_Fr=
+       [email protected][[1]]@Polygons[[idx]]@coords
> min_poly=apply(polygon_Fr,2,min)
> max_poly=apply(polygon_Fr,2,max)
> idx_i=which((vx>min_poly[1])&(vx> idx_j=which((vy>min_poly[2])&(vy> sub_X=X[idx_i,idx_j]
> image(vx[idx_i],vy[idx_j],
+       log(sub_X+1),col=rev(heat.colors(101)),
+       xlab="",ylab="")
> lines(polygon_Fr)

We are now able to extract information about population for France, only (actually, it is only mainland France, islands are not considered here… to avoid complicated computations

> library(sp)
> xy=expand.grid(x = vx[idx_i], y = vy[idx_j])
> dim(xy)
[1] 65730     2

Here, we have 65,730 small squares, in France.

> pip=point.in.polygon(xy[,1],xy[,2],
+     polygon_Fr[,1],polygon_Fr[,2])>0
> dim(pip)=dim(sub_X)
> Fr=sub_X[pip]
> sum(Fr)
[1] 58105272

Observe that the total population within the French polygon is close to 60 million people, which is consistent with actual figures. Now, if we look more carefully at repartition over the French territory

> library(ineq)
> Gini(Fr)
[1] 0.7296936

Gini coefficient is rather high (over 70%), but it is also possible to visualize Lorenz curve,

> plot(Lc(Fr))

Observe that in 5% of the territory, we can find almost 54% of the population

> 1-min(LcF$L[LcF$p>.95])
[1] 0.5462632

In order to compare with other countries, consider the

> LC=function(rds="fr.rds"){
+ Fra=readRDS(rds)
+ n=length([email protected][[1]]@Polygons)
+ L=rep(NA,n)
+ for(i in 1:n) 
L[i]=nrow([email protected][[1]]@Polygons[[i]]@coords)
+ idx=which.max(L)
+ polygon_Fr=
+      [email protected][[1]]@Polygons[[idx]]@coords
+ min_poly=apply(polygon_Fr,2,min)
+ max_poly=apply(polygon_Fr,2,max)
+ idx_i=which((vx>min_poly[1])&(vx+ idx_j=which((vy>min_poly[2])&(vy+ sub_X=X[idx_i,idx_j]
+ xy=expand.grid(x = vx[idx_i], y = vy[idx_j])
+ dim(xy)
+ pip=point.in.polygon(xy[,1],xy[,2],
+     polygon_Fr[,1],polygon_Fr[,2])>0
+ dim(pip)=dim(sub_X)
+ Fr=sub_X[pip]
+ return(list(gini=Gini(Fr),LC=Lc(Fr))
+ }
> FRA=LC()

For instance, consider Germany, or Italy

> deu=download.file(
"http://biogeo.ucdavis.edu/data/gadm2.8/rds/DEU_adm0.rds","deu.rds")
> DEU=LC("deu.rds")
> ita=download.file(
"http://biogeo.ucdavis.edu/data/gadm2.8/rds/ITA_adm0.rds","ita.rds")
> ITA=LC("ita.rds")

It is possible to plot Lorenz curve, together,

> plot(FRA$LC,col="blue")
> lines(DEU$LC,col="black")
> lines(ITA$LC,col="red")

Observe that France is clearly below the other ones. Compared with Germany, there is a significant difference

> FRA$gini
[1] 0.7296936
> DEU$gini
[1] 0.5088853

More precisely, if 54% of French people live in 5% of the territory, only 40% of Italians, and 32% of the Germans,

> 1-min(FRA$LC$L[FRA$LC$p>.95])
[1] 0.5462632
> 1-min(ITA$LC$L[ITA$LC$p>.95])
[1] 0.3933227
> 1-min(DEU$LC$L[DEU$LC$p>.95])
[1] 0.3261124

To leave a comment for the author, please follow the link and comment on their blog: R-english – Freakonometrics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)