Shapefiles from Isodensity Curves

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Recently, with @3wen, we wanted to play with isodensity curves. The problem is that it is difficult to get – numerically – the equation of the contour (even if we can easily plot it). Consider the following surface (just for fun, in order to illustrate the idea)

> f=function(x,y) x*y+(1-x)*(1-y)
> u=v=seq(0,1,length=21)
> v=seq(0,1,length=11)
> f=outer(u,v,f)
> persp(u,v,f,theta=angle,phi=10,box=TRUE,
+ shade=TRUE,ticktype="detailed",xlab="",
+ ylab="",zlab="",col="yellow")

For instance, assume that we want to locate areas where the density exceed 0.7 (here in the lower left corner, SW, and the upper right corner, NE)

> image(u,v,f)
> contour(u,v,f,add=TRUE,levels=.7)

Is it possible to get the shapefile of the area(s) where the densityexceed some given threshold?

Recall that our density is defined on a grid. The points (on the grid) such that the density exceed the threshold are obtained using

> x=matrix(c(vectu[vecti==TRUE],
+ vectv[vecti==TRUE]),ncol=2)
> plot(x,xlim=0:1,ylim=0:1)

(here it is not perfectly symmetric since I wanted to have a thinner grid on one axis, and a larger one on the other one, just for fun). To get a nice shapefile, let us consider the convex hull of the points (or to be more specific some hull, see Goswami (2013) or Pless (2012)), but actually, we’d better add a random noise to avoid straight lines (for computational issues)

> x=matrix(c(vectu[vecti==TRUE],
+ vectv[vecti==TRUE]),ncol=2)+
+ rnorm(sum(idx)*2)/10000

The is obtained using

> library(alphahull)
> alphashape <- ashape(x, alpha = .2)

The contour is obtained by connecting the following points (here, we have indices of  points, used to draw segments ),

> alphashape$edges[, 1:2]
      ind1 ind2
 [1,]    3    2
 [2,]    7   13
 [3,]    6    5
 [4,]    5    4
 [5,]    6   12
 [6,]   14   13
 [7,]   16   15
 [8,]   15   14
 [9,]   12   16
[10,]   17   18
[11,]   18   22
[12,]   22   28
[13,]   21   27
[14,]   34   27
[15,]   30   29
[16,]   29   28
[17,]   31   30
[18,]   33   32
[19,]   32   31
[20,]   34   33
[21,]    1    2
[22,]    1    7
[23,]    3    4
[24,]   17   21

The graph of the convex hull is the following

> plot(alphashape, col = "blue")

The problem is that we do not have a shapefile here. We have indices of points used to draw some segments. We should start from a given point, and then, we go the its neighbor, etc

> id =alphashape$edges[, 1:2]
> boucle=FALSE
> listi=id[1,1:2]
> vk=2:nrow(id)
> i0=as.numeric(listi[length(listi)])
> while(boucle==FALSE){
+ idxi0=which(id[vk,1]==i0)
+ if(length(idxi0)>0)  {nb=id[vk,2][idxi0]}
+ if(length(idxi0)==0) {idxi0=which(id[vk,2]==i0)
+ nb=id[vk,1][idxi0]}
+ if(length(idxi0)==0) {boucle=TRUE}
+ if(boucle==FALSE){
+ listi=c(listi,nb)
+ vk=vk[-idxi0]
+ i0=nb}}

The shapefile for the region in the lower part is

> px=x[listi,]
> px
               [,1]          [,2]
 [1,]  1.000752e-01 -1.416283e-05
 [2,]  5.003630e-02  1.089980e-05
 [3,] -8.050675e-05 -1.453569e-04
 [4,]  6.210392e-05  9.996789e-02
 [5,]  1.017049e-04  1.999146e-01
 [6,]  5.006795e-02  1.998075e-01
 [7,]  1.001361e-01  2.001562e-01
 [8,]  1.499776e-01  1.998995e-01
 [9,]  2.500919e-01  1.000922e-01
[10,]  2.499919e-01 -6.847855e-06
[11,]  1.999688e-01  1.454993e-04
[12,]  1.499061e-01  1.595938e-04
[13,]  1.000752e-01 -1.416283e-05

If we build the polygon associated to those points, we can draw it

> polygon(px,col="red")

Now, remember that we did add a noise. If we round the values, we get

> px=x[listi,]
> pxt=round(px*20)/20
> plot(x)
> polygon(pxt,col="purple")

Based on that function, it is possible to draw anything. For instance, it is possible to visualize isodensity curves of the bivariate Gaussian distribution,

Now, we did use a similar function to visualize hot spot on various maps. In our original work, we have visualization such as

@3wen recently uploaded a long post on his blog, to describe the algorithm we used, to go from the level curves on the graph above to the clusters described below,

Similar graphs can be found in the revised version of our joint paper, Kernel Density Estimation with Ripley’s Circumferential Correction.

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)