(This article was first published on

I few years ago, while I was working on kernel based density estimation
on compact support distribution (like copulas) I went through a series
of papers on circular distributions. By that time, I thought it was
something for mathematicians working on weird spaces.... but during the
past weeks, I saw several potential applications of those estimators.**Freakonometrics - Tag - R-english**, and kindly contributed to R-bloggers)- circular data density estimation

with a circular relationship, i.e. . It can be seen as an invariance by rotation.

von Mises proposed a parametric model in 1918 (see here or there), assuming that

where is Bessel modified function of order 1,

(which is simply a normalization parameter). There are two parameters here, (some concentration parameter) and mu a direction.

From a series of observed angles, the maximum likelihood estimator for kappa is solution of

where

and

and where , where those functions are modified Bessel functions. Well, that estimator is biased, but it is possible to improve it (see here or there). This can be done easily in R (actually Jeff Gill - here - used that package in several applications). But I am not a big fan of that technique....

- density estimation for hours on simulated data

The time is now seen as an angle. It is possible to consider the equivalent of an histogram,

set.seed(1)

library(circular)

X=rbeta(100,shape1=2,shape2=4)*24

Omega=2*pi*X/24

Omegat=2*pi*trunc(X)/24

H=circular(Omega,type="angle",units="radians",rotation="clock")

Ht=circular(Omegat,type="angle",units="radians",rotation="clock")

plot(Ht, stack=FALSE, shrink=1.3, cex=1.03,

axes=FALSE,tol=0.8,zero=c(rad(90)),bins=24,ylim=c(0,1))

points(Ht, rotation = "clock", zero =c(rad(90)),

col = "1", cex=1.03, stack=TRUE )

rose.diag(Ht-pi/2,bins=24,shrink=0.33,xlim=c(-2,2),ylim=c(-2,2),

axes=FALSE,prop=1.5)

circ.dens = density(Ht+3*pi/2,bw=20)

plot(Ht, stack=TRUE, shrink=.35, cex=0, sep=0.0,

axes=FALSE,tol=.8,zero=c(0),bins=24,

xlim=c(-2,2),ylim=c(-2,2), ticks=TRUE, tcl=.075)

lines(circ.dens, col="darkgrey", lwd=3)

text(0,0.8,"24", cex=2); text(0,-0.8,"12",cex=2);

text(0.8,0,"6",cex=2); text(-0.8,0,"18",cex=2)

The code is simply the following

U=seq(0,1,by=1/250)

O=U*2*pi

U12=seq(0,1,by=1/24)

O12=U12*2*pi

X=rbeta(100,shape1=2,shape2=4)*24

OM=2*pi*X/24

XL=c(X-24,X,X+24)

d=density(X)

d=density(XL,bw=d$bw,n=1500)

I=which((d$x>=6)&(d$x<=30))

Od=d$x[I]/24*2*pi-pi/2

Dd=d$y[I]/max(d$y)+1

plot(cos(O),-sin(O),xlim=c(-2,2),ylim=c(-2,2),

type="l",axes=FALSE,xlab="",ylab="")

for(i in pi/12*(0:12)){

abline(a=0,b=tan(i),lty=1,col="light yellow")}

segments(.9*cos(O12),.9*sin(O12),1.1*cos(O12),1.1*sin(O12))

lines(Dd*cos(Od),-Dd*sin(Od),col="red",lwd=1.5)

text(.7,0,"6"); text(-.7,0,"18")

text(0,-.7,"12"); text(0,.7,"24")

R=1/24/max(d$y)/3+1

lines(R*cos(O),R*sin(O),lty=2)

Note that it is possible to stress more (visually) on hours having few phone calls, or a lot (compared with an homogeneous Poisson process), e.g.

plot(cos(O),-sin(O),xlim=c(-2,2),ylim=c(-2,2),

type="l",axes=FALSE,xlab="",ylab="")

for(i in pi/12*(0:12)){

abline(a=0,b=tan(i),lty=1,col="light yellow")}

segments(2*cos(O12),2*sin(O12),1.1*cos(O12),1.1*sin(O12),

col="light grey")

segments(.9*cos(O12),.9*sin(O12),1.1*cos(O12),1.1*sin(O12))

text(.7,0,"6")

text(-.7,0,"18")

text(0,-.7,"12")

text(0,.7,"24")

R=1/24/max(d$y)/3+1

lines(R*cos(O),R*sin(O),lty=2)

AX=R*cos(Od);AY=-R*sin(Od)

BX=Dd*cos(Od);BY=-Dd*sin(Od)

COUL=rep("blue",length(AX))

COUL[R<Dd]="red"

CM=cm.colors(200)

a=trunc(100*Dd/R)

COUL=CM[a]

segments(AX,AY,BX,BY,col=COUL,lwd=2)

lines(Dd*cos(Od),-Dd*sin(Od),lwd=2)

We get here those two graphs,

- density of wind direction

- density of 911 phone calls

*midnight crime*" myth, using hours of 911 phone calls in Montréal.

That was for all phone calls. But if we look more specifically, for

**, we have the distribution on the left, and for**

*burglaries***the one on the right**

*conflicts***, we have the distribution on the left, and for**

*gun shots***(basically people making too much noisy in parties) or**

*"troubles"***the one on the right**

*"noise"*- density of earth temperatures, or earthquakes

The idea being that point on the left above are extremely close to the one on the right. An application can be e.g. on earthquakes occurrence. Data can be found here.

library(ks)

X=cbind(EQ$Longitude,EQ$Latitude)

Hpi1 = Hpi(x = X)

DX=kde(x = X, H = Hpi1)

library(maps)

map("world")

plot(DX,add=TRUE,col="red")

points(X,cex=.2,col="blue")

Y=rbind(cbind(X[,1],X[,2]),cbind(X[,1]+360,X[,2]),

cbind(X[,1]-360,X[,2]),cbind(X[,1],X[,2]+180),

cbind(X[,1]+360,X[,2]+180),cbind(X[,1]-360,X[,2]+180),

cbind(X[,1],X[,2]-180),cbind(X[,1]+360,

X[,2]-180),cbind(X[,1]-360,X[,2]-180))

DY=kde(x = Y, H = Hpi1)

library(maps)

plot(DY,add=TRUE,col="purple")

To

**leave a comment**for the author, please follow the link and comment on his blog:**Freakonometrics - Tag - R-english**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...