Generating Hurricanes with a Markov Spatial Process

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The National Hurricane Center (NHC) collects datasets with all  storms in North Atlantic, the North Atlantic Hurricane Database (HURDAT). For all sorms, we have the location of the storm, every six jours (at midnight, six a.m., noon and six p.m.). Note that we have also the date, the maximal wind speed – on a 6 hour window – and the pressure in the eye of the storm.

It is possible to run the following function

library(XML)
extract.track=function(year=2012,p=TRUE){

First, we need to extract the list of all storms

loc <- paste("http://weather.unisys.com/hurricane/atlantic/",year,"/index.php",sep="")
tabs <- readHTMLTable(htmlParse(loc)) 
storms <- unlist(strsplit(as.character(
tabs[[1]]$Name),split=" "))
index <- storms%in%c("Tropical","Storm",paste("Hurricane-",1:6,sep=""),"Depression","Subtropical","Extratropical","Low",paste("Storm-",1:6,sep=""),
"Xxx")
nstorms  <- storms[!index]

and then, for all storms, we go to the appropriate page, and we extract the dataset,

TRACK=NULL
for(i in length(nstorms):1){
loc=paste("http://weather.unisys.com/hurricane/atlantic/",year,"/",nstorms[i],"/track.dat",sep="")
track=read.fwf(loc,skip=3,widths = c(4,6,8,12,4,6,20))
names(track)=c("ADV","LAT","LON","TIME","WIND","PR","STAT")
track$LAT=as.numeric(as.character(track$LAT))
track$LON=as.numeric(as.character(track$LON))
track$WIND=as.numeric(as.character(track$WIND))
track$PR=as.numeric(as.character(track$PR))
track$year=year
track$name=nstorms[i]
TRACK=rbind(TRACK,track)
if(p==TRUE){  cat(year,i,nstorms[i],nrow(track),"n")}}
return(TRACK)}

For instance, for 2012, we have

> m=extract.track(2012)
2012 19 TONY 20 
2012 18 SANDY 45 
2012 17 RAFAEL 56 
2012 16 PATTY 11 
2012 15 OSCAR 13 
2012 14 NADINE 96 
2012 13 MICHAEL 42 
2012 12 LESLIE 60 
2012 11 KIRK 22 
2012 10 JOYCE 12 
2012 9 ISAAC 51 
2012 8 GORDON 26 
2012 7 HELENE 38 
2012 6 FLORENCE 21 
2012 5 ERNESTO 39 
2012 4 DEBBY 18 
2012 3 CHRIS 31 
2012 2 BERYL 33 
2012 1 ALBERTO 20 
> head(m)
  ADV  LAT   LON      TIME WIND  PR  STAT year name
1   1 20.1 -50.8  10/21/18Z   25 1011 LOW 2012 TONY
2   2 20.4 -51.2  10/22/00Z   25 1011 LOW 2012 TONY
3   3 20.8 -51.5  10/22/06Z   25 1010 LOW 2012 TONY
4   4 21.3 -51.7  10/22/12Z   30 1009 LOW 2012 TONY

With a simple loop we can extract all the years (note some manual corrections are necessary, since there are about ten typos in the lists, on the hurricane names).

TOTTRACK=NULL
for(y in 2012:1851){
TOTTRACK=rbind(TOTTRACK,extract.track(y))
}

We can visualize all those trajectories,

library(maps)
map("world",xlim=c(-80,-40),ylim=c(10,50),col="light yellow",fill=TRUE)
for(n in unique(TOTTRACK$name)){
lines(TOTTRACK$LON[TOTTRACK$name==n],TOTTRACK$LAT[TOTTRACK$name==n],lwd=.5,col="red")}

If we look at the distribution of those points, we have

library(ks)
U=TOTTRACK[,c("LON","LAT")]
U=U[!is.na(U$LON),]
H=diag(c(.2,.2))
fat=kde(U,H,xmin=c(min(U[,1]),min(U[,2])),xmax=c(max(U[,1]),max(U[,2])))
z=fat$estimate
image(z)
map("world",add=TRUE)

Almost a year ago, Christophe Denuse-Baillon – in his actuarial thesis – suggested to use a Markov space process.  I do not want to fit a parametric model, but for all locations, it is possible to use a nonparametric approach. For all locations, on a given grid, it is possible to get all past movements (if any). To be more specific, given a location (a truncated latitude and a
truncated longitude), let us extract two informations,

  • the proportion of trajectories which ended at this location
  • for all trajectories that did not end at this location, we can extract location 6 hours lated, after being at this location.

The code will be

mtransition=function(i,j){
MOVEMENT=NA
sumstop=0
p=1
idx=which( (TOTTRACK$LON>=gridx[i])&(TOTTRACK$LON<gridx[i+1]) &
           (TOTTRACK$LAT>=gridy[j])&(TOTTRACK$LAT<gridy[j+1]) )
if(length(idx)>0){
MOVEMENT=data.frame(LON=rep(NA,length(idx)),LAT=rep(NA,length(idx)),D=rep(NA,length(idx)))
for(s in 1:length(idx)){
stops=TRUE
if((is.na(TOTTRACK$name[idx[s]+1])==FALSE)&(is.na(TOTTRACK$name[idx[s]])==FALSE)){
stops=TOTTRACK$name[idx[s]+1]!=TOTTRACK$name[idx[s]]}
	if(stops==TRUE ){sumstop=sumstop+1}	
	if(stops==FALSE){
	d=(TOTTRACK$LON[idx[s]+1]-TOTTRACK$LON[idx[s]])^2+(TOTTRACK$LAT[idx[s]+1]-TOTTRACK$LAT[idx[s]])^2
	locx=locy=NA
	if((d<90)& TOTTRACK$LON[idx[s]+1]<0){
locx=floor(TOTTRACK$LON[idx[s]+1]/pasgrid)
locy=floor(TOTTRACK$LAT[idx[s]+1]/pasgrid)
	}
	MOVEMENT[s,]=c(locx,locy,d)
	}}
p=sumstop/length(idx)}
return(list(listemouv=MOVEMENT,probstop=p))}

For convenience, for all values on a grid, let us store all those values in a big list

ListTransMat=list()
ListStopMat =list()
for(i in 1:(length(gridx)-1)){
for(j in 1:(length(gridy)-1)){
nom=abs(gridx[i])*1000+abs(gridy[j])
M=matricetransition(i,j)
ListTransMat[[nom]] = M$listemouv
ListStopMat[[nom]]  = M$probstop
}}

Now we can start some code to generate a trajectory. First, we need a starting point, from one we did observe.

n=nrow(TOTTRACK)
idx=which(TOTTRACK$name[1:(n-1)]!=TOTTRACK$name[2:n])
STARTINGVALUE=TOTTRACK[c language="(1,idx+1),c("LON","LAT")"][/c]

If we plot them on the map above, we have

abline(v=gridx,col="white",lwd=.5)
abline(h=gridy,col="white",lwd=.5)
points(STARTINGVALUE$LON,STARTINGVALUE$LAT,
pch=19,col="blue")

From that location we look at possible trajectories, and then, we generate two random variables. One to see if we move, or if the trajectory ended where we were. And in the case we move, we draw
randomly among past trajectories. The idea is the following. Start at some specific location

and  then, using the previous function, look were you might be, 6 hours later,

Draw one possible location, from those observe previously (note that we might reamin in the same square)

sim.mouv=function(LOC){
lon=LOC[1]
lat=LOC[2]
nom=as.numeric(abs(lon)*1000+abs(lat))
M=ListTransMat[[nom]]
M=M[!is.na(M$LON),]
stop=ListStopMat[[nom]]
u=runif(1)
if(u<=stop) LOC2=NA
if(u>stop) LOC2=M[sample(1:nrow(M),size=1),c("LON","LAT")]
return(LOC2)
}

And then, from that new location

we move again (according to some previous trajectory, observed in the past)

Actually, keep in mind that it might also be possible to stop here. The code is – more or less – the following

sim.traj=function(){
i=sample(1:nrow(STARTINGVALUE),size=1)
LOC=round(STARTINGVALUE[i,])
MLOC=LOC
stop=FALSE
while(stop==FALSE){
LOC2=sim.mouv(LOC)
if(is.na(LOC2))  stop=TRUE
if(!is.na(LOC2)) LOC=LOC2; MLOC=rbind(MLOC,LOC)
}
return(MLOC)
}

Below, we can visualize two possible trajectories, for hurricanes that we did generate,

Nice, isn’t it? Now we need to add additional information, about the strength of the hurricaine, again using past experience… But let’s keep that for another post!

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)