# Earthquakes in Netherlands

February 24, 2013
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

In the Netherlands we have Natural Gas. Unfortunately winning this gas seems to cause some quakes. As quakes go, they are not strong. However, our buildings are not made to resist quakes, before 1986 they were unheard of, so there is some damage. It is now predicted they could get stronger and more frequent. This caused a bit of a stir in the Netherlands and I wondered if I could display the data so I can understand the quakes a bit better.

The source of the data is knmi.nl. I found a spreadsheet with data upto 28 January 2013 called geinduceerde-bevingen-nl.pdf. The data would not convert very well in calibre, but after zooming out, I could ctrl-A, ctrl-C it from Adobe into notepad. As so often, a bit of post processing is needed. These days I tend to do that in R.
library(ggplot2)
# read as lines of text for processing
[1] "Geïnduceerde aardbevingen in Nederland"
[2] "YYMMDD TIME LOCATION LAT LON X_RD Y_RD INT MAG.DEPTH"
[3] "19861226 074751.00 Assen 52.992 6.548 232,924 556,587 4.5 2.8 1.0"
[4] "19871214 204951.05 Hooghalen 52.928 6.552 233,266 549,537 4 2.5 1.5"
[5] "19891201 200914.35 Purmerend 52.529 4.971 126,697 504,593 5 2.7 1.2"
[6] "19910215 021116.54 Emmen 52.771 6.914 257,992 532,491 3.5 2.2 3.0"
Although not clear from this head, it seems there are a number of problems running this through read.table(). The worst are; each page of the .pdf contains a header, LOCATION can contain a space, INT may be missing, MAG.DEPTH is actually two variables.
#print notes, then remove them
grep('^YYMMDD|Ge|[[:digit:]]{8}',Datain,invert=TRUE,value=TRUE)
[1] "www.knmi.nl"
[2] "Any use of this material [text, illustrations, graphics] is only permitted when the source www.knmi.nl is acknowledged."
Datain <-grep('^(YYMMDD|[[:digit:]]{8})',Datain,value=TRUE)
#column header is repeated. remove all but first
# Quote around location names
Datain <-gsub('([[:digit:]]) ([[:alpha:]\'])','\\1 "\\2',Datain,fixed=FALSE)
Datain <-gsub('([[:alpha:]() ]) ([[:digit:]])','\\1" \\2',Datain,fixed=FALSE)
#quote around INT, so it can be read even if it is absent
Datain <-gsub('([[:digit:]]{2,3},[[:digit:]]{3} [[:digit:]]{3},[[:digit:]]{3} )','\\1" ',Datain,fixed=FALSE)
Datain <-gsub('( -?[[:digit:]]*\\.[[:digit:]]* [[:digit:]]*\\.[[:digit:]]*$)',' "\\1',Datain,fixed=FALSE) # remove one occurence 'not yet known' Datain <-grep('Nog niet bekend',Datain,value=TRUE,invert=TRUE) # and split the last two words Datain <-gsub('MAG.DEPTH','Mag Depth',Datain,fixed=TRUE) # now read into a dataframe r1 <-read.table(textConnection(Datain),header=TRUE, colClasses=c(c('character','character'),rep(NA,8))) To use the data, I need a proper data-time variable. My interest is limited to Groningen and Drente, so some data is removed. The projection into a map I stole from another blog (rpsychologist.com) and updated for a newer version of ggplot2. r1$DTC <-as.POSIXct(paste(r1$YYMMDD,r1$TIME),format='%Y%m%d %H%M%S')
topred$Mag <- predict(lo2,topred) png('earthquake time.png') ggplot(r2, aes(DTC,diftime,col=Mag)) + geom_point() + scale_x_datetime('Date',limits=as.POSIXct(c('1995-01-01','2013-02-01'))) + scale_colour_gradient(low="blue",high="red") + scale_y_log10('Time between quakes (days)',limits=c(1e-1,50)) + geom_line(size=2,data=topred) dev.off() ### Where? I wanted a good reason to plot some data in a map for ages. I am very happy this is the good occasion. And it is actually very simple. The only difficulty was using colour for progression. ggplot2 didn't like the POSIXct variable DTC, so I had to make a second variable, mydate, which was numeric and mimics years. The red is more recent. Most quakes are north-east of Groningen city, which is the region I associate them with. There are definitely 'hot spots' and 'hot regions'. I was surprised to learn of quakes in Ter Apel (bottom right), and just south of Assen, that is (relatively) far away. #where is the quake? mp <-openmap(upperLeft=c(max(ylim),min(xlim)),lowerRight=c(min(ylim),max(xlim))) r2$mydate <-1995+(2013-1995)*(as.numeric(r2\$DTC-as.POSIXct('1995-01-01')))/
as.numeric(as.POSIXct('2013-01-01')-as.POSIXct('1995-01-01'))
png('earthquake map.png')
autoplot(mp) +
geom_point(aes(x,y, color=mydate,guide='legend'),
alpha = I(7/10), data=r2) +
theme(axis.line=element_blank(),axis.text.x=element_blank(),
axis.text.y=element_blank(),axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank()) +
theme(legend.position = "bottom")
dev.off()