on the Spatial analysis blog a nice visualisation of the major shipping route of the British, Dutch and Spanish fleet in 1750-1850 was presented recently. based on the Climatological Databases for the World’s Oceans (CLIWOC). another even nicer visualisation of the same data was presented by unconsenting. in neither case any code was provided, in particular with regards to just getting the data into a readable format. so i did some digging into the the CLIWOC home-page, with just that particularity in mind.
being a Linux Fedora user i started limiting my, albeit quick trial, with what was supposed to be a linux/unix readable format on the CLWVOC main database release page. this being CLIWOC15.Z file. that trial was not very successful. managed to read in the data into R after getting rid of some “unwanted” characters that R did not “like”. however, the fields (columns supposed to be semicolon delimited) was more or less messed up. and the number of records were also less than specified on the page. most importantly the trial was based on non-reproducible code within the R-environment.
so, reluctantly i decided to have a go at the MS Access databases CLIWOC15_2000.zip source, albeit not having high hopes that i could find a solution that would work withing the Linux environment. but after some search on the web on how to read mdb format directly into R environment within Linux i stumbled across this post. In particular: “Use mdb.get() from Hmisc package to import entire tables from the database into dataframes.” just what the doctor ordered. now i had the Hmisc library already installed. but I did not have the success with the mdb.get() function. reading the help file on mdb.get (?mdb.get) one “gets”:
require(Hmisc) # need also mdbtools, in Fedora do >yum install mdbtools path <- "yourworkingdirectory" URL <- "http://www.knmi.nl/" PATH <- "cliwoc/download/" FILE <- "CLIWOC15_2000.zip" download.file(paste(URL,PATH,FILE,sep=""), paste(path,"CLIWOC15_2000.zip",sep="")) dir <- unzip(paste(path,"CLIWOC15_2000.zip",sep="")) file <- substr(dir,3,nchar(dir)) dat <- mdb.get(file) tmp <- dat$CLIWOC15[,c("Lon3","Lat3")] require(ggplot2) ggplot(tmp,aes(Lon3,Lat3)) + geom_point(alpha=0.01,size=1) + coord_map() + ylim(-90,90)