R and the Geotechnical Exchange Format

December 24, 2011

(This article was first published on Bart Rogiers - Sreigor, and kindly contributed to R-bloggers)

Quite some time ago now, I wrote this function to read some *.gef files into R. “gef” stands for Geotechnical Exchange Format. Details on this data format can be found at http://www.geffiles.nl/, as well as several software tools. I had a long list of very specific *.gef files, so the function is not very generic, but it might provide you with a good starting point, if you need to do something similar. I might update it in the future though..

read.gef <- function(filename)
gef.lines <- scan(filename, what=character(), sep='\n')
gef.lines.comments <- gef.lines[which(substr(gef.lines, 1,1)=='#')]
gef.lines.data <- gef.lines[-which(substr(gef.lines, 1,1)=='#')]
gef <- NULL
nc <- length(grep('COLUMNINFO=',gef.lines.comments))
nr <- length(gef.lines.data)
gef$data <- matrix(ncol=nc, nrow=nr)
for(i in 1:length(gef.lines.data))
for(j in 1:nc)
gef$data[i,j] <- as.numeric(remove.empty.strings(strsplit(gef.lines.data[i],' ')[[1]])[j])
gef$data <- as.data.frame(gef$data)
for(i in 1:nc) names(gef$data)[i] <- gef.lines.comments[grep('COLUMNINFO=',gef.lines.comments)[i]]
for(i in 1:nc) names(gef$data)[i] <- strsplit(strsplit(names(gef$data), ' ')[[i]][4], ',')
gef$x <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$y <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[4], ',')[[1]])
gef$surface <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('ZID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
if(length(grep('PARENT=',gef.lines.comments))==1) # file is child
gef$depth <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('PARENT=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$z <- gef$surface-gef$depth
# else # file is parent
# {
# }
#cat(paste(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]]), ','),'\n'))
gef$cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 2',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$a.cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 3',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$a.sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 4',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])

The function below is used in the code above:

remove.empty.strings <- function(stringArray)
newStringArray <- NULL
for(i in 1:length(stringArray))
if(stringArray[i] != '') {newStringArray <- c(newStringArray, stringArray[i])}
Created by Pretty R at inside-R.org

To leave a comment for the author, please follow the link and comment on their blog: Bart Rogiers - Sreigor.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)