R and the Geotechnical Exchange Format

December 24, 2011
By

(This article was first published on Bart Rogiers - Sreigor, and kindly contributed to R-bloggers)

Quite some time ago now, I wrote this function to read some *.gef files into R. "gef" stands for Geotechnical Exchange Format. Details on this data format can be found at http://www.geffiles.nl/, as well as several software tools. I had a long list of very specific *.gef files, so the function is not very generic, but it might provide you with a good starting point, if you need to do something similar. I might update it in the future though..


read.gef <- function(filename)
{
gef.lines <- scan(filename, what=character(), sep='\n')
gef.lines.comments <- gef.lines[which(substr(gef.lines, 1,1)=='#')]
gef.lines.data <- gef.lines[-which(substr(gef.lines, 1,1)=='#')]
gef <- NULL
 
nc <- length(grep('COLUMNINFO=',gef.lines.comments))
nr <- length(gef.lines.data)
gef$data <- matrix(ncol=nc, nrow=nr)
for(i in 1:length(gef.lines.data))
{
for(j in 1:nc)
{
gef$data[i,j] <- as.numeric(remove.empty.strings(strsplit(gef.lines.data[i],' ')[[1]])[j])
}
}
gef$data <- as.data.frame(gef$data)
for(i in 1:nc) names(gef$data)[i] <- gef.lines.comments[grep('COLUMNINFO=',gef.lines.comments)[i]]
for(i in 1:nc) names(gef$data)[i] <- strsplit(strsplit(names(gef$data), ' ')[[i]][4], ',')
gef$x <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$y <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[4], ',')[[1]])
gef$surface <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('ZID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
if(length(grep('PARENT=',gef.lines.comments))==1) # file is child
{
gef$depth <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('PARENT=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$z <- gef$surface-gef$depth
}
# else # file is parent
# {
#
# }
#cat(paste(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]]), ','),'\n'))
gef$cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 2',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$a.cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 3',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
gef$a.sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 4',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
return(gef)
}

The function below is used in the code above:

remove.empty.strings <- function(stringArray)
{
newStringArray <- NULL
for(i in 1:length(stringArray))
{
#print(stringArray[i])
if(stringArray[i] != '') {newStringArray <- c(newStringArray, stringArray[i])}
}
return(newStringArray)
}
Created by Pretty R at inside-R.org

To leave a comment for the author, please follow the link and comment on his blog: Bart Rogiers - Sreigor.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.