R and the Geotechnical Exchange Format

[This article was first published on Bart Rogiers - Sreigor, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Quite some time ago now, I wrote this function to read some *.gef files into R. “gef” stands for Geotechnical Exchange Format. Details on this data format can be found at http://www.geffiles.nl/, as well as several software tools. I had a long list of very specific *.gef files, so the function is not very generic, but it might provide you with a good starting point, if you need to do something similar. I might update it in the future though..


read.gef <- function(filename)
{
    gef.lines <- scan(filename, what=character(), sep='\n')
    gef.lines.comments <- gef.lines[which(substr(gef.lines, 1,1)=='#')]
    gef.lines.data <- gef.lines[-which(substr(gef.lines, 1,1)=='#')]
    gef <- NULL
 
    nc <- length(grep('COLUMNINFO=',gef.lines.comments))
    nr <- length(gef.lines.data)
    gef$data <- matrix(ncol=nc, nrow=nr)    
    for(i in 1:length(gef.lines.data)) 
    {
        for(j in 1:nc)
        {
        gef$data[i,j] <- as.numeric(remove.empty.strings(strsplit(gef.lines.data[i],' ')[[1]])[j])
        }
    }
    gef$data <- as.data.frame(gef$data)
    for(i in 1:nc) names(gef$data)[i] <- gef.lines.comments[grep('COLUMNINFO=',gef.lines.comments)[i]]
    for(i in 1:nc) names(gef$data)[i] <- strsplit(strsplit(names(gef$data), ' ')[[i]][4], ',')
    gef$x <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
    gef$y <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('XYID=',gef.lines.comments)], ' ')[[1]])[4], ',')[[1]])
    gef$surface <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('ZID=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
    if(length(grep('PARENT=',gef.lines.comments))==1) # file is child
    {
        gef$depth <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('PARENT=',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
        gef$z <- gef$surface-gef$depth
    }
#    else # file is parent
#    {
#    
#    }   
    #cat(paste(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]]), ','),'\n'))
    gef$cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 1',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
    gef$sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 2',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]]) 
    gef$a.cone <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 3',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
    gef$a.sleeve <- as.numeric(strsplit(remove.empty.strings(strsplit(gef.lines.comments[grep('MEASUREMENTVAR= 4',gef.lines.comments)], ' ')[[1]])[3], ',')[[1]])
    return(gef)
}

The function below is used in the code above:

remove.empty.strings <- function(stringArray)
{
    newStringArray <- NULL
    for(i in 1:length(stringArray)) 
    {
        #print(stringArray[i])
        if(stringArray[i] != '') {newStringArray <- c(newStringArray, stringArray[i])}
    }
    return(newStringArray)
}

To leave a comment for the author, please follow the link and comment on their blog: Bart Rogiers - Sreigor.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)