Estimation of hydraulic conductivity and its uncertainty from grain-size data using GLUE and artificial neural networks.

June 13, 2012
By

(This article was first published on Bart Rogiers - Sreigor, and kindly contributed to R-bloggers)

Abstract
Various approaches exist to relate saturated hydraulic conductivity (Ks) to grain-size data. Most methods use a single grain-size parameter and hence omit the information encompassed by the entire grain-size distribution. This study compares two data-driven modelling methods—multiple linear regression and artificial neural networks—that use the entire grain-size distribution data as input for Kprediction. Besides the predictive capacity of the methods, the uncertainty associated with the model predictions is also evaluated, since such information is important for stochastic groundwater flow and contaminant transport modelling.
Artificial neural networks (ANNs) are combined with a generalised likelihood uncertainty estimation (GLUE) approach to predict Ks from grain-size data. The resulting GLUE-ANN hydraulic conductivity predictions and associated uncertainty estimates are compared with those obtained from the multiple linear regression models by a leave-one-out cross-validation. The GLUE-ANN ensemble prediction proved to be slightly better than multiple linear regression. The prediction uncertainty, however, was reduced by half an order of magnitude on average, and decreased at most by an order of magnitude. This demonstrates that the proposed method outperforms classical data-driven modelling techniques. Moreover, a comparison with methods from the literature demonstrates the importance of site-specific calibration. The data set used for this purpose originates mainly from unconsolidated sandy sediments of the Neogene aquifer, northern Belgium. The proposed predictive models are developed for 173 grain-size Ks-pairs. Finally, an application with the optimised models is presented for a borehole lacking Kdata.

Keywords
Early stopping - Cross-validation - Generalised likelihood uncertainty estimation - Artificial neural networks - Sedimentary aquifer - Data-driven modelling - Likelihood measures - Principal component analysis - GLUE-ANN

Reference
Rogiers, B., Mallants, D., Batelaan, O., Gedeon, M., Huysmans, M., Dassargues, A. 2012. Estimation of hydraulic conductivity and its uncertainty from grain-size data using GLUE and artificial neural networks. Mathematical Geosciences 44: 739–763. Paper - Author's version - R script - Blog post



An example script for performing such analysis in the R language is provided with the paper. The most up to date version can be found below. Two simple examples are provided:

The first one models the [0,100] range of x² in function of x. It plots two figures: one with the results for the training data, and one that predicts new data on the [-100,200] range. The data outside of the training data range can of course not be predicted, as is clear from the second figure.
 The second example shows a cross-validation with the air quality dataset:

R script with examples of GLUE-ANN analyses:

################################################################################
### GLUE-ANN parameters + examples (Rogiers et al. 2011) ###
################################################################################
# A dataframe has to be provided with the independent variable(s), and a vector
# with the dependent variable. A list of parameters has to be set to define the
# a priori distributions of the stochastic parameters, and fixed values for the
# ones considered to be constant.
# Be sure to load the additional GLUE-ANN functions into R before trying
# the examples provided below. The AMORE package should be installed and loaded:
install.packages(c('AMORE', 'Hmisc'))
library(AMORE)
library(Hmisc)
 
### Parameter list #############################################################
# A list of all required parameters with their default values.
 
### Pre-processing parameters ################################################
# Orthogonalise input data; options are: 'Cov', 'Cor', and 'No'
orthogonal <- 'Cor'
# Number of input variables, provide vector for stochastic treatment
variableNumber <- c(1:ncol(inputData))
# T: Select always most variance explaining PCs; F: Choose PCs randomly
dimRed <- T
# Rescale output to 0-1 range
rescale.output <- T
 
### ANN parameters ###########################################################
# Enable early stopping
ES <- T
# Part of the data used for early stopping
randomESPart <- 0.5
# Early stopping samples should have a similar mean as the training samples
nonEqualMeans <- T
# Maximum difference between early stopping and training observation means
maxMeanDiff <- mean(outputData)*0.2
# maximum number of training cycles, provide a vector for stochastic treatment
nCycles <- 100
# Amount of hidden layers, provide vector for stochastic treatment
HLrange <- c(1:1)
# Amount of hidden nodes, provide vector for stochastic treatment
HNrange <- c(1:10)
# Maximum amount of hidden nodes per input variable
maxHNPerVar <- Inf
# Hidden layer activation function; options are 'tansig', 'sigmoid', 'purelin'
hidden.layer <- 'tansig'
# Output layer activation function; options are 'tansig', 'sigmoid', 'purelin'
output.layer <- 'sigmoid'
 
### GLUE parameters ##########################################################
# Number of models in the ensemble
nSets <- 100
# Number of random variates for weights based on Stedinger et al (2008)
nStedinger <- 50
# GLUE weighting procedure; options are 'MSE', 'MSEes' (for weights based on
# the early stopping subset performance), 'NSEff' and 'Stedinger'
weighting <- 'Stedinger'
 
### Cross-validation parameters ##############################################
# Use leave-one-out cross-validation
cv <- T
 
### Examples ###################################################################
 
### Example of training and prediction for fictional data ####################
# Create fictional data for x range [1,100]
inputData <- data.frame(x=seq(1,100,0.5))
# Dependent variable y = x^2
outputData <- inputData$x ^ 2
# add random noise to y, representing different sources of error
outputData <- jitter(outputData, amount=mean(outputData)*0.2)
# Run GLUE-ANN functions
glue.ann.ensemble <- train.glue.ann(inputData, outputData, nCycles=500)
training.results <- predict.glue.ann(inputData, glue.ann.ensemble)
# Visualise results for training data
plot(inputData$x, training.results$eMean, type='l', xlab='x', ylab=
expression(y = x^2 + epsilon), ylim=c(-2000,13000))
lines(inputData$x, glue.ann.ensemble$outputData, lty=2, col='red')
lines(inputData$x, training.results$eQ025, lty=3, col='blue')
lines(inputData$x, training.results$eQ975, lty=3, col='blue')
legend('topleft', c('GLUE-ANN ensemble mean prediction','Observations',
'95% prediction uncertainty'), lty=c(1,2,3),col=c('black','red','blue'),
bty='n')
# Predict new data
newData <- data.frame(x=c(-100:200)+0.5)
newData.predictions <- predict.glue.ann(newData, glue.ann.ensemble)
# Visualise results of prediction
plot(newData$x, newData.predictions$eMean, type='l', xlab='x', ylab=
expression(y = x^2 + epsilon), ylim=c(-2000,13000))
lines(newData$x, jitter(newData$x^2, amount=mean(outputData)*0.2) , lty=2,
col='red')
lines(newData$x, newData.predictions$eQ025, lty=3, col='blue')
lines(newData$x, newData.predictions$eQ975, lty=3, col='blue')
abline(v=c(0,100))
text(50,13000,'Training data range')
legend('bottomright', c('GLUE-ANN ensemble mean prediction','Observations',
'95% prediction uncertainty'), lty=c(1,2,3),col=c('black','red','blue'),
bty='n')
 
### Example of cross-validation with an R dataset ############################
# Read input and output data (ozone concentration in function of wind,
# temperature and solar radiation)
outputData <- na.omit(airquality)[,1]
inputData <- na.omit(airquality)[,2:4]
# Perform cross-validation
cross_validation <- train.glue.ann(inputData, outputData, cv=T, nCycles=1000
, nSets=10)
# Visualisation
plot(cross_validation$target, ylim=c(-100,200), xlab='Sample number', ylab=
'Ozone')
points(cross_validation$eMean, pch=2, col='red')
lines(cross_validation$eQ025, lty=3, col='blue')
lines(cross_validation$eQ975, lty=3, col='blue')
legend('bottomright', c('Observations','GLUE-ANN ensemble mean prediction',
'95% prediction uncertainty'), pch=c(1,2,NA), lty=c(NA,NA,3),col=c('black',
'red','blue'),bty='n')

R functions for GLUE-ANN analysis with the AMORE package (Castejón Limas et al 2010):
################################################################################
### GLUE-ANN - functions (Rogiers et al. 2012) ###
################################################################################
 
### ANN training function ######################################################
nncalc <- function(Ptrain, Pval, targetTrain, targetVal, architecture, n, Stao=1
, es=T, hidden.layer="sigmoid", output.layer="purelin")
{
neuralnetwork <- newff(n.neurons=architecture, learning.rate.global=1e-2,
momentum.global=0.5, error.criterium = 'LMS', Stao = Stao, hidden.layer=
hidden.layer, output.layer=output.layer, method="ADAPTgdwm")
if(!es) { Pval <- NULL; targetVal <- NULL }
capture.output(results <- train(neuralnetwork, Ptrain, Stao=Stao, Pval=Pval,
Tval=targetVal,targetTrain, error.criterium = 'LMS', report=T, show.step=1,
n.shows=n ))
return(results)
}
 
### Sampling parameter vectors function ########################################
resample <- function(x, n, ...) x[sample.int(length(x), size=n, ...)]
 
### Model ensemble training function ###########################################
train.glue.ann <- function(inputData,outputData,orthogonal='Cor',variableNumber=
c(1:ncol(inputData)),dimRed=T,rescale.output=T,ES=T,randomESPart=0.5,
nonEqualMeans=T,maxMeanDiff=mean(outputData)*0.2,nCycles=100,HLrange=c(1:1),
HNrange=c(1:10),maxHNPerVar=Inf,hidden.layer='tansig',output.layer='sigmoid',
nSets=100,nStedinger=50,weighting='Stedinger',cv=F)
{
### Initialisation ###########################################################
parameters <- NULL
parameters$orthogonal <- orthogonal
parameters$variableNumber <- variableNumber
parameters$dimRed <- dimRed
parameters$rescale.output <- rescale.output
parameters$ES <- ES
parameters$randomESPart <- randomESPart
parameters$nonEqualMeans <- nonEqualMeans
parameters$maxMeanDiff <- maxMeanDiff
parameters$nCycles <- nCycles
parameters$HLrange <- HLrange
parameters$HNrange <- HNrange
parameters$maxHNPerVar <- maxHNPerVar
parameters$hidden.layer <- hidden.layer
parameters$output.layer <- output.layer
parameters$nSets <- nSets
parameters$nStedinger <- nStedinger
parameters$weighting <- weighting
parameters$cv <- cv
glue.ann <- NULL
cvPredictions <- NULL
glue.ann$parameters <- parameters
glue.ann$esNumber <- matrix(nrow=nrow(inputData), ncol=nSets)
 
### Training loop ############################################################
for(crossValidationSampleNumber in 1:ifelse(cv,nrow(inputData),1))
{
if(cv)
{
inputDataModellingCV <- as.data.frame(
inputData[crossValidationSampleNumber,])
inputDataModelling <- as.data.frame(
inputData[-crossValidationSampleNumber,])
outputDataModellingCV <- outputData[crossValidationSampleNumber]
outputDataModelling <- outputData[-crossValidationSampleNumber]
} else {
inputDataModelling <- inputData
outputDataModelling <- outputData
}
names(inputDataModelling) <- c(1:ncol(inputDataModelling))
if(cv) names(inputDataModellingCV) <- c(1:ncol(inputDataModellingCV))
glue.ann$inputData <- inputDataModelling
glue.ann$outputData <- outputDataModelling
P <- cbind(inputDataModelling)
if(orthogonal != 'No'){Ppca <- princomp(P, cor=ifelse(orthogonal=='Cor', T,
F));P <- Ppca$scores}
trans <- P
glue.ann$P <- P
for(i in 1:ncol(inputData)) trans[,i] <- (trans[,i]-mean(P[,i]))/sd(P[,i])
if(rescale.output) outputDataModelling <- (outputDataModelling - (
min(glue.ann$outputData)))/(((max(glue.ann$outputData))-(
min(glue.ann$outputData))))
if(rescale.output & cv) outputDataModellingCV <- (outputDataModellingCV -(
min(glue.ann$outputData)))/(((max(glue.ann$outputData))-(
min(glue.ann$outputData))))
variableslist <- list();networks <- list(); esNumber <- NULL
earlyStoppingMSE <- NULL; earlyStoppingCycle <- NULL; hn <- list()
totalVar <- NULL; trainingMSE <- NULL
for(j in 1:nSets)
{
if(cv) cat(paste('# Cross-validating sample',crossValidationSampleNumber,
' '))
esDataNumbers <- sample(1:nrow(inputDataModelling),
round(nrow(inputDataModelling)*randomESPart,0))
while(nonEqualMeans)
{
esDataNumbers <- sample(1:nrow(inputDataModelling),
round(nrow(inputDataModelling)*randomESPart,0))
esMean <- mean(outputDataModelling[esDataNumbers])
trainMean <- mean(outputDataModelling[-esDataNumbers])
if(abs(esMean-trainMean) < maxMeanDiff) nonEqualMeans <- F
}
trainingDataNumbers <- c(1:nrow(inputDataModelling))[-esDataNumbers]
if(!ES) {trainingDataNumbers <- c(1:nrow(inputDataModelling))
esDataNumbers <- trainingDataNumbers}
nvars <- resample(variableNumber,1)
variables <- resample(c(1:ncol(inputDataModelling)),nvars)
if(dimRed) variables <- c(1:length(variables))
variableslist[[j]] <- variables
hiddenLayers <- resample(HLrange,1, replace=T)
hiddenNodes <- resample(HNrange,hiddenLayers, replace=T)
for(i in 1:hiddenLayers) if(maxHNPerVar < hiddenNodes[i]/nvars) {
hiddenNodes[i] <- resample((min(HNrange):maxHNPerVar),1, replace=T)}
ncyc <- resample(nCycles,1)
cat('# Training ANN nr. ',j,', with architecture: ',sep='')
cat(c(nvars,hiddenNodes[1:hiddenLayers],1), sep='-')
Ptrain <- as.data.frame(trans[trainingDataNumbers,variables] )
Pval <- as.data.frame(trans[esDataNumbers,variables])
targetTrain <- outputDataModelling[trainingDataNumbers]
targetVal <- outputDataModelling[esDataNumbers]
networks[[j]] <- nncalc(Ptrain, Pval, targetTrain, targetVal,
architecture=c(nvars,hiddenNodes,1), ncyc, Stao=10, es=ES, hidden.layer=
hidden.layer, output.layer=output.layer)
if(ES) esNumber[j] <- which.min(networks[[j]]$Merror[,2])
predTrain <- sim(networks[[j]]$net, Ptrain)
trainingMSE[j] <- mean((targetTrain-predTrain)^2)
if(ES) predEarlyStopping <- sim(networks[[j]]$net, Pval)
if(ES) earlyStoppingMSE[j] <- mean((targetVal-predEarlyStopping)^2)
if(ES) totalVar[j] <- var(c(predTrain-targetTrain, predEarlyStopping-
targetVal))
if(!ES) totalVar[j] <- var(c(predTrain-targetTrain))
if(ES) earlyStoppingCycle[j] <- which.min(networks[[j]]$Merror[,2])
hn[[j]] <- hiddenNodes
if(ES) cat(paste(' # Training MSE: ',round(trainingMSE[j],2),
', early stopping MSE: ',round(earlyStoppingMSE[j],2),', cycles: ',
earlyStoppingCycle[j],'\n', sep=''))
if(!ES) cat(paste(' # Training MSE: ',round(trainingMSE[j],2),
', cycles: ',earlyStoppingCycle[j],'\n', sep=''))
}
glue.ann$networks <- networks
glue.ann$variables <- variableslist
glue.ann$trainMSE <- trainingMSE
if(ES) glue.ann$esMSE <- earlyStoppingMSE
if(ES) glue.ann$totalMSE <- (1-randomESPart) * glue.ann$trainMSE +
randomESPart * glue.ann$esMSE
if(!ES) glue.ann$totalMSE <- glue.ann$trainMSE
glue.ann$totalVar <- totalVar
glue.ann$hn <- hn
if(orthogonal != 'No') glue.ann$pca <- Ppca
 
### Predict cross-validated sample #########################################
if(cv)
{
cvPrediction <- predict.glue.ann(inputDataModellingCV, glue.ann)
cvPredictions$MSE[crossValidationSampleNumber] <- mean(
(outputData[crossValidationSampleNumber] - cvPrediction$eMean)^2)
cvPredictions$eMean[crossValidationSampleNumber] <- cvPrediction$eMean
cvPredictions$eMedian[crossValidationSampleNumber] <- cvPrediction$eMedian
cvPredictions$eMin[crossValidationSampleNumber] <- cvPrediction$eMin
cvPredictions$eMax[crossValidationSampleNumber] <- cvPrediction$eMax
cvPredictions$eQ025[crossValidationSampleNumber] <- cvPrediction$eQ025
cvPredictions$eQ975[crossValidationSampleNumber] <- cvPrediction$eQ975
cvPredictions$MLE[crossValidationSampleNumber] <- cvPrediction$MLE
cvPredictions$target[crossValidationSampleNumber] <-
outputData[crossValidationSampleNumber]
}
}
if(cv){return(cvPredictions)} else {return(glue.ann)}
}
 
### Model ensemble prediction function #########################################
predict.glue.ann <- function(inputData, glue.ann,
weighting=glue.ann$parameters$weighting)
{
names(inputData) <- c(1:ncol(inputData))
if(glue.ann$parameters$orthogonal != 'No') inputData <-
predict(glue.ann$pca, newdata=inputData) # orthogonalise
for(i in 1:ncol(inputData)) inputData[,i] <-
(inputData[,i]-mean(glue.ann$P[,i]))/sd(glue.ann$P[,i]) # standardise
 
### Predict model ensemble results for new data ##############################
resultsMatrix <- matrix(nrow=nrow(inputData), ncol=glue.ann$parameters$nSets)
if(nrow(inputData) > 1) {for(i in 1:glue.ann$parameters$nSets) {
resultsMatrix[,i] <- sim(glue.ann$networks[[i]]$net,
inputData[,glue.ann$variables[[i]]])}}
if(nrow(inputData) == 1) {for(i in 1:glue.ann$parameters$nSets) {
resultsMatrix[,i] <- sim(glue.ann$networks[[i]]$net, t(
inputData[,glue.ann$variables[[i]]]))}}
 
### Construct GLUE weights ###################################################
if(glue.ann$parameters$weighting=='MSEes'){weightsGLUE <- (max(glue.ann$esMSE)
- glue.ann$esMSE)
weightsGLUE[which(weightsGLUE < 0)] <- 0
weightsGLUE <- weightsGLUE/sum(weightsGLUE)}
if(glue.ann$parameters$weighting=='MSE'){weightsGLUE <- (
max(glue.ann$totalMSE) - glue.ann$totalMSE)^2
weightsGLUE[which(weightsGLUE < 0)] <- 0
weightsGLUE <- weightsGLUE/sum(weightsGLUE)}
if(glue.ann$parameters$weighting=='Stedinger')
{
se2 <- min(glue.ann$totalVar)
n <- length(glue.ann$outputData)
weightsGLUE <- exp((-n/2)* (glue.ann$totalMSE)/(se2) )
}
if(glue.ann$parameters$weighting=='NSEff')
{
weightsGLUE <- (1-(glue.ann$totalMSE/var(glue.ann$outputData)))
weightsGLUE[which(weightsGLUE < 0)] <- 0
}
 
### Weight model predictions #################################################
predictions <- NULL
predictions$MLE <- resultsMatrix[1,which.min(glue.ann$totalVar)]
if(glue.ann$parameters$weighting != 'Stedinger')
{
predictions$eMean <- apply(resultsMatrix, 1, "wtd.mean", weights =
weightsGLUE, normwt=T)
predictions$eMedian <- apply(resultsMatrix, 1, "wtd.quantile", weights =
weightsGLUE, probs = c(0.5), normwt=T)
predictions$eQ025 <- apply(resultsMatrix, 1, "wtd.quantile", weights =
weightsGLUE, probs = c(0.025), normwt=T)
predictions$eQ975 <- apply(resultsMatrix, 1, "wtd.quantile", weights =
weightsGLUE, probs = c(0.975), normwt=T)
predictions$eMin <- apply(resultsMatrix, 1, "min")
predictions$eMax <- apply(resultsMatrix, 1, "max")
predictions$eResults <- resultsMatrix
}
if(glue.ann$parameters$weighting=='Stedinger')
{
resultsMatrix_sted <- matrix(nrow=nrow(inputData), ncol=
glue.ann$parameters$nSets*glue.ann$parameters$nStedinger)
for(i in 1:nrow(inputData))
{
for(k in 1:glue.ann$parameters$nSets)
{
errorVector <- rnorm(glue.ann$parameters$nStedinger,mean=0,sd=sqrt(se2))
resultsMatrix_sted[i,(((k-1)*glue.ann$parameters$nStedinger)+c(
1:glue.ann$parameters$nStedinger))] <- errorVector + resultsMatrix[i,k]
}
}
weightsGLUE <- rep(weightsGLUE, each=glue.ann$parameters$nStedinger)
predictions$eMean <- apply(resultsMatrix_sted, 1, "wtd.mean", weights =
weightsGLUE, normwt=T)
for(i in 1:nrow(inputData))
{
quantiles <- wtd.quantile(resultsMatrix_sted[i,], weights = weightsGLUE,
probs = c(0.025, 0.5, 0.975), normwt=T)
predictions$eMedian[i] <- quantiles[2]
predictions$eQ025[i] <- quantiles[1]
predictions$eQ975[i] <- quantiles[3]
}
predictions$eResults <- resultsMatrix_sted
predictions$eMin <- apply(resultsMatrix_sted, 1, "min")
predictions$eMax <- apply(resultsMatrix_sted, 1, "max")
}
 
### Backtransform of predictions according to training data output ###########
if(glue.ann$parameters$rescale.output)
{
outputRange <- max(glue.ann$outputData)-min(glue.ann$outputData)
outputMin <- min(glue.ann$outputData)
predictions$MLE <- predictions$MLE * outputRange + outputMin
predictions$eMean <- predictions$eMean * outputRange + outputMin
predictions$eMedian <- predictions$eMedian * outputRange + outputMin
predictions$eQ025 <- predictions$eQ025 * outputRange + outputMin
predictions$eQ975 <- predictions$eQ975 * outputRange + outputMin
predictions$eMin <- predictions$eMin * outputRange + outputMin
predictions$eMax <- predictions$eMax * outputRange + outputMin
predictions$eResults <- predictions$eResults * outputRange + outputMin
}
return(predictions)
}

To leave a comment for the author, please follow the link and comment on his blog: Bart Rogiers - Sreigor.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.