**The Lab-R-torian**, and kindly contributed to R-bloggers)

## Background

At the AACC meeting last week, some of my friends were bugging me that I had not made a blog post in 10 months. Without getting into it too much, let's just say I can blame Cerner. Thanks also to a prod from a friend, here is an approach to a fairly common problem.

We all report calculated quantities out of our laboratories–quantities such as LDL cholesterol, non-HDL cholesterol, aldosterone:renin ratio, free testosterone, eGFR etc. How does one determine the precision (i.e. imprecision) of a calculated quantity. While earlier in my life, I might go to the trouble of trying to do such calculations analytically using the rules of error propagation, in my later years, I am more pragmatic and I'm happy to use a computational approach.

In this example, we will model the precision in calculated bioavailable testosterone (CBAT). Without explanation, I provide an R function for CBAT (and free testosterone) where testosterone is reported in nmol/L, sex hormone binding globulin (SHBG) is reported in nmol/L, and albumin is reported in g/L. Using the Vermeulen Equation as discussed in this publication, you can calculate CBAT as follows:

cbat <- function(TT,SHBG,ALB = 43){ Kalb <- 3.6*10^4 Kshbg <- 10^9 N <- 1 + Kalb*ALB/69000 a <- N*Kshbg b <- N + Kshbg*(SHBG - TT)/10^9 c <- -TT/10^9 FT <- (-b + sqrt(b^2 - 4*a*c))/(2*a)*10^9 cbat <- N*FT return(list(free.T = FT, cbat = cbat)) }

To sanity-check this, we can use this online calculator. Taking a typical male testosterone of 20 nmol/L, an SHBG of 50 nmol/L and an albumin of 43 g/L, we get the following:

cbat(20,50)

## $free.T ## [1] 0.3273049 ## ## $cbat ## [1] 7.670319

which is confirmed by the online calculator. Because the function is vectorized, we an submit a vector of testosterone results and SHBG results and get a vector of CBAT results.

cbat(c(10,20,30), c(40,50,60))

## $free.T ## [1] 0.1738837 0.3273049 0.4661380 ## ## $cbat ## [1] 4.074926 7.670319 10.923842

## Precision of Components

We now need some precision data for the three components. However, in our lab, we just substitute 43 g/L for the albumin, so we will leave that term out of the analysis and limit our precision calculation to testosterone and SHBG. This will allow us to present the precision as surface plots as a function of total testosterone and SHBG.

We do testosterone by LC-MS/MS using Deborah French's method. In the last three months, the precision has been 3.9% at 0.78 nmol/L, 5.5% at 6.7 nmol/L, 5.2% at 18.0 nmol/L, and 6.0% at 28.2 nmol/L. We are using the Roche Cobas e601 SHBG method which, according to the package insert, has precision of 1.8% at 14.9 nmol/L, 2.1 % at 45.7 nmol/L, and 4.0% at 219 nmol/L.

cv.tt <- c(3.9, 5.5, 5.2, 6.0) conc.tt <- c(0.78, 6.7, 18.0, 28.2) tt.df <- data.frame(conc.tt,cv.tt) plot(cv.tt ~ conc.tt, data = tt.df, main = "Precision Profile of Testosterone", xlab = "Testosterone (nmol/L)", ylab = "CV Testosterone (%)", ylim = c(0,8), type = "o")

cv.shbg <- c(1.8, 2.1, 4.0) conc.shbg <- c(14.9,45.7,219) shbg.df <- data.frame(cv.shbg, conc.shbg) plot(cv.shbg ~ conc.shbg, data = shbg.df, main = "Precision Profile of SHBG", xlab = "SHBG (nmol/L)", ylab = "CV SHGB (%)", ylim = c(0,5), type = "o")

## Build Approximation Functions

We will want to generate linear interpolations of these precision profiles. Generally, we might watnt to use non-linear regression to do this but I will just linearly interpolate with the `approxfun()`

function. This will allow us to just call a function to get the approximate CV at concentrations other than those for which we have data.

tt.fun <- approxfun(x = tt.df$conc.tt, y = tt.df$cv.tt) shbg.fun <- approxfun(x = shbg.df$conc.shbg, y = shbg.df$cv.shbg)

Now, if we want to know the precision of SHBG at, say, 100 nmol/L, we can just write,

shbg.fun(100)

## [1] 2.695326

to obtain our precision result.

## Random Simulation

Now let's build a grid of SHBG and total testosterone (TT) values at which we will calculate the precision for CBAT.

shbg <- seq(from = 15, to = 200, by = 5) tt <- seq(from = 1, to = 28, by = 1)

At each point on the grid, we will have to generate, say, 100000 random TT values and 100000 random SHBG values with the appropriate precision and then calculate the expected precision of CBAT at those concentrations.

Let's do this for a single pair of concentrations by way of example modelling the random analytical error as Gaussian using the `rnorm()`

function.

# [SHBG] = 15 nmol/L # [TT] = 5.0 nmol/L set.seed(100) #just to get consistent results rng.tt <- rnorm(100000, mean = 5.0, sd = tt.fun(5.0)/100*5.0) rng.shbg <- rnorm(100000, mean = 15, sd = shbg.fun(15)/100*15) rng.cbat <- cbat(rng.tt, rng.shbg) cv.cbat <- sd(rng.cbat$cbat)/mean(rng.cbat$cbat)*100 cv.cbat

## [1] 5.30598

So, we can build the process of calculating the CV of CBAT into a function as follows:

cbat.cv <- function(TT, SHBG, N = 100000){ rng.tt <- rnorm(N, mean = TT, sd = tt.fun(TT)/100*TT) rng.shbg <- rnorm(N, mean = SHBG, sd = shbg.fun(SHBG)/100*SHBG) rng.cbat <- cbat(rng.tt, rng.shbg) cv <- sd(rng.cbat$cbat)/mean(rng.cbat$cbat)*100 return(cv) }

Now, we can make a matrix of the data for presenting a plot, calculating the CV and appending it to the dataframe.

cv.grid <- expand.grid(tt, shbg) names(cv.grid) <- c("tt", "shbg") cv.grid$cv.cbat <- mapply(cbat.cv, cv.grid$tt, cv.grid$shbg)

Now make plot using the `wireframe()`

function.

library(lattice) wireframe(cv.cbat ~ tt*shbg, data = cv.grid, xlab = "Testo \n (nmol/L)", ylab = "SHBG \n (nmol/L)", zlab = "CV \n (%)", drape = TRUE, colorkey = TRUE, col.regions = colorRampPalette(c("blue", "red", "yellow"))(100), scales = list(arrows=FALSE,cex=.5,tick.number = 10) )

This shows us that the CV of CBAT ranges from about 4–8% over the TT and SHBG ranges we have looked at.

## Conclusion

We have determined the CV of calculated bioavailable testosterone using random number simulations using empirical CV data and produced a surface plot of CV. This allows us to comment on the CV of this lab reportable as a function of the two variables by which it is determined.

**Parting Thought on Monte Carlo Simulations**

*The die is cast into the lap, but its every decision is from the LORD.*

(Prov 16:33)

**leave a comment**for the author, please follow the link and comment on their blog:

**The Lab-R-torian**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...