[This article was first published on Back Side Smack » R Stuff, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

While responding to this thread on Reddit I made a rough guess as to the heat retention of my french press when completely full of coffee. When I went to bed I realized there was no good reason why I couldn’t actually test out my guess and perform a little data analysis of my own. So I took my Frieling stainless steel french press (no promotional consideration offered!), filled it with boiling water and recorded the temperature with a digital thermometer every five minutes. In contrast to my very rough guess I found that after an hour and a half the water temperature inside was still 151 F (66.1 C). Knowing the time and temperature (though both were probably reported with error, more on that in a bit) I can estimate temperature as a function of time: Leaving aside the problem that temperature is not only a function of time, the linear model is pretty decent! A quadratic specification (just adding I(Time^2)) is even closer. With an ambient temperature of 67 F, the linear model predicts the press would “reach” ambient at about 4 hours and the quadratic model at 7.8 hours! Now, neither of these predictions evolve from a theoretical model of heat transfer and neither would converge on the ambient temperature. The linear model would predict a continuous temperature drop and the quadratic model would reach the temperature of the sun in a few weeks. However within the relevant range both are decent approximations. One of the default regression diagnostic plots provided in R can shed light on both the potential mis-specification of the linear model as well as two data points I may have recorded incorrectly. For each observation I tried to get as close to five minutes as possible and place the thermometer in approximately the same place, but the 1st and 6th observation seem a bit out of whack.

We don’t have to stop here. nls() offers us the ability to fit a more realistic though not necessarily more accurate model. If we know the weight of water in the press, area of the press and the ambient temperature we can use a very simplified form of Fourier’s law to both estimate the heat transfer coefficient and fit a predicted temperature given our known starting temperature. At first it seems like we might have to muck about with computing derivatives manually, but nls() can give us both. How?

• $T(t) = T_{\mathrm{env}} + (T(0) - T_{\mathrm{env}}) \ e^{-r t}$ is our solution to the differential equation where $r = h(Area/Heat Capacity)$ and h is the heat transfer coefficient
• We fit an exponential function with nls() and regress our estimate against $T(t)_{observed}$ to find $\hat{h}$
• Then we can use $\hat{h}$ to find predicted values for temperature and compare them to observed values

What does this look like? On its face, the fit is poorer than the quadratic model. The heat transfer model I created is exceedingly simple, so the poor fit may result from the assumptions regarding the heat transfer area. Another reason may be that the model depends on the first data point, the one measured with what looks like a great degree of error. Removing the first point and refitting the model returns a different estimate for $\hat{h}$ (0.8322 vs. the original 0.968) and an improved fit. We can show the fit of each model with a call to deviance():

 dev.french<- lapply(list(french.lin, french.quad, french.nls, french.nls2), deviance); names(dev.french)<-c("linear","quadratic","first exponential","updated exponential"); dev.french

$linear  56.81051$quadratic
 8.431236

$first exponential  17.24654$updated exponential
 4.985709 It isn't quite predicting the cosmic background radiation but I am happy to see that science works! As always, code is below.