**Back Side Smack » R Stuff**, and kindly contributed to R-bloggers)

While responding to this thread on Reddit I made a rough guess as to the heat retention of my french press when completely full of coffee. When I went to bed I realized there was no good reason why I couldn’t actually test out my guess and perform a little data analysis of my own. So I took my Frieling stainless steel french press (no promotional consideration offered!), filled it with boiling water and recorded the temperature with a digital thermometer every five minutes. In contrast to my very rough guess I found that after an hour and a half the water temperature inside was still 151 F (66.1 C). Knowing the time and temperature (though both were probably reported with error, more on that in a bit) I can estimate temperature as a function of time:

Leaving aside the problem that temperature is not only a function of time, the linear model is pretty decent! A quadratic specification (just adding `I(Time^2)`

) is even closer. With an ambient temperature of 67 F, the linear model predicts the press would “reach” ambient at about 4 hours and the quadratic model at 7.8 hours! Now, neither of these predictions evolve from a theoretical model of heat transfer and neither would converge on the ambient temperature. The linear model would predict a continuous temperature drop and the quadratic model would reach the temperature of the sun in a few weeks. However within the relevant range both are decent approximations. One of the default regression diagnostic plots provided in R can shed light on both the potential mis-specification of the linear model as well as two data points I may have recorded incorrectly.

For each observation I tried to get as close to five minutes as possible and place the thermometer in approximately the same place, but the 1st and 6th observation seem a bit out of whack.

We don’t have to stop here. `nls()`

offers us the ability to fit a more realistic though not necessarily more accurate model. If we know the weight of water in the press, area of the press and the ambient temperature we can use a very simplified form of Fourier’s law to both estimate the heat transfer coefficient and fit a predicted temperature given our known starting temperature. At first it seems like we might have to muck about with computing derivatives manually, but `nls()`

can give us both. How?

- is our solution to the differential equation where and h is the heat transfer coefficient
- We fit an exponential function with
`nls()`

and regress our estimate against to find - Then we can use to find predicted values for temperature and compare them to observed values

What does this look like?

On its face, the fit is poorer than the quadratic model. The heat transfer model I created is exceedingly simple, so the poor fit may result from the assumptions regarding the heat transfer area. Another reason may be that the model depends on the first data point, the one measured with what *looks* like a great degree of error. Removing the first point and refitting the model returns a different estimate for (`0.8322`

vs. the original `0.968`

) and an improved fit. We can show the fit of each model with a call to `deviance()`

:

dev.french<- lapply(list(french.lin, french.quad, french.nls, french.nls2), deviance);

```
```names(dev.french)<-c("linear","quadratic","first exponential","updated exponential");

`dev.french`

$linear [1] 56.81051 $quadratic [1] 8.431236 $`first exponential` [1] 17.24654 $`updated exponential` [1] 4.985709

It isn't quite predicting the cosmic background radiation but I am happy to see that science works! As always, code is below.

**leave a comment**for the author, please follow the link and comment on his blog:

**Back Side Smack » R Stuff**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...