Clarifying vague interactions

August 18, 2013

(This article was first published on Ecology in silico, and kindly contributed to R-bloggers)

For some reason, authors occasionally present linear model results with vague or unintelligible interaction effects. One way to be vague when presenting interaction effects is to provide only a table of model coefficients, including no information on the range of covariate values observed, and no plots to aid in interpretation. Here’s an example:

Suppose you have discovered a statistically significant interaction effect between two continous covariates in the context of a linear model.

Suppose also that you have decided to present the model results with the following table, and the reviewers requested no additional information:

  Estimate SE P-value
$\beta_0$ -0.004 0.037 0.921
$\beta_1$ 1.055 0.038 <0.05
$\beta_2$ -0.496 0.037 <0.05
$\beta_3$ 2.002 0.040 <0.05
RSE 0.517    

Without knowing the range of covariate values observed, this table gives an incomplete story about relationship between the covariates and the response variable. Assuming the reader has a decent guess about the range of possible values for the covariates, this is what they can piece together:

# parameter estimates
beta0 <- -.004
beta1 <- 1.055
beta2 <- -.496
beta3 <- 2.002

# reader's guess: range of possible covariate values
x1 <- seq(-5, 5, .1)
x2 <- seq(-5, 5, .1)
X <- expand.grid(x1=x1, x2=x2)

# reader's attempt to know how the covariates relate to E(y)
mu <- with(X, beta0 + beta1*x1 + beta2*x2 + beta3*x1*x2)

d <- data.frame(mu=mu, x1=X$x1, x2=X$x2)
p1 <- ggplot(d, aes(x1, x2, z=mu)) + theme_bw() +
  geom_tile(aes(fill=mu)) +
  stat_contour(binwidth=1.5) +
  scale_fill_gradient2(low="blue", mid="white", high="orange") +
  xlab("Covariate 1") + ylab("Covariate 2") +
  ggtitle("Contour plot of E(y)")

If the reader does not know where the observations fell in this plot, it is difficult to know whether the response variable was increasing or decreasing with each covariate across the range of observed values.

Consider the following two cases, where the observed covariate combinations are included as points.

These two plots tell somewhat different stories despite identical model parameters. On the left, across the range of observed covariates, the expected value of $y$ increases as either covariate increases and the interaction term affects the magnitude this increase. On the right, increases in covariate 1 or 2 could increase or decrease $\mu$, depending on the value of the other covariate.

I won’t get into the nitty gritty of how to present interaction effects (but if you’re interested, there are articles out there, e.g. Lamina et al. 2012). My main goal here is to point out the ambiguity associated with only presenting a table of parameter estimates. My preference would be that authors at least present observed covariate ranges (or better yet values), and provide a plot that illustrates the interaction.

To leave a comment for the author, please follow the link and comment on their blog: Ecology in silico. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Recent popular posts


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)