It’s the interactions

[This article was first published on R – Michael's and Christian's Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

What makes a ML model a black-box? It is the interactions. Without any interactions, the ML model is additive and can be exactly described.

Studying interaction effects of ML models is challenging. The main XAI approaches are:

  1. Looking at ICE plots, stratified PDP, and/or 2D PDP.
  2. Study vertical scatter in SHAP dependence plots, or even consider SHAP interaction values.
  3. Check partial-dependence based H-statistics introduced in Friedman and Popescu (2008), or related statistics.

This post is mainly about the third approach. Its beauty is that we get information about all interactions. The downside: it is as good/bad as partial dependence functions. And: the statistics are computationally very expensive to compute (of order n^2).

Different R packages offer some of these H-statistics, including {iml}, {gbm}, {flashlight}, and {vivid}. They all have their limitations. This is why I wrote the new R package {hstats}:

  • It is very efficient.
  • Has a clean API. DALEX explainers and meta-learners (mlr3, Tidymodels, caret) work out-of-the-box.
  • Supports multivariate predictions, including classification models.
  • Allows to calculate unnormalized H-statistics. They help to compare pairwise and three-way statistics.
  • Contains fast multivariate ICE/PDPs with optional grouping variable.

In Python, there is the very interesting project artemis. I will write a post on it later.

Statistics supported by {hstats}

Furthermore, a global measure of non-additivity (proportion of prediction variability unexplained by main effects), and a measure of feature importance is available. For technical details and references, check the following pdf or github.

Classification example

Let’s fit a probability random forest on iris species.


v <- setdiff(colnames(iris), "Species")
fit <- ranger(Species ~ ., data = iris, probability = TRUE, seed = 1)
s <- hstats(fit, v = v, X = iris)  # 8 seconds run-time
# Proportion of prediction variability unexplained by main effects of v:
#      setosa  versicolor   virginica 
# 0.002705945 0.065629375 0.046742035

plot(s, normalize = FALSE, squared = FALSE) +
  ggtitle("Unnormalized statistics") +
  scale_fill_viridis_d(begin = 0.1, end = 0.9)

ice(fit, v = "Petal.Length", X = iris, BY = "Petal.Width", n_max = 150) |> 
  plot(center = TRUE) +
  ggtitle("Centered ICE plots")
Unnormalized H-statistics, i.e., values are roughly on the scale of the predictions (here: probabilities).
Centered ICE plots per class.


  • The features with strongest interactions are Petal Length and Petal Width. These interactions mainly affect species “virginica” and “versicolor”. The effect for “setosa” is almost additive.
  • Unnormalized pairwise statistics show that the strongest absolute interaction happens indeed between Petal Length and Petal Width.
  • The centered ICE plots shows how the interaction manifests: The effect of Petal Length heavily depends on Petal Width, except for species “setosa”. Would a SHAP analysis show the same?

DALEX example

Here, we consider a random forest regression on “Sepal.Length”.



fit <- ranger(Sepal.Length ~ ., data = iris)
ex <- explain(fit, data = iris[-1], y = iris[, 1])

s <- hstats(ex)  # 2 seconds
s  # Non-additivity index 0.054
plot(ice(ex, v = "Sepal.Width", BY = "Petal.Width"), center = TRUE)
Centered ICE plot of strongest relative interactions.


  • Petal Length and Width show the strongest overall associations. Since we are considering normalized statistics, we can say: “About 3.5% of prediction variability comes from interactions with Petal Length”.
  • The strongest relative pairwise interaction happens between Sepal Width and Petal Width: Again, because we study normalized H-statistics, we can say: “About 4% of total prediction variability of the two features Sepal Width and Petal Width can be attributed to their interactions.”
  • Overall, all interactions explain only about 5% of prediction variability (see text output).

Try it out!

The complete R script can be found here. More examples and background can be found on the Github page of the project.

To leave a comment for the author, please follow the link and comment on their blog: R – Michael's and Christian's Blog. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)