Attribution modelling in R

[This article was first published on R – Data Integration | Attribution Modelling |, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Attribution modelling in R an example

Here I am going into some examples in attribution modelling in R. It is a complex topic and much more can be said about it than I will be able to do here. I will here go hands on mostly into the markov model using the channel attribution package in R.

Here are some other posts where we cover some of the other topics:

For this example we pull data into a data-frame from our rest API.

The code to pull this data is here:


Pulling data via our REST API

Diving straight into code here:


data_req <- GET("", add_headers( "Authorization" =
"Bearer API_KEY")

journey_data <- content(data_req, "text")
journey_data_json <- fromJSON(journey_data, flatten = TRUE)
journey_data_df <-


The data retrieved then looks like this.

customer journeys for attribution modelling

Customer journeys for attribution modelling


The data contains both converting journeys and non-converting journeys. This is important for the model to give reliable attribution values. In the above example we have both clicks and impressions in the journeys. The image shows source, medium and campaigns along the customer journey but it is easy to go down to keyword level also in the modelling. This way one gets a data driven attributed value for every keyword along every customer journey.


Loading R packages and calculating the attributions

We use the following R packages for this example.

# Install these libraries (only do this once)
# install.packages("ChannelAttribution")
# install.packages("reshape")
# install.packages("ggplot2")

Load the packages


Here we calculate the first-touch, last-touch and linear-touch models.

H <- heuristic_models(journey_data_df, 'sourcepath', 'totalconversions', var_value='totalconversionvalue')

And here we calculate the markov model.

M <- markov_model(journey_data_df, 'sourcepath', 'totalconversions', var_value='totalconversionvalue', order = 1)


Then we join the data-frames by channel-name to be able to compare the attribution models more easily.

attributions <- merge(H, M, by='channel_name')


We remove some colums we dont need so we keep only the interesting ones in this case.

attributions <- attributions[, (colnames(R)%in%c('channel_name', 'first_touch_conversions', 'last_touch_conversions', 'linear_touch_conversions', 'total_conversion'))]

# Renames the columns
colnames(attributions) <- c('channel_name', 'first_touch', 'last_touch', 'linear_touch', 'markov_model')

Before plotting them we definitely need to filter the dataframe a bit as in our case we had more than 500 different converting sources.

attributions <- top_n(attributions, 10, markov_model)

Here we transform the data-frame so ggplot can use it more easily.
attributions <- melt(attributions, id='channel_name')


Plotting the data

And here we can plot the conversions in a bar chart.

ggplot(attributions, aes(channel_name, value, fill = variable)) +
geom_bar(stat='identity', position='dodge') +
ggtitle('Attributed conversoins with the different models') +
theme(axis.title.x = element_text(vjust = -2)) +
theme(axis.title.y = element_text(vjust = +2)) +
theme(title = element_text(size = 16)) +
theme(plot.title=element_text(size = 20)) +


The chart looks like below in this example.

attribution modelling in R 1

attribution modelling in R

To make attribution modelling more actionable one has to join it with the cost data so one can get a ROAS or a CPA based on the chosen attribution model. That way one can allocate the budget and spend where it has the biggest impact. Multi-touch attribution models help here significantly because then it simplifies the analyses as one does not have to take into account bounce-rates and click-trough rates etc. Everything is included in the model when its put into perspective how much was spent on the channel.


Budget optimisations

In optimising the budget and making the data actionable is where our budget-optimiser comes in handy. The budget optimiser in our software takes into account the impact of budget optimisations and gives you prioritised optimisations. Get in touch for a demo or sign-up for a free trial!

Trough our REST API it is also possible to pull the attributed conversions per keyword directly into excel or a google sheet via our API. Let us know if you are interested in this! Here is a documentation about the API:


This blog is also submitted to They have many more practical tips on how to use R.

The post Attribution modelling in R appeared first on Data Integration | Attribution Modelling |

To leave a comment for the author, please follow the link and comment on their blog: R – Data Integration | Attribution Modelling | offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)