Revisited: Forecasting Last Christmas Search Volume

[This article was first published on r-bloggers – STATWORX, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It is June and nearly half of the year is over, marking the middle between Christmas 2018 and 2019. Last year in autumn, I’ve published a blog post about predicting Wham’s „Last Christmas“ search volume using Google Trends data with different types of neural network architectures. Of course, now I want to know how good the predictions were, compared to the actual search volumes.

The following table shows the predicted values by the different network architectures, the true search volume data in the relevant time region from November 2018 until January 2019, as well as the relative prediction error in brackets:

monthMLPCNNLSTMactual
2018-110.166 (0.21)0.194 (0.078)0.215 (0.023)0.21
2018-120.858 (0.057)0.882 (0.031)0.817 (0.102)0.91
2019-010.035 (0.153)0.034 (0.149)0.035 (0.153)0.03

There’s no clear winner in this game. For the month November, the LSTM model performs best with a relative error of only 2.3%. However, in the „main“ month December, the LSTM drops in accuracy in favor of the 1-dimensional CNN with 3.1% error and the MLP with 5.7% error. Compared to November and December, January exhibits higher prediction errors >10% regardless of the architecture.

To bring a little more data science flavor into this post, I’ve created a short R script that presents the results in a cool „heatmap“ style.

library(dplyr)
library(ggplot2)

# Define data frame for plotting
df_plot <- data.frame(MONTH=rep(c("2018-11", "2018-12", "2019-01"), 3),
                      MODEL = c(rep("MLP", 3), rep("CNN", 3), rep("LSTM", 3)),
                      PREDICTION = c(0.166, 0.858, 0.035, 0.194, 0.882, 0.034, 0.215, 0.817, 0.035),
                      ACTUAL = rep(c(0.21, 0.91, 0.03), 3))

# Do plot
df_plot %>%
  mutate(MAPE = round(abs(ACTUAL - PREDICTION) / ACTUAL, 3)) %>%
  ggplot(data = ., aes(x = MONTH, y = MODEL)) +
  geom_tile(aes(fill = MAPE)) +
  scale_fill_gradientn(colors = c('navyblue', 'darkmagenta', 'darkorange1')) +
  geom_text(aes(label = MAPE), color = "white") +
  theme_minimal()
prediction heat map

This year, I will (of course) redo the experiment using the newly acquired data. I am curious to find out if the prediction improves. In the meantime, you can sign up to our mailing list, bringing you the best data science, machine learning and AI reads and treats directly into your mailbox!

Über den Autor
Sebastian Heinz

Sebastian Heinz

I am the founder and CEO of STATWORX. I enjoy writing about machine learning and AI, especially about neural networks and deep learning. In my spare time, I love to cook, eat and drink as well as traveling the world.

ABOUT US


STATWORX
is a consulting company for data science, statistics, machine learning and artificial intelligence located in Frankfurt, Zurich and Vienna. Sign up for our NEWSLETTER and receive reads and treats from the world of data science and AI. If you have questions or suggestions, please write us an e-mail addressed to blog(at)statworx.com.  

Der Beitrag Revisited: Forecasting Last Christmas Search Volume erschien zuerst auf STATWORX.

To leave a comment for the author, please follow the link and comment on their blog: r-bloggers – STATWORX.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)