Predicting the NASDAQ 100 with Hyperparameter Tuning

[This article was first published on DataGeeek, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Pressure on the markets has intensified ahead of the Wednesday deadline for trade deals. We will model monthly Nasdaq 100 data with the Federal Funds Effective Rate and the Unemployment Rate. We will use Boosted trees with hyperparameter tuning.

library(tidyverse)
library(tidymodels)
library(modeltime)
library(timetk)

#Unemployment Rate (UNRATE)
df_unrate <- 
  tq_get("UNRATE", get = "economic.data") %>% 
  select(date, unrate = price)
  
#Federal Funds Effective Rate (FEDFUNDS)
df_fedfunds <- 
  tq_get("FEDFUNDS", get = "economic.data") %>% 
  select(date, fedfunds = price)

#Nasdaq 100
df_nasdaq <- 
  tq_get("^NDX") %>% 
  tq_transmute(select = close,
               mutate_fun = to.monthly,
               col_rename = "nasdaq") %>% 
  mutate(date = as.Date(date))


#Merging the datasets
df_merged <- 
  df_unrate %>% 
  left_join(df_fedfunds) %>% 
  left_join(df_nasdaq) %>% 
  drop_na() 


#Split Data 
splits <- 
  time_series_split(
  df_merged,
  assess     = "1 year",
  cumulative = TRUE
)

df_train <- training(splits)
df_test <- testing(splits)



#Recipe
recipe_ml <- 
  recipe(nasdaq ~ ., df_train) %>%
  step_date(date, features = "month", ordinal = FALSE) %>%
  step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% 
  step_mutate(date_num = as.numeric(date)) %>%
  step_normalize(all_numeric_predictors()) %>%
  step_rm(date)

#Model spec
mod_spec <- 
  boost_tree(trees = tune(),
             min_n = tune(),
             tree_depth = tune(),
             learn_rate = tune()) %>%
  set_engine("xgboost") %>% 
  set_mode("regression")

#Hyperparameter Tuning
mod_param <- extract_parameter_set_dials(mod_spec)

set.seed(1234)
model_tbl <- 
  mod_param %>% 
  grid_random(size = 50) %>%
  create_model_grid(
    f_model_spec = boost_tree,
    engine_name  = "xgboost",
    mode         = "regression"
  )

#Extracting the model list
model_list <- model_tbl$.models

#Workflowsets
model_wfset <- 
  workflow_set(
  preproc = list(recipe_ml),
  models = model_list, 
  cross = TRUE
)


#Fitting Using Parallel Backend
model_parallel_tbl <- 
  model_wfset %>%
  modeltime_fit_workflowset(
    data    = df_train,
    control = control_fit_workflowset(
      verbose   = TRUE,
      allow_par = TRUE
    )
  )



#Accuracy 
model_parallel_tbl %>% 
  modeltime_calibrate(new_data = df_test) %>% 
  modeltime_accuracy() %>%
  table_modeltime_accuracy()



#Calibration to the test set for the best model
calibration_tbl <- 
  model_parallel_tbl %>%
  filter(.model_desc == "RECIPE_BOOST_TREE_27") %>% 
  modeltime_calibrate(df_test)
  


#Prediction Intervals
calibration_tbl %>% 
  modeltime_forecast(new_data = df_test, 
                     actual_data = df_merged %>%
                                   filter(date >= as.Date("2024-07-01"))) %>%
  plot_modeltime_forecast(.interactive = FALSE,
                          .legend_show = FALSE,
                          .line_size = 1.5,
                          .color_lab = "",
                          .title = "NASDAQ 100") +
  
  labs(subtitle = "<span style = 'color:dimgrey;'>Predictive Intervals</span><br><span style = 'color:red;'>ML Model</span>") + 
  scale_x_date(expand = expansion(mult = c(.1, .15)),
               labels = scales::label_date(format = "%b'%y")) +
  scale_y_continuous(labels = scales::label_currency()) +
  theme_minimal(base_family = "Roboto Slab", base_size = 20) +
  theme(legend.position = "none",
        plot.background = element_rect(fill = "azure", 
                                       color = "azure"),
        plot.title = element_text(face = "bold"),
        axis.text = element_text(face = "bold"),
        plot.subtitle = ggtext::element_markdown(face = "bold"))

According to the model, it would be better to wait until tariff uncertainty comes to an end to enter the market.

To leave a comment for the author, please follow the link and comment on their blog: DataGeeek.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)