Predicting the NASDAQ 100 with Hyperparameter Tuning
[This article was first published on DataGeeek, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Pressure on the markets has intensified ahead of the Wednesday deadline for trade deals. We will model monthly Nasdaq 100 data with the Federal Funds Effective Rate and the Unemployment Rate. We will use Boosted trees with hyperparameter tuning.
library(tidyverse) library(tidymodels) library(modeltime) library(timetk) #Unemployment Rate (UNRATE) df_unrate <- tq_get("UNRATE", get = "economic.data") %>% select(date, unrate = price) #Federal Funds Effective Rate (FEDFUNDS) df_fedfunds <- tq_get("FEDFUNDS", get = "economic.data") %>% select(date, fedfunds = price) #Nasdaq 100 df_nasdaq <- tq_get("^NDX") %>% tq_transmute(select = close, mutate_fun = to.monthly, col_rename = "nasdaq") %>% mutate(date = as.Date(date)) #Merging the datasets df_merged <- df_unrate %>% left_join(df_fedfunds) %>% left_join(df_nasdaq) %>% drop_na() #Split Data splits <- time_series_split( df_merged, assess = "1 year", cumulative = TRUE ) df_train <- training(splits) df_test <- testing(splits) #Recipe recipe_ml <- recipe(nasdaq ~ ., df_train) %>% step_date(date, features = "month", ordinal = FALSE) %>% step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% step_mutate(date_num = as.numeric(date)) %>% step_normalize(all_numeric_predictors()) %>% step_rm(date) #Model spec mod_spec <- boost_tree(trees = tune(), min_n = tune(), tree_depth = tune(), learn_rate = tune()) %>% set_engine("xgboost") %>% set_mode("regression") #Hyperparameter Tuning mod_param <- extract_parameter_set_dials(mod_spec) set.seed(1234) model_tbl <- mod_param %>% grid_random(size = 50) %>% create_model_grid( f_model_spec = boost_tree, engine_name = "xgboost", mode = "regression" ) #Extracting the model list model_list <- model_tbl$.models #Workflowsets model_wfset <- workflow_set( preproc = list(recipe_ml), models = model_list, cross = TRUE ) #Fitting Using Parallel Backend model_parallel_tbl <- model_wfset %>% modeltime_fit_workflowset( data = df_train, control = control_fit_workflowset( verbose = TRUE, allow_par = TRUE ) ) #Accuracy model_parallel_tbl %>% modeltime_calibrate(new_data = df_test) %>% modeltime_accuracy() %>% table_modeltime_accuracy() #Calibration to the test set for the best model calibration_tbl <- model_parallel_tbl %>% filter(.model_desc == "RECIPE_BOOST_TREE_27") %>% modeltime_calibrate(df_test) #Prediction Intervals calibration_tbl %>% modeltime_forecast(new_data = df_test, actual_data = df_merged %>% filter(date >= as.Date("2024-07-01"))) %>% plot_modeltime_forecast(.interactive = FALSE, .legend_show = FALSE, .line_size = 1.5, .color_lab = "", .title = "NASDAQ 100") + labs(subtitle = "<span style = 'color:dimgrey;'>Predictive Intervals</span><br><span style = 'color:red;'>ML Model</span>") + scale_x_date(expand = expansion(mult = c(.1, .15)), labels = scales::label_date(format = "%b'%y")) + scale_y_continuous(labels = scales::label_currency()) + theme_minimal(base_family = "Roboto Slab", base_size = 20) + theme(legend.position = "none", plot.background = element_rect(fill = "azure", color = "azure"), plot.title = element_text(face = "bold"), axis.text = element_text(face = "bold"), plot.subtitle = ggtext::element_markdown(face = "bold"))

According to the model, it would be better to wait until tariff uncertainty comes to an end to enter the market.
To leave a comment for the author, please follow the link and comment on their blog: DataGeeek.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.