[This article was first published on DataGeeek, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
According to the graph below, suggested by Fernando Leibovici, the increase in uncertainty that began in late 2024 aligns with a rise in imports, indicating that US importers accelerated their purchases as a precaution against expected tariff increases or supply chain disruptions.
When we model the variables with the glmnet engine, we can see that this impact is limited and negative. This could confirm the thought that Fernando Leibovici suggested that this alignment is mostly due to gold imports.
Source code:
library(tidyverse) library(tidymodels) library(tidyquant) library(timetk) #Imports of Goods and Services: Balance of Payments Basis (BOPTIMP) df_imports <- tq_get("BOPTIMP", get = "economic.data") %>% select(date = observation_date, imports = BOPTIMP) #Economic Policy Uncertainty Index: Categorical Index: Trade policy (EPUTRADE) df_uncertainty <- tq_get("EPUTRADE", get = "economic.data") %>% select(date = observation_date, uncertainty = EPUTRADE) #Merging the datasets df_merged <- df_imports %>% left_join(df_uncertainty) %>% drop_na() df_merged %>% plot_acf_diagnostics(date, imports, .ccf_vars = "uncertainty") #Correlation df_merged %>% filter(date >= first(date) - months(3)) %>% tq_performance(Ra = imports, Rb = uncertainty, performance_fun = table.Correlation) #Data split splits <- initial_time_split(df_merged, prop = 0.8) df_train <- training(splits) df_test <- testing(splits) #Bootstrapping for tuning set.seed(12345) df_folds <- bootstraps(df_train, times = 100) #Recipe Preprocessing Specification rec_spec <- recipe(imports ~ ., data = training(splits)) %>% step_fourier(date, period = 30, K = 1) %>% step_date(date, features = c("month", "year")) %>% step_rm(date) %>% step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% step_normalize(all_numeric_predictors()) #Model Specification model_spec <- linear_reg( mode = "regression", penalty = tune() ) %>% set_engine("glmnet") #Workflow sets wflow_ <- workflow_set( preproc = list(recipe = rec_spec), models = list(model = model_spec) ) #Tuning and evaluating all the models grid_ctrl <- control_grid( save_pred = TRUE, parallel_over = "everything", save_workflow = TRUE ) grid_results <- wflow_ %>% workflow_map( seed = 98765, resamples = df_folds, grid = 10, control = grid_ctrl ) #Accuracy of the grid results grid_results %>% rank_results(select_best = TRUE, rank_metric = "rsq") %>% select(Models = wflow_id, .metric, mean) # A tibble: 2 × 3 #Models .metric mean # <chr> <chr> <dbl> #1 recipe_model rmse 18389. #2 recipe_model rsq 0.804 #Finalizing the model with the best parameters best_param <- grid_results %>% extract_workflow_set_result("recipe_model") %>% select_best(metric = "rsq") wflw_fit <- grid_results %>% extract_workflow("recipe_model") %>% finalize_workflow(best_param) %>% fit(df_train) #Variable importance library(DALEXtra) #Processed data frame for variable importance calculation imp_data <- rec_spec %>% prep() %>% bake(new_data = NULL) #Explainer object explainer_ <- explain_tidymodels( wflw_fit %>% extract_fit_parsnip(), data = imp_data %>% select(-imports), y = imp_data$imports, label = "", verbose = FALSE ) #Model Studio library(modelStudio) set.seed(1983) modelStudio::modelStudio(explainer_, B = 100, viewer = "browser")
To leave a comment for the author, please follow the link and comment on their blog: DataGeeek.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.