[This article was first published on DataGeeek, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Bitcoin hit an all-time high of $125,664 on October 5. This increase was fueled by a historic net inflow of $3.24 billion into spot Bitcoin ETFs and rising public demand.
In this article, we will predict the trend of two blockchain ETFs using nested forecasting with the Spark backend.
I couldn’t use lagged and smoothing values in recipes because NA caused a problem in the nested data structure.
library(modeltime) library(timetk) library(tidymodels) library(dplyr) library(tidyquant) library(sparklyr) #Connection sc <- spark_connect(master = "local") #Setup the Spark Backend parallel_start(sc, .method = "spark") #Invesco CoinShares Global Blockchain UCITS ETF (BCHN.L) df_bchn <- tq_get("BCHN.L") %>% select(date, 'Invesco CoinShares Global Blockchain' = close) #iShares Blockchain and Tech ETF (IBLC) df_iblc <- tq_get("IBLC") %>% select(date, 'iShares Blockchain and Tech' = close) #Creating the survey data df_survey <- df_bchn %>% left_join(df_iblc) %>% pivot_longer(-date, names_to = "id", values_to = "value") %>% filter(date >= last(date) - months(6)) %>% drop_na() #Nested Data nested_data_tbl <- df_survey %>% dplyr::select(id, date = date, value = value) %>% extend_timeseries( .id_var = id, .date_var = date, .length_future = 15 ) %>% nest_timeseries( .id_var = id, .length_future = 15 ) %>% split_nested_timeseries( .length_test = 15 ) #Modeling #XGBoost rec_xgb <- recipe(value ~ ., extract_nested_train_split(nested_data_tbl)) %>% step_timeseries_signature(date) %>% step_rm(date) %>% step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% step_zv(all_predictors()) %>% step_impute_linear(all_numeric_predictors()) wflw_xgb <- workflow() %>% add_model(boost_tree("regression") %>% set_engine("xgboost")) %>% add_recipe(rec_xgb) #Prophet rec_prophet <- recipe(value ~ date, extract_nested_train_split(nested_data_tbl)) %>% step_date(date, features = c("dow", "month", "year", "doy")) %>% step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% step_zv(all_predictors()) %>% step_impute_linear(all_numeric_predictors()) wflw_prophet <- workflow() %>% add_model( prophet_reg("regression") %>% set_engine("prophet", seasonality_yearly = FALSE, seasonality_weekly = TRUE, seasonality_daily = TRUE)) %>% add_recipe(rec_prophet) #Nested Forecasting with Spark nested_modeltime_tbl <- nested_data_tbl %>% modeltime_nested_fit( wflw_xgb, wflw_prophet, control = control_nested_fit(allow_par = TRUE, verbose = TRUE) ) #Model Test Accuracy nested_modeltime_tbl %>% extract_nested_test_accuracy() %>% table_modeltime_accuracy(.interactive = T) #Extract Nested Test Accuracy best_nested_modeltime_tbl <- nested_modeltime_tbl %>% modeltime_nested_select_best( metric = "mape", minimize = TRUE, filter_test_forecasts = TRUE ) #Extract Nested Best Model Report best_nested_modeltime_tbl %>% extract_nested_best_model_report() #Extract Nested Best Test Forecasts best_nested_modeltime_tbl %>% extract_nested_test_forecast() %>% group_by(id) %>% plot_modeltime_forecast( .facet_ncol = 1, .interactive = FALSE, .line_size = 1 ) + labs(title = "Nested Forecasting", subtitle = "<span style = 'color:dimgrey;'>Predictive Intervals</span> of <span style = 'color:red;'>Prophet</span> Model", y = "", x = "") + facet_wrap(~ id, ncol = 1, scales = "free_y") + scale_y_continuous(labels = scales::label_currency()) + scale_x_date(labels = scales::label_date("%b'%Y"), date_breaks = "30 days") + theme_tq(base_family = "Roboto Slab", base_size = 16) + theme(plot.subtitle = ggtext::element_markdown(face = "bold"), plot.title = element_text(face = "bold"), strip.text = element_text(face = "bold"), axis.text = element_text(face = "bold"), axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1), legend.position = "none")
To leave a comment for the author, please follow the link and comment on their blog: DataGeeek.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.