Regularized Greedy Forest in R

[This article was first published on mlampros, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This blog post is about my newly released RGF package (the blog post consists mainly of the package Vignette). The RGF package is a wrapper of the Regularized Greedy Forest python package, which also includes a Multi-core implementation (FastRGF). Portability from Python to R was made possible using the reticulate package and the installation requires basic knowledge of Python. Except for the Linux Operating System, the installation on Macintosh and Windows might be somehow cumbersome (on windows the package currently can be used only from within the command prompt). Detailed installation instructions for all three Operating Systems can be found in the README.md file and in the rgf_python Github repository.

The Regularized Greedy Forest algorithm is explained in detail in the paper Rie Johnson and Tong Zhang, Learning Nonlinear Functions Using Regularized Greedy Forest. A small synopsis would be “… the resulting method, which we refer to as regularized greedy forest (RGF), integrates two ideas: one is to include tree-structured regularization into the learning formulation; and the other is to employ the fully-corrective regularized greedy algorithm ….”.

At the time of writing this blog post (14 – 02 – 2018), there isn’t a corresponding implementation of the algorithm in the R language, so I decided to port the Python package in R taking advantage of the reticulate package. In the next lines, I will explain the functionality of the package and I compare RBF with other similar implementations, such as ranger (random forest algorithm) and xgboost (gradient boosting algorithm), in terms of time efficiency and error rate improvement.

The RGF package

The RGF package includes the following R6-classes / functions,

classes

RGF_Regressor RGF_Classifier FastRGF_Regressor FastRGF_Classifier
fit() fit(() fit() fit()
predict() predict() predict() predict()
cleanup() predict_proba() cleanup() predict_proba()
get_params() cleanup() get_params() cleanup()
score() get_params() score() get_params()
  score()   score()

functions

dgCMatrix_2scipy_sparse()

RGF_cleanup_temp_files()

mat_2scipy_sparse()

The package documentation includes details and examples for all R6-classes and functions. In the following code chunks, I’ll explain how a user can work with sparse matrices as all RGF algorithms (besides a dense matrix) require a python sparse matrix as input.

Sparse matrices as input

The RGF package includes two functions (mat_2scipy_sparse and dgCMatrix_2scipy_sparse) which allow the user to convert from a matrix / dgCMatrix to a scipy sparse matrix,

<span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">RGF</span><span class="p">)</span><span class="w">

</span><span class="c1"># conversion from a matrix object to a scipy sparse matrix</span><span class="w">
</span><span class="c1">#----------------------------------------------------------</span><span class="w">

</span><span class="n">set.seed</span><span class="p">(</span><span class="m">1</span><span class="p">)</span><span class="w">

</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">matrix</span><span class="p">(</span><span class="n">runif</span><span class="p">(</span><span class="m">1000</span><span class="p">),</span><span class="w"> </span><span class="n">nrow</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">100</span><span class="p">,</span><span class="w"> </span><span class="n">ncol</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">10</span><span class="p">)</span><span class="w">

</span><span class="n">x_sparse</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mat_2scipy_sparse</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">format</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"sparse_row_matrix"</span><span class="p">)</span><span class="w">

</span><span class="n">print</span><span class="p">(</span><span class="nf">dim</span><span class="p">(</span><span class="n">x</span><span class="p">))</span><span class="w">

</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w"> </span><span class="m">100</span><span class="w">  </span><span class="m">10</span><span class="w">

</span><span class="n">print</span><span class="p">(</span><span class="n">x_sparse</span><span class="o">$</span><span class="n">shape</span><span class="p">)</span><span class="w">

</span><span class="p">(</span><span class="m">100</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">)</span><span class="w">
  
</span>

<span class="w">
</span><span class="c1"># conversion from a dgCMatrix object to a scipy sparse matrix</span><span class="w">
</span><span class="c1">#-------------------------------------------------------------</span><span class="w">

</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">3</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">,</span><span class="w"> </span><span class="m">5</span><span class="p">,</span><span class="w"> </span><span class="m">6</span><span class="p">)</span><span class="w">

</span><span class="c1"># by default column-oriented format</span><span class="w">

</span><span class="n">dgcM</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Matrix</span><span class="o">::</span><span class="n">Matrix</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">nrow</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">3</span><span class="p">,</span><span class="w">

                      </span><span class="n">ncol</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">3</span><span class="p">,</span><span class="w"> </span><span class="n">byrow</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w">

                      </span><span class="n">sparse</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)</span><span class="w">

</span><span class="n">print</span><span class="p">(</span><span class="nf">dim</span><span class="p">(</span><span class="n">dgcM</span><span class="p">))</span><span class="w">

</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w"> </span><span class="m">3</span><span class="w"> </span><span class="m">3</span><span class="w">

</span><span class="n">x_sparse</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">dgCMatrix_2scipy_sparse</span><span class="p">(</span><span class="n">dgcM</span><span class="p">)</span><span class="w">

</span><span class="n">print</span><span class="p">(</span><span class="n">x_sparse</span><span class="o">$</span><span class="n">shape</span><span class="p">)</span><span class="w">

</span><span class="p">(</span><span class="m">3</span><span class="p">,</span><span class="w"> </span><span class="m">3</span><span class="p">)</span><span class="w">
  
</span>

Comparison of RGF with ranger and xgboost

First the data, libraries and cross-validation function will be inputted (the MLmetrics library is also required),

<span class="w">
</span><span class="n">data</span><span class="p">(</span><span class="n">Boston</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'KernelKnn'</span><span class="p">)</span><span class="w">

</span><span class="n">library</span><span class="p">(</span><span class="n">RGF</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">ranger</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">xgboost</span><span class="p">)</span><span class="w">



</span><span class="c1"># shuffling function for cross-validation folds</span><span class="w">
</span><span class="c1">#-----------------------------------------------</span><span class="w">


</span><span class="n">func_shuffle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">vec</span><span class="p">,</span><span class="w"> </span><span class="n">times</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">10</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">

  </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">times</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">

    </span><span class="n">out</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sample</span><span class="p">(</span><span class="n">vec</span><span class="p">,</span><span class="w"> </span><span class="nf">length</span><span class="p">(</span><span class="n">vec</span><span class="p">))</span><span class="w">
  </span><span class="p">}</span><span class="w">
  </span><span class="n">out</span><span class="w">
</span><span class="p">}</span><span class="w">


</span><span class="c1"># cross-validation folds [ regression]</span><span class="w">
</span><span class="c1">#-------------------------------------</span><span class="w">


</span><span class="n">regr_folds</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">folds</span><span class="p">,</span><span class="w"> </span><span class="n">RESP</span><span class="p">,</span><span class="w"> </span><span class="n">stratified</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">

  </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">is.factor</span><span class="p">(</span><span class="n">RESP</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">

    </span><span class="n">stop</span><span class="p">(</span><span class="n">simpleError</span><span class="p">(</span><span class="s2">"this function is meant for regression for classification use the 'class_folds' function"</span><span class="p">))</span><span class="w">
  </span><span class="p">}</span><span class="w">

  </span><span class="n">samp_vec</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">1</span><span class="o">/</span><span class="n">folds</span><span class="p">,</span><span class="w"> </span><span class="n">folds</span><span class="p">)</span><span class="w">

  </span><span class="n">sort_names</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">paste0</span><span class="p">(</span><span class="s1">'fold_'</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">folds</span><span class="p">)</span><span class="w">

  </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">stratified</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">

    </span><span class="n">stratif</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cut</span><span class="p">(</span><span class="n">RESP</span><span class="p">,</span><span class="w"> </span><span class="n">breaks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">folds</span><span class="p">)</span><span class="w">

    </span><span class="n">clas</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="n">unique</span><span class="p">(</span><span class="n">stratif</span><span class="p">),</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">stratif</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">x</span><span class="p">))</span><span class="w">

    </span><span class="n">len</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="n">clas</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="nf">length</span><span class="p">(</span><span class="n">x</span><span class="p">))</span><span class="w">

    </span><span class="n">prop</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="n">len</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">samp_vec</span><span class="p">),</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="nf">round</span><span class="p">(</span><span class="n">y</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">samp_vec</span><span class="p">[</span><span class="n">x</span><span class="p">])))</span><span class="w">

    </span><span class="n">repl</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">unlist</span><span class="p">(</span><span class="n">lapply</span><span class="p">(</span><span class="n">prop</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">x</span><span class="p">),</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="n">paste0</span><span class="p">(</span><span class="s1">'fold_'</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">),</span><span class="w"> </span><span class="n">x</span><span class="p">[</span><span class="n">y</span><span class="p">]))))</span><span class="w">

    </span><span class="n">spl</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">suppressWarnings</span><span class="p">(</span><span class="n">split</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">RESP</span><span class="p">),</span><span class="w"> </span><span class="n">repl</span><span class="p">))}</span><span class="w">

  </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">

    </span><span class="n">prop</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">RESP</span><span class="p">),</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">samp_vec</span><span class="p">),</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="nf">round</span><span class="p">(</span><span class="n">y</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">samp_vec</span><span class="p">[</span><span class="n">x</span><span class="p">])))</span><span class="w">

    </span><span class="n">repl</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">func_shuffle</span><span class="p">(</span><span class="n">unlist</span><span class="p">(</span><span class="n">lapply</span><span class="p">(</span><span class="n">prop</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">x</span><span class="p">),</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="n">paste0</span><span class="p">(</span><span class="s1">'fold_'</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">),</span><span class="w"> </span><span class="n">x</span><span class="p">[</span><span class="n">y</span><span class="p">])))))</span><span class="w">

    </span><span class="n">spl</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">suppressWarnings</span><span class="p">(</span><span class="n">split</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">RESP</span><span class="p">),</span><span class="w"> </span><span class="n">repl</span><span class="p">))</span><span class="w">
  </span><span class="p">}</span><span class="w">

  </span><span class="n">spl</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">spl</span><span class="p">[</span><span class="n">sort_names</span><span class="p">]</span><span class="w">

  </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">table</span><span class="p">(</span><span class="n">unlist</span><span class="p">(</span><span class="n">lapply</span><span class="p">(</span><span class="n">spl</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="nf">length</span><span class="p">(</span><span class="n">x</span><span class="p">)))))</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">

    </span><span class="n">warning</span><span class="p">(</span><span class="s1">'the folds are not equally split'</span><span class="p">)</span><span class="w">
  </span><span class="p">}</span><span class="w">

  </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">unlist</span><span class="p">(</span><span class="n">spl</span><span class="p">))</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="nf">length</span><span class="p">(</span><span class="n">RESP</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">

    </span><span class="n">stop</span><span class="p">(</span><span class="n">simpleError</span><span class="p">(</span><span class="s2">"the length of the splits are not equal with the length of the response"</span><span class="p">))</span><span class="w">
  </span><span class="p">}</span><span class="w">

  </span><span class="n">spl</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>

single threaded ( small data set )

In the next code chunk, I’ll perform 5-fold cross-validation using the Boston dataset and I’ll compare time execution and error rate for all three algorithms (comparison without doing hyper-parameter tuning),

<span class="w">
</span><span class="n">NUM_FOLDS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">5</span><span class="w">

</span><span class="n">set.seed</span><span class="p">(</span><span class="m">1</span><span class="p">)</span><span class="w">
</span><span class="n">FOLDS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">regr_folds</span><span class="p">(</span><span class="n">folds</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">NUM_FOLDS</span><span class="p">,</span><span class="w"> </span><span class="n">Boston</span><span class="p">[,</span><span class="w"> </span><span class="s1">'medv'</span><span class="p">],</span><span class="w"> </span><span class="n">stratified</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">


</span><span class="n">boston_rgf_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">boston_ranger_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">boston_xgb_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">boston_rgf_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">boston_ranger_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">boston_xgb_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="kc">NA</span><span class="p">,</span><span class="w"> </span><span class="n">NUM_FOLDS</span><span class="p">)</span><span class="w">


</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">FOLDS</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">

  </span><span class="n">cat</span><span class="p">(</span><span class="s2">"fold : "</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="s2">"\n"</span><span class="p">)</span><span class="w">

  </span><span class="n">samp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">unlist</span><span class="p">(</span><span class="n">FOLDS</span><span class="p">[</span><span class="o">-</span><span class="n">i</span><span class="p">])</span><span class="w">
  </span><span class="n">samp_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">unlist</span><span class="p">(</span><span class="n">FOLDS</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w">


  </span><span class="c1"># RGF</span><span class="w">
  </span><span class="c1">#----</span><span class="w">

  </span><span class="n">rgf_start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">init_regr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">RGF_Regressor</span><span class="o">$</span><span class="n">new</span><span class="p">(</span><span class="n">l</span><span class="m">2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">)</span><span class="w">

  </span><span class="n">init_regr</span><span class="o">$</span><span class="n">fit</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)]),</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="n">samp</span><span class="p">,</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)])</span><span class="w">

  </span><span class="n">pr_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">init_regr</span><span class="o">$</span><span class="n">predict</span><span class="p">(</span><span class="n">as.matrix</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)]))</span><span class="w">

  </span><span class="n">rgf_end</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">boston_rgf_time</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rgf_end</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">rgf_start</span><span class="w">

  </span><span class="n">boston_rgf_te</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MLmetrics</span><span class="o">::</span><span class="n">RMSE</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="s1">'medv'</span><span class="p">],</span><span class="w"> </span><span class="n">pr_te</span><span class="p">)</span><span class="w">


  </span><span class="c1"># ranger</span><span class="w">
  </span><span class="c1">#-------</span><span class="w">

  </span><span class="n">ranger_start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">fit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ranger</span><span class="p">(</span><span class="n">dependent.variable.name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"medv"</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="n">samp</span><span class="p">,</span><span class="w"> </span><span class="p">],</span><span class="w"> </span><span class="n">write.forest</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> 
               
               </span><span class="n">probability</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">F</span><span class="p">,</span><span class="w"> </span><span class="n">num.threads</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">num.trees</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">500</span><span class="p">,</span><span class="w"> </span><span class="n">verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">,</span><span class="w"> 
               
               </span><span class="n">classification</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">F</span><span class="p">,</span><span class="w"> </span><span class="n">mtry</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">min.node.size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">5</span><span class="p">,</span><span class="w"> </span><span class="n">keep.inbag</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">

  </span><span class="n">pred_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">predict</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)],</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'se'</span><span class="p">)</span><span class="o">$</span><span class="n">predictions</span><span class="w">

  </span><span class="n">ranger_end</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">boston_ranger_time</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ranger_end</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">ranger_start</span><span class="w">

  </span><span class="n">boston_ranger_te</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MLmetrics</span><span class="o">::</span><span class="n">RMSE</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="s1">'medv'</span><span class="p">],</span><span class="w"> </span><span class="n">pred_te</span><span class="p">)</span><span class="w">


  </span><span class="c1"># xgboost</span><span class="w">
  </span><span class="c1">#--------</span><span class="w">

  </span><span class="n">xgb_start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">dtrain</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">xgb.DMatrix</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)]),</span><span class="w"> </span><span class="n">label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="n">samp</span><span class="p">,</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)])</span><span class="w">

  </span><span class="n">dtest</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">xgb.DMatrix</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)]),</span><span class="w"> </span><span class="n">label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)])</span><span class="w">

  
  </span><span class="n">watchlist</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">train</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">dtrain</span><span class="p">,</span><span class="w"> </span><span class="n">test</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">dtest</span><span class="p">)</span><span class="w">

  
  </span><span class="n">param</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="s2">"objective"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"reg:linear"</span><span class="p">,</span><span class="w"> </span><span class="s2">"bst:eta"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.05</span><span class="p">,</span><span class="w"> </span><span class="s2">"max_depth"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">4</span><span class="p">,</span><span class="w"> 
               
               </span><span class="s2">"subsample"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.85</span><span class="p">,</span><span class="w"> </span><span class="s2">"colsample_bytree"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.85</span><span class="p">,</span><span class="w"> </span><span class="s2">"booster"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"gbtree"</span><span class="p">,</span><span class="w">
               
               </span><span class="s2">"nthread"</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w">

  </span><span class="n">fit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">xgb.train</span><span class="p">(</span><span class="n">param</span><span class="p">,</span><span class="w"> </span><span class="n">dtrain</span><span class="p">,</span><span class="w"> </span><span class="n">nround</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">500</span><span class="p">,</span><span class="w"> </span><span class="n">print_every_n</span><span class="w">  </span><span class="o">=</span><span class="w"> </span><span class="m">100</span><span class="p">,</span><span class="w"> </span><span class="n">watchlist</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">watchlist</span><span class="p">,</span><span class="w"> </span><span class="n">early_stopping_rounds</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">20</span><span class="p">,</span><span class="w">
                  
                  </span><span class="n">maximize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w"> </span><span class="n">verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">)</span><span class="w">

  </span><span class="n">p_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">xgboost</span><span class="o">:::</span><span class="n">predict.xgb.Booster</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">Boston</span><span class="p">)]),</span><span class="w"> </span><span class="n">ntreelimit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">fit</span><span class="o">$</span><span class="n">best_iteration</span><span class="p">)</span><span class="w">

  </span><span class="n">xgb_end</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">boston_xgb_time</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">xgb_end</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">xgb_start</span><span class="w">

  </span><span class="n">boston_xgb_te</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MLmetrics</span><span class="o">::</span><span class="n">RMSE</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="n">samp_</span><span class="p">,</span><span class="w"> </span><span class="s1">'medv'</span><span class="p">],</span><span class="w"> </span><span class="n">p_te</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span>

<span class="w">
</span><span class="n">fold</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">1</span><span class="w"> 
</span><span class="n">fold</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">2</span><span class="w"> 
</span><span class="n">fold</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">3</span><span class="w"> 
</span><span class="n">fold</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">4</span><span class="w"> 
</span><span class="n">fold</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">5</span><span class="w"> 

</span>

<span class="w">
</span><span class="n">cat</span><span class="p">(</span><span class="s2">"total time rgf 5 fold cross-validation : "</span><span class="p">,</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">boston_rgf_time</span><span class="p">),</span><span class="w"> </span><span class="s2">" mean rmse on test data : "</span><span class="p">,</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">boston_rgf_te</span><span class="p">),</span><span class="w"> </span><span class="s2">"\n"</span><span class="p">)</span><span class="w">

</span><span class="n">cat</span><span class="p">(</span><span class="s2">"total time ranger 5 fold cross-validation : "</span><span class="p">,</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">boston_ranger_time</span><span class="p">),</span><span class="w"> </span><span class="s2">" mean rmse on test data : "</span><span class="p">,</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">boston_ranger_te</span><span class="p">),</span><span class="w"> </span><span class="s2">"\n"</span><span class="p">)</span><span class="w">

</span><span class="n">cat</span><span class="p">(</span><span class="s2">"total time xgb 5 fold cross-validation : "</span><span class="p">,</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">boston_xgb_time</span><span class="p">),</span><span class="w"> </span><span class="s2">" mean rmse on test data : "</span><span class="p">,</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">boston_xgb_te</span><span class="p">),</span><span class="w"> </span><span class="s2">"\n"</span><span class="p">)</span><span class="w">

</span>

<span class="w">
</span><span class="n">total</span><span class="w"> </span><span class="n">time</span><span class="w"> </span><span class="n">rgf</span><span class="w"> </span><span class="m">5</span><span class="w"> </span><span class="n">fold</span><span class="w"> </span><span class="n">cross</span><span class="o">-</span><span class="n">validation</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">0.7730639</span><span class="w">  </span><span class="n">mean</span><span class="w"> </span><span class="n">rmse</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">test</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">3.832135</span><span class="w"> 
</span><span class="n">total</span><span class="w"> </span><span class="n">time</span><span class="w"> </span><span class="n">ranger</span><span class="w"> </span><span class="m">5</span><span class="w"> </span><span class="n">fold</span><span class="w"> </span><span class="n">cross</span><span class="o">-</span><span class="n">validation</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">3.826846</span><span class="w">  </span><span class="n">mean</span><span class="w"> </span><span class="n">rmse</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">test</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">4.17419</span><span class="w"> 
</span><span class="n">total</span><span class="w"> </span><span class="n">time</span><span class="w"> </span><span class="n">xgb</span><span class="w"> </span><span class="m">5</span><span class="w"> </span><span class="n">fold</span><span class="w"> </span><span class="n">cross</span><span class="o">-</span><span class="n">validation</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">0.4316094</span><span class="w">  </span><span class="n">mean</span><span class="w"> </span><span class="n">rmse</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">test</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">:</span><span class="w">  </span><span class="m">3.949122</span><span class="w"> 

</span>

5 threads ( high dimensional dataset and presence of multicollinearity )

For the high-dimensional data (can be downloaded from my Github repository) I’ll use the FastRGF_Regressor rather than the RGF_Regressor (comparison without doing hyper-parameter tuning),

<span class="w">
</span><span class="c1"># download the data from my Github repository (tested on a Linux OS)</span><span class="w">

</span><span class="n">system</span><span class="p">(</span><span class="s2">"wget https://raw.githubusercontent.com/mlampros/DataSets/master/africa_soil_train_data.zip"</span><span class="p">)</span><span class="w">


</span><span class="c1"># load the data in the R session</span><span class="w">

</span><span class="n">train_dat</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">read.table</span><span class="p">(</span><span class="n">unz</span><span class="p">(</span><span class="s2">"africa_soil_train_data.zip"</span><span class="p">,</span><span class="w"> </span><span class="s2">"train.csv"</span><span class="p">),</span><span class="w"> </span><span class="n">nrows</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1157</span><span class="p">,</span><span class="w"> </span><span class="n">header</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">,</span><span class="w"> </span><span class="n">quote</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"\""</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">","</span><span class="p">)</span><span class="w">


</span><span class="c1"># c("Ca", "P", "pH", "SOC", "Sand") : response variables            </span><span class="w">


</span><span class="c1"># exclude response-variables and factor variable</span><span class="w">

</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">train_dat</span><span class="p">[,</span><span class="w"> </span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">colnames</span><span class="p">(</span><span class="n">train_dat</span><span class="p">)</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"Ca"</span><span class="p">,</span><span class="w"> </span><span class="s2">"P"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pH"</span><span class="p">,</span><span class="w"> </span><span class="s2">"SOC"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Sand"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Depth"</span><span class="p">)))]</span><span class="w">


</span><span class="c1"># take (randomly) the first of the responses for train</span><span class="w">

</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">train_dat</span><span class="p">[,</span><span class="w"> </span><span class="s2">"Ca"</span><span class="p">]</span><span class="w">


</span><span class="c1"># dataset for ranger</span><span class="w">

</span><span class="n">tmp_rg_dat</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cbind</span><span class="p">(</span><span class="n">Ca</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">)</span><span class="w">


</span><span class="c1"># cross-validation folds</span><span class="w">

</span><span class="n">set.seed</span><span class="p">(</span><span class="m">2</span><span class="p">)</span><span class="w">
</span><span class="n">FOLDS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">regr_folds</span><span class="p">(</span><span class="n">folds</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">NUM_FOLDS</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">stratified</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">


</span><span class="n">highdim_rgf_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">highdim_ranger_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">highdim_xgb_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">highdim_rgf_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">highdim_ranger_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">highdim_xgb_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="kc">NA</span><span class="p">,</span><span class="w"> </span><span class="n">NUM_FOLDS</span><span class="p">)</span><span class="w">


</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">FOLDS</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">

  </span><span class="n">cat</span><span class="p">(</span><span class="s2">"fold : "</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="s2">"\n"</span><span class="p">)</span><span class="w">

  </span><span class="n">new_samp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">unlist</span><span class="p">(</span><span class="n">FOLDS</span><span class="p">[</span><span class="o">-</span><span class="n">i</span><span class="p">])</span><span class="w">
  </span><span class="n">new_samp_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">unlist</span><span class="p">(</span><span class="n">FOLDS</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w">


  </span><span class="c1"># RGF</span><span class="w">
  </span><span class="c1">#----</span><span class="w">

  </span><span class="n">rgf_start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">init_regr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">FastRGF_Regressor</span><span class="o">$</span><span class="n">new</span><span class="p">(</span><span class="n">n_jobs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">5</span><span class="p">,</span><span class="w"> </span><span class="n">l</span><span class="m">2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">)</span><span class="w">                  </span><span class="c1"># I added 'l2' regularization</span><span class="w">

  </span><span class="n">init_regr</span><span class="o">$</span><span class="n">fit</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">new_samp</span><span class="p">,</span><span class="w"> </span><span class="p">]),</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">y</span><span class="p">[</span><span class="n">new_samp</span><span class="p">])</span><span class="w">

  </span><span class="n">pr_te</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">init_regr</span><span class="o">$</span><span class="n">predict</span><span class="p">(</span><span class="n">as.matrix</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">new_samp_</span><span class="p">,</span><span class="w"> </span><span class="p">]))</span><span class="w">

  </span><span class="n">rgf_end</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sys.time</span><span class="p">()</span><span class="w">

  </span><span class="n">highdim_rgf_time</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rgf_end</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">rgf_start</span><span class="w">

  </span><span class="n">highdim_rgf_te</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MLmetrics</span><span class="o">::</span><span class="n">RMSE</span><span class="p">(</span>...

To leave a comment for the author, please follow the link and comment on their blog: mlampros.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)