Extreme Learning Machine

[This article was first published on mlampros, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

As of 2018-06-17 the elmNN package was archived and due to the fact that it was one of the machine learning functions that I used when I started learning R (it returns the output results pretty fast too) plus that I had to utilize the package last week for a personal task I decided to reimplement the R code in Rcpp. It didn’t take long because the R package was written, initially by the author, in a clear way. In the next lines I’ll explain the differences and the functionality just for reference.

Differences between the elmNN (R package) and the elmNNRcpp (Rcpp Package)

  • The reimplementation assumes that both the predictors ( x ) and the response variable ( y ) are in the form of a matrix. This means that character, factor or boolean columns have to be transformed (onehot encoded would be an option) before using either the elm_train or the elm_predict function.
  • The output predictions are in the form of a matrix. In case of regression the matrix has one column whereas in case of classification the number of columns equals the number of unique labels
  • In case of classification the unique labels should begin from 0 and the difference between the unique labels should not be greater than 1. For instance, unique_labels = c(0, 1, 2, 3) are acceptable whereas the following case will raise an error : unique_labels = c(0, 2, 3, 4)
  • I renamed the poslin activation to relu as it’s easier to remember ( both share the same properties ). Moreover I added the leaky_relu_alpha parameter so that if the value is greater than 0.0 a leaky-relu-activation for the single-hidden-layer can be used.
  • The initilization weights in the elmNN were set by default to uniform in the range [-1,1] ( ‘uniform_negative’ ) . I added two more options : ‘normal_gaussian’ ( in the range [0,1] ) and ‘uniform_positive’ ( in the range [0,1] ) too
  • The user has the option to include or exclude bias of the one-layer feed-forward neural network

The elmNNRcpp functions

The functions included in the elmNNRcpp package are the following and details for each parameter can be found in the package documentation,

elmNNRcpp
elm_train(x, y, nhid, actfun, init_weights = “normal_gaussian”, bias = FALSE, …)
elm_predict(elm_train_object, newdata, normalize = FALSE)
onehot_encode(y)

elmNNRcpp in case of Regression

The following code chunk gives some details on how to use the elm_train in case of regression and compares the results with the lm ( linear model ) base function,

<span class="w">
</span><span class="c1"># load the data and split it in two parts</span><span class="w">
</span><span class="c1">#----------------------------------------</span><span class="w">

</span><span class="n">data</span><span class="p">(</span><span class="n">Boston</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'KernelKnn'</span><span class="p">)</span><span class="w">

</span><span class="n">library</span><span class="p">(</span><span class="n">elmNNRcpp</span><span class="p">)</span><span class="w">

</span><span class="n">Boston</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">Boston</span><span class="p">)</span><span class="w">
</span><span class="nf">dimnames</span><span class="p">(</span><span class="n">Boston</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="w">

</span><span class="n">X</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[,</span><span class="w"> </span><span class="o">-</span><span class="nf">dim</span><span class="p">(</span><span class="n">Boston</span><span class="p">)[</span><span class="m">2</span><span class="p">]]</span><span class="w">
</span><span class="n">xtr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">X</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">350</span><span class="p">,</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">xte</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">X</span><span class="p">[</span><span class="m">351</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">X</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">


</span><span class="c1"># prepare / convert the train-data-response to a one-column matrix</span><span class="w">
</span><span class="c1">#-----------------------------------------------------------------</span><span class="w">

</span><span class="n">ytr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">matrix</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">350</span><span class="p">,</span><span class="w"> </span><span class="nf">dim</span><span class="p">(</span><span class="n">Boston</span><span class="p">)[</span><span class="m">2</span><span class="p">]],</span><span class="w"> </span><span class="n">nrow</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">length</span><span class="p">(</span><span class="n">Boston</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">350</span><span class="p">,</span><span class="w"> </span><span class="nf">dim</span><span class="p">(</span><span class="n">Boston</span><span class="p">)[</span><span class="m">2</span><span class="p">]]),</span><span class="w">
             
             </span><span class="n">ncol</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w">


</span><span class="c1"># perform a fit and predict [ elmNNRcpp ]</span><span class="w">
</span><span class="c1">#----------------------------------------</span><span class="w">

</span><span class="n">fit_elm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">elm_train</span><span class="p">(</span><span class="n">xtr</span><span class="p">,</span><span class="w"> </span><span class="n">ytr</span><span class="p">,</span><span class="w"> </span><span class="n">nhid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1000</span><span class="p">,</span><span class="w"> </span><span class="n">actfun</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'purelin'</span><span class="p">,</span><span class="w">
                    
                    </span><span class="n">init_weights</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"uniform_negative"</span><span class="p">,</span><span class="w"> </span><span class="n">bias</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
                    
</span>
<span class="w">
</span><span class="c1">## Input weights will be initialized ...</span><span class="w">
</span><span class="c1">## Dot product of input weights and data starts ...</span><span class="w">
</span><span class="c1">## Bias will be added to the dot product ...</span><span class="w">
</span><span class="c1">## 'purelin' activation function will be utilized ...</span><span class="w">
</span><span class="c1">## The computation of the Moore-Pseudo-inverse starts ...</span><span class="w">
</span><span class="c1">## The computation is finished!</span><span class="w">
</span><span class="c1">## </span><span class="w">
</span><span class="c1">## Time to complete : 0.09112573 secs</span><span class="w">

</span>
<span class="w">
</span><span class="n">pr_te_elm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">elm_predict</span><span class="p">(</span><span class="n">fit_elm</span><span class="p">,</span><span class="w"> </span><span class="n">xte</span><span class="p">)</span><span class="w">



</span><span class="c1"># perform a fit and predict [ lm ]</span><span class="w">
</span><span class="c1">#----------------------------------------</span><span class="w">

</span><span class="n">data</span><span class="p">(</span><span class="n">Boston</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'KernelKnn'</span><span class="p">)</span><span class="w">

</span><span class="n">fit_lm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lm</span><span class="p">(</span><span class="n">medv</span><span class="o">~</span><span class="n">.</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">350</span><span class="p">,</span><span class="w"> </span><span class="p">])</span><span class="w">

</span><span class="n">pr_te_lm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">predict</span><span class="p">(</span><span class="n">fit_lm</span><span class="p">,</span><span class="w"> </span><span class="n">newdata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="m">351</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">X</span><span class="p">),</span><span class="w"> </span><span class="p">])</span><span class="w">



</span><span class="c1"># evaluation metric</span><span class="w">
</span><span class="c1">#------------------</span><span class="w">

</span><span class="n">rmse</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">function</span><span class="w"> </span><span class="p">(</span><span class="n">y_true</span><span class="p">,</span><span class="w"> </span><span class="n">y_pred</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  
  </span><span class="n">out</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sqrt</span><span class="p">(</span><span class="n">mean</span><span class="p">((</span><span class="n">y_true</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">y_pred</span><span class="p">)</span><span class="o">^</span><span class="m">2</span><span class="p">))</span><span class="w">
  
  </span><span class="n">out</span><span class="w">
</span><span class="p">}</span><span class="w">


</span><span class="c1"># test data response variable</span><span class="w">
</span><span class="c1">#----------------------------</span><span class="w">

</span><span class="n">yte</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Boston</span><span class="p">[</span><span class="m">351</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">X</span><span class="p">),</span><span class="w"> </span><span class="nf">dim</span><span class="p">(</span><span class="n">Boston</span><span class="p">)[</span><span class="m">2</span><span class="p">]]</span><span class="w">


</span><span class="c1"># mean-squared-error for 'elm' and 'lm'</span><span class="w">
</span><span class="c1">#--------------------------------------</span><span class="w">

</span><span class="n">cat</span><span class="p">(</span><span class="s1">'the rmse error for extreme-learning-machine is :'</span><span class="p">,</span><span class="w"> </span><span class="n">rmse</span><span class="p">(</span><span class="n">yte</span><span class="p">,</span><span class="w"> </span><span class="n">pr_te_elm</span><span class="p">[,</span><span class="w"> </span><span class="m">1</span><span class="p">]),</span><span class="w"> </span><span class="s1">'\n'</span><span class="p">)</span><span class="w">

</span><span class="c1">## the rmse error for extreme-learning-machine is : 22.00705</span><span class="w">


</span><span class="n">cat</span><span class="p">(</span><span class="s1">'the rmse error for liner-model is :'</span><span class="p">,</span><span class="w"> </span><span class="n">rmse</span><span class="p">(</span><span class="n">yte</span><span class="p">,</span><span class="w"> </span><span class="n">pr_te_lm</span><span class="p">),</span><span class="w"> </span><span class="s1">'\n'</span><span class="p">)</span><span class="w">

</span><span class="c1">## the rmse error for liner-model is : 23.36543</span><span class="w">

</span>

elmNNRcpp in case of Classification

The following code script illustrates how elm_train can be used in classification and compares the results with the glm ( Generalized Linear Models ) base function,

<span class="w">

</span><span class="c1"># load the data</span><span class="w">
</span><span class="c1">#--------------</span><span class="w">

</span><span class="n">data</span><span class="p">(</span><span class="n">ionosphere</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'KernelKnn'</span><span class="p">)</span><span class="w">

</span><span class="n">y_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ionosphere</span><span class="p">[,</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">ionosphere</span><span class="p">)]</span><span class="w">

</span><span class="n">x_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ionosphere</span><span class="p">[,</span><span class="w"> </span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">ionosphere</span><span class="p">))]</span><span class="w">     </span><span class="c1"># second column has 1 unique value</span><span class="w">

</span><span class="n">x_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">scale</span><span class="p">(</span><span class="n">x_class</span><span class="p">[,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">x_class</span><span class="p">)])</span><span class="w">

</span><span class="n">x_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">x_class</span><span class="p">)</span><span class="w">                        </span><span class="c1"># convert to matrix</span><span class="w">
</span><span class="nf">dimnames</span><span class="p">(</span><span class="n">x_class</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="w"> 



</span><span class="c1"># split data in train-test</span><span class="w">
</span><span class="c1">#-------------------------</span><span class="w">

</span><span class="n">xtr_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x_class</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">200</span><span class="p">,</span><span class="w"> </span><span class="p">]</span><span class="w">                    
</span><span class="n">xte_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x_class</span><span class="p">[</span><span class="m">201</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">ionosphere</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="n">ytr_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">y_class</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">200</span><span class="p">])</span><span class="w">
</span><span class="n">yte_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">y_class</span><span class="p">[</span><span class="m">201</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">ionosphere</span><span class="p">)])</span><span class="w">

</span><span class="n">ytr_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">onehot_encode</span><span class="p">(</span><span class="n">ytr_class</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w">                                     </span><span class="c1"># class labels should begin from 0 (subtract 1)</span><span class="w">


</span><span class="c1"># perform a fit and predict [ elmNNRcpp ]</span><span class="w">
</span><span class="c1">#----------------------------------------</span><span class="w">

</span><span class="n">fit_elm_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">elm_train</span><span class="p">(</span><span class="n">xtr_class</span><span class="p">,</span><span class="w"> </span><span class="n">ytr_class</span><span class="p">,</span><span class="w"> </span><span class="n">nhid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1000</span><span class="p">,</span><span class="w"> </span><span class="n">actfun</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'relu'</span><span class="p">,</span><span class="w">
                          
                          </span><span class="n">init_weights</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"uniform_negative"</span><span class="p">,</span><span class="w"> </span><span class="n">bias</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
                          
</span>
<span class="w">
</span><span class="c1">## Input weights will be initialized ...</span><span class="w">
</span><span class="c1">## Dot product of input weights and data starts ...</span><span class="w">
</span><span class="c1">## Bias will be added to the dot product ...</span><span class="w">
</span><span class="c1">## 'relu' activation function will be utilized ...</span><span class="w">
</span><span class="c1">## The computation of the Moore-Pseudo-inverse starts ...</span><span class="w">
</span><span class="c1">## The computation is finished!</span><span class="w">
</span><span class="c1">## </span><span class="w">
</span><span class="c1">## Time to complete : 0.03604198 secs</span><span class="w">

</span>
<span class="w">
</span><span class="n">pr_elm_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">elm_predict</span><span class="p">(</span><span class="n">fit_elm_class</span><span class="p">,</span><span class="w"> </span><span class="n">xte_class</span><span class="p">,</span><span class="w"> </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">)</span><span class="w">

</span><span class="n">pr_elm_class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">max.col</span><span class="p">(</span><span class="n">pr_elm_class</span><span class="p">,</span><span class="w"> </span><span class="n">ties.method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"random"</span><span class="p">)</span><span class="w">



</span><span class="c1"># perform a fit and predict [ glm ]</span><span class="w">
</span><span class="c1">#----------------------------------------</span><span class="w">

</span><span class="n">data</span><span class="p">(</span><span class="n">ionosphere</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'KernelKnn'</span><span class="p">)</span><span class="w">

</span><span class="n">fit_glm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">glm</span><span class="p">(</span><span class="n">class</span><span class="o">~</span><span class="n">.</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ionosphere</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">200</span><span class="p">,</span><span class="w"> </span><span class="m">-2</span><span class="p">],</span><span class="w"> </span><span class="n">family</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">binomial</span><span class="p">(</span><span class="n">link</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'logit'</span><span class="p">))</span><span class="w">

</span><span class="n">pr_glm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">predict</span><span class="p">(</span><span class="n">fit_glm</span><span class="p">,</span><span class="w"> </span><span class="n">newdata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ionosphere</span><span class="p">[</span><span class="m">201</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">ionosphere</span><span class="p">),</span><span class="w"> </span><span class="m">-2</span><span class="p">],</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'response'</span><span class="p">)</span><span class="w">

</span><span class="n">pr_glm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.vector</span><span class="p">(</span><span class="n">ifelse</span><span class="p">(</span><span class="n">pr_glm</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="m">0.5</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">))</span><span class="w">


</span><span class="c1"># accuracy for 'elm' and 'glm'</span><span class="w">
</span><span class="c1">#-----------------------------</span><span class="w">

</span><span class="n">cat</span><span class="p">(</span><span class="s1">'the accuracy for extreme-learning-machine is :'</span><span class="p">,</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">yte_class</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">pr_elm_class</span><span class="p">),</span><span class="w"> </span><span class="s1">'\n'</span><span class="p">)</span><span class="w">

</span><span class="c1">## the accuracy for extreme-learning-machine is : 0.9337748</span><span class="w">


</span><span class="n">cat</span><span class="p">(</span><span class="s1">'the accuracy for glm is :'</span><span class="p">,</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">yte_class</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">pr_glm</span><span class="p">),</span><span class="w"> </span><span class="s1">'\n'</span><span class="p">)</span><span class="w">

</span><span class="c1">## the accuracy for glm is : 0.8940397</span><span class="w">

</span>

Classify MNIST digits using elmNNRcpp

I found an interesting Python implementation / Code on the web and I thought I give it a try to reproduce the results. I downloaded the MNIST data from my Github repository and I used the following parameter setting,

<span class="w">

</span><span class="c1"># using system('wget..') on a linux OS </span><span class="w">
</span><span class="c1">#-------------------------------------</span><span class="w">

</span><span class="n">system</span><span class="p">(</span><span class="s2">"wget https://raw.githubusercontent.com/mlampros/DataSets/master/mnist.zip"</span><span class="p">)</span><span class="w">             

</span><span class="n">mnist</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">read.table</span><span class="p">(</span><span class="n">unz</span><span class="p">(</span><span class="s2">"mnist.zip"</span><span class="p">,</span><span class="w"> </span><span class="s2">"mnist.csv"</span><span class="p">),</span><span class="w"> </span><span class="n">nrows</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">70000</span><span class="p">,</span><span class="w"> </span><span class="n">header</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">,</span><span class="w"> 
                    
                    </span><span class="n">quote</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"\""</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">","</span><span class="p">)</span><span class="w">

</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mnist</span><span class="p">[,</span><span class="w"> </span><span class="o">-</span><span class="n">ncol</span><span class="p">(</span><span class="n">mnist</span><span class="p">)]</span><span class="w">

</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mnist</span><span class="p">[,</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">mnist</span><span class="p">)]</span><span class="w">

</span><span class="n">y_expand</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">onehot_encode</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="w">



</span><span class="c1"># split the data randomly in train-test</span><span class="w">
</span><span class="c1">#--------------------------------------</span><span class="w">

</span><span class="n">idx_train</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sample</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">y_expand</span><span class="p">),</span><span class="w"> </span><span class="nf">round</span><span class="p">(</span><span class="m">0.85</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">nrow</span><span class="p">(</span><span class="n">y_expand</span><span class="p">)))</span><span class="w">

</span><span class="n">idx_test</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">setdiff</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">y_expand</span><span class="p">),</span><span class="w"> </span><span class="n">idx_train</span><span class="p">)</span><span class="w">

</span><span class="n">fit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">elm_train</span><span class="p">(</span><span class="n">as.matrix</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">idx_train</span><span class="p">,</span><span class="w"> </span><span class="p">]),</span><span class="w"> </span><span class="n">y_expand</span><span class="p">[</span><span class="n">idx_train</span><span class="p">,</span><span class="w"> </span><span class="p">],</span><span class="w"> </span><span class="n">nhid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2500</span><span class="p">,</span><span class="w"> 
                
                </span><span class="n">actfun</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'relu'</span><span class="p">,</span><span class="w"> </span><span class="n">init_weights</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'uniform_negative'</span><span class="p">,</span><span class="w"> </span><span class="n">bias</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w">
                
                </span><span class="n">verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)</span><span class="w">


</span><span class="c1"># Input weights will be initialized ...</span><span class="w">
</span><span class="c1"># Dot product of input weights and data starts ...</span><span class="w">
</span><span class="c1"># Bias will be added to the dot product ...</span><span class="w">
</span><span class="c1"># 'relu' activation function will be utilized ...</span><span class="w">
</span><span class="c1"># The computation of the Moore-Pseudo-inverse starts ...</span><span class="w">
</span><span class="c1"># The computation is finished!</span><span class="w">
</span><span class="c1"># </span><span class="w">
</span><span class="c1"># Time to complete : 1.607153 mins </span><span class="w">


</span><span class="c1"># predictions for test-data</span><span class="w">
</span><span class="c1">#--------------------------</span><span class="w">

</span><span class="n">pr_test</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">elm_predict</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">newdata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">idx_test</span><span class="p">,</span><span class="w"> </span><span class="p">]))</span><span class="w">

</span><span class="n">pr_max_col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">max.col</span><span class="p">(</span><span class="n">pr_test</span><span class="p">,</span><span class="w"> </span><span class="n">ties.method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"random"</span><span class="p">)</span><span class="w">

</span><span class="n">y_true</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">max.col</span><span class="p">(</span><span class="n">y_expand</span><span class="p">[</span><span class="n">idx_test</span><span class="p">,</span><span class="w"> </span><span class="p">])</span><span class="w">


</span><span class="n">cat</span><span class="p">(</span><span class="s1">'Accuracy ( Mnist data ) :'</span><span class="p">,</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">pr_max_col</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">y_true</span><span class="p">),</span><span class="w"> </span><span class="s1">'\n'</span><span class="p">)</span><span class="w">

</span><span class="c1"># Accuracy ( Mnist data ) : 96.13  </span><span class="w">

</span>

An updated version of the elmNNRcpp package can be found in my Github repository and to report bugs/issues please use the following link, https://github.com/mlampros/elmNNRcpp/issues.

To leave a comment for the author, please follow the link and comment on their blog: mlampros.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)