Disclaimer: I have no affiliation with The Next Web (cf. online article)

A few weeks ago I read this interesting and accessible article about explainable AI, discussing more specifically self-explainable AI issues. I’m not sure – anymore – if there’s a mandatory need for AI models that explain themselves, as there are model-agnostic tools such as the teller – among many others – for helping them in doing just that.

With that being said, the new `LSBoost` algorithm implemented in mlsauce does, explain itself. `LSBoost` is a cousin of the `LS_Boost` algorithm introduced in
GREEDY FUNCTION APPROXIMATION: A GRADIENT BOOSTING MACHINE (GFAGBM). GFAGBM’s `LS_Boost` is outlined below:

So, what makes the new `LSBoost` different? Would you be legitimately entitled to ask. Well, about the seemingly new name: I actually misspelled `LS_Boost` in my code in the first place! So, it’ll remain named as it is now and forever. Otherwise, in the new `LSBoost` we have:

• Page 1203, section 5 of GFAGBM is used: `LSBoost` contains a learning rate which could accelerate or slow down the convergence of residuals towards 0. Overfitting, fast or slow.
• Function h (referring to Algorithm 2 in GFAGBM) returns a columnwise concatenation of x and a – so called – neuron or node:

• a (referring to Algorithm 2 in GFAGBM) contains elements of a matrix of simulated uniform random numbers whose size can be controlled, in a randomized networks’ fashion.
• Both columns and rows of X (containing x’s) can be subsampled, in order to increase the diversity of the weak learners h fitting the successive residuals.
• Instead of optimizing least squares at line 4 of Algorithm 2, penalized least squares are used. Currently, ridge regression is implemented, and its bias has the effect of slowing down the convergence of residuals towards 0.
• An early stopping criterion is implemented, and is based on the magnitude of successive residuals.

Besides this, we can also remark that `LSBoost` is explainable as a linear model, while being a highly nonlinear one. Indeed by using some calculus, it’s possible to compute derivatives of F (still referring to Algorithm 2 outlined before) relative to x, wherever the function h does admit a derivative.

In the following Python+R examples appearing after the short survey (both tested on Linux and macOS so far), we’ll use `LSBoost` with default hyperparameters, for solving regression and classification problems. There’s still some room for improvement of models performance.

# I – Python version

## I – 0 – Install and import packages

Install mlsauce (command line)

``````

``````

Import packages

``````

import numpy as np
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score
from time import time
from os import chdir
from sklearn import metrics

import mlsauce as ms

```
```

## I – 1 – Classification

### I – 1 – 1 Breast cancer dataset

``````

# data 1
X = breast_cancer.data
y = breast_cancer.target
# split data into training test and test set
np.random.seed(15029)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2)

print("dataset 1 -- breast cancer -----")

print(X.shape)
obj = ms.LSBoostClassifier()
# using default parameters
print(obj.get_params())

start = time()
obj.fit(X_train, y_train)
print(time()-start)
start = time()
print(obj.score(X_test, y_test))
print(time()-start)

# classification report
y_pred = obj.predict(X_test)
print(classification_report(y_test, y_pred))

```
```
``````
dataset 1 -- breast cancer -----

(569, 30)

{'backend': 'cpu', 'col_sample': 1, 'direct_link': 1, 'dropout': 0, 'learning_rate': 0.1, 'n_estimators': 100, 'n_hidden_features': 5, 'reg_lambda': 0.1, 'row_sample': 1, 'seed': 123, 'tolerance': 0.0001, 'verbose': 1}

0.16006875038146973
0.9473684210526315
0.015897750854492188

precision    recall  f1-score   support

0       1.00      0.86      0.92        42
1       0.92      1.00      0.96        72

accuracy                           0.95       114
macro avg       0.96      0.93      0.94       114
weighted avg       0.95      0.95      0.95       114

``````

### I – 1 – 2 Wine dataset

``````

# data 2
Z = wine.data
t = wine.target
np.random.seed(879423)
X_train, X_test, y_train, y_test = train_test_split(Z, t,
test_size=0.2)

print("dataset 2 -- wine -----")

print(Z.shape)
obj = ms.LSBoostClassifier()
# using default parameters
print(obj.get_params())

start = time()
obj.fit(X_train, y_train)
print(time()-start)
start = time()
print(obj.score(X_test, y_test))
print(time()-start)

# classification report
y_pred = obj.predict(X_test)
print(classification_report(y_test, y_pred))

```
```
``````
dataset 2 -- wine -----

(178, 13)

{'backend': 'cpu', 'col_sample': 1, 'direct_link': 1, 'dropout': 0, 'learning_rate': 0.1, 'n_estimators': 100, 'n_hidden_features': 5, 'reg_lambda': 0.1, 'row_sample': 1, 'seed': 123, 'tolerance': 0.0001, 'verbose': 1}

0.1548290252685547
0.9722222222222222
0.021778583526611328

precision    recall  f1-score   support

0       1.00      0.93      0.97        15
1       0.92      1.00      0.96        12
2       1.00      1.00      1.00         9

accuracy                           0.97        36
macro avg       0.97      0.98      0.98        36
weighted avg       0.97      0.97      0.97        36

``````

### I – 1 – 3 iris dataset

``````

# data 3
Z = iris.data
t = iris.target
np.random.seed(734563)
X_train, X_test, y_train, y_test = train_test_split(Z, t,
test_size=0.2)

print("dataset 3 -- iris -----")

print(Z.shape)
obj = ms.LSBoostClassifier()
# using default parameters
print(obj.get_params())

start = time()
obj.fit(X_train, y_train)
print(time()-start)
start = time()
print(obj.score(X_test, y_test))
print(time()-start)

# classification report
y_pred = obj.predict(X_test)
print(classification_report(y_test, y_pred))

```
```
``````
dataset 3 -- iris -----

(150, 4)

{'backend': 'cpu', 'col_sample': 1, 'direct_link': 1, 'dropout': 0, 'learning_rate': 0.1, 'n_estimators': 100, 'n_hidden_features': 5, 'reg_lambda': 0.1, 'row_sample': 1, 'seed': 123, 'tolerance': 0.0001, 'verbose': 1}

100%|██████████| 100/100 [00:00<00:00, 1157.03it/s]

0.0932917594909668
0.9666666666666667
0.007458209991455078

precision    recall  f1-score   support

0       1.00      1.00      1.00        13
1       1.00      0.90      0.95        10
2       0.88      1.00      0.93         7

accuracy                           0.97        30
macro avg       0.96      0.97      0.96        30
weighted avg       0.97      0.97      0.97        30

``````

## I – 2 – Regression

### I – 2 – 1 Boston dataset

``````

# data 1
X = boston.data
y = boston.target
# split data into training test and test set
np.random.seed(15029)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2)

print("dataset 4 -- boston -----")

print(X.shape)
obj = ms.LSBoostRegressor()
# using default parameters
print(obj.get_params())

start = time()
obj.fit(X_train, y_train)
print(time()-start)
start = time()
print(np.sqrt(np.mean(np.square(obj.predict(X_test) - y_test))))
print(time()-start)

```
```
``````
dataset 4 -- boston -----

(506, 13)

{'backend': 'cpu', 'col_sample': 1, 'direct_link': 1, 'dropout': 0, 'learning_rate': 0.1, 'n_estimators': 100, 'n_hidden_features': 5, 'reg_lambda': 0.1, 'row_sample': 1, 'seed': 123, 'tolerance': 0.0001, 'verbose': 1}

100%|██████████| 100/100 [00:00<00:00, 896.24it/s]
0%|          | 0/100 [00:00``````

### `I – 2 – 2 Diabetes dataset`

``````

# data 2
X = diabetes.data
y = diabetes.target
# split data into training test and test set
np.random.seed(15029)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2)

print("dataset 5 -- diabetes -----")

print(X.shape)
obj = ms.LSBoostRegressor()
# using default parameters
print(obj.get_params())

start = time()
obj.fit(X_train, y_train)
print(time()-start)
start = time()
print(np.sqrt(np.mean(np.square(obj.predict(X_test) - y_test))))
print(time()-start)

``````
``````
dataset 5 -- diabetes -----

(442, 10)

{'backend': 'cpu', 'col_sample': 1, 'direct_link': 1, 'dropout': 0, 'learning_rate': 0.1, 'n_estimators': 100, 'n_hidden_features': 5, 'reg_lambda': 0.1, 'row_sample': 1, 'seed': 123, 'tolerance': 0.0001, 'verbose': 1}

100%|██████████| 100/100 [00:00<00:00, 1000.60it/s]

0.10351037979125977
55.867989174555625
0.012843847274780273

``````

# `II – R version`

## `I – 0 – Install and import packages`

``````

library(devtools)
devtools::install_github("thierrymoudiki/mlsauce/R-package")
library(mlsauce)

``````

## `II – 1 – Classification`

``````

library(datasets)

X <- as.matrix(iris[, 1:4])
y <- as.integer(iris[, 5]) - 1L

n <- dim(X)[1]
p <- dim(X)[2]
set.seed(21341)
train_index <- sample(x = 1:n, size = floor(0.8*n), replace = TRUE)
test_index <- -train_index
X_train <- as.matrix(X[train_index, ])
y_train <- as.integer(y[train_index])
X_test <- as.matrix(X[test_index, ])
y_test <- as.integer(y[test_index])

# using default parameters
obj <- mlsauce::LSBoostClassifier()

start <- proc.time()[3]
obj\$fit(X_train, y_train)
print(proc.time()[3] - start)

start <- proc.time()[3]
print(obj\$score(X_test, y_test))
print(proc.time()[3] - start)

``````
``````elapsed
0.051
0.9253731
elapsed
0.011
``````

## `II – 2 – Regression`

``````

library(datasets)

X <- as.matrix(datasets::mtcars[, -1])
y <- as.integer(datasets::mtcars[, 1])

n <- dim(X)[1]
p <- dim(X)[2]
set.seed(21341)
train_index <- sample(x = 1:n, size = floor(0.8*n), replace = TRUE)
test_index <- -train_index
X_train <- as.matrix(X[train_index, ])
y_train <- as.double(y[train_index])
X_test <- as.matrix(X[test_index, ])
y_test <- as.double(y[test_index])

# using default parameters
obj <- mlsauce::LSBoostRegressor()

start <- proc.time()[3]
obj\$fit(X_train, y_train)
print(proc.time()[3] - start)

start <- proc.time()[3]
print(sqrt(mean((obj\$predict(X_test) - y_test)**2)))
print(proc.time()[3] - start)

``````
``````elapsed
0.044
6.482376
elapsed
0.01
``````
