# On model specification, identification, degrees of freedom and regularization

**T. Moudiki's Webpage - R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I had a lot of fun this week, revisiting this blog post (Monte Carlo simulation of a 2-factor interest rates model with ESGtoolkit) I wrote a few years ago in 2014 â€“ that somehow generated a heatwave. This 2020 post is about model **specification, identification, degrees of freedom** and **regularization**. The first part is on **Monte Carlo simulation for financial pricing**, and the second part on optimization in **deep learning neural networks**. I wonâ€™t draw a lot of conclusions here, but will let you draw your own. Of course, feel free to **reach out** if something seems/sounds wrong to you. Thatâ€™s still the best way to deal with issues.

### Simulation of a G2++ short rates model

Letâ€™s start by loading **ESGtoolkit** for the first part of this post:

```
# In R console
suppressPackageStartupMessages(library(ESGtoolkit))
```

G2++ Model input **parameters**:

```
# Observed maturities
u <- 1:30
# Yield to maturities
txZC <- c(0.01422,0.01309,0.01380,0.01549,0.01747,0.01940,
0.02104,0.02236,0.02348, 0.02446,0.02535,0.02614,
0.02679,0.02727,0.02760,0.02779,0.02787,0.02786,
0.02776,0.02762,0.02745,0.02727,0.02707,0.02686,
0.02663,0.02640,0.02618,0.02597,0.02578,0.02563)
# Zero-coupon prices = 'Observed' market prices
p <- c(0.9859794,0.9744879,0.9602458,0.9416551,0.9196671,
0.8957363,0.8716268,0.8482628,0.8255457,0.8034710,
0.7819525,0.7612204,0.7416912,0.7237042,0.7072136
,0.6922140,0.6785227,0.6660095,0.6546902,0.6441639,
0.6343366,0.6250234,0.6162910,0.6080358,0.6003302,
0.5929791,0.5858711,0.5789852,0.5722068,0.5653231)
```

G2++ **simulation function** (HCSPL stands for Hermite Cubic Spline interpolation of the Yield Curve):

```
# Function of the number of scenarios
simG2plus <- function(n, methodyc = "HCSPL", seed=13435,
b_opt=NULL, rho_opt=NULL, eta_opt=NULL,
randomize_params=FALSE)
{
set.seed(seed)
# Horizon, number of simulations, frequency
horizon <- 20
freq <- "semi-annual"
delta_t <- 1/2
# Parameters found for the G2++
a_opt <- 0.50000000 + ifelse(randomize_params, 0.5*runif(1), 0)
if(is.null(b_opt))
b_opt <- 0.35412030 + ifelse(randomize_params, 0.5*runif(1), 0)
sigma_opt <- 0.09416266
if(is.null(rho_opt))
rho_opt <- -0.99855687
if(is.null(eta_opt))
eta_opt <- 0.08439934
print(paste("a:", a_opt))
print(paste("b:", b_opt))
print(paste("sigma:", sigma_opt))
print(paste("rho:", rho_opt))
print(paste("eta:", eta_opt))
# Simulation of gaussian correlated shocks
eps <- ESGtoolkit::simshocks(n = n, horizon = horizon,
frequency = "semi-annual",
family = 1, par = rho_opt)
# Simulation of the factor x
x <- ESGtoolkit::simdiff(n = n, horizon = horizon,
frequency = freq,
model = "OU",
x0 = 0, theta1 = 0, theta2 = a_opt, theta3 = sigma_opt,
eps = eps[[1]])
# Simulation of the factor y
y <- ESGtoolkit::simdiff(n = n, horizon = horizon,
frequency = freq,
model = "OU",
x0 = 0, theta1 = 0, theta2 = b_opt, theta3 = eta_opt,
eps = eps[[2]])
# Instantaneous forward rates, with spline interpolation
methodyc <- match.arg(methodyc)
fwdrates <- ESGtoolkit::esgfwdrates(n = n, horizon = horizon,
out.frequency = freq, in.maturities = u,
in.zerorates = txZC, method = methodyc)
fwdrates <- window(fwdrates, end = horizon)
# phi
t.out <- seq(from = 0, to = horizon,
by = delta_t)
param.phi <- 0.5*(sigma_opt^2)*(1 - exp(-a_opt*t.out))^2/(a_opt^2) +
0.5*(eta_opt^2)*(1 - exp(-b_opt*t.out))^2/(b_opt^2) +
(rho_opt*sigma_opt*eta_opt)*(1 - exp(-a_opt*t.out))*
(1 - exp(-b_opt*t.out))/(a_opt*b_opt)
param.phi <- ts(replicate(n, param.phi),
start = start(x), deltat = deltat(x))
phi <- fwdrates + param.phi
colnames(phi) <- c(paste0("Series ", 1:n))
# The short rates
r <- x + y + phi
colnames(r) <- c(paste0("Series ", 1:n))
return(r)
}
```

Simulations of G2++ for **4 types of parametersâ€™ sets**:

```
r.HCSPL <- simG2plus(n = 10000, methodyc = "HCSPL", seed=123)
r.HCSPL2 <- simG2plus(n = 10000, methodyc = "HCSPL", seed=2020)
r.HCSPL3 <- simG2plus(n = 10000, methodyc = "HCSPL", seed=123,
randomize_params=TRUE)
r.HCSPL4 <- simG2plus(n = 10000, methodyc = "HCSPL", seed=123,
b_opt=1, rho_opt=0, eta_opt=0,
randomize_params=FALSE)
```

Stochastic **discount factors** derived from short rates simulations:

```
deltat_r <- deltat(r.HCSPL)
Dt.HCSPL <- ESGtoolkit::esgdiscountfactor(r = r.HCSPL, X = 1)
Dt.HCSPL <- window(Dt.HCSPL, start = deltat_r, deltat = 2*deltat_r)
Dt.HCSPL2 <- ESGtoolkit::esgdiscountfactor(r = r.HCSPL2, X = 1)
Dt.HCSPL2 <- window(Dt.HCSPL2, start = deltat_r, deltat = 2*deltat_r)
Dt.HCSPL3 <- ESGtoolkit::esgdiscountfactor(r = r.HCSPL3, X = 1)
Dt.HCSPL3 <- window(Dt.HCSPL3, start = deltat_r, deltat = 2*deltat_r)
Dt.HCSPL4 <- ESGtoolkit::esgdiscountfactor(r = r.HCSPL4, X = 1)
Dt.HCSPL4 <- window(Dt.HCSPL4, start = deltat_r, deltat = 2*deltat_r)
```

**Prices** (*observed* vs Monte Carlo for previous 4 examples):

```
# Observed market prices
horizon <- 20
marketprices <- p[1:horizon]
# Monte Carlo prices
## Example 1
montecarloprices.HCSPL <- rowMeans(Dt.HCSPL)
## Example 2
montecarloprices.HCSPL2 <- rowMeans(Dt.HCSPL2)
## Example 3
montecarloprices.HCSPL3 <- rowMeans(Dt.HCSPL3)
## Example 4
montecarloprices.HCSPL4 <- rowMeans(Dt.HCSPL4)
```

**Plots** *observed* prices vs Monte Carlo prices:

```
par(mfrow=c(4, 2))
ESGtoolkit::esgplotbands(r.HCSPL, xlab = 'time', ylab = 'short rate',
main="short rate simulations \n for example 1")
plot(marketprices, col = "blue", type = 'l',
xlab = "time", ylab = "prices", main = "Prices for example 1 \n (observed vs Monte Carlo)")
points(montecarloprices.HCSPL, col = "red")
ESGtoolkit::esgplotbands(r.HCSPL2, xlab = 'time', ylab = 'short rate',
main="short rate simulations \n for example 2")
plot(marketprices, col = "blue", type = 'l',
xlab = "time", ylab = "prices", main = "Prices for example 2 \n (observed vs Monte Carlo)")
points(montecarloprices.HCSPL2, col = "red")
ESGtoolkit::esgplotbands(r.HCSPL3, xlab = 'time', ylab = 'short rate',
main="short rate simulations \n for example 3")
plot(marketprices, col = "blue", type = 'l',
xlab = "time", ylab = "prices", main = "Prices for example 3 \n (observed vs Monte Carlo)")
points(montecarloprices.HCSPL3, col = "red")
ESGtoolkit::esgplotbands(r.HCSPL4, xlab = 'time', ylab = 'short rate',
main="short rate simulations \n for example 4")
plot(marketprices, col = "blue", type = 'l',
xlab = "time", ylab = "prices", main = "Prices for example 4 \n (observed vs Monte Carlo)")
points(montecarloprices.HCSPL4, col = "red")
```

What do we **observe** on these graphs, both on simulations and prices? What will happen if we add a **third factor** to this model, meaning, three more parameters; a G3++/any other *hydra*?

### Optimization in Deep learning neural networks

On a different type of question/problem, but still on the subject of model specification, identification, degrees of freedom and regularization: **Deep learning neural networks**. Some people suggest that if you keep adding parameters (degrees of freedom?) to these models, youâ€™ll **still obtain a good generalization**. Well, thereâ€™s this picture that I like a lot:

When we optimize the loss function in Deep learning neural networks models, we are most likely using gradient descent, which is **fast and scalable**. Still, no matter how sophisticated the gradient descent procedure weâ€™re using, we will likely get stuck into a local minimum â€“ because the loss function is rarely convex.

*Stuck* is a rather unfortunate term here, because itâ€™s not an actual problem, but instead, an indirect way to avoid overtraining. Also, in our gradient descent procedure, we tune the number of epochs (number of iterations in the descent/ascent), the learning rate (how fast we roll in the descent/ascent), in addition to the dropout (randomly dropping out some nodes in networksâ€™ layers), etc. These are also ways to **avoid learning too much, to stop the optimization relatively early, and preserve the modelâ€™s ability to generalize**. They regularize the model, whereas the millions of network nodes serve as degrees of freedom. This is a different problem than the first one we examined, with different objectives, butâ€¦ still on the subject of model specification, identification, degrees of freedom and regularization.

For those who are working from home because of the COVID-19, Iâ€™d recommend this **book about work-life balance**, that I literally devoured a few months ago: REMOTE: Office Not Required (and nope, Iâ€™m not paid to promote this book).

**Note:** I am currently looking for a *gig*. You can hire me on Malt or send me an email: **thierry dot moudiki at pm dot me**. I can do descriptive statistics, data preparation, feature engineering, model calibration, training and validation, and model outputsâ€™ interpretation. I am fluent in Python, R, SQL, Microsoft Excel, Visual Basic (among others) and French. My rÃ©sumÃ©? Here!

**leave a comment**for the author, please follow the link and comment on their blog:

**T. Moudiki's Webpage - R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.