Arbitrage-free Nelson Siegel model with R code

[This article was first published on K & L Fintech Modeling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


Sang-Heon Lee

This article explains how to estimate parameters of the Arbitrage-Free dynamic Nelson-Siegel (AFNS) model (Christensen, Diebold, and Rudebusch; 2009, Christensen, Diebold, and Rudebusch; 2011) using Kalman filter. We estimate not only parameters but also filtered latent factor estimates such as level, slope, and curvature using R code.




1. AFNS model


AFNS model can be expressed as a state state model which consists of measurement equation and state equation as follows.
\begin{align} y_t (\boldsymbol{\tau}) &= B(\boldsymbol{\tau}) X_t + \frac{A(\boldsymbol{\tau})}{\boldsymbol{\tau}} + \epsilon_t (\boldsymbol{\tau}) \\ dX_t &= K^P (\theta^P – X_t) dt + \Sigma dW_t^P \end{align}
\[ B(\boldsymbol{\tau}) = \begin{bmatrix} 1 & \displaystyle \frac{1-e^{- \tau_1 \lambda }}{\tau_1 \lambda } & \displaystyle \frac{1-e^{- \tau_1 \lambda }}{\tau_1 \lambda }-e^{- \tau_1 \lambda } \\ 1 & \displaystyle \frac{1-e^{- \tau_2 \lambda }}{\tau_2 \lambda } & \displaystyle \frac{1-e^{- \tau_2 \lambda }}{\tau_2 \lambda }-e^{- \tau_2 \lambda } \\ ⋮&⋮&⋮\\ 1 & \displaystyle \frac{1-e^{- \tau_N \lambda }}{\tau_N \lambda } & \displaystyle \frac{1-e^{- \tau_N \lambda }}{\tau_N \lambda }-e^{- \tau_N \lambda } \end{bmatrix}, \] \[ y_t (\boldsymbol{\tau}) = \begin{bmatrix} y _{t} ( \tau _{1} )\\y _{t} ( \tau _{2} )\\ ⋮ \\y _{t} ( \tau _{N} )\end{bmatrix}, \epsilon_t (\boldsymbol{\tau}) = \begin{bmatrix} \epsilon_{t} ( \tau _{1} )\\\epsilon_{t} ( \tau _{2} )\\ ⋮ \\\epsilon_{t} ( \tau _{N} )\end{bmatrix}, \] \begin{align} K^P &= \begin{bmatrix} \kappa_{11}^P & 0& 0 \\ 0& \kappa_{22}^P &0 \\ 0& 0& \kappa_{33}^P \end{bmatrix}, \theta^P = \begin{bmatrix} \theta_{1}^P \\ \theta_{2}^P \\ \theta_{3}^P \end{bmatrix}, \\ \Sigma &= \begin{bmatrix} \sigma_{11} & 0 & 0 \\ \sigma_{21} & \sigma_{22} & 0 \\ \sigma_{31} & \sigma_{32} & \sigma_{33} \end{bmatrix}, dW_t^P = \begin{bmatrix} dW_{1t}^P \\ dW_{2t}^P \\ dW_{3t}^P \end{bmatrix} \end{align} Christensen et al. (2009, 2011) introduces the yield adjustment term to ensure the no-arbitrage condition for bond price. This yield adjustment term looks very complicated but can be calculated using estimated parameters such as \(\tau\), \(\lambda\), \(\Sigma\) (see Christensen et al. (2009, 2011) for the detailed derivation).
\begin{align} &\scriptstyle\frac{A( τ )} {τ } =A \frac{τ ^{2}} {6} \\ &\scriptstyle+B \left[ \frac{1}{2 λ ^{2}} – \frac{1}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } + \frac{1}{4 λ ^{3}} \frac{1-e ^{-2 λ τ }}{τ } \right]\\ &\scriptstyle+C \left [ \frac{1}{2 λ ^{2}} + \frac{1}{λ ^{2}} e ^{- λ τ } – \frac{1}{4 λ } τ e ^{-2 λ τ } – \frac{3}{4 λ ^{2}} e ^{-2 λ τ } – \frac{2}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } + \frac{5}{8 λ ^{3}} \frac{1-e ^{-2 λ τ }}{τ } \right ]\\ &\scriptstyle+D \left [ \frac{1}{2 λ } τ + \frac{1}{λ ^{2}} e ^{- λ τ } – \frac{1}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } \right ]\\ &\scriptstyle+E \left [ \frac{3}{λ ^{2}} e ^{- λ τ } + \frac{1}{2 λ } τ + \frac{1}{λ } τ e ^{- λ τ } – \frac{3}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } \right ]\\ &\scriptstyle+F \left [ \frac{1}{λ^2} + \frac{1}{λ ^{2}} e ^{- λ τ } – \frac{1}{2λ ^{2}} {e ^{- 2λ τ }} – \frac{3}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } + \frac{3}{4λ ^{3}} \frac{1-e ^{-2 λ τ }}{τ } \right ] \end{align} where \(A=coef(1,1)\),\(B=coef(2,2)\),\(C=coef(3,3)\),\(D=coef(1,2)\),\(E=coef(1,3)\),\(F=coef(2,3)\) if we denote \(coef(i,j) = σ_{i1}σ_{j1}+σ_{i2}σ_{j2}+σ_{i3}σ_{j3}\).

Since AFNS model have a continuous-time representation, we should express AFNS model as the discretized form for estimating parameters using discrete time series data as follows. \begin{align} y_t (\boldsymbol{\tau}) &= B(\boldsymbol{\tau}) X_t + \frac{A(\boldsymbol{\tau})}{\boldsymbol{\tau}} + \epsilon_t (\boldsymbol{\tau}) \\ X_t &= (I – e^{-K^P \Delta t}) \theta_P + e^{-K^P \Delta t} X_{t-1} + \eta_{t} \end{align} where \(\Delta t\) is 1/12 for monthly data and 1/52 for weekly data.

AFNS model as a state space model is linear with respect to factors, we can use Kalman filter to estimate parameters. If we use \( \psi_0 = (I – e^{-K^P \Delta t}) \theta_P \) , \( \psi_1 = e^{-K^P \Delta t} \) for initialization, Kalman filtering is represented as the following recursive calculations. \begin{align} X_{t|t-1} &= \psi_0 + \psi_1 X_{t-1|t-1} \\ V_{t|t-1} &= \psi_1 V_{t-1|t-1} \psi_1^{‘} + \Sigma_{\eta} \\ e_{t|t-1} &= y_t – B(\boldsymbol{\tau}) X_t – A(\boldsymbol{\tau})/\boldsymbol{\tau} \\ ev_{t|t-1} &= B(\boldsymbol{\tau}) V_{t|t-1} B(\boldsymbol{\tau})^{‘} + \Sigma_{\epsilon}\\ X_{t|t} &= X_{t|t-1} + K_t e_{t|t-1} \\ V_{t|t} &= V_{t|t-1} – K_t B(\boldsymbol{\tau}) V_{t|t-1} \end{align} where \(K_t = V_{t|t-1} B(\boldsymbol{\tau})^{‘} ev_{t|t-1}^{-1}\) is the Kalman gain which reflects the uncertainty of prediction and is used as a weight for updating the time \(t-1\) prediction after observing time \(t\) data.

The conditional covariance (\(\Sigma_{\eta}\)) and unconditional covariance (\(\Sigma_{\eta,0}\)) are defined as \begin{align} \Sigma_{\eta} &= \int_{0}^{Δt} exp⁡(-K^p s)ΣΣ^{‘} exp⁡(-K^p s) ds \\ \Sigma_{\eta,0} &= \int_{0}^{\infty} exp⁡(-K^p s)ΣΣ^{‘} exp⁡(-K^p s) ds \end{align} Christensen and Rudebusch (2015) explain the way to calculate the analytical covariance as follows. \begin{align} \Sigma_{\eta} = V \left( \frac{\omega_{ij}}{\lambda_i + \lambda_j}(1-e^{-(\lambda_i + \lambda_j)\Delta t}) \right)_{n \times n} V^{‘} \end{align} where \(V\) and \(\Lambda\) are the eigenvector matrix and the diagonal matrix with eigenvalues respectively from the diagonalization of \(K^P = V\Lambda V^{-1}\). \(\lambda_i\) is the (\(i,i\))-th element of \(\Lambda\) and \(\omega_{ij}\) is the (\(i,j\))-th element of \(\Omega = V^{-1}ΣΣ^{‘}(V^{-1})^{‘}\).

From the equation for the conditional covariance matrix, we can eaily derive the unconditional covariance matrix. \begin{align} \Sigma_{\eta,0} = V \left( \frac{\omega_{ij}}{\lambda_i + \lambda_j} \right)_{n \times n} V^{‘} \end{align} Since Kalman fitler is an iterative procedure, we use \(X_{1|0} = \theta_P \) and \(V_{1|0} = \Sigma_{\eta,0} \) for initial guess. We use numerical optimization algorithm to search parameters for maximizing the log likelihood function that is constructed from conditional prediction errors (\(e_{t|t-1}\)) and its uncertainties (\(ev_{t|t-1}\)). \begin{align} ln L_t (\boldsymbol{\theta}) &= -\frac{NT}{2} ln(2\pi) – \frac{1}{2} \sum_{t=1}^{T} ln |ev_{t|t-1}| \\ &\quad – \frac{1}{2} \sum_{t=1}^{T} e_{t|t-1}^{‘}(ev_{t|t-1})^{-1}e_{t|t-1} \end{align}



2. R code for AFNS model


Now we can implement the AFNS model using R code as follows. Our estimation is based on monthly KTB (Korean Treasury Bond) from January 2011 to December 2019. The data are end-of-month, zero-coupon yields at thirteen maturities: 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3, 4, 5, 7, 10, and 20 years.


#=========================================================================#
# Financial Econometrics & Engineering, ML/DL using R, Python, Tensorflow 
# by Sang-Heon Lee 
#
# https://kiandlee.blogspot.com
#————————————————————————-#
# AFNS estimation using Kalman filter
#=========================================================================#
 
library(readxl)
library(expm)
library(numDeriv)
 
graphics.off()  # clear all graphs
rm(list = ls()) # remove all files from your workspace
 
setwd(“D:/a_book_FIER_Ki_Lee/ch04_KICS/code/dns_afns_est”)
 
# DNS factor loading matrix
NS.B<function(lambda, tau)
{
    col1 < rep.int(1,length(tau))
    col2 < (1exp(lambda*tau))/(lambda*tau)
    col3 < col2  exp(lambda*tau) 
    return(cbind(col1,col2,col3))
}
 
# yield adjustment term in AFNS
AFNS.C<function(sigma, lambda, tau)
{
    s < sigma; la < lambda; t < tau
    
    s11<s[1,1]; s12<s[1,2]; s13<s[1,3]
    s21<s[2,1]; s22<s[2,2]; s23<s[2,3]
    s31<s[3,1]; s32<s[3,2]; s33<s[3,3]
  
    A<s11^2+s12^2+s13^2; D<s11*s21+s12*s22+s13*s23
    B<s21^2+s22^2+s23^2; E<s11*s31+s12*s32+s13*s33 
    C<s31^2+s32^2+s33^2; F<s21*s31+s22*s32+s23*s33
    
    r1<A*t^2/6
    r2<B*(1/(2*la^2)(1exp(la*t))/(la^3*t)+
                       (1exp(2*la*t))/(4*la^3*t))
    r3<C*(1/(2*la^2)+exp(la*t)/(la^2)t*exp(2*la*t)/(4*la)
           3*exp(2*la*t)/(4*la^2)2*(1exp(la*t))/(la^3*t)+
           5*(1exp(2*la*t))/(8*la^3*t))
    r4<D*(t/(2*la)+exp(la*t)/(la^2)(1exp(la*t))/(la^3*t))
    r5<E*(3*exp(la*t)/(la^2)+t/(2*la)+t*exp(la*t)/(la)
           3*(1exp(la*t))/(la^3*t))
    r6<F*(1/(la^2)+exp(la*t)/(la^2)exp(2*la*t)/(2*la^2)
           3*(1exp(la*t))/(la^3*t)+3*(1exp(2*la*t))/(4*la^3*t))
    return(r1+r2+r3+r4+r5+r6)
}
 
# parameter restrictions
trans<function(b)
{
    bb < b
    bb[1]  < 1/(1+exp(b[1]))  # kappa11
    bb[13< b[13]^2          # lambda
    bb[14:npara] < b[14:npara]^2          # measurement error
    return(bb)
}
 
# log likelihood function
loglike<function(para_un,m.spot)
{
    # parameter restrictions
    para < trans(para_un)
  
    # restricted parameters
    kappa  < rbind(c(para[1],0,0),
                    c(0,para[2],0),
                    c(0,0,para[3]))
    sigma  < rbind(c(para[4],0,0),
                    c(para[5],para[6],0),
                    c(para[7],para[8],para[9]))
    theta  < para[10:12]
    lambda < para[13]
    H      < diag(para[14:npara])
   
    B  < NS.B(lambda,v.mat); tB < t(B) # factor loading matrix
    C  < AFNS.C(sigma,lambda,v.mat)     # yield adjustment
  
    # Conditional and Unconditional covariance matrix : Q, Q0
    m    < eigen(kappa) 
    eval < m$values 
    evec < m$vectors; ievec<solve(evec)
    Smat < ievec%*%sigma%*%t(sigma)%*%t(ievec)
    Vdt  < Vinf < matrix(0,nf,nf)
    
    for(i in 1:nf) { for(j in 1:nf) {
        Vdt[i,j] <Smat[i,j]*(1exp((eval[i]+eval[j])*dt))/
                             (eval[i]+eval[j]) # conditional
        Vinf[i,j]<Smat[i,j]/(eval[i]+eval[j]) # unconditional
    }}
    
    # Analytical Covariance matrix
    # Q : conditional, Q0 : unconditional
    Q  < evec%*%Vdt%*%t(evec)
    Q0 < evec%*%Vinf%*%t(evec)
    
    # initialzation of vector and matrix
    prevX < theta; prevV < Q0
    Phi1  < expm(kappa*dt)
    Phi0  < (diag(nf)Phi1)%*%theta
    loglike < 0 # log likelihood function
    
    for(i in 1:nobs) {
        Xhat < Phi0+Phi1%*%prevX        # predicted state
        Vhat < Phi1%*%prevV%*%t(Phi1)+# predicted cov
        
        y        < m.spot[i,] # the observed yield
        yimplied < B%*%Xhat+# the model-implied yields
        er       < yyimplied # prediction error
 
        # updating 
        ev < B%*%Vhat%*%tB+H; iev<solve(ev)
        KG < Vhat%*%tB%*%iev # Kalman Gain
        
        prevX < Xhat+KG%*%er       # E[X|y_t]   updated state 
        prevV < VhatKG%*%B%*%Vhat # Cov[X|y_t] updated cov
        
        # log likelihood function
        loglike < loglike  0.5*(nmat)*log(2*pi)
                             0.5*log(det(ev))0.5*t(er)%*%iev%*%er
        gm.factor[i,] << prevX
    }
    
    return(loglike)
}
 
 
#=========================================================================#
#  Main : AFNS term structure model estimation
#=========================================================================#
 
    dt < 1/12 #1/52 # weekly
    nf < 3
    
    # read excel spot data
    file < “spot_2011_2019.xlsx”; sheet < “monthly”
    df.spot < read_excel(file,sheet,col_names=TRUE)
    
    # divide date and data
    v.ymd  < df.spot[,1]
    v.mat  < as.numeric(colnames(df.spot)[1])
    m.spot < as.matrix(df.spot[,1])
    
    nmat < length(v.mat) # # of maturities
    nobs < nrow(m.spot)  # # of observations
    
    # factor estimates
    gm.factor < matrix(0,nobs,nf)
    
    #—————————————————–
    # initial guess for unconstrained parameters(para_un)
    #—————————————————–
    init_para_un < c(
      1.226637,  0.840692,  0.603496# kappa
      0.006327,  
     0.005464,  0.003441,  
     0.0007070.003399,  0.010891# sigma
      0.0325770.0125360.002748# theta 
      0.5    ,                       # lambda
      rep(0.000524,nmat)              # measurement error
    )
    
    npara < length(init_para_un) # # of observations
    
    m<optim(init_para_un,loglike,
             control = list(maxit=5000, trace=2),
             method=c(“Nelder-Mead”),m.spot=m.spot)
    prev_likev < m$value
    v.likev    < m$value
    
    i < 1
    repeat {
        print(paste(i,“-th iteration”))
        m<optim(m$par,loglike,
                 control = list(maxit=5000, trace=2),
                 method=c(“Nelder-Mead”),m.spot=m.spot)
        
        v.likev < cbind(v.likev, m$value)
        print(paste(m$value,” <- likelihood value"))
        
        if (abs(m$value  prev_likev) < 0.00000000001) {
            break
        }
        prev_likev < m$value
        i < i + 1
        
        print(v.likev)
    }
    
    name_theta < 
      c(“kappa_1”   , “kappa_2”   , “kappa_3” ,
        “sigma_11”  , “sigma_21”  , “sigma_22”,
        “sigma_31”  , “sigma_32”  , “sigma_33”,
        “theta_1”   , “theta_2”   , “theta_3” ,
        “lambda”    , “epsilon_1” ,
        “epsilon_2” , “epsilon_3” , “epsilon_4”,
        “epsilon_5” , “epsilon_6” , “epsilon_7”,
        “epsilon_8” , “epsilon_9” , “epsilon_10”,
        “epsilon_11”“epsilon_12”“epsilon_13”)
 
    # draw 3 factor estimates
    x11(width=6, height=5);
    matplot(gm.factor,type=“l”, ylab=“L,S,C”, lty = 1
            main = “AFNS 3 Factor Estimates (L,S,C)”, lwd=2)
 
    # Delta method for statistical inference
    grad    < jacobian(trans, m$par)
    hess    < hessian(func=loglike, x=m$par,m.spot=m.spot)
    vcv_con < grad%*%solve(hess)%*%t(grad)
    
    # parameter | std.err | t-value | p-value
    theta   < trans(m$par)
    stderr  < sqrt(diag(vcv_con))
    tstat   < theta/stderr
    pvalue  < 2*pt(abs(tstat),df=nobsnpara)
    df.est  < cbind(theta, round(stderr,4),
                     round(tstat,4), round(pvalue,4))
 
    rownames(df.est) < name_theta # parameter name
    colnames(df.est) <c(“parameter”,“std.err”,“t-stat”,“p-value”)
    print(df.est)
cs


3. Estimated AFNS model


Running the above R code for the AFNS model, we can get the estimated parameters and the latent factor estimates(\(L, S, C\)). The following figure prints out the convergence of the log-likelihood function and estimated parameters with standard errors, t-statistics, and p-values. We use the delta method for statistical inference.

AFNS model estimation r code

We can also plot the time-varying filtered estimates of latent factors (level, slope, curvature factors).

level slope curvature r code afns

Reference

Christensen, J. H. E., F. X. Diebold, and G. D. Rudebusch, 2009, An Arbitrage-free Generalized Nelson-Siegel Term Structure Model, Econometrics Journal 12, 33~64.

Christensen, J. H. E., F. X. Diebold, and G. D. Rudebusch, 2011, The Arbitrage-free Class of Nelson-Siegel Term Structure Models. Journal of Econometrics 164, 4-20.

Christensen, J. H. E. and G. D. Rudebusch (2015), Analytical Formulas for the Second Moment in Affine Models with Stochastic Volatility, working paper
\(\blacksquare\)

To leave a comment for the author, please follow the link and comment on their blog: K & L Fintech Modeling.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)