Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Sang-Heon Lee

This article explains how to estimate parameters of the Arbitrage-Free dynamic Nelson-Siegel (AFNS) model (Christensen, Diebold, and Rudebusch; 2009, Christensen, Diebold, and Rudebusch; 2011) using Kalman filter. We estimate not only parameters but also filtered latent factor estimates such as level, slope, and curvature using R code.

### 1. AFNS model

AFNS model can be expressed as a state state model which consists of measurement equation and state equation as follows.
\begin{align} y_t (\boldsymbol{\tau}) &= B(\boldsymbol{\tau}) X_t + \frac{A(\boldsymbol{\tau})}{\boldsymbol{\tau}} + \epsilon_t (\boldsymbol{\tau}) \\ dX_t &= K^P (\theta^P – X_t) dt + \Sigma dW_t^P \end{align}
$B(\boldsymbol{\tau}) = \begin{bmatrix} 1 & \displaystyle \frac{1-e^{- \tau_1 \lambda }}{\tau_1 \lambda } & \displaystyle \frac{1-e^{- \tau_1 \lambda }}{\tau_1 \lambda }-e^{- \tau_1 \lambda } \\ 1 & \displaystyle \frac{1-e^{- \tau_2 \lambda }}{\tau_2 \lambda } & \displaystyle \frac{1-e^{- \tau_2 \lambda }}{\tau_2 \lambda }-e^{- \tau_2 \lambda } \\ ⋮&⋮&⋮\\ 1 & \displaystyle \frac{1-e^{- \tau_N \lambda }}{\tau_N \lambda } & \displaystyle \frac{1-e^{- \tau_N \lambda }}{\tau_N \lambda }-e^{- \tau_N \lambda } \end{bmatrix},$ $y_t (\boldsymbol{\tau}) = \begin{bmatrix} y _{t} ( \tau _{1} )\\y _{t} ( \tau _{2} )\\ ⋮ \\y _{t} ( \tau _{N} )\end{bmatrix}, \epsilon_t (\boldsymbol{\tau}) = \begin{bmatrix} \epsilon_{t} ( \tau _{1} )\\\epsilon_{t} ( \tau _{2} )\\ ⋮ \\\epsilon_{t} ( \tau _{N} )\end{bmatrix},$ \begin{align} K^P &= \begin{bmatrix} \kappa_{11}^P & 0& 0 \\ 0& \kappa_{22}^P &0 \\ 0& 0& \kappa_{33}^P \end{bmatrix}, \theta^P = \begin{bmatrix} \theta_{1}^P \\ \theta_{2}^P \\ \theta_{3}^P \end{bmatrix}, \\ \Sigma &= \begin{bmatrix} \sigma_{11} & 0 & 0 \\ \sigma_{21} & \sigma_{22} & 0 \\ \sigma_{31} & \sigma_{32} & \sigma_{33} \end{bmatrix}, dW_t^P = \begin{bmatrix} dW_{1t}^P \\ dW_{2t}^P \\ dW_{3t}^P \end{bmatrix} \end{align} Christensen et al. (2009, 2011) introduces the yield adjustment term to ensure the no-arbitrage condition for bond price. This yield adjustment term looks very complicated but can be calculated using estimated parameters such as $$\tau$$, $$\lambda$$, $$\Sigma$$ (see Christensen et al. (2009, 2011) for the detailed derivation).
\begin{align} &\scriptstyle\frac{A( τ )} {τ } =A \frac{τ ^{2}} {6} \\ &\scriptstyle+B \left[ \frac{1}{2 λ ^{2}} – \frac{1}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } + \frac{1}{4 λ ^{3}} \frac{1-e ^{-2 λ τ }}{τ } \right]\\ &\scriptstyle+C \left [ \frac{1}{2 λ ^{2}} + \frac{1}{λ ^{2}} e ^{- λ τ } – \frac{1}{4 λ } τ e ^{-2 λ τ } – \frac{3}{4 λ ^{2}} e ^{-2 λ τ } – \frac{2}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } + \frac{5}{8 λ ^{3}} \frac{1-e ^{-2 λ τ }}{τ } \right ]\\ &\scriptstyle+D \left [ \frac{1}{2 λ } τ + \frac{1}{λ ^{2}} e ^{- λ τ } – \frac{1}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } \right ]\\ &\scriptstyle+E \left [ \frac{3}{λ ^{2}} e ^{- λ τ } + \frac{1}{2 λ } τ + \frac{1}{λ } τ e ^{- λ τ } – \frac{3}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } \right ]\\ &\scriptstyle+F \left [ \frac{1}{λ^2} + \frac{1}{λ ^{2}} e ^{- λ τ } – \frac{1}{2λ ^{2}} {e ^{- 2λ τ }} – \frac{3}{λ ^{3}} \frac{1-e ^{- λ τ }}{τ } + \frac{3}{4λ ^{3}} \frac{1-e ^{-2 λ τ }}{τ } \right ] \end{align} where $$A=coef(1,1)$$,$$B=coef(2,2)$$,$$C=coef(3,3)$$,$$D=coef(1,2)$$,$$E=coef(1,3)$$,$$F=coef(2,3)$$ if we denote $$coef(i,j) = σ_{i1}σ_{j1}+σ_{i2}σ_{j2}+σ_{i3}σ_{j3}$$.

Since AFNS model have a continuous-time representation, we should express AFNS model as the discretized form for estimating parameters using discrete time series data as follows. \begin{align} y_t (\boldsymbol{\tau}) &= B(\boldsymbol{\tau}) X_t + \frac{A(\boldsymbol{\tau})}{\boldsymbol{\tau}} + \epsilon_t (\boldsymbol{\tau}) \\ X_t &= (I – e^{-K^P \Delta t}) \theta_P + e^{-K^P \Delta t} X_{t-1} + \eta_{t} \end{align} where $$\Delta t$$ is 1/12 for monthly data and 1/52 for weekly data.

AFNS model as a state space model is linear with respect to factors, we can use Kalman filter to estimate parameters. If we use $$\psi_0 = (I – e^{-K^P \Delta t}) \theta_P$$ , $$\psi_1 = e^{-K^P \Delta t}$$ for initialization, Kalman filtering is represented as the following recursive calculations. \begin{align} X_{t|t-1} &= \psi_0 + \psi_1 X_{t-1|t-1} \\ V_{t|t-1} &= \psi_1 V_{t-1|t-1} \psi_1^{‘} + \Sigma_{\eta} \\ e_{t|t-1} &= y_t – B(\boldsymbol{\tau}) X_t – A(\boldsymbol{\tau})/\boldsymbol{\tau} \\ ev_{t|t-1} &= B(\boldsymbol{\tau}) V_{t|t-1} B(\boldsymbol{\tau})^{‘} + \Sigma_{\epsilon}\\ X_{t|t} &= X_{t|t-1} + K_t e_{t|t-1} \\ V_{t|t} &= V_{t|t-1} – K_t B(\boldsymbol{\tau}) V_{t|t-1} \end{align} where $$K_t = V_{t|t-1} B(\boldsymbol{\tau})^{‘} ev_{t|t-1}^{-1}$$ is the Kalman gain which reflects the uncertainty of prediction and is used as a weight for updating the time $$t-1$$ prediction after observing time $$t$$ data.

The conditional covariance ($$\Sigma_{\eta}$$) and unconditional covariance ($$\Sigma_{\eta,0}$$) are defined as \begin{align} \Sigma_{\eta} &= \int_{0}^{Δt} exp⁡(-K^p s)ΣΣ^{‘} exp⁡(-K^p s) ds \\ \Sigma_{\eta,0} &= \int_{0}^{\infty} exp⁡(-K^p s)ΣΣ^{‘} exp⁡(-K^p s) ds \end{align} Christensen and Rudebusch (2015) explain the way to calculate the analytical covariance as follows. \begin{align} \Sigma_{\eta} = V \left( \frac{\omega_{ij}}{\lambda_i + \lambda_j}(1-e^{-(\lambda_i + \lambda_j)\Delta t}) \right)_{n \times n} V^{‘} \end{align} where $$V$$ and $$\Lambda$$ are the eigenvector matrix and the diagonal matrix with eigenvalues respectively from the diagonalization of $$K^P = V\Lambda V^{-1}$$. $$\lambda_i$$ is the ($$i,i$$)-th element of $$\Lambda$$ and $$\omega_{ij}$$ is the ($$i,j$$)-th element of $$\Omega = V^{-1}ΣΣ^{‘}(V^{-1})^{‘}$$.

From the equation for the conditional covariance matrix, we can eaily derive the unconditional covariance matrix. \begin{align} \Sigma_{\eta,0} = V \left( \frac{\omega_{ij}}{\lambda_i + \lambda_j} \right)_{n \times n} V^{‘} \end{align} Since Kalman fitler is an iterative procedure, we use $$X_{1|0} = \theta_P$$ and $$V_{1|0} = \Sigma_{\eta,0}$$ for initial guess. We use numerical optimization algorithm to search parameters for maximizing the log likelihood function that is constructed from conditional prediction errors ($$e_{t|t-1}$$) and its uncertainties ($$ev_{t|t-1}$$). \begin{align} ln L_t (\boldsymbol{\theta}) &= -\frac{NT}{2} ln(2\pi) – \frac{1}{2} \sum_{t=1}^{T} ln |ev_{t|t-1}| \\ &\quad – \frac{1}{2} \sum_{t=1}^{T} e_{t|t-1}^{‘}(ev_{t|t-1})^{-1}e_{t|t-1} \end{align}

### 2. R code for AFNS model

Now we can implement the AFNS model using R code as follows. Our estimation is based on monthly KTB (Korean Treasury Bond) from January 2011 to December 2019. The data are end-of-month, zero-coupon yields at thirteen maturities: 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3, 4, 5, 7, 10, and 20 years.

 #=========================================================================## Financial Econometrics & Engineering, ML/DL using R, Python, Tensorflow # by Sang-Heon Lee ## https://kiandlee.blogspot.com#————————————————————————-## AFNS estimation using Kalman filter#=========================================================================# library(readxl)library(expm)library(numDeriv) graphics.off()  # clear all graphsrm(list = ls()) # remove all files from your workspace setwd(“D:/a_book_FIER_Ki_Lee/ch04_KICS/code/dns_afns_est”) # DNS factor loading matrixNS.B<–function(lambda, tau){    col1 <– rep.int(1,length(tau))    col2 <– (1–exp(–lambda*tau))/(lambda*tau)    col3 <– col2 – exp(–lambda*tau)     return(cbind(col1,col2,col3))} # yield adjustment term in AFNSAFNS.C<–function(sigma, lambda, tau){    s <– sigma; la <– lambda; t <– tau        s11<–s[1,1]; s12<–s[1,2]; s13<–s[1,3]    s21<–s[2,1]; s22<–s[2,2]; s23<–s[2,3]    s31<–s[3,1]; s32<–s[3,2]; s33<–s[3,3]      A<–s11^2+s12^2+s13^2; D<–s11*s21+s12*s22+s13*s23    B<–s21^2+s22^2+s23^2; E<–s11*s31+s12*s32+s13*s33     C<–s31^2+s32^2+s33^2; F<–s21*s31+s22*s32+s23*s33        r1<––A*t^2/6;     r2<––B*(1/(2*la^2)–(1–exp(–la*t))/(la^3*t)+                       (1–exp(–2*la*t))/(4*la^3*t))    r3<––C*(1/(2*la^2)+exp(–la*t)/(la^2)–t*exp(–2*la*t)/(4*la)–           3*exp(–2*la*t)/(4*la^2)–2*(1–exp(–la*t))/(la^3*t)+           5*(1–exp(–2*la*t))/(8*la^3*t))    r4<––D*(t/(2*la)+exp(–la*t)/(la^2)–(1–exp(–la*t))/(la^3*t))    r5<––E*(3*exp(–la*t)/(la^2)+t/(2*la)+t*exp(–la*t)/(la)–           3*(1–exp(–la*t))/(la^3*t))    r6<––F*(1/(la^2)+exp(–la*t)/(la^2)–exp(–2*la*t)/(2*la^2)–           3*(1–exp(–la*t))/(la^3*t)+3*(1–exp(–2*la*t))/(4*la^3*t))    return(r1+r2+r3+r4+r5+r6)} # parameter restrictionstrans<–function(b){    bb <– b    bb[1]  <– 1/(1+exp(b[1]))  # kappa11    bb[13] <– b[13]^2          # lambda    bb[14:npara] <– b[14:npara]^2          # measurement error    return(bb)} # log likelihood functionloglike<–function(para_un,m.spot){    # parameter restrictions    para <– trans(para_un)      # restricted parameters    kappa  <– rbind(c(para[1],0,0),                    c(0,para[2],0),                    c(0,0,para[3]))    sigma  <– rbind(c(para[4],0,0),                    c(para[5],para[6],0),                    c(para[7],para[8],para[9]))    theta  <– para[10:12]    lambda <– para[13]    H      <– diag(para[14:npara])       B  <– NS.B(lambda,v.mat); tB <– t(B) # factor loading matrix    C  <– AFNS.C(sigma,lambda,v.mat)     # yield adjustment      # Conditional and Unconditional covariance matrix : Q, Q0    m    <– eigen(kappa)     eval <– m$values evec <– m$vectors; ievec<–solve(evec)    Smat <– ievec%*%sigma%*%t(sigma)%*%t(ievec)    Vdt  <– Vinf <– matrix(0,nf,nf)        for(i in 1:nf) { for(j in 1:nf) {        Vdt[i,j] <–Smat[i,j]*(1–exp(–(eval[i]+eval[j])*dt))/                             (eval[i]+eval[j]) # conditional        Vinf[i,j]<–Smat[i,j]/(eval[i]+eval[j]) # unconditional    }}        # Analytical Covariance matrix    # Q : conditional, Q0 : unconditional    Q  <– evec%*%Vdt%*%t(evec)    Q0 <– evec%*%Vinf%*%t(evec)        # initialzation of vector and matrix    prevX <– theta; prevV <– Q0    Phi1  <– expm(–kappa*dt)    Phi0  <– (diag(nf)–Phi1)%*%theta    loglike <– 0 # log likelihood function        for(i in 1:nobs) {        Xhat <– Phi0+Phi1%*%prevX        # predicted state        Vhat <– Phi1%*%prevV%*%t(Phi1)+Q # predicted cov                y        <– m.spot[i,] # the observed yield        yimplied <– B%*%Xhat+C # the model-implied yields        er       <– y–yimplied # prediction error         # updating         ev <– B%*%Vhat%*%tB+H; iev<–solve(ev)        KG <– Vhat%*%tB%*%iev # Kalman Gain                prevX <– Xhat+KG%*%er       # E[X|y_t]   updated state         prevV <– Vhat–KG%*%B%*%Vhat # Cov[X|y_t] updated cov                # log likelihood function        loglike <– loglike – 0.5*(nmat)*log(2*pi)–                             0.5*log(det(ev))–0.5*t(er)%*%iev%*%er        gm.factor[i,] <<– prevX    }        return(–loglike)}  #=========================================================================##  Main : AFNS term structure model estimation#=========================================================================#     dt <– 1/12 #1/52 # weekly    nf <– 3        # read excel spot data    file <– “spot_2011_2019.xlsx”; sheet <– “monthly”    df.spot <– read_excel(file,sheet,col_names=TRUE)        # divide date and data    v.ymd  <– df.spot[,1]    v.mat  <– as.numeric(colnames(df.spot)[–1])    m.spot <– as.matrix(df.spot[,–1])        nmat <– length(v.mat) # # of maturities    nobs <– nrow(m.spot)  # # of observations        # factor estimates    gm.factor <– matrix(0,nobs,nf)        #—————————————————–    # initial guess for unconstrained parameters(para_un)    #—————————————————–    init_para_un <– c(      1.226637,  0.840692,  0.603496, # kappa      0.006327,       –0.005464,  0.003441,       –0.000707, –0.003399,  0.010891, # sigma      0.032577, –0.012536, –0.002748, # theta       0.5    ,                       # lambda      rep(0.000524,nmat)              # measurement error    )        npara <– length(init_para_un) # # of observations        m<–optim(init_para_un,loglike,             control = list(maxit=5000, trace=2),             method=c(“Nelder-Mead”),m.spot=m.spot)    prev_likev <– m$value v.likev <– m$value        i <– 1    repeat {        print(paste(i,“-th iteration”))        m<–optim(m$par,loglike, control = list(maxit=5000, trace=2), method=c(“Nelder-Mead”),m.spot=m.spot) v.likev <– cbind(v.likev, m$value)        print(paste(m$value,” <- likelihood value")) if (abs(m$value – prev_likev) < 0.00000000001) {            break        }        prev_likev <– m$value i <– i + 1 print(v.likev) } name_theta <– c(“kappa_1” , “kappa_2” , “kappa_3” , “sigma_11” , “sigma_21” , “sigma_22”, “sigma_31” , “sigma_32” , “sigma_33”, “theta_1” , “theta_2” , “theta_3” , “lambda” , “epsilon_1” , “epsilon_2” , “epsilon_3” , “epsilon_4”, “epsilon_5” , “epsilon_6” , “epsilon_7”, “epsilon_8” , “epsilon_9” , “epsilon_10”, “epsilon_11”, “epsilon_12”, “epsilon_13”) # draw 3 factor estimates x11(width=6, height=5); matplot(gm.factor,type=“l”, ylab=“L,S,C”, lty = 1, main = “AFNS 3 Factor Estimates (L,S,C)”, lwd=2) # Delta method for statistical inference grad <– jacobian(trans, m$par)    hess    <– hessian(func=loglike, x=m$par,m.spot=m.spot) vcv_con <– grad%*%solve(hess)%*%t(grad) # parameter | std.err | t-value | p-value theta <– trans(m$par)    stderr  <– sqrt(diag(vcv_con))    tstat   <– theta/stderr    pvalue  <– 2*pt(–abs(tstat),df=nobs–npara)    df.est  <– cbind(theta, round(stderr,4),                     round(tstat,4), round(pvalue,4))     rownames(df.est) <– name_theta # parameter name    colnames(df.est) <–c(“parameter”,“std.err”,“t-stat”,“p-value”)    print(df.est)Colored by Color Scripter cs

### 3. Estimated AFNS model

Running the above R code for the AFNS model, we can get the estimated parameters and the latent factor estimates($$L, S, C$$). The following figure prints out the convergence of the log-likelihood function and estimated parameters with standard errors, t-statistics, and p-values. We use the delta method for statistical inference.

We can also plot the time-varying filtered estimates of latent factors (level, slope, curvature factors).

Reference

Christensen, J. H. E., F. X. Diebold, and G. D. Rudebusch, 2009, An Arbitrage-free Generalized Nelson-Siegel Term Structure Model, Econometrics Journal 12, 33~64.

Christensen, J. H. E., F. X. Diebold, and G. D. Rudebusch, 2011, The Arbitrage-free Class of Nelson-Siegel Term Structure Models. Journal of Econometrics 164, 4-20.

Christensen, J. H. E. and G. D. Rudebusch (2015), Analytical Formulas for the Second Moment in Affine Models with Stochastic Volatility, working paper
$$\blacksquare$$