Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In post 6 we introduced some econometrics code that will help those working with time-series to gain asymptoticly efficient results.  In this post we look at the different commands and libraries necessary for testing our assumptions and such.

Testing our Assumptions and Meeting the Gauss-Markov Theorem

In this section we will seek to test and verify the assumptions of the simple linear regression model.  These assumptions are laid out as follows and are extracted from Hill, Griffiths and Lim 2008:

SR1. The value of y, for each value of x, is
y= ß_{1}+ß_{2}x+µ
SR2. The expected value of the random error µ is
E(µ)=0
which is equivalent to assuming
E(y)= ß_{1}+ß_{2}x
SR3. The variance of the random error µ is
var(µ)=sigma^2 = var(y)
The random variables y and µ have the same variance because they differ only by a constant.
SR4. The covariance between any pair of random errors µ_{i} and µ_{j} is
cov(µ_{i}, µ_{j})=cov(y_{i},y_{j})=0
SR5. The variable x is not random and must take at least two different values.
SR6. The values of µ are normally distributed about their mean
µ ~ N(0, sigma^2)
if the y values are normally distributed and vice-versa
Central to this topics objective is meeting the conditions set forth by the Gauss-Markov Theorem.  The Gauss-Markov Theorem states that if the error term is stationary and has no serial correlation, then the OLS parameter estimate is the Best Linear Unbiased Estimate or BLUE, which implies that all other linear unbiased estimates will have a larger variance. An estimator that has the smallest possible variance is called an “efficient” estimator.  In essence, the Gauss-Markov theorem states that the error term must have no structure; the residual levels must exhibit no trend and the variance must be constant through time.
When the error term in the regression does not satisfy the assumptions set forth by Gauss-Markov, OLS is still unbiased, but fails to be BLUE as it fails to give the most efficient parameter estimates. In this scenario, a strategy which transforms the regressions variables so that the error has no structure is in order. In time-series analysis, the problem of autocorrelation between the residual values is a common one.  There are several ways to approach the transformations necessary to ensure BLUE estimates, and the previous post used the following method to gain asymptotic efficiency and improve our estimates:
1. Estimate the OLS regression

2. Fit OLS residual to an AR(p) process using the Yule-Walker Method and find the value of p.

3.  Re-estimate model using Generalized Least Squares fit by Maximum Likelihood estimation, using the  estimated from 2, as the order for your correlation residual term.

4. Fit the GLS estimated residuals to an AR(p) process and use the estimated p‘s as the final parameter estimates for the error term.

What have we done?  First we have to find out what the error term autocorrelation process is. What order is p? In order to find this out we fit the OLS residuals to an AR(p) using the Yule-Walker method. Then we take the order p of our estimated error term and run a GLS regression with an AR(p) error term.  This will give us better estimates for our model.  Research has shown that GLS estimators are asymptotically more efficient than OLS estimates almost one-hundred percent of the time. If you notice in every single regression, the GLS estimator with a twice iterated AR(p) error terms consistently results in a lower standard deviation of the residual value. Therefore the model has gained efficiency which translates into improved confidence intervals.  Additionally, by fitting the GLS residuals to an AR(p) we remove any autocorrelation(or structure) that may have been present in the residual.

Testing For Model Miss-specification and Omitted Variable Bias

The Ramsey RESET test (Regression Specification Error Test) is designed to detect omitted variable bias and incorrect functional form. Rejection of H_{0} implies that the original model is inadequate and can be improved.  A failure to reject H_{0} conveys that the test has not been able to detect any miss-specification.

Unfortunately our models of short-term risk premia over both estimation periods reject the null hypothesis, and thus suggest that a better model is out there somewhere.  Correcting for this functional miss-specification or omitted variable bias will not be pursued here, but we must keep in mind that our model can be improved upon and is thus not BLUE.

In R you can run the Ramsey Reset test for standard lm functions using the library lmtest:

>library(lmtest)

> resettest(srp1.lm)

RESET test

data:  srp1.lm
RESET = 9.7397, df1 = 2, df2 = 91, p-value = 0.0001469

For GLS objects however you’ll need to do it manually and that procedure will not be outline here.  Although if you really want to know please feel free to email or leave a comment below.

In the original formulation of the model there existed an independent variable called CreditMarketSupport, that was very similar to our FedBalance variable.  Both variables are percentages and shared the same numerator while also having very similar denominators.  As a result we had suffered from a condition called exact collinearity as the correlation between these two variables was nearly one.

> cor(FedBalance1,CreditMarketSupport1)

0.9994248

With exact collinearity we were unable to obtain a least squares estimate of our ß coefficients and these variables were behaving opposite of what we were expecting.  This violated one of our least squares assumptions SR5 which states that values of x_{ik} are not exact linear functions of the other explanatory variables.  To remedy this problem, we removed CreditMarketSupport from the models and we are able to achieve BLUE estimates.

Suspected Endogeniety

In our estimation of long-term risk premia over the first time period we suspect endogeniety in the cyclical variable Output Gap.  In order to remedy this situation we replace it with an instrumental variable – the percentage change in S&P 500 and perform the Hausman Test which is laid out as follows:

H_{0}: delta = 0 (no correlation between x_{i} and µ_{i})

H_{1}: delta ≠ 0 (correlation between x_{i} and µ_{i})

When we perform the Hausman Test using S&P 500 as our instrumental variable our delta ≠ 0 and is statistically significant.  This means that our Output Gap variable is indeed endogenous and correlated with the residual term.  If you want to learn more about the Hausman Test and how to perform it in R please leave a comment or email me and i’ll make sure to get the code over to you.  When we perform the Two Stage Least Squares Regression to correct for this not a single term is significant.  This can be reasonably be attributed to the problem of weak instruments.  The 2 Stage Least Squares Estimation is provided below. Since, the percentage change in the S&P500 was only correlated with the Output Gap 0.110954, there is strong reason to suspect that weak instruments are the source of the problem.  We will choose to not locate a proper instrumental variable to emulate the Output Gap, instead we will keep in mind that we have an endogenous variable when interpreting our coefficient estimates which will now end up being slightly biased.

Below is how to perform a two-stage least squares regression in R when your replacing an endogenous variable with an exogenous one. First you’ll need to load the library sem into R. In the below regression the first part includes all the variables from the original model and the second part lists all of our exogenous and instrumental variables which in this case is just the percentage change in the S&P 500.

> tSLRP1<-tsls(lrp1~yc1+CP1+FF1+default1+Support1+ER1+FedGDP1+FedBalance1+govcredit1+ForeignDebt1+UGAP1+OGAP1,~ yc1+CP1+FF1+default1+Support1+ER1+FedGDP1+FedBalance1+govcredit1+ForeignDebt1+sp500ch+OGAP1 )

> summary(tSLRP1)

2SLS Estimates

Model Formula: lrp1 ~ yc1 + CP1 + FF1 + default1 + Support1 + ER1 + FedGDP1 +
FedBalance1 + govcredit1 + ForeignDebt1 + UGAP1 + OGAP1

Instruments: ~yc1 + CP1 + FF1 + default1 + Support1 + ER1 + FedGDP1 + FedBalance1 +
govcredit1 + ForeignDebt1 + sp500ch + OGAP1

Residuals:
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
-9.030  -1.870   0.021   0.000   2.230   7.310

Estimate Std. Error  t value Pr(>|t|)
(Intercept)  -5.28137   44.06906 -0.11984   0.9049
yc1          -1.48564   10.60827 -0.14005   0.8889
CP1          -0.01584    0.09206 -0.17204   0.8638
FF1           0.20998    2.43849  0.08611   0.9316
default1     -7.16622   65.35728 -0.10965   0.9129
Support1      6.39893   47.72244  0.13409   0.8936
ER1           4.56290   35.91837  0.12704   0.8992
FedGDP1       1.86392    9.16081  0.20347   0.8392
FedBalance1   0.73087   12.96474  0.05637   0.9552
govcredit1    0.17051    0.89452  0.19062   0.8492
ForeignDebt1 -0.22396    1.41749 -0.15799   0.8748
UGAP1         4.55897   35.33446  0.12902   0.8976
OGAP1         0.01331    0.09347  0.14235   0.8871

Residual standard error: 3.3664 on 93 degrees of freedom

Notice that our model now doesn’t have any significant terms.  This is why we will choose to ignore the endogeniety of our Output Gap and probably Unemployment Gap variables.  Correcting for endogeniety does more harm than good in this case.

Results and Concluding Thoughts

As this paper hopefully shows, the Feds actions did directly impact the easing of broader credit conditions in the financial markets.

Over our first estimation period from 1971 to 1997 we find that the Fed’s support of Depository Institutions as a percentage of savings and time deposits is positively related to the short-term risk premia. Specifically we find that a 1 percentage point increase in Support leads to a 2.1 percent increase in short-term risk premia. This was as expected because Depository Institutions would only borrow from the Fed if no other options existed. We also find that a 1 percentage point increase in the federal funds rate leads to a .19 percentage increase in short-term risk premia.  This is consistent with our original hypothesis as an increased FF puts positive pressure on short-term rates like the 3 month commercial paper rate, thus resulting in an widened spread.  With respect to long-term risk premia, we find that a 1 percentage point increase in FF leads the long-term risk premia to decrease by .66 percentage points and a 1 percent increase in the federal funds rate leads to a .07 decrease in the long-term risk premia.

Over our second estimation period the composition of the Feds balance sheet is considered.  We see that the CCLF  did decrease short-term risk premiums, with every one percent increase translating to a decrease in short-term risk premia by .1145 percentage points.  Another important result is that Fed purchases of Agency Debt and Agency MBS did have a significant, although almost negligible effect on short-term risk premia.  One surprising result with the estimation of the long-term risk premia is that our Fed balance sheet size variable has a sign that is opposite of what we expected and its significance is particularly surprising.  This may be expected since this period is largely characterized by both a shrinking balance sheet and narrowing risk premia as investments were considered relatively safe.  However towards the end of the period risk premiums shot up and only after did the size of the balance sheet also increase, thus the sample period may place too much weight towards the beginning of the time period and not enough towards the end. This is a reasonable assumption given that our estimate of the balance sheet size showed a large negative impact on risk premia over our longer estimation period.

Please people keep dancing and we’ll delve further into some additional econometrics tests next week.