“Thanks for this blog post. I enjoyed reading it. I’m wondering how straightforward you think this would be to extend orthogonal regression to the case of two independent variables? Assume both independent variables are meaningfully measured in the same units.”
What was I referring to, exactly?
Well, just recall how we define the Principal Components of a multivariate set of data. Suppose that the data are in the form of an (n x p) matrix, X. There are n observations, and p variables. An orthogonal transformation is applied to X. This results in r (le p) new variables that are linearly uncorrelated. These are the principal components (PC’s) of the data, and they are ordered as follow. The first PC accounts for the most of the variability in the original data. The second PC accounts for the maximum amount of the remaining variability in the data, subject to the constraint that it is uncorrelated with (i.e., orthogonal to) the first PC.
Note how orthogonality has crept into the story!
We then continue – the third PC accounts for the maximum amount of the remaining variability in the data, subject to the constraint that it is orthogonal to both the first and second PC’s. etc.
You’ll find examples of PC analysis being used in a statistically descriptive way in some earlier posts of mine – e.g., here and here.
We can use (some of) the PC’s of the regressor data as explanatory variables in a regression model. A useful reference for this can be found here. Note that, by construction, these transformed explanatory variables will have zero multicollinearity.
So, in the multivariate case, orthogonal regression is just least squares regression using a sub-set of the principal components of the original regressor matrix as the explanatory variables. We also sometimes call it Total Least Squares.
In this earlier post I talked about using Principal Components Regression (PCR) in the context of simultaneous equations models. The problem there was that we can’t construct the 2SLS estimator if the sample size is smaller than the total number of predetermined variables in the entire system. (This used to be referred to as the “under-sized sample” problem.) One solution was to use a few of the principal components of the matrix of data on the predetermined variables, instead of all of the latter variables, at the first stage of 2SLS. (Usually, just the first few principal components will capture almost all of the variability in the original data.)
There are some useful discussions of this that you might want to refer to. For instance, Vincent Zoonekynd has a nice illustration here. I particularly recommend two other pieces that discuss PCR using R – this post, “Principal components regression in R, an operational tutorial”, by John Mount, on the Revolutions blog; and this post, “Performing principal components regression (PCR) in R”, by Michy Alice, on the Quantide site.
PCR also gets a brief mention in this earlier post of mine – see the discussion of the last paper mentioned in that post.