Iterative OLS Regression Using Gauss-Seidel

October 24, 2014
By

(This article was first published on Stable Markets » R, and kindly contributed to R-bloggers)

Figure shows 10 iterations of Gauss-Seidel's OLS estimates. Estimates get successively closer to the true line, shown in green.
Figure shows 10 iterations of Gauss-Seidel’s OLS estimates. Estimates get successively closer to the true line, shown in green.

I just finished covering a few numerical techniques for solving systems of equations, which can be applied to find best-fit lines through a give set of data points.

The four points \{(0,0), (1,3), (2,3), (5,6)\} are arranged into an inconsistent system of four equations and two unknowns:

b+a(0) = 0 \\  b+a(1) = 3 \\  b+a(2) = 3 \\  b+a(5) = 6

The system can be represented in matrix form:

\begin{bmatrix} 1 & 0 \\ 1 & 1 \\ 1 & 2 \\ 1 & 5 \end{bmatrix}  \begin{bmatrix} b \\ a \end{bmatrix}  =  \begin{bmatrix} 1 \\ 3 \\ 3 \\ 6 \end{bmatrix}

The least-squares solution vector can be found by solving the normal equations

A^TA\vec{x_*}=A^T\vec{b}

Solving for A^TA and A^T \vec{b} yields the following matrix system:

\begin{bmatrix} 4 & 8 \\ 8 & 30 \end{bmatrix}  \begin{bmatrix} b \\ a \end{bmatrix}  =  \begin{bmatrix} 12 \\ 39 \end{bmatrix}

The system can be solved using Gauss-Seidel method.

Below, I run 10 iterations of Gauss-Seidel (visualized in the figure above). The estimated line gets successively closer to the true solution in green. The estimates are shown in blue – each iteration is shown in a darker shade than the next (see highlighted lines).

library(MASS) # package needed for ginv() function 

A=cbind(c(4,8),c(8,30)) # coefficient matrix
b<-t(t(c(12,39))) # b-vector
x0<-t(t(c(0,0)))  # starting vector
iter<-10          # number of Gauss-Seidel iterations to run

L<-lower.tri(A)*A  # lower triang. A
U<-upper.tri(A)*A  # upper triang. A
D<-diag(diag(A))   # diag of A

# plot points 
xc<-c(0,1,2,5)
yc<-c(0,3,3,6)
plot(xc,yc,
     col='red',
     xlab="X-Values",
     ylab="Y-Values",
     bty='n')
title(main="OLS Estimates")
legend("bottomright",
       c("Data","Estimates","True Line"),
       lty=c(0,1,1),
       pch=c(1,NA,NA),
       col=c("red","blue","green"),
       bty='n')

# create color palette - lines will get darker with each iter
pal<-colorRampPalette(c("#f2f2f2", "Blue"))
colors<-pal(iter) # creates color palette of length(iter)

# plot true line
abline(a=1.0714,b=.8571,col='green')

n<-1
while(n<=iter){
  print(x0)
  
  # Gauss-Seidel formula
  x1<-(ginv((L+D)))%*%((-U%*%x0)+b)
  x0<-x1
  n=n+1
  
  # plot estimated line  
  abline(a=as.numeric(x0[2,1]), # slope of estimated line
         b=as.numeric(x0[1,1]), # y-intercept of estimated line
         col=colors[n]) # pick nth color in palette
}

To leave a comment for the author, please follow the link and comment on their blog: Stable Markets » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)