MAT8886 reducing dimension using factors

February 16, 2012
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

First, let us recall a standard result from linear algebra: “real symmetric matrices are diagonalizable by orthogonal matrices“. Thus, any variance-covariance matrix http://freakonometrics.blog.free.fr/public/perso5/ellex32.gif can be written

http://freakonometrics.blog.free.fr/public/perso5/ACP10.gif

since a variance-covariance matrix is also definite positive.
In the context of Gaussian random vectors (or more generally elliptical distributions), we can write

http://freakonometrics.blog.free.fr/public/perso5/ACP11.gif


The idea in factor models is that a simplified version of the diagonal matrix can be considered

http://freakonometrics.blog.free.fr/public/perso5/ACP12.gif

where

http://freakonometrics.blog.free.fr/public/perso5/ACP14.gif


assuming that eigenvalues were sorted http://freakonometrics.blog.free.fr/public/perso5/ACO17.gif.
The idea is to write the expression above

http://freakonometrics.blog.free.fr/public/perso5/ACP15.gif


where the http://freakonometrics.blog.free.fr/public/perso5/ACP16.gif largest eigenvalues are considered. This can also be written

http://freakonometrics.blog.free.fr/public/perso5/ACP17.gif


were the so-called factors http://freakonometrics.blog.free.fr/public/perso5/ACP20.gif are assumed to be orthogonal, i.e. non-correlated. Thus, components are driven by those factors, and the remaining term http://freakonometrics.blog.free.fr/public/perso5/ACP21.gif is called (in finance) the idiosyncratic component.
This technique is extremely popular in finance, to model returns of multiple stocks, from the capital asset pricing model (CAPM, Sharpe (1964) or Mossin (1966)) – with one factor (the so-called market) – to the arbitrage pricing theory (APT, Ross (1976)). For instance, with the following code, we can extract prices of 35 French stocks,
code=read.table(
"http://perso.univ-rennes1.fr/arthur.charpentier/
code-CAC.csv",sep=";",header=TRUE)
code$Nom=as.character(code$Nom)
code$Code=as.character(code$Code)
head(code)
i=1
library(tseries)
code=code[-8,]
X<-get.hist.quote(code$Code[i])
Xc=X$Close
for(i in 2:nrow(code)){
x<-get.hist.quote(code$Code[i])
xc=x$Close
Xc=merge(Xc,xc)}

It is natural to consider log-returns, and their correlations,

R=diff(log(Xc))
colnames(R)=code$Code
correlation=matrix(NA,ncol(R),ncol(R))
colnames(correlation)=code$Code
rownames(correlation)=code$Code
for(i in 1:ncol(R)){
for(j in 1: ncol(R)){
I=(is.na(R[,i])==FALSE)&(is.na(R[,j])==FALSE)
correlation[i,j]=cor(R[I,i],R[I,j]);
}}
library(corrgram)
corrgram(correlation, order=NULL,
lower.panel=panel.shade,
upper.panel=NULL, text.panel=panel.txt, main="")

In that case, there is one eigenvalue extremely large, and then, tall the others are extremely small,

L=eigen(correlation)
plot(1:ncol(R),L$values,type="b",col="red")

I.e. we suggest to consider a factor model, with http://freakonometrics.blog.free.fr/public/perso5/ACP16.gif equals one.

In a Gaussian (or elliptical) world, building factor models are extremely close to the theory of principal component analysis, where we seek axis, or planes, with the “best” projection of scatterplots,

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , ,

Comments are closed.

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)