**Freakonometrics » R-english**, and kindly contributed to R-bloggers)

Last week, Eric Chemi and Ariana Giorgi published an interesting article on “The Pay-for-Performance Myth”

With all the public chatter about exorbitant executive compensation and income inequality, it’s useful to look at the relationship between chief executive officer pay and corporate performance. Typically, when the subject of their big pay packages arises, CEOs—usually through their spokespeople—say they are paid for performance. Does data back that up?

An analysis of compensation data publicly released by Equilar shows little correlation between CEO pay and company performance. Equilar ranked the salaries of 200 highly paid CEOs. When compared to metrics such as revenue, profitability, and stock return, the scattering of data looks pretty random, as though performance doesn’t matter. The comparison makes it look as if there is zero relationship between pay and performance.

In the article, they produce a copula-type plot (since ranks – only – are considered). Ariana kindly sent me the dataset (that was used in The Pay at the Top) to play with it

> base=read.table("ceo.csv",sep=";",header=TRUE)

Here I normalize (dividing by the size of the dataset) to have uniform distribution on the unit interval (instead of working with ranks, i.e. integers). If we remove that scaling factor, the scatterplot is that same as the one mentioned in the Pay-for-performance myth.

> n=nrow(base) > U=rank(base[,1])/(n+1) > V=rank(base[,2])/(n+1) > plot(U,V,xlab="Rank CEO Pay", + ylab="Rank Stock Return")

This is the copula type representation.

If we visualize the density of the copula (using the algorithm described in the joint paper with Gery and Davy), we get either

> library("copula") > library("ks") > library("MASS") > library("locfit") > n.res=32 > ctilde1=probtranscopkde(UVs,p=1, + u.out=seq(1/(2*n.res+1),1-1/(2*n.res+1), +length=n.res),plots=TRUE)

or

> ctilde2=probtranscopkde(UVs,p=2, + u.out=seq(1/(2*n.res+1),1-1/(2*n.res+1), +length=n.res),plots=TRUE)

(depending on one parameter used in our function). It looks like we can consider that the underlying copula is the independent copula (both nonparametric estimators of the density of the copula are – relatively – flat, constant, equal to one).

Since we work with copulas, a lot of things could be done. For instance, we can plot Kendall’s function

> X=cbind(U,V) > i=rep(1:n,each=n) > j=rep(1:n,n) > S=((X[i,1]>X[j,1])&(X[i,2]>X[j,2])) > Z=tapply(S,i,sum)/(n-1) > plot(sort(Z),(1:n)/n,type="s",col="blue",lwd=2)

The line here is very close to the one we should have with independent variables (the red line). Actually, if we use monte carlo simulations to visualize a confidence interval (for a sample of size 200), it might be reasonable to assume that both variables are independent.

> M=NULL > for(i in 1:nsim){ + U=1:n + set.seed(i) + V=sample(1:n,size=n) + X=cbind(U,V) + i=rep(1:n,each=n) + j=rep(1:n,n) + S=((X[i,1]>X[j,1])&(X[i,2]>X[j,2])) + Z=tapply(S,i,sum)/(n-1) + lines(sort(Z),(1:n)/n,type="s",col="light blue") + M=cbind(M,sort(Z)) + } > Q1=apply(M,1,function(x) quantile(x,.05)) > Q2=apply(M,1,function(x) quantile(x,.95)) > lines(Q1,(1:n)/n,col="blue") > lines(Q2,(1:n)/n,col="blue")

This can be confirmed by more formal tests, one based on the (rank) correlation

> cor.test(U,V) Pearson's product-moment correlation data: U and V t = 1.1178, df = 198, p-value = 0.265 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.06021553 0.21555981 sample estimates: cor 0.07918704

and one based on some contingency table if we discretize (the chi-square test) our variables

> Uc=cut(U,seq(0,1,by=.1)) > Vc=cut(V,seq(0,1,by=.1)) > chisq.test(Uc,Vc) Pearson's Chi-squared test data: Uc and Vc X-squared = 90, df = 81, p-value = 0.2313 Warning message: In chisq.test(Uc, Vc) : Chi-squared approximation may be incorrect

It is also possible to look at various functions to get a better understanding of what is happening in the four corners (inspired by what was done in tails of copulas)

> z=seq(0,1,by=.001) > L00=L10=L01=L11=rep(NA,length(z)) > for(i in 1:length(z)){ + L00[i]=sum((U<=z[i])&(V<=z[i]))/sum(U<=z[i]) + L10[i]=sum((U>1-z[i])&(V<=z[i]))/sum(U<=z[i]) + L01[i]=sum((U<=z[i])&(V>1-z[i]))/sum(U<=z[i]) + L11[i]=sum((U>1-z[i])&(V>1-z[i]))/sum(U<=z[i]) + } > par(mfrow=c(2,2)) > plot(z,L01,type="l",xlab="",ylab="") > segments(0,0,1,1,lty=2,col="red") > plot(z,L11,type="l",xlab="",ylab="") > segments(0,0,1,1,lty=2,col="red") > plot(z,L00,type="l",xlab="",ylab="") > segments(0,0,1,1,lty=2,col="red") > plot(z,L10,type="l",xlab="",ylab="") > segments(0,0,1,1,lty=2,col="red")

The four corner are respectively low pay, high return – high pay, high return – low pay, low return – high pay, low return,,

Again, the red line is the independent copula. On those graph, we can see that the null correlation is not only valid *on average* but also in corner : very low pays (relatively to other CEOs) is not particularly associated to low stock returns, and very high pays is not associated with very high returns.

So here again, we can confirm that this “*pay for performance*” principle is a myth.

**leave a comment**for the author, please follow the link and comment on their blog:

**Freakonometrics » R-english**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...