Finishing football postings

November 4, 2012
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

For now this is the last post about these football data. It started in August, by now it is November. But just to finish up; the model as it should have been last week.

Model

As most of what I did is described last week, only the model as it went in Jags is shown.
Explanation: Each team has a total strength (TStr) and an attack/defense strength (AD). These combine to two numbers, Defense strength (DStr) and Attack strength (AStr). The attack strength against defense strength together determine the number of goals. Logically each team has one TStr, AD, DStr and Astr. In the model this is not so, Astr and Dstr are calculated for each game. 
The intention was to build in strategy. If a team plays a much stronger/weaker team, it might be possible to shift in attack/defense strength. Unfortunately, that did not work. Either because it does not happen or because my formulation did not work out so good.
A second possibility possible in this framework is to incorporate statements like; team X never wins when they play at time Y. That could translate to a change in TStr depending on day and time. Maybe later.
fbmodel1 <- function() {
  for (i in 1:N) {
    HomeMadeGoals[i] ~ dpois(HS[i])
    OutMadeGoals[i]  ~ dpois(OS[i])
    log(HS[i]) <- Home  + DstrO[i] + AstrH[i]
    log(OS[i]) <-         DstrH[i] + AstrO[i] 
    AstrH[i] <- TStr[HomeClub[i]]+AD[HomeClub[i]]
    DstrH[i] <- TStr[HomeClub[i]]-AD[HomeClub[i]]
    AstrO[i] <- TStr[OutClub[i]] +AD[OutClub[i]]
    DstrO[i] <- TStr[OutClub[i]] -AD[OutClub[i]]
  }
  TStr[1] <- 0
  AD[1] <- 0
  for (i in 2:nclub) {
    TStr[i] ~ dnormmix(MT,tauT1,EtaT)
    AD[i]   ~ dnormmix(MAD ,tauAD1, EtaAD)
  }
  for (i in 1:3) {
    MT[i]   ~ dnorm(0,.01)
    MAD [i] ~ dnorm(0,.01)
    tauT1[i]  <- tauT
    tauAD1[i] <- tauAD
    eee[i] <- 3
  }
  EtaT[1:3]  ~ ddirch(eee[1:3])
  EtaAD[1:3] ~ ddirch(eee[1:3])
  sigmaT <- 1/sqrt(tauT)
  tauT   ~ dgamma(.001,.001)
  sigmaAD  <- 1/sqrt(tauAD)
  tauAD ~ dgamma(.001,.001)
  Home ~ dnorm(0,.0001)
}

Results

As estimates for ADO Den Haag are now fixed, I am not sure any of the estimates can be interpreted as such. So, this is just to show the model runs through JAGS. 
Inference for Bugs model at "C:/Users/.../Rtmp4uyOHL/model1be0596d2d4b.txt", fit using jags,
 3 chains, each with 10000 iterations (first 5000 discarded), n.thin = 5
 n.sims = 3000 iterations saved
          mu.vect sd.vect     2.5%      25%      50%      75%    97.5%  Rhat n.eff
AD[1]       0.000   0.000    0.000    0.000    0.000    0.000    0.000 1.000     1
AD[2]       0.676   0.132    0.421    0.585    0.676    0.763    0.935 1.004  1100
AD[3]       0.534   0.133    0.271    0.447    0.534    0.624    0.795 1.001  3000
AD[4]      -0.005   0.134   -0.261   -0.098   -0.006    0.089    0.261 1.002  1300
AD[5]      -0.084   0.140   -0.362   -0.179   -0.083    0.012    0.192 1.004   770
AD[6]       0.122   0.133   -0.143    0.035    0.121    0.210    0.388 1.002  1500
AD[7]       0.547   0.130    0.288    0.463    0.545    0.636    0.795 1.002  2600
AD[8]       0.250   0.133   -0.010    0.159    0.252    0.340    0.517 1.002  1100
AD[9]       0.555   0.131    0.298    0.469    0.555    0.641    0.802 1.001  2300
AD[10]      0.198   0.133   -0.073    0.111    0.197    0.286    0.466 1.002  1300
AD[11]      0.199   0.134   -0.061    0.110    0.195    0.289    0.468 1.003  1700
AD[12]      0.244   0.135   -0.019    0.154    0.243    0.334    0.516 1.003  2800
AD[13]      0.563   0.129    0.309    0.479    0.561    0.650    0.814 1.002  3000
AD[14]      0.196   0.135   -0.069    0.105    0.195    0.286    0.461 1.003   940
AD[15]      0.168   0.130   -0.081    0.080    0.166    0.255    0.413 1.003  1300
AD[16]      0.428   0.132    0.165    0.340    0.427    0.519    0.683 1.002  1600
AD[17]      0.321   0.140    0.054    0.225    0.317    0.413    0.604 1.002  3000
AD[18]      0.026   0.131   -0.227   -0.065    0.027    0.114    0.277 1.001  3000
Home        0.358   0.062    0.237    0.315    0.360    0.401    0.477 1.001  3000
TStr[1]     0.000   0.000    0.000    0.000    0.000    0.000    0.000 1.000     1
TStr[2]     0.186   0.082    0.012    0.134    0.189    0.241    0.346 1.008   320
TStr[3]     0.036   0.084   -0.126   -0.017    0.032    0.091    0.204 1.001  3000
TStr[4]     0.095   0.086   -0.067    0.032    0.094    0.158    0.262 1.003   840
TStr[5]     0.045   0.088   -0.126   -0.012    0.044    0.102    0.219 1.001  3000
TStr[6]     0.066   0.082   -0.092    0.009    0.066    0.123    0.224 1.001  2300
TStr[7]     0.209   0.080    0.044    0.158    0.211    0.262    0.361 1.001  2000
TStr[8]     0.147   0.084   -0.020    0.089    0.151    0.206    0.303 1.001  3000
TStr[9]     0.080   0.083   -0.077    0.021    0.077    0.137    0.241 1.002  1900
TStr[10]    0.153   0.084   -0.012    0.095    0.155    0.212    0.310 1.001  2400
TStr[11]    0.054   0.082   -0.106   -0.001    0.052    0.110    0.214 1.001  3000
TStr[12]   -0.010   0.083   -0.188   -0.061   -0.007    0.045    0.141 1.001  3000
TStr[13]    0.235   0.078    0.086    0.183    0.235    0.288    0.397 1.002  3000
TStr[14]   -0.002   0.084   -0.179   -0.055    0.000    0.054    0.152 1.003   960
TStr[15]    0.215   0.078    0.054    0.165    0.215    0.266    0.370 1.002  2000
TStr[16]    0.268   0.080    0.118    0.213    0.265    0.320    0.437 1.004   910
TStr[17]    0.008   0.082   -0.158   -0.044    0.007    0.061    0.165 1.001  3000
TStr[18]    0.164   0.083   -0.001    0.108    0.166    0.220    0.320 1.002  1400
sigmaAD     0.172   0.077    0.044    0.112    0.168    0.225    0.329 1.001  3000
sigmaT      0.081   0.041    0.025    0.050    0.075    0.105    0.176 1.015   140
deviance 1886.560   8.367 1872.324 1880.527 1886.032 1891.708 1904.662 1.003   680

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 34.9 and DIC = 1921.5
DIC is an estimate of expected predictive error (lower deviance is better).

To leave a comment for the author, please follow the link and comment on his blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.