Finishing football postings
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
For now this is the last post about these football data. It started in August, by now it is November. But just to finish up; the model as it should have been last week.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Model
As most of what I did is described last week, only the model as it went in Jags is shown.
Explanation: Each team has a total strength (TStr) and an attack/defense strength (AD). These combine to two numbers, Defense strength (DStr) and Attack strength (AStr). The attack strength against defense strength together determine the number of goals. Logically each team has one TStr, AD, DStr and Astr. In the model this is not so, Astr and Dstr are calculated for each game.
The intention was to build in strategy. If a team plays a much stronger/weaker team, it might be possible to shift in attack/defense strength. Unfortunately, that did not work. Either because it does not happen or because my formulation did not work out so good.
A second possibility possible in this framework is to incorporate statements like; team X never wins when they play at time Y. That could translate to a change in TStr depending on day and time. Maybe later.
fbmodel1 <- function() {
for (i in 1:N) {
HomeMadeGoals[i] ~ dpois(HS[i])
OutMadeGoals[i] ~ dpois(OS[i])
log(HS[i]) <- Home + DstrO[i] + AstrH[i]
log(OS[i]) <- DstrH[i] + AstrO[i]
AstrH[i] <- TStr[HomeClub[i]]+AD[HomeClub[i]]
DstrH[i] <- TStr[HomeClub[i]]-AD[HomeClub[i]]
AstrO[i] <- TStr[OutClub[i]] +AD[OutClub[i]]
DstrO[i] <- TStr[OutClub[i]] -AD[OutClub[i]]
}
TStr[1] <- 0
AD[1] <- 0
for (i in 2:nclub) {
TStr[i] ~ dnormmix(MT,tauT1,EtaT)
AD[i] ~ dnormmix(MAD ,tauAD1, EtaAD)
}
for (i in 1:3) {
MT[i] ~ dnorm(0,.01)
MAD [i] ~ dnorm(0,.01)
tauT1[i] <- tauT
tauAD1[i] <- tauAD
eee[i] <- 3
}
EtaT[1:3] ~ ddirch(eee[1:3])
EtaAD[1:3] ~ ddirch(eee[1:3])
sigmaT <- 1/sqrt(tauT)
tauT ~ dgamma(.001,.001)
sigmaAD <- 1/sqrt(tauAD)
tauAD ~ dgamma(.001,.001)
Home ~ dnorm(0,.0001)
}
Results
As estimates for ADO Den Haag are now fixed, I am not sure any of the estimates can be interpreted as such. So, this is just to show the model runs through JAGS.
Inference for Bugs model at “C:/Users/…/Rtmp4uyOHL/model1be0596d2d4b.txt”, fit using jags,
3 chains, each with 10000 iterations (first 5000 discarded), n.thin = 5
n.sims = 3000 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
AD[1] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
AD[2] 0.676 0.132 0.421 0.585 0.676 0.763 0.935 1.004 1100
AD[3] 0.534 0.133 0.271 0.447 0.534 0.624 0.795 1.001 3000
AD[4] -0.005 0.134 -0.261 -0.098 -0.006 0.089 0.261 1.002 1300
AD[5] -0.084 0.140 -0.362 -0.179 -0.083 0.012 0.192 1.004 770
AD[6] 0.122 0.133 -0.143 0.035 0.121 0.210 0.388 1.002 1500
AD[7] 0.547 0.130 0.288 0.463 0.545 0.636 0.795 1.002 2600
AD[8] 0.250 0.133 -0.010 0.159 0.252 0.340 0.517 1.002 1100
AD[9] 0.555 0.131 0.298 0.469 0.555 0.641 0.802 1.001 2300
AD[10] 0.198 0.133 -0.073 0.111 0.197 0.286 0.466 1.002 1300
AD[11] 0.199 0.134 -0.061 0.110 0.195 0.289 0.468 1.003 1700
AD[12] 0.244 0.135 -0.019 0.154 0.243 0.334 0.516 1.003 2800
AD[13] 0.563 0.129 0.309 0.479 0.561 0.650 0.814 1.002 3000
AD[14] 0.196 0.135 -0.069 0.105 0.195 0.286 0.461 1.003 940
AD[15] 0.168 0.130 -0.081 0.080 0.166 0.255 0.413 1.003 1300
AD[16] 0.428 0.132 0.165 0.340 0.427 0.519 0.683 1.002 1600
AD[17] 0.321 0.140 0.054 0.225 0.317 0.413 0.604 1.002 3000
AD[18] 0.026 0.131 -0.227 -0.065 0.027 0.114 0.277 1.001 3000
Home 0.358 0.062 0.237 0.315 0.360 0.401 0.477 1.001 3000
TStr[1] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
TStr[2] 0.186 0.082 0.012 0.134 0.189 0.241 0.346 1.008 320
TStr[3] 0.036 0.084 -0.126 -0.017 0.032 0.091 0.204 1.001 3000
TStr[4] 0.095 0.086 -0.067 0.032 0.094 0.158 0.262 1.003 840
TStr[5] 0.045 0.088 -0.126 -0.012 0.044 0.102 0.219 1.001 3000
TStr[6] 0.066 0.082 -0.092 0.009 0.066 0.123 0.224 1.001 2300
TStr[7] 0.209 0.080 0.044 0.158 0.211 0.262 0.361 1.001 2000
TStr[8] 0.147 0.084 -0.020 0.089 0.151 0.206 0.303 1.001 3000
TStr[9] 0.080 0.083 -0.077 0.021 0.077 0.137 0.241 1.002 1900
TStr[10] 0.153 0.084 -0.012 0.095 0.155 0.212 0.310 1.001 2400
TStr[11] 0.054 0.082 -0.106 -0.001 0.052 0.110 0.214 1.001 3000
TStr[12] -0.010 0.083 -0.188 -0.061 -0.007 0.045 0.141 1.001 3000
TStr[13] 0.235 0.078 0.086 0.183 0.235 0.288 0.397 1.002 3000
TStr[14] -0.002 0.084 -0.179 -0.055 0.000 0.054 0.152 1.003 960
TStr[15] 0.215 0.078 0.054 0.165 0.215 0.266 0.370 1.002 2000
TStr[16] 0.268 0.080 0.118 0.213 0.265 0.320 0.437 1.004 910
TStr[17] 0.008 0.082 -0.158 -0.044 0.007 0.061 0.165 1.001 3000
TStr[18] 0.164 0.083 -0.001 0.108 0.166 0.220 0.320 1.002 1400
sigmaAD 0.172 0.077 0.044 0.112 0.168 0.225 0.329 1.001 3000
sigmaT 0.081 0.041 0.025 0.050 0.075 0.105 0.176 1.015 140
deviance 1886.560 8.367 1872.324 1880.527 1886.032 1891.708 1904.662 1.003 680
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
pD = 34.9 and DIC = 1921.5
DIC is an estimate of expected predictive error (lower deviance is better).
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.