Site icon R-bloggers

European MEP Data, Part 2

[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Following last week’s short examination, I now wanted to drill down a bit more in the voting behaviour as given in data from votewatch.eu on voting of MEPs.
Votewatch’s Data describe how often MEPs voted what in the European Parliament. For each MEP the number of votes, percentages Yes, No, Abstain, number of elections and number of elections not voted. A description of the data can be found in this pdf.

Data

Compared to last week there is a slight adaptation in reading data. The same preprocessed spreadsheet is used, but next processing has been adapted.
library(gdata)
library(ggplot2)

r1 <- read.xls(‘votewatch-europe-yes-no-votes-data-11-december-2013.preprocessed.xls’,
    stringsAsFactors=TRUE)
# adapted in souce file:
# headers
# number of decimals
# “

# fix a double
r1$National.Party <- as.character(r1$National.Party)
(uu <- unique(r1$National.Party[r1$Country==’Hungary’ & 
                  grepl(‘ri Sz’,r1$National.Party)]))
[1] “Fidesz-Magyar Polgári Szövetség-Keresztény Demokrata Néppárt”      
[2] “Fidesz-Magyar Polgári Szövetség-Keresztény Demokrata Néppárt”
r1$National.Party[r1$National.Party %in% uu] <- uu[1]
r1$National.Party <- factor(r1$National.Party)

r1$Date <- as.Date(r1$Sdate)
r1$Group <- relevel(r1$Group,’S&D’)
r1$nYES=round(r1$pYES*r1$TotalDone/100)
r1$nNO=round(r1$pNO*r1$TotalDone/100)
r1$nAbstain=round(r1$pAbstain*r1$TotalDone/100)
# 3 vars before  sum to TotalDone
r1$nNoVote=round(r1$pNoVote*r1$TotalPossible/100)
r1$nNotIn <- r1$TotalPossible-r1$nNoVote-r1$TotalDone
# 5 vars before sum to TotalPossible

Group data

In this section data per group is analyzed. The analysis results are mean estimates, their dispersion is a function of the group size. The analysis result has three stages

  1. proportion present, 
  2. proportion voted conditional on being present, 
  3. proportion votes Yes, No, Abstain, conditional on voting. 
This means the proportions do not add to 1. There is an ANOVA calculation in the analysis. However, since it is highly significant for all responses, they are not displayed.
The script itself consists of a wrapper around a glm() call. This was the most easy approach when I did think displaying ANOVA would be a good idea. It is retained because it makes the dependent variable used clearly readable. 
test1 <- function(response) {
  g1 <- glm(response ~ Group,
      data=r1,family=binomial)
  g0 <- glm(response ~ 1,
      data=r1,family=binomial)
  print(anova(g1,g0,test=’Chisq’))
  g1
}
g1 <- test1(with(r1,
        cbind(
            nNotIn,
            nIn=TotalPossible-nNotIn)))

h1 <- test1(with(r1,
        cbind(
            nNoVote,
            nVote=TotalPossible-nNotIn-nNoVote)))

i1 <- test1(with(r1,
        cbind(
            nYES,
            nNotYES=TotalDone-nYES)))

j1 <- test1(with(r1,
        cbind(
            nNO,
            nNotNO=TotalDone-nNO)))

k1 <- test1(with(r1,
        cbind(
            nAbstain,
            nNotAb=TotalDone-nAbstain)))
It would be nice to display the results graphically. For this a prediction variable is made. In addition, the results are merged with decent labels
ug <- unique(subset(r1,,Group))
lpreds <- lapply(list(g1,h1,i1,j1,k1),
    function(x) {
      predsg <- 
          as.data.frame(predict.glm(x,ug,type=’response’,se.fit=TRUE)[-3])
      predsg$Group <- ug$Group
      predsg$Q <- dimnames(x$model[[1]])[[2]][1]
      predsg 
    })

preds <- do.call(rbind,lpreds)
rel <- data.frame(
    Q=c(“nNotIn”,   “nNoVote”,  “nYES”,     “nNO”,      “nAbstain”),
    Vote=factor(1:5,labels=c(
            ‘Not Present’,
            ‘Not Voted’,
            ‘Yes’,’No’,’Abstain’)))
preds <- merge(preds,rel)

limits <- aes(ymax = fit + 2*se.fit, ymin=fit- 2* se.fit)
p1 <- ggplot(preds, aes(y=fit, x=Group,col=Vote))
p1 + geom_point(stat=”identity”) +
    geom_errorbar(limits,  width=0.25) +
    ylab(‘Proportion’) +
    coord_flip()
One can see that each of the groups in general votes sooner Yes than No.
As I found the result a bit disappointing, I made a mosaic plot where distribution of results are next to each other. 
ag <- aggregate(subset(r1,,c(nYES,nNO,nAbstain,nNoVote,nNotIn)),
    by=list(Group=r1$Group),
    sum)
lag <- reshape(ag,direction=’long’,
    idvar=’Group’,
    times=names(ag[-1]),
    timevar=’Q’,
    v.names=’Counts’,
    varying=list(names(ag[-1])))
lag <- merge(lag,rel)

mosaicplot(xtabs(Counts ~ Group + Vote ,data=lag),
    color=rainbow(nlevels(lag$Group)),
    main=’Votes’,
    las=2,cex=.7)

By Party

At this point, I decided that the first plot would be interesting if displayed by party. However, there are far too many parties. So, to get a bit of order, the focus is on ‘interesting’ parties. Interesting in this case is anything not S&D, ADLE/ADLE and EPP because these are far too predictable. This leaves still quite some parties, so they have to be ordered by country too. Since there are a number of parties (e.g. independent) which are present in several countries, these to be labelled by country too. 
First a National.PartyF, which is ordered by country.
tween <- unique(subset(r1,,c(Country,National.Party)))
tween$National.PartyC <- as.character(tween$National.Party)
selection <- tween$National.Party %in% c(‘Independent’,’Labour Party’,’Parti Socialiste’,’Partido Popular’)
tween$National.PartyC[selection] <- 
    paste(tween$National.PartyC[selection],tween$Country[selection])
tween <- tween[order(tween$Country,tween$National.Party),]
tween$National.PartyF <- factor(tween$National.PartyC,
    levels=tween$National.PartyC,
    labels=tween$National.PartyC)
r2 <- merge(r1,tween)
The analysis, stripped down to minimum.
test2 <- function(response) {
  glm(response ~ National.PartyF,
      data=r2,family=binomial)
}
g2 <- test2(with(r2,
        cbind(
            nNotIn,
            nIn=TotalPossible-nNotIn)))

h2 <- test2(with(r2,
        cbind(
            nNoVote,
            nVote=TotalPossible-nNotIn-nNoVote)))

i2 <- test2(with(r2,
        cbind(
            nYES,
            nNotYES=TotalDone-nYES)))

j2 <- test2(with(r2,
        cbind(
            nNO,
            nNotNO=TotalDone-nNO)))

k2 <- test2(with(r2,
        cbind(
            nAbstain,
            nNotAb=TotalDone-nAbstain)))
Predicting
ug2 <- unique(subset(r2,,National.PartyF))
lpreds2 <- lapply(list(g2,h2,i2,j2,k2),
    function(x) {
      predsg <- 
          as.data.frame(predict.glm(x,ug2,type=’response’,se.fit=TRUE)[-3])
      predsg$National.PartyF <- ug2$National.PartyF
      predsg$Q <- dimnames(x$model[[1]])[[2]][1]
      predsg 
    })

preds2 <- do.call(rbind,lpreds2)
preds2 <- merge(preds2,rel)
Adding supporting variables.  Shortening a very long name.
preds2 <- merge(preds2,rel)
preds2 <- merge(preds2,unique(subset(r2,,c(National.PartyF,Country,Group))))
levels(preds2$National.PartyF)[
    levels(preds2$National.PartyF)==
        “People for Real, Open and United Democracy / Conservative Party for Democracy and Success”
    ]  <- “People for Real, Open and United Democracy / …”
The plot is still a bit large. In the top the UK has a number of parties which do know where the No button is. As do Dutch ‘Partij voor de Vrijheid’ and French ‘Mouvement pour la France’. In contrast, ‘Front National’ is probably better described as Abstained. Do not think this is a right wing thing, the ‘Communist Party of Greece’ likes the No button too, though not as much as the parties mentioned earlier.
p1 <- ggplot(preds2[!(preds2$Group %in% c(‘S&D’,’EPP’,’ALDE/ADLE’)),], 
    aes(y=fit, x=National.PartyF,col=Vote))
png(‘long2.png’,width=800,height=800)
p1 + geom_point(stat=”identity”) +
    geom_errorbar(limits,  width=0.25) +
    ylab(‘Proportion’) + 
    coord_flip() +
    xlab(‘National Party’) +
    theme(legend.position=”bottom”)

Other noticeable

Anything over 0.3 which was neither Yes nor No.
preds2[preds2$fit>.3 & !(preds2$Vote %in% c(‘Yes’,’No’)),c(1,3,5,6,7)]
                     National.PartyF       fit        Vote   Country     Group
136                    Darbo partija 0.5702479 Not Present Lithuania ALDE/ADLE
138                    Darbo partija 0.3653846   Not Voted Lithuania ALDE/ADLE
246 Freiheitliche Partei Österreichs 0.3248408     Abstain   Austria        NI
257                   Front national 0.3598585     Abstain    France        NI
396  JOBBIK MAGYARORSZÁGÉRT MOZGALOM 0.3164147     Abstain   Hungary        NI
522         Mouvement pour la France 0.3060686   Not Voted    France       EFD
To note: 
Darbo partija is the Labour Party
Jobbik is Radical Nationalist
FPÖ and FN got often enough in the newspapers, so don’t need a link. MPF is clearly anti EU just from the name.
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.