European MEP Data, Part 2

Posted on May 11, 2014 by Wingfeet in R bloggers | 0 Comments

[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Following last week’s short examination, I now wanted to drill down a bit more in the voting behaviour as given in data from votewatch.eu on voting of MEPs.
Votewatch’s Data describe how often MEPs voted what in the European Parliament. For each MEP the number of votes, percentages Yes, No, Abstain, number of elections and number of elections not voted. A description of the data can be found in this pdf.

Data

Compared to last week there is a slight adaptation in reading data. The same preprocessed spreadsheet is used, but next processing has been adapted.

library(gdata)

library(ggplot2)

r1 <- read.xls('votewatch-europe-yes-no-votes-data-11-december-2013.preprocessed.xls',

stringsAsFactors=TRUE)

# adapted in souce file:

# headers

# number of decimals

# “

# fix a double

r1$National.Party <- as.character(r1$National.Party)

(uu <- unique(r1$National.Party[r1$Country=='Hungary' &

grepl(‘ri Sz’,r1$National.Party)]))

[1] “Fidesz-Magyar Polgári Szövetség-Keresztény Demokrata Néppárt”

[2] “Fidesz-Magyar Polgári Szövetség-Keresztény Demokrata Néppárt”

r1$National.Party[r1$National.Party %in% uu] <- uu[1]

r1$National.Party <- factor(r1$National.Party)

r1$Date <- as.Date(r1$Sdate)

r1$Group <- relevel(r1$Group,'S&D')

r1$nYES=round(r1$pYES*r1$TotalDone/100)

r1$nNO=round(r1$pNO*r1$TotalDone/100)

r1$nAbstain=round(r1$pAbstain*r1$TotalDone/100)

# 3 vars before sum to TotalDone

r1$nNoVote=round(r1$pNoVote*r1$TotalPossible/100)

r1$nNotIn <- r1$TotalPossible-r1$nNoVote-r1$TotalDone

# 5 vars before sum to TotalPossible

Group data

In this section data per group is analyzed. The analysis results are mean estimates, their dispersion is a function of the group size. The analysis result has three stages

proportion present,
proportion voted conditional on being present,
proportion votes Yes, No, Abstain, conditional on voting.

This means the proportions do not add to 1. There is an ANOVA calculation in the analysis. However, since it is highly significant for all responses, they are not displayed.
The script itself consists of a wrapper around a glm() call. This was the most easy approach when I did think displaying ANOVA would be a good idea. It is retained because it makes the dependent variable used clearly readable.

test1 <- function(response) {

g1 <- glm(response ~ Group,

data=r1,family=binomial)

g0 <- glm(response ~ 1,

data=r1,family=binomial)

print(anova(g1,g0,test=’Chisq’))

}

g1 <- test1(with(r1,

cbind(

nNotIn,

nIn=TotalPossible-nNotIn)))

h1 <- test1(with(r1,

cbind(

nNoVote,

nVote=TotalPossible-nNotIn-nNoVote)))

i1 <- test1(with(r1,

cbind(

nYES,

nNotYES=TotalDone-nYES)))

j1 <- test1(with(r1,

cbind(

nNO,

nNotNO=TotalDone-nNO)))

k1 <- test1(with(r1,

cbind(

nAbstain,

nNotAb=TotalDone-nAbstain)))

It would be nice to display the results graphically. For this a prediction variable is made. In addition, the results are merged with decent labels

ug <- unique(subset(r1,,Group))

lpreds <- lapply(list(g1,h1,i1,j1,k1),

function(x) {

predsg <-

as.data.frame(predict.glm(x,ug,type=’response’,se.fit=TRUE)[-3])

predsg$Group <- ug$Group

predsg$Q <- dimnames(x$model[[1]])[[2]][1]

predsg

})

preds <- do.call(rbind,lpreds)

rel <- data.frame(

Q=c(“nNotIn”, “nNoVote”, “nYES”, “nNO”, “nAbstain”),

Vote=factor(1:5,labels=c(

‘Not Present’,

‘Not Voted’,

‘Yes’,’No’,’Abstain’)))

preds <- merge(preds,rel)

limits <- aes(ymax = fit + 2*se.fit, ymin=fit- 2* se.fit)

p1 <- ggplot(preds, aes(y=fit, x=Group,col=Vote))

p1 + geom_point(stat=”identity”) +

geom_errorbar(limits, width=0.25) +

ylab(‘Proportion’) +

coord_flip()

One can see that each of the groups in general votes sooner Yes than No.

As I found the result a bit disappointing, I made a mosaic plot where distribution of results are next to each other.

ag <- aggregate(subset(r1,,c(nYES,nNO,nAbstain,nNoVote,nNotIn)),

by=list(Group=r1$Group),

sum)

lag <- reshape(ag,direction='long',

idvar=’Group’,

times=names(ag[-1]),

timevar=’Q’,

v.names=’Counts’,

varying=list(names(ag[-1])))

lag <- merge(lag,rel)

mosaicplot(xtabs(Counts ~ Group + Vote ,data=lag),

color=rainbow(nlevels(lag$Group)),

main=’Votes’,

las=2,cex=.7)

By Party

At this point, I decided that the first plot would be interesting if displayed by party. However, there are far too many parties. So, to get a bit of order, the focus is on ‘interesting’ parties. Interesting in this case is anything not S&D, ADLE/ADLE and EPP because these are far too predictable. This leaves still quite some parties, so they have to be ordered by country too. Since there are a number of parties (e.g. independent) which are present in several countries, these to be labelled by country too.

First a National.PartyF, which is ordered by country.

tween <- unique(subset(r1,,c(Country,National.Party)))

tween$National.PartyC <- as.character(tween$National.Party)

selection <- tween$National.Party %in% c('Independent','Labour Party','Parti Socialiste','Partido Popular')

tween$National.PartyC[selection] <-

paste(tween$National.PartyC[selection],tween$Country[selection])

tween <- tween[order(tween$Country,tween$National.Party),]

tween$National.PartyF <- factor(tween$National.PartyC,

levels=tween$National.PartyC,

labels=tween$National.PartyC)

r2 <- merge(r1,tween)

The analysis, stripped down to minimum.

test2 <- function(response) {

glm(response ~ National.PartyF,

data=r2,family=binomial)

}

g2 <- test2(with(r2,

cbind(

nNotIn,

nIn=TotalPossible-nNotIn)))

h2 <- test2(with(r2,

cbind(

nNoVote,

nVote=TotalPossible-nNotIn-nNoVote)))

i2 <- test2(with(r2,

cbind(

nYES,

nNotYES=TotalDone-nYES)))

j2 <- test2(with(r2,

cbind(

nNO,

nNotNO=TotalDone-nNO)))

k2 <- test2(with(r2,

cbind(

nAbstain,

nNotAb=TotalDone-nAbstain)))

Predicting

ug2 <- unique(subset(r2,,National.PartyF))

lpreds2 <- lapply(list(g2,h2,i2,j2,k2),

function(x) {

predsg <-

as.data.frame(predict.glm(x,ug2,type=’response’,se.fit=TRUE)[-3])

predsg$National.PartyF <- ug2$National.PartyF

predsg$Q <- dimnames(x$model[[1]])[[2]][1]

predsg

})

preds2 <- do.call(rbind,lpreds2)

preds2 <- merge(preds2,rel)

Adding supporting variables. Shortening a very long name.

preds2 <- merge(preds2,rel)

preds2 <- merge(preds2,unique(subset(r2,,c(National.PartyF,Country,Group))))

levels(preds2$National.PartyF)[

levels(preds2$National.PartyF)==

“People for Real, Open and United Democracy / Conservative Party for Democracy and Success”

] <- "People for Real, Open and United Democracy / ..."

The plot is still a bit large. In the top the UK has a number of parties which do know where the No button is. As do Dutch ‘Partij voor de Vrijheid’ and French ‘Mouvement pour la France’. In contrast, ‘Front National’ is probably better described as Abstained. Do not think this is a right wing thing, the ‘Communist Party of Greece’ likes the No button too, though not as much as the parties mentioned earlier.

p1 <- ggplot(preds2[!(preds2$Group %in% c('S&D','EPP','ALDE/ADLE')),],

aes(y=fit, x=National.PartyF,col=Vote))

png(‘long2.png’,width=800,height=800)

p1 + geom_point(stat=”identity”) +

geom_errorbar(limits, width=0.25) +

ylab(‘Proportion’) +

coord_flip() +

xlab(‘National Party’) +

theme(legend.position=”bottom”)

Other noticeable

Anything over 0.3 which was neither Yes nor No.

preds2[preds2$fit>.3 & !(preds2$Vote %in% c(‘Yes’,’No’)),c(1,3,5,6,7)]

National.PartyF fit Vote Country Group

136 Darbo partija 0.5702479 Not Present Lithuania ALDE/ADLE

138 Darbo partija 0.3653846 Not Voted Lithuania ALDE/ADLE

246 Freiheitliche Partei Österreichs 0.3248408 Abstain Austria NI

257 Front national 0.3598585 Abstain France NI

396 JOBBIK MAGYARORSZÁGÉRT MOZGALOM 0.3164147 Abstain Hungary NI

522 Mouvement pour la France 0.3060686 Not Voted France EFD

To note:

Darbo partija is the Labour Party

Jobbik is Radical Nationalist

FPÖ and FN got often enough in the newspapers, so don’t need a link. MPF is clearly anti EU just from the name.

To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

European MEP Data, Part 2