An R Package for the “controversial” counts of registered voters in Uganda
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Some members of the Ugandan media have claimed that counts of registered voters for the 2016 General Election, contain almost 20,000 “ghost voters”. This is according to their analyses (using excel) of data released by the Uganda Electoral Commission (EC), the body charged with conducting a free and fair election. With voting just 8 days away, I wanted to compile this data into an R package and make available it available to the general public (R users/data analysts) to analyze and develop their own conclusions. Its open data now!!
The package is currently available on GitHub here: https://github.com/Emaasit/UGvoters16
Motivation for Developing this R Package
There were several members of the media who were claiming to have found discrepancies in the Uganda Electoral Commission voter count. Their claims created a storm on social media which caught my attention.
With the data readily available in PDF format on the website of the EC, I wanted to compile it into an R package so that others can analyze it and make their own conclusions.
How to use the Package
Before you can use the data in R, you need to download it from Github using the following commands:
install.packages("devtools") devtools::install_git("git://github.com/emaasit/UGvoters16.git", branch = "master") library(UGvoters16)
The package is made up of two datasets including:
- “UGvoters16”: This is the original data set released by the Commission. It’s made up of 14 variables and 280, 010 observations.
- “analyzed”: This dataset contains an extra column (“ANALYZED_VOTER_COUNT”) added by a member of the media to make their comparison.
After loading the library, you can create local data frames using the following commands:
df1 <- UGvoters16 df2 <- analyzed ## You can take a glimpse of the data by using ## the head() function. head(df1) ## SER_NO DIST_CODE DISTRICT_NAME EA_CODE EA_NAME SCTY_CODE ## 1 1 01 APAC 002 KWANIA COUNTY 01 ## 2 2 01 APAC 002 KWANIA COUNTY 01 ## 3 3 01 APAC 002 KWANIA COUNTY 01 ## 4 4 01 APAC 002 KWANIA COUNTY 01 ## 5 5 01 APAC 002 KWANIA COUNTY 01 ## 6 6 01 APAC 002 KWANIA COUNTY 01 ## SCOUNTY_NAME PAR_CODE PARISH_NAME PS_CODE PS_NAME ## 1 ADUKU 01 ADYEDA 01 ADYEDA CENTRE ## 2 ADUKU 01 ADYEDA 02 APORWEGI P.7 SCHOOL ## 3 ADUKU 01 ADYEDA 03 ADYEDA IMALO ## 4 ADUKU 02 ALIRA 01 ALIRA B ## 5 ADUKU 02 ALIRA 02 AKOT A ## 6 ADUKU 02 ALIRA 03 OLEKE ## NO_OF_FEMALES NO_OF_MALES EC_VOTER_COUNTS ANALYZED_VOTER_COUNT ## 1 134 143 277 277 ## 2 379 323 703 702 ## 3 164 157 322 321 ## 4 461 411 872 872 ## 5 386 364 750 750 ## 6 443 383 826 826 head(df2) ## SER_NO DIST_CODE DISTRICT_NAME EA_CODE EA_NAME SCTY_CODE ## 1 1 1 APAC 2 KWANIA COUNTY 1 ## 2 2 1 APAC 2 KWANIA COUNTY 1 ## 3 3 1 APAC 2 KWANIA COUNTY 1 ## 4 4 1 APAC 2 KWANIA COUNTY 1 ## 5 5 1 APAC 2 KWANIA COUNTY 1 ## 6 6 1 APAC 2 KWANIA COUNTY 1 ## SCOUNTY_NAME PAR_CODE PARISH_NAME PS_CODE PS_NAME ## 1 ADUKU 1 ADYEDA 1 ADYEDA CENTRE ## 2 ADUKU 1 ADYEDA 2 APORWEGI P.7 SCHOOL ## 3 ADUKU 1 ADYEDA 3 ADYEDA IMALO ## 4 ADUKU 2 ALIRA 1 ALIRA B ## 5 ADUKU 2 ALIRA 2 AKOT A ## 6 ADUKU 2 ALIRA 3 OLEKE ## NO_OF_FEMALES NO_OF_MALES EC_VOTER_COUNTS ANALYZED_VOTER_COUNT ## 1 43 51 240 277 ## 2 312 251 687 702 ## 3 76 66 287 321 ## 4 404 349 869 872 ## 5 320 296 739 750 ## 6 384 317 819 826
# what are the column names names(df1) ## [1] "SER_NO" "DIST_CODE" "DISTRICT_NAME" ## [4] "EA_CODE" "EA_NAME" "SCTY_CODE" ## [7] "SCOUNTY_NAME" "PAR_CODE" "PARISH_NAME" ## [10] "PS_CODE" "PS_NAME" "NO_OF_FEMALES" ## [13] "NO_OF_MALES" "EC_VOTER_COUNTS" "ANALYZED_VOTER_COUNT" names(df2) ## [1] "SER_NO" "DIST_CODE" "DISTRICT_NAME" ## [4] "EA_CODE" "EA_NAME" "SCTY_CODE" ## [7] "SCOUNTY_NAME" "PAR_CODE" "PARISH_NAME" ## [10] "PS_CODE" "PS_NAME" "NO_OF_FEMALES" ## [13] "NO_OF_MALES" "EC_VOTER_COUNTS" "ANALYZED_VOTER_COUNT" # count the total number of analyzed voter counts sum(df2$ANALYZED_VOTER_COUNT) ## [1] 15277197
Closing Remarks
With this data now readily available in an R package, data analysts/data journalists can perform their own analyses with the R programming language that provides more tools and methods.
The post An R Package for the “controversial” counts of registered voters in Uganda appeared first on Data Science Africa.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.