Simple analysis of a few aspects of the Wikipedia World cup 2014 squads data

May 29, 2014

[This article was first published on R by Emmanuel Jjunju, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The data and script for this post can be found on this gist . The data is taken from wikipedia( The script analyses the data to create interesting charts about the 2014 world cup squads. Charts include box plots of age, number of home/foreign based players for each country, clubs with more than 4 players in the world cup and leagues with more than 10 players in the World cup.

The data shows that the youngest team is the Netherlands(Dutch) team. Only Mexico, Netherlands, Spain, England, Italy, Russia, germane and Iran have more home-based players than foreign based players. Most teams have less players based in their home countries. The European clubs dominate the number of clubs with the most players in the world cup (again not a surprise)!! The world cup 2014 appears to be a sort of “European Cup”!!

Links to the scripts and input data

To leave a comment for the author, please follow the link and comment on their blog: R by Emmanuel Jjunju. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)