Exploring World Gender Statistics with Shiny

[This article was first published on Shirin's playgRound, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This week I explored the World Gender Statistics dataset. You can look at 160 measurements over 56 years with my Shiny app here.

I prepared the data as follows:

Data.csv

  • Country.Name: the name of the country
  • Country.Code: the country’s code
  • Indicator.Name: the name of the variable that this row represents
  • Indicator.Code: a unique id for the variable
  • 1960 – 2016: one column EACH for the value of the variable in each year it was available
<span class="n">dataset</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">read.csv</span><span class="p">(</span><span class="s2">"Data.csv"</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_subs</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset</span><span class="p">[</span><span class="n">grep</span><span class="p">(</span><span class="s2">".FE|.MA"</span><span class="p">,</span><span class="w"> </span><span class="n">dataset</span><span class="o">$</span><span class="n">Indicator.Code</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">head</span><span class="p">(</span><span class="n">dataset_subs</span><span class="p">)</span><span class="w">

</span><span class="n">dataset_subs</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">as.character</span><span class="p">(</span><span class="n">dataset_subs</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">)</span><span class="w">

</span><span class="n">dataset_fem</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset</span><span class="p">[</span><span class="n">grep</span><span class="p">(</span><span class="s2">"female"</span><span class="p">,</span><span class="w"> </span><span class="n">dataset</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">"female"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Code</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">".FE"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Code</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">gender</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"female"</span><span class="w">

</span><span class="n">dataset_male</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset</span><span class="p">[</span><span class="o">-</span><span class="n">grep</span><span class="p">(</span><span class="s2">"female"</span><span class="p">,</span><span class="w"> </span><span class="n">dataset</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">"male"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Code</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">".FE"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Code</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">gender</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"male"</span><span class="w">

</span><span class="n">dataset_fem</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset_fem</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">dataset_male</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset_male</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="n">dataset_fem</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset_fem</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Country.Code</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Country.Code</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">dataset_male</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset_male</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Country.Code</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Country.Code</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="n">library</span><span class="p">(</span><span class="n">dplyr</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_fem</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">arrange</span><span class="p">(</span><span class="n">dataset_fem</span><span class="p">,</span><span class="w"> </span><span class="n">Country.Code</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_male</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">arrange</span><span class="p">(</span><span class="n">dataset_male</span><span class="p">,</span><span class="w"> </span><span class="n">Country.Code</span><span class="p">)</span><span class="w">

</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Country.Code</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">as.character</span><span class="p">(</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Country.Code</span><span class="p">)</span><span class="w">
</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Country.Code</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">as.character</span><span class="p">(</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Country.Code</span><span class="p">)</span><span class="w">

</span><span class="n">save</span><span class="p">(</span><span class="n">dataset_fem</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"dataset_fem.RData"</span><span class="p">)</span><span class="w">
</span><span class="n">save</span><span class="p">(</span><span class="n">dataset_male</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"dataset_male.RData"</span><span class="p">)</span><span class="w">
</span>
<span class="nf">length</span><span class="p">(</span><span class="n">unique</span><span class="p">(</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">))</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="nf">length</span><span class="p">(</span><span class="n">unique</span><span class="p">(</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">))</span><span class="w">

</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">unique</span><span class="p">(</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">)))</span><span class="w"> </span><span class="p">{</span><span class="w">
  
  </span><span class="n">code</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">unique</span><span class="p">(</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">)[</span><span class="n">n</span><span class="p">]</span><span class="w">
  
  </span><span class="n">print</span><span class="p">(</span><span class="n">code</span><span class="p">)</span><span class="w">
                 
  </span><span class="n">fem</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset_fem</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">dataset_fem</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">code</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
  </span><span class="n">male</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dataset_male</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">dataset_male</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">code</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">

  </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">fem</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
    
    </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
      
      </span><span class="n">diff</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">male</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="m">5</span><span class="o">:</span><span class="m">61</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">fem</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="m">5</span><span class="o">:</span><span class="m">61</span><span class="p">]</span><span class="w">
      </span><span class="n">diff_table</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">cbind</span><span class="p">(</span><span class="n">male</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">4</span><span class="p">)],</span><span class="w"> </span><span class="n">diff</span><span class="p">)</span><span class="w">
      
    </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
      
      </span><span class="n">diff</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">male</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="m">5</span><span class="o">:</span><span class="m">61</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">fem</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="m">5</span><span class="o">:</span><span class="m">61</span><span class="p">]</span><span class="w">
      </span><span class="n">diff_table</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rbind</span><span class="p">(</span><span class="n">diff_table</span><span class="p">,</span><span class="w"> 
                          </span><span class="n">cbind</span><span class="p">(</span><span class="n">male</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">4</span><span class="p">)],</span><span class="w"> </span><span class="n">diff</span><span class="p">))</span><span class="w">
      
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
  
  </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
    
    </span><span class="n">diff_table_bind</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">diff_table</span><span class="w">
    
  </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
    
    </span><span class="n">diff_table_bind</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rbind</span><span class="p">(</span><span class="n">diff_table_bind</span><span class="p">,</span><span class="w"> </span><span class="n">diff_table</span><span class="p">)</span><span class="w">
    
  </span><span class="p">}</span><span class="w">
  
</span><span class="p">}</span><span class="w">

</span><span class="n">diff_table_bind</span><span class="o">$</span><span class="n">Country.Code</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">as.character</span><span class="p">(</span><span class="n">diff_table_bind</span><span class="o">$</span><span class="n">Country.Code</span><span class="p">)</span><span class="w">
</span><span class="n">diff_table_bind</span><span class="p">[</span><span class="n">diff_table_bind</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s2">"NaN"</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="kc">NA</span><span class="w">

</span><span class="n">save</span><span class="p">(</span><span class="n">diff_table_bind</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"diff_table_bind.RData"</span><span class="p">)</span><span class="w">
</span>
<span class="n">measures</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">unique</span><span class="p">(</span><span class="n">diff_table_bind</span><span class="o">$</span><span class="n">Indicator.Name</span><span class="p">)</span><span class="w">
</span><span class="n">save</span><span class="p">(</span><span class="n">measures</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"measures.RData"</span><span class="p">)</span><span class="w">

</span><span class="n">years</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">"X"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">diff_table_bind</span><span class="p">)[</span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">4</span><span class="p">)])</span><span class="w">
</span><span class="n">years</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">years</span><span class="p">[</span><span class="o">-</span><span class="nf">length</span><span class="p">(</span><span class="n">years</span><span class="p">)]</span><span class="w">
</span><span class="n">save</span><span class="p">(</span><span class="n">years</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"years.RData"</span><span class="p">)</span><span class="w">
</span>

Map

<span class="n">library</span><span class="p">(</span><span class="n">plyr</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">rgdal</span><span class="p">)</span><span class="w">

</span><span class="n">wmap_countries</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">readOGR</span><span class="p">(</span><span class="n">dsn</span><span class="o">=</span><span class="s2">"shapefiles"</span><span class="p">,</span><span class="w"> </span><span class="n">layer</span><span class="o">=</span><span class="s2">"ne_110m_admin_0_countries"</span><span class="p">)</span><span class="w">

</span><span class="n">wmap_countries_df</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">fortify</span><span class="p">(</span><span class="n">wmap_countries</span><span class="p">)</span><span class="w">
</span><span class="n">wmap_countries</span><span class="o">@</span><span class="n">data</span><span class="o">$</span><span class="n">id</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rownames</span><span class="p">(</span><span class="n">wmap_countries</span><span class="o">@</span><span class="n">data</span><span class="p">)</span><span class="w">
</span><span class="n">wmap_countries_df_final</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">join</span><span class="p">(</span><span class="n">wmap_countries_df</span><span class="p">,</span><span class="w"> </span><span class="n">wmap_countries</span><span class="o">@</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"id"</span><span class="p">)</span><span class="w">

</span><span class="n">wmap_countries_df_final</span><span class="o">$</span><span class="n">gu_a3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">as.character</span><span class="p">(</span><span class="n">wmap_countries_df_final</span><span class="o">$</span><span class="n">gu_a3</span><span class="p">)</span><span class="w">

</span><span class="n">save</span><span class="p">(</span><span class="n">wmap_countries_df_final</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"wmap_countries_df_final.RData"</span><span class="p">)</span><span class="w">
</span>

## R version 3.3.2 (2016-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: macOS Sierra 10.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] backports_1.0.4 magrittr_1.5    rprojroot_1.1   tools_3.3.2    
##  [5] htmltools_0.3.5 yaml_2.1.14     Rcpp_0.12.8     stringi_1.1.2  
##  [9] rmarkdown_1.3   knitr_1.15.1    stringr_1.1.0   digest_0.6.11  
## [13] evaluate_0.10

To leave a comment for the author, please follow the link and comment on their blog: Shirin's playgRound.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)