I loved this %>% crosstable

July 28, 2015
By

(This article was first published on » R, and kindly contributed to R-bloggers)

This is a public tank you for @heatherturner’s contribution. Now the SciencesPo’s crosstable can work in a chain (%>%) fashion; useful for using along with other packages that have integrated the magrittr operator.

     > candidatos %>%
     + filter(desc_cargo == 'DEPUTADO ESTADUAL'| 
desc_cargo =='DEPUTADO DISTRITAL' | desc_cargo =='DEPUTADO FEDERAL' | 
desc_cargo =='VEREADOR' | desc_cargo =='SENADOR') %>% 
tab(desc_cargo,desc_sexo)

====================================================
                           desc_sexo                
                   -------------------------        
desc_cargo             NA   FEMININO MASCULINO  Total 
----------------------------------------------------
DEPUTADO DISTRITAL      1     826      2457     3284
                    0.03%     25%       75%     100%
DEPUTADO ESTADUAL     122   12595     48325    61042
                    0.20%     21%       79%     100%
DEPUTADO FEDERAL       40    5006     20176    25222
                    0.16%     20%       80%     100%
SENADOR                 4     161      1002     1167
                    0.34%     14%       86%     100%
VEREADOR             9682  376576   1162973  1549231
                    0.62%     24%       75%     100%
----------------------------------------------------
Total                9849  395164   1234933  1639946
                    0.60%     24%       75%     100%
====================================================

Chi-Square Test for Independence

Number of cases in table: 1639946 
Number of factors: 2 
Test for independence of all factors:
    Chisq = 1077.4, df = 8, p-value = 2.956e-227
                    X^2 df P(> X^2)
Likelihood Ratio 1216.0  8        0
Pearson          1077.4  8        0

Phi-Coefficient   : 0.026 
Contingency Coeff.: 0.026 
Cramer's V        : 0.018 

# Reproducible example:

library(SciencesPo)

 gender = rep(c("female","male"),c(1835,2691))
    admitted = rep(c("yes","no","yes","no"),c(557,1278,1198,1493))
    dept = rep(c("A","B","C","D","E","F","A","B","C","D","E","F"),
               c(89,17,202,131,94,24,19,8,391,244,299,317))
    dept2 = rep(c("A","B","C","D","E","F","A","B","C","D","E","F"),
               c(512,353,120,138,53,22,313,207,205,279,138,351))
    department = c(dept,dept2)
    ucb = data.frame(gender,admitted,department)


> ucb %>% tab(admitted, gender, department)
================================================================
                                 department                       
                  -----------------------------------------       
admitted gender   A      B      C      D      E      F    Total 
----------------------------------------------------------------
no       female     19      8    391    244    299    317   1278
                  1.5%  0.63%    31%    19%  23.4%    25%   100%
         male      313    207    205    279    138    351   1493
                 21.0% 13.86%    14%    19%   9.2%    24%   100%
         -------------------------------------------------------
         Total     332    215    596    523    437    668   2771
                 12.0%  7.76%    22%    19%  15.8%    24%   100%
----------------------------------------------------------------
yes      female     89     17    202    131     94     24    557
                   16%   3.1%    36%    24%  16.9%   4.3%   100%
         male      512    353    120    138     53     22   1198
                   43%  29.5%    10%    12%   4.4%   1.8%   100%
         -------------------------------------------------------
         Total     601    370    322    269    147     46   1755
                   34%  21.1%    18%    15%   8.4%   2.6%   100%
----------------------------------------------------------------
Total    female    108     25    593    375    393    341   1835
                  5.9%   1.4%    32%    20%  21.4%    19%   100%
         male      825    560    325    417    191    373   2691
                 30.7%  20.8%    12%    15%   7.1%    14%   100%
         -------------------------------------------------------
         Total     933    585    918    792    584    714   4526
                 20.6%  12.9%    20%    17%  12.9%    16%   100%
================================================================

To leave a comment for the author, please follow the link and comment on their blog: » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)