Introducing the CGPfunctions package – March 22, 2018

March 21, 2018
By

(This article was first published on Chuck Powell, and kindly contributed to R-bloggers)

CRAN
Version

Overview

This package includes functions that I find useful for teaching
statistics as well as actually practicing the art. They typically are
not “new” methods but rather wrappers around either base R or other
packages and concepts I’m trying to master. Currently contains:

  • Plot2WayANOVA which as the name implies conducts a 2 way ANOVA and
    plots the results using ggplot2
  • neweta which is a helper function that appends the results of a
    Type II eta squared calculation onto a classic ANOVA table
  • Mode which finds the modal value in a vector of data
  • SeeDist which wraps around ggplot2 to provide visualizations of
    univariate data.
  • OurConf is a simulation function that helps you learn about
    confidence intervals

Installation

# Install from CRAN
install.packages("CGPfunctions")

# Highly recommended since it is under rapid development right now
# Or the development version from GitHub
# install.packages("devtools")
devtools::install_github("ibecav/CGPfunctions")

Usage

library(CGPfunctions) will load the package which contains 5
functions:

SeeDist will give you some plots of the distribution of a variable
using ggplot2

library(CGPfunctions)
SeeDist(mtcars$hp,whatvar="Horsepower",whatplots="d")

#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>    52.0    96.5   123.0   146.7   180.0   335.0

Mode is a helper function that simply returns one or more modal values

Mode(mtcars$hp)
#> [1] 110 175 180

neweta is a helper function which returns a tibble containing AOV
output similar to summary(aov(MyAOV)) but with eta squared computed and
appended as an additional column

MyAOV <- aov(mpg~am*cyl, mtcars)
neweta(MyAOV)
#> # A tibble: 4 x 8
#>   Source       Df `Sum Sq` `Mean Sq` `F value`       p sigstars `eta sq`
#>                                 
#> 1 am            1     37.0     37.0       4.30  0.0480 *          0.0330
#> 2 cyl           1    450.     450.       52.0   0.     ***        0.399 
#> 3 am:cyl        1     29.4     29.4       3.40  0.0760 .          0.0260
#> 4 Residuals    28    242.       8.64     NA    NA             0.215

The Plot2WayANOVA function conducts a classic analysis using existing
R functions and packages in a sane and defensible manner not necessarily
in the one and only manner.

Plot2WayANOVA(mpg~am*cyl, mtcars)
#> 
#> Converting am to a factor --- check your results
#> 
#> Converting cyl to a factor --- check your results
#> 
#> You have an unbalanced design. Using Type II sum of squares, eta squared may not sum to 1.0
#> # A tibble: 4 x 8
#>   Source       Df `Sum Sq` `Mean Sq` `F value`       p sigstars `eta sq`
#>                                 
#> 1 am            1     36.8     36.8       4.00  0.0560 .          0.0330
#> 2 cyl           2    456.     228.       24.8   0.     ***        0.405 
#> 3 am:cyl        2     25.4     12.7       1.40  0.269  ""         0.0230
#> 4 Residuals    26    239.       9.19     NA    NA             0.212
#> 
#> Table of group means
#> # A tibble: 6 x 9
#> # Groups:   am [2]
#>   am    cyl   TheMean TheSD TheSEM CIMuliplier LowerBound UpperBound     N
#>                              
#> 1 0     4        22.9 1.45   0.839        4.30       19.3       26.5     3
#> 2 0     6        19.1 1.63   0.816        3.18       16.5       21.7     4
#> 3 0     8        15.0 2.77   0.801        2.20       13.3       16.8    12
#> 4 1     4        28.1 4.48   1.59         2.36       24.3       31.8     8
#> 5 1     6        20.6 0.751  0.433        4.30       18.7       22.4     3
#> 6 1     8        15.4 0.566  0.400       12.7        10.3       20.5     2
#> 
#> Testing Homogeneity of Variance with Brown-Forsythe
#>    *** Possible violation of the assumption ***
#> Levene's Test for Homogeneity of Variance (center = median)
#>       Df F value  Pr(>F)  
#> group  5   2.736 0.04086 *
#>       26                  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Testing Normality Assumption with Shapiro-Wilk
#> 
#>  Shapiro-Wilk normality test
#> 
#> data:  MyAOV_residuals
#> W = 0.96277, p-value = 0.3263
#> 
#> Interaction graph plotted...

OurConf is a simulation function that helps you learn about confidence
intervals

OurConf(samples = 20, n = 15, mu = 100, sigma = 20, conf.level = 0.90)

#> 100 % of the confidence intervals contain Mu = 100 .

Credits

Many thanks to Dani Navarro and the book > (Learning Statistics with
R
)
whose etaSquared function was the genesis of neweta.

“He who gives up safety for speed deserves neither.”
(via)

A shoutout to some other packages I find essential.

  • stringr, for strings.
  • lubridate, for date/times.
  • forcats, for factors.
  • haven, for SPSS, SAS and Stata
    files.
  • readxl, for .xls and .xlsx
    files.
  • modelr, for modelling within a
    pipeline
  • broom, for turning models into
    tidy data
  • ggplot2, for data visualisation.
  • dplyr, for data manipulation.
  • tidyr, for data tidying.
  • readr, for data import.
  • purrr, for functional programming.
  • tibble, for tibbles, a modern
    re-imagining of data frames.

Leaving Feedback

If you like CGPfunctions, please consider leaving feedback
here
.

Contributing

Contributions in the form of feedback, comments, code, and bug reports
are most welcome. How to contribute:

  • Issues, bug reports, and wish lists: File a GitHub
    issue
    .
  • Contact the maintainer ibecav at gmail.com by email.

License

Creative Commons License
This work (blogpost) is licensed under a
Creative
Commons Attribution-ShareAlike 4.0 International License
.

To leave a comment for the author, please follow the link and comment on their blog: Chuck Powell.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)