# Testing out ggQC

February 10, 2019
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

`R` is a great language for statistical process control largely because there are several packages which make it quite easy, including qcc, SixSigma, and a new one that I want to explore in this post called ggQC. I’ve been using `qcc` for a while and I trust its calculations, so I’ll use that to validate `ggQC`.

`ggQC` was created by Keneth Grey and I came accross it when I read his excelent article on why we use r-bar and a constant to calculate the control limits on a control chart. The advantage of `ggQC` is that it extends ggplot2, so that you get all of the flexibility that comes with that.

First, let’s load the `tidyverse`, `qcc`, and `ggQC` (you’ll need to install these first if you haven’t already).

``````library(tidyverse)
library(qcc)
library(ggQC)``````

Now, let’s create an x-bar chart with qcc to use as a baseline.

``````# Generate sample data
set.seed(20190117)
example_df <- data_frame(values = rnorm(n=30*5, mean = 25, sd = .005),
subgroup = rep(1:30, 5),
n = rep(1:5, each = 30)) %>%
add_row(values = rnorm(n=2*5, mean = 25 + .006, sd = .005),
subgroup = rep(31:32, 5),
n = rep(1:5, each = 2))``````
``````## Warning: `data_frame()` is deprecated, use `tibble()`.
## This warning is displayed once per session.``````
``````# Spread the data and plot the control chart
qcc_chart <- example_df %>%
spread(key = n, value = values) %>%
select(-subgroup) %>%
qcc(type = "xbar", data.name = "Example Data")`````` ``qcc_chart``

You can see that as long as the data is formatted with each group on its own row, this package makes it very easy to generate a funcitonal control chart. However, customizing this chart isn’t as easy for those of us who primarily use `ggplot2` instead of base plots.

Let’s take a look at the same thing with ggQC. First we’ll need to put the data in long format, then we’ll pass that to `ggplot2` and add some `ggQC` layers.

``````ggQC_example <- example_df %>%
ggplot(aes(x = subgroup, y = values, group = 1)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line") +
stat_QC(method = "xBar.rBar", auto.label = TRUE, label.digits = 4) +
scale_x_continuous(expand =  expand_scale(mult = c(.05, .15))) + # Pad the x-axis for the labels
ylab("x-bar") +
theme_bw()
ggQC_example`````` You can see that both packages compute the UCL and LCL identically.

I would like for the two stat summaries to be plotted as part of `stat_QC`, but I like that you get more control over them when you plot them on their own and that you see exacly what you’re plotting. While building the chart this way seems a little tedious, it makes it easy to read what’s happening and also easy to add additional layers, like overlaying the individual points:

``````ggQC_example +
geom_point(alpha = .25)`````` I like showing the plot like this because it is more clear to an innocent bystander exactly what is being plotted.

One thing that I would like for ggQC to provide is an easy way to make a decent chart in just a few lines of code. For example, something like this:

``````tidy_df %>%
ggplot(aes(x = Groups, y = Observations)) +
stat_quick_QC(method = "xBar.rBar", auto.label = TRUE, label.digits = 4)``````

Another thing missing from the ggQC package is coloring the violations. It is supposed to color them with the `stat_qc_violations()` layer, but I think that is a messy solution with the four facets. It is also buggy, as you can see by the two violations that it identifies being plotted in the wrong place on the x axis.

``````ggQC_example +
stat_qc_violations(method = "xBar.rBar")`````` So, let’s try a work-around to color the violations. First we’ll get the violations from the ggQC `QC_Violations` function and then we’ll add them to the `ggQC_example` plot, colored red.

``````violations <- example_df %>%
QC_Violations(value = "values", grouping = "subgroup", method = "xBar.rBar") %>%
filter(Violation == TRUE) %>%
select(data, Index) %>%
unique()

ggQC_example +
geom_point(data = violations, aes(x = Index, y = data), color = "red")`````` You can see that `ggQC` doesn’t use the same violation rules as `qcc`. `ggQC` lists its rules in the help documentation for `QC_Violations()`, but I don’t see the rules documented anywhere for the `qcc` package. There are multiple sents of rules that people use, but I like to refer to the NIST Engineering Statistics Handbook Western Electric rules since they’re published by NIST and are publically available. However, neither package uses these rules exactly, so lets do the same work-around with the qcc package for fun.

``````violations <- data_frame(Observations = qcc_chart\$violations) %>%
map(unlist) %>%
as_data_frame() %>%
left_join(as_data_frame(qcc_chart\$statistics) %>%
mutate(Observations = row_number()),
by = "Observations")``````
``````## Warning: `as_data_frame()` is deprecated, use `as_tibble()` (but mind the new semantics).
## This warning is displayed once per session.``````
``````ggQC_example +
geom_point(data = violations, aes(x = Observations, y = value), color = "red")`````` Overall I like the ggQC package. It’s still an early version and has some things to work out, but I think that it’s the natural progression of control charting in `R`.

This post was generated with version 0.0.31 of `ggQC`.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.