R is a great language for statistical process control largely because there are several packages which make it quite easy, including qcc, SixSigma, and a new one that I want to explore in this post called ggQC. I’ve been using
qcc for a while and I trust its calculations, so I’ll use that to validate
ggQC was created by Keneth Grey and I came accross it when I read his excelent article on why we use r-bar and a constant to calculate the control limits on a control chart. The advantage of
ggQC is that it extends ggplot2, so that you get all of the flexibility that comes with that.
First, let’s load the
ggQC (you’ll need to install these first if you haven’t already).
library(tidyverse) library(qcc) library(ggQC)
Now, let’s create an x-bar chart with qcc to use as a baseline.
# Generate sample data set.seed(20190117) example_df <- data_frame(values = rnorm(n=30*5, mean = 25, sd = .005), subgroup = rep(1:30, 5), n = rep(1:5, each = 30)) %>% add_row(values = rnorm(n=2*5, mean = 25 + .006, sd = .005), subgroup = rep(31:32, 5), n = rep(1:5, each = 2))
## Warning: `data_frame()` is deprecated, use `tibble()`. ## This warning is displayed once per session.
# Spread the data and plot the control chart qcc_chart <- example_df %>% spread(key = n, value = values) %>% select(-subgroup) %>% qcc(type = "xbar", data.name = "Example Data")
You can see that as long as the data is formatted with each group on its own row, this package makes it very easy to generate a funcitonal control chart. However, customizing this chart isn’t as easy for those of us who primarily use
ggplot2 instead of base plots.
Let’s take a look at the same thing with ggQC. First we’ll need to put the data in long format, then we’ll pass that to
ggplot2 and add some
ggQC_example <- example_df %>% ggplot(aes(x = subgroup, y = values, group = 1)) + stat_summary(fun.y = mean, geom = "point") + stat_summary(fun.y = mean, geom = "line") + stat_QC(method = "xBar.rBar", auto.label = TRUE, label.digits = 4) + scale_x_continuous(expand = expand_scale(mult = c(.05, .15))) + # Pad the x-axis for the labels ylab("x-bar") + theme_bw() ggQC_example
You can see that both packages compute the UCL and LCL identically.
I would like for the two stat summaries to be plotted as part of
stat_QC, but I like that you get more control over them when you plot them on their own and that you see exacly what you’re plotting. While building the chart this way seems a little tedious, it makes it easy to read what’s happening and also easy to add additional layers, like overlaying the individual points:
ggQC_example + geom_point(alpha = .25)
I like showing the plot like this because it is more clear to an innocent bystander exactly what is being plotted.
One thing that I would like for ggQC to provide is an easy way to make a decent chart in just a few lines of code. For example, something like this:
tidy_df %>% ggplot(aes(x = Groups, y = Observations)) + stat_quick_QC(method = "xBar.rBar", auto.label = TRUE, label.digits = 4)
Another thing missing from the ggQC package is coloring the violations. It is supposed to color them with the
stat_qc_violations() layer, but I think that is a messy solution with the four facets. It is also buggy, as you can see by the two violations that it identifies being plotted in the wrong place on the x axis.
ggQC_example + stat_qc_violations(method = "xBar.rBar")
So, let’s try a work-around to color the violations. First we’ll get the violations from the ggQC
QC_Violations function and then we’ll add them to the
ggQC_example plot, colored red.
violations <- example_df %>% QC_Violations(value = "values", grouping = "subgroup", method = "xBar.rBar") %>% filter(Violation == TRUE) %>% select(data, Index) %>% unique() ggQC_example + geom_point(data = violations, aes(x = Index, y = data), color = "red")
You can see that
ggQC doesn’t use the same violation rules as
ggQC lists its rules in the help documentation for
QC_Violations(), but I don’t see the rules documented anywhere for the
qcc package. There are multiple sents of rules that people use, but I like to refer to the NIST Engineering Statistics Handbook Western Electric rules since they’re published by NIST and are publically available. However, neither package uses these rules exactly, so lets do the same work-around with the qcc package for fun.
violations <- data_frame(Observations = qcc_chart$violations) %>% map(unlist) %>% as_data_frame() %>% left_join(as_data_frame(qcc_chart$statistics) %>% mutate(Observations = row_number()), by = "Observations")
## Warning: `as_data_frame()` is deprecated, use `as_tibble()` (but mind the new semantics). ## This warning is displayed once per session.
ggQC_example + geom_point(data = violations, aes(x = Observations, y = value), color = "red")
Overall I like the ggQC package. It’s still an early version and has some things to work out, but I think that it’s the natural progression of control charting in
This post was generated with version 0.0.31 of