Subjugation to the Sigmas

August 23, 2011
By

(This article was first published on The Pith of Performance, and kindly contributed to R-bloggers)

No doubt you've heard about the tyranny of the 9s in reference to computer system availability. You're probably also familiar with the phrase six sigma, either in the context of manufacturing process quality control or the improvement of business processes. As we discovered in the recent Guerrilla Data Analysis Techniques class, the two concepts are related.


 Nines Percent Downtime/Year  σ Level 
499.99%  52.596 minutes 
599.999%  5.2596 minutes -
699.9999%  31.5576 seconds 
799.99999%  3.15576 seconds -
8 99.999999%  315.6 milliseconds 


In this way, people like to talk about achieving "5 nines" availability or a "six sigma" quality level. These phrases are often bandied about without appreciating:
  1. that nines and sigmas refer to similar criteria.
  2. that high nines and high sigmas are very difficult to achieve consistently.
See the appended Comments below for more details and examples.

To arrive at the 3rd column of numbers in the table, you can use the following R function to find out how much shorter downtime per year each additional 9 imposes. Hence, the term tyranny.


downt <- function(nines,tunit=c('s','m','h')) {
ds <- 10^(-nines) * 365.25*24*60*60
if(tunit == 's') { ts <- 1; tu <- "seconds" }
if(tunit == 'm') { ts <- 60; tu <- "minutes" }
if(tunit == 'h') { ts <- 3600; tu <- "hours" }
return(sprintf("Downtime per year at %d nines: %g %s", nines, ds/ts,tu))
}

> downt(5,'m')
[1] "Downtime per year at 5 nines: 5.2596 minutes"
> downt(8,'s')
[1] "Downtime per year at 8 nines: 0.315576 seconds"
The associated σ levels correspond to the area under the Normal (Gaussian) or "bell shaped" curve within that 2σ interval centered on the mean (μ). The σ refers to the standard deviation in the usual way.
The corresponding area under the Normal curve can be calculated using the following R function:

library(NORMT3)
sigp <- function(sigma) {
sigma <- as.integer(sigma)
apc <- erf(sigma/sqrt(2))
return(sprintf("%d-sigma bell area: %10.8f%%; Prob(chance): %e", sigma, apc*100, 1-apc))
}

> sigp(2)
[1] "2-sigma bell area: 95.44997361%; Prob(chance): 4.550026e-02"
> sigp(5)
[1] "5-sigma bell area: 99.99994267%; Prob(chance): 5.733031e-07"
So, 5σ corresponds to slightly more than 99.9999% of the area under in the bell curve; the total area being 100%. It also corresponds closely to six 9s availability. The 2nd number computed by sigp is the probability that the achieved availability was a fluke. A reasonable mnemonic for some of these values is:
  • 3σ corresponds roughly to a probability of 1 in 1,000 that four 9s availability occurred by chance.
  • 5σ is roughly a 1 in a million chance, which is like flipping a fair coin and getting 20 heads in a row.
  • 6σ is roughly a 1 in a billion chance that it was a fluke.
Now you see why these goals are easy to covet but hard to achieve.

To leave a comment for the author, please follow the link and comment on his blog: The Pith of Performance.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.