# R and Tolerance Intervals

April 19, 2010
By

(This article was first published on Software for Exploratory Data Analysis and Statistical Modelling, and kindly contributed to R-bloggers)

Confidence intervals and prediction intervals are used by statisticians on a regular basis. Another useful interval is the tolerance interval that describes the range of values for a distribution with confidence limits calculated to a particular percentile of the distribution. The R package tolerance can be used to create a variety of tolerance intervals of interest.

These tolerance limits, taken from the estimated interval, are limits within which a stated proportion of the population is expected to occur. The function normtol.int from the tolerance package can be used to calculate a tolerance interval for data from a normal distribution.

The function arguments include the data itself in a vector denoted x. The confidence level associated with the tolerance interval is specified by alpha, where alpha is the difference between 100% and the confidence level – alpha is 0.05 for 95% confidence. The argument P is the proportion of the data to be included in the tolerance interval. The side argument determines whether a one-sided or two-sided interval is required.

Consider a simulated set of data from a manufacturing process loaded into R, stored as vector object obs, as follows:

obs = c(102.17, 102.45, 106.23, 98.16, 100.82, 101.40, 90.51, 102.51, 97.93, 96.98, 101.74, 104.34, 103.50, 94.72, 102.80, 103.92, 97.43, 102.76, 100.03, 107.12, 104.96, 105.32, 87.06, 97.89, 100.23)

A 95% tolerance interval for 90% of data of this type, based on the 25 observations above is created with this code:

> normtol.int(x = obs, alpha = 0.05, P = 0.90, side = 2) alpha P x.bar 2-sided.lower 2-sided.upper 1 0.05 0.9 100.5192 90.07606 110.9623

The alpha and P are as noted above and the average of the data is reported along with the lower and upper tolerance intervals in this case as we asked for a two-sided interval. This can be easily changed to cover 95% rather than 90% of the data:

> normtol.int(x = obs, alpha = 0.05, P = 0.95, side = 2) alpha P x.bar 2-sided.lower 2-sided.upper 1 0.05 0.95 100.5192 88.07543 112.9630

The package tolerance can create intervals for other data distributions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...