R and Tolerance Intervals

April 19, 2010

[This article was first published on Software for Exploratory Data Analysis and Statistical Modelling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Confidence intervals and prediction intervals are used by statisticians on a regular basis. Another useful interval is the tolerance interval that describes the range of values for a distribution with confidence limits calculated to a particular percentile of the distribution. The R package tolerance can be used to create a variety of tolerance intervals of interest.

These tolerance limits, taken from the estimated interval, are limits within which a stated proportion of the population is expected to occur. The function normtol.int from the tolerance package can be used to calculate a tolerance interval for data from a normal distribution.

The function arguments include the data itself in a vector denoted x. The confidence level associated with the tolerance interval is specified by alpha, where alpha is the difference between 100% and the confidence level – alpha is 0.05 for 95% confidence. The argument P is the proportion of the data to be included in the tolerance interval. The side argument determines whether a one-sided or two-sided interval is required.

Consider a simulated set of data from a manufacturing process loaded into R, stored as vector object obs, as follows:

obs = c(102.17, 102.45, 106.23, 98.16, 100.82, 101.40, 90.51, 102.51, 97.93,
  96.98, 101.74, 104.34, 103.50, 94.72, 102.80, 103.92, 97.43, 102.76, 100.03,
  107.12, 104.96, 105.32, 87.06, 97.89, 100.23)

A 95% tolerance interval for 90% of data of this type, based on the 25 observations above is created with this code:

> normtol.int(x = obs, alpha = 0.05, P = 0.90, side = 2)
  alpha   P    x.bar 2-sided.lower 2-sided.upper
1  0.05 0.9 100.5192      90.07606      110.9623

The alpha and P are as noted above and the average of the data is reported along with the lower and upper tolerance intervals in this case as we asked for a two-sided interval. This can be easily changed to cover 95% rather than 90% of the data:

> normtol.int(x = obs, alpha = 0.05, P = 0.95, side = 2)
  alpha    P    x.bar 2-sided.lower 2-sided.upper
1  0.05 0.95 100.5192      88.07543      112.9630

The package tolerance can create intervals for other data distributions.

To leave a comment for the author, please follow the link and comment on their blog: Software for Exploratory Data Analysis and Statistical Modelling.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)