# Articles by Ralph

### Book Review – Modern Applied Statistics with S by W. N. Venables and B. D. Ripley (Springer 2003)

May 9, 2010 |

Order this book from Amazon Modern Applied Statistics with S (Fourth Edition) is one of the oldest and most popular books on Applied Statistics using R and S-plus. A large number of topics in Applied Statistics are covered in this book and it is certainly not for the faint hearted. ... ### Using the update function during variable selection

May 9, 2010 |

When fitting statistical models to data where there are multiple variables we are often interested in adding or removing terms from our model and in cases where there are a large number of terms it can be quicker to use the update function to start with a formula from a ... [Read more...]

### Displaying data using level plots

May 3, 2010 |

A level plot is a type of graph that is used to display a surface in two rather than three dimensions – the surface is viewed from above as if we were looking straight down and is an alternative to a contour plot – geographic data is an example of where this ... ### Analysis of Covariance – Extending Simple Linear Regression

April 28, 2010 |

The simple linear regression model considers the relationship between two variables and in many cases more information will be available that can be used to extend the model. For example, there might be a categorical variable (sometimes known as a covariate) that can be used to divide the data set ... ### Summarising data using box and whisker plots

April 25, 2010 |

A box and whisker plot is a type of graphical display that can be used to summarise a set of data based on the five number summary of this data. The summary statistics used to create a box and whisker plot are the median of the data, the lower and ... ### Simple Linear Regression

April 23, 2010 |

One of the most frequent used techniques in statistics is linear regression where we investigate the potential relationship between a variable of interest (often called the response variable but there are many other names in use) and a set of one of more variables (known as the independent variables or ... ### Book Review – ggplot 2: Elegant Graphics for Data Analysis by Hadley Wickham (Springer 2009)

April 20, 2010 |

Order this book from Amazon This book is written by the author of the ggplot2 package for R, which is a package with a design inspired by the grammar of graphics and can remove some of the effort required to put together impressive graphs. The book is just under 200 pages ... [Read more...]

### R and Tolerance Intervals

April 19, 2010 |

Confidence intervals and prediction intervals are used by statisticians on a regular basis. Another useful interval is the tolerance interval that describes the range of values for a distribution with confidence limits calculated to a particular percentile of the distribution. The R package tolerance can be used to create a ... [Read more...]

### Summarising data using scatter plots

April 18, 2010 |

A scatter plot is a graph used to investigate the relationship between two variables in a data set. The x and y axes are used for the values of the two variables and a symbol on the graph represents the combination for each pair of values in the data set. ... ### Working with themes in Lattice Graphics

April 12, 2010 |

The Trellis graphics approach provides facilities for creating effective graphs with a consistent look and feel and one of the good things about the system is the use of themes to define the colour, size and other features of the components that make up a graph. The lattice package in ... [Read more...]

### Summarising data using histograms

April 11, 2010 |

The histogram is a standard type of graphic used to summarise univariate data where the range of values in the data set is divided into regions and a bar (usually vertical) is plotted in each of these regions with height proportional to the frequency of observations in that region. In ... [Read more...]

### Summarising data using dot plots

March 26, 2010 |

A dot plot is a type of display that compares counts, frequencies, totals or other summary measures for a series of categories. The dot plot can be arranged with the categories either on the vertical or horizontal axis of the display to allow comparising between the different categories as well ... [Read more...]

### Contingency Tables – Fisher’s Exact Test

March 6, 2010 |

A contingency table is used in statistics to provide a tabular summary of categorical data and the cells in the table are the number of occassions that a particular combination of variables occur together in a set of data. The relationship between variables in a contingency table are often investigated ... [Read more...]

### Design of Experiments – Block Designs

February 20, 2010 |

In many experiments where the investigator is comparing a set of treatments there is the possibility of one or more sources of variability in the experimental measurements that can be accounted for during the design stage of the experimentation. For example we might be investigating four different pieces of machinery ... [Read more...]

### Two-way Analysis of Variance (ANOVA)

February 15, 2010 |

The analysis of variance (ANOVA) model can be extended from making a comparison between multiple groups to take into account additional factors in an experiment. The simplest extension is from one-way to two-way ANOVA where a second factor is included in the model as well as a potential interaction between ... ### One-way ANOVA (cont.)

February 12, 2010 |

In a previous post we considered using R to fit one-way ANOVA models to data. In this post we consider a few additional ways that we can look at the analysis. In the analysis we made use of the linear model function lm and the analysis could be conducted using ... ### One-way Analysis of Variance (ANOVA)

February 3, 2010 |

Analysis of Variance (ANOVA) is a commonly used statistical technique for investigating data by comparing the means of subsets of the data. The base case is the one-way ANOVA which is an extension of two-sample t test for independent groups covering situations where there are more than two groups being ... ### Codecogs – Open-Source library of numerical components

January 8, 2010 |

The Codecogs website provides an Open-source library of functions for numerical analysis. One interesting component available on the website is the LaTeX equation editor which can be used to create graphics files of equations to include on webpages. The webpage describe this component as a A web-based LaTeX equation editor ... [Read more...]

### R Blogs

December 17, 2009 |

There are many blogs on Statistics, R and other related topics scattered around the internet. The R bloggers website provides a central hub where feeds from participating blogs are collated so that they can be viewed from a single website. This resources certainly appears to be a good idea so ... [Read more...]

### The Grammar of Graphics: ggplot2 package

December 14, 2009 |

The grammar of graphics approach to constructing graphs has been implemented in the ggplot2 package in R. The author of the package, Hadley Wickham, has provided a website with many details of using the system to create nice looking graphics. The package removes many of the awkward parts of setting ... [Read more...]
1 2 3 4 5