# Articles by Ken Kleinman

### Example 10.7: Fisher vs. Pearson

October 29, 2012 |

In the early days of the discipline of statistics, R.A. Fisher argued with great vehemence against Egon Pearson (and Jerzy Neyman) over the foundational notions supporting statistical inference. The personal invective recorded is somewhat amusing an... [Read more...]

### Example 10.6: Should Poisson regression ever be used? Negative binomial vs. Poisson regression

October 15, 2012 |

In practice, we often find that count data is not well modeled by Poisson regression, though Poisson models are often presented as the natural approach for such data. In contrast, the negative binomial regression model is much more flexible and is therefore likely to fit better, if the data are ...

### Example 10.5: Convert a character-valued categorical variable to numeric

October 8, 2012 |

In some settings it may be necessary to recode a categorical variable with character values into a variable with numeric values. For example, the matching macro we discussed in example 7.35 will only match on numeric variables. One way to conve... [Read more...]

### Example 10.4: Multiple comparisons and confidence limits

October 1, 2012 |

A colleague is a devotee of confidence intervals. To him, the CI have the magical property that they are immune to the multiple comparison problem-- in other words, he feels its OK to look at a bunch of 95% CI and focus on the ones that appear to exclude the null. ... [Read more...]

### Example 10.3: Enhanced scatterplot with marginal histograms

September 24, 2012 |

Back in example 8.41 we showed how to make a graphic combining a scatterplot with histograms of each variable. A commenter suggested we change the R graphic to allow post-hoc plotting of, for example, lowess lines. In addition, there are further refinements to be made. In this R-only entry, we'll make ... [Read more...]

### Example 10.2: Custom graphic layouts

September 17, 2012 |

In example 10.1 we introduced data from a CPAP machine. In brief, it's hard to tell exactly what's being recorded in the data set, but it seems to be related to the pattern of breathing. Measurements are taken five times a second, leading to on the o... [Read more...]

### Example 10.1: Read a file byte by byte

September 10, 2012 |

More and more makers of electronic devices use standard storage media to record data. Sometimes this is central to the device's function, as in a camera, so that the data must be easy to recover. Other times, it's effectively incidental, and the device maker may not provide easy access to ... [Read more...]

### Third year wrap-up

July 23, 2012 |

July marks the end of three years of blogging for us. By our count, we've posted 121 examples across the first three years. We aim to be helpful and interesting.As always, it's hard to get a sense of our readership. At the time we wrote this, Feedbur...

### Citing R or SAS

July 2, 2012 |

One of us recently read a colleague's first draft of a paper, in which she had written: "All analyses were done in R 2.14.0." We assume we're preaching to the converted here, when we say that the enormous amount of work that goes into R needs to be re...

### Example 9.36: Levene’s test for equal variances

June 25, 2012 |

The assumption of equal variances among the groups in analysis of variance is an expression of the assumption of homoscedasticity for linear models more generally. For ANOVA, this assumption can be tested via Levene's test. The test is a function of the residuals and means within each group, though various ...

### Example 9.35: Discrete randomization and formatted output

June 18, 2012 |

A colleague asked for help with randomly choosing a kid within a family. This is for a trial in which families are recruited at well-child visits, but in each family only one of the children having a well-child visit that day can be in the study. The idea is that ...

### Example 9.34: Bland-Altman type plot

June 5, 2012 |

The Bland-Altman plot is a visual aid for assessing differences between two ways of measuring something. For example, one might compare two scales this way, or two devices for measuring particulate matter. The plot simply displays the difference between the measures against their average. Rather than a statistical test, it ... [Read more...]

### Example 9.33: Multiple imputation, rounding, and bias

May 29, 2012 |

Nick has a paper in the American Statistician warning about bias in multiple imputation arising from rounding data imputed under a normal assumption. One example where you might run afoul of this is if the data are truly dichotomous or count variables, but you model it as normal (either because ...

### Example 9.32: Multiple testing simulation

May 21, 2012 |

In examples 9.30 and 9.31 we explored corrections for multiple testing and then extracting p-values adjusted by the Benjamini and Hochberg (or FDR) procedure. In this post we'll develop a simulation to explore the impact of "strong" and "weak" control of the family-wise error rate offered in multiple comparison corrections. Loosely put, ...

### Example 9.31: Exploring multiple testing procedures

May 14, 2012 |

In example 9.30 we explored the effects of adjusting for multiple testing using the Bonferroni and Benjamini-Hochberg (or false discovery rate, FDR) procedures. At the time we claimed that it would probably be inappropriate to extract the adjusted p-values from the FDR method from their context. In this entry we attempt ...

### Example 9.27: Baseball and shrinkage

April 16, 2012 |

To celebrate the beginning of the professional baseball season here in the US and Canada, we revisit a famous example of using baseball data to demonstrate statistical properties. In 1977, Bradley Efron and Carl Morris published a paper about the Jame... [Read more...]

### Example 9.26: More circular plotting

April 9, 2012 |

SAS's Rick Wicklin showed a simple loess smoother for the temperature data we showed here. Then he came back with a better approach that does away with edge effects. Rick's smoothing was calculated and plotted on a cartesian plane. In this entry we'll explore another option or two for smoothing, ... [Read more...]

### Example 9.25: It’s been a mighty warm winter? (Plot on a circular axis)

April 2, 2012 |

Updated (see below)People here in the northeast US consider this to have been an unusually warm winter. Was it?The University of Dayton and the US Environmental Protection Agency maintain an archive of daily average temperatures that's reasonably current. In the case of Albany, NY (the most similar of ... [Read more...]

### Example 9.24: Changing the parameterization for categorical predictors

March 22, 2012 |

In our book, we discuss the important question of how to assign different parameterizations to categorical variables when fitting models (section 3.1.3). We show code in R for use in the lm() function, as follows:lm(y ~ x, contrasts=list(x,"contr.trea...

### Example 9.23: Demonstrating proportional hazards

March 13, 2012 |

A colleague recently asked after a slide suitable for explaining proportional hazards. In particular, she was concerned that her audience not focus on the time to event or probability of the event. An initial thought was to display the cumulative haz...
1 2 3 4 6