Articles by Ken Kleinman

Example 10.7: Fisher vs. Pearson

October 29, 2012 | Ken Kleinman

In the early days of the discipline of statistics, R.A. Fisher argued with great vehemence against Egon Pearson (and Jerzy Neyman) over the foundational notions supporting statistical inference. The personal invective recorded is somewhat amusing an... [Read more...]

Example 10.6: Should Poisson regression ever be used? Negative binomial vs. Poisson regression

October 15, 2012 | Ken Kleinman

In practice, we often find that count data is not well modeled by Poisson regression, though Poisson models are often presented as the natural approach for such data. In contrast, the negative binomial regression model is much more flexible and is therefore likely to fit better, if the data are ...

[Read more...]

Example 10.5: Convert a character-valued categorical variable to numeric

October 8, 2012 | Ken Kleinman

In some settings it may be necessary to recode a categorical variable with character values into a variable with numeric values. For example, the matching macro we discussed in example 7.35 will only match on numeric variables. One way to conve... [Read more...]

Example 10.4: Multiple comparisons and confidence limits

October 1, 2012 | Ken Kleinman

A colleague is a devotee of confidence intervals. To him, the CI have the magical property that they are immune to the multiple comparison problem-- in other words, he feels its OK to look at a bunch of 95% CI and focus on the ones that appear to exclude the null. ... [Read more...]

Example 10.3: Enhanced scatterplot with marginal histograms

September 24, 2012 | Ken Kleinman

Back in example 8.41 we showed how to make a graphic combining a scatterplot with histograms of each variable. A commenter suggested we change the R graphic to allow post-hoc plotting of, for example, lowess lines. In addition, there are further refinements to be made. In this R-only entry, we'll make ... [Read more...]

Example 10.2: Custom graphic layouts

September 17, 2012 | Ken Kleinman

In example 10.1 we introduced data from a CPAP machine. In brief, it's hard to tell exactly what's being recorded in the data set, but it seems to be related to the pattern of breathing. Measurements are taken five times a second, leading to on the o... [Read more...]

Example 10.1: Read a file byte by byte

September 10, 2012 | Ken Kleinman

More and more makers of electronic devices use standard storage media to record data. Sometimes this is central to the device's function, as in a camera, so that the data must be easy to recover. Other times, it's effectively incidental, and the device maker may not provide easy access to ... [Read more...]

Third year wrap-up

July 23, 2012 | Ken Kleinman

July marks the end of three years of blogging for us. By our count, we've posted 121 examples across the first three years. We aim to be helpful and interesting.As always, it's hard to get a sense of our readership. At the time we wrote this, Feedbur...

[Read more...]

Citing R or SAS

July 2, 2012 | Ken Kleinman

One of us recently read a colleague's first draft of a paper, in which she had written: "All analyses were done in R 2.14.0." We assume we're preaching to the converted here, when we say that the enormous amount of work that goes into R needs to be re...

[Read more...]

Example 9.36: Levene’s test for equal variances

June 25, 2012 | Ken Kleinman

The assumption of equal variances among the groups in analysis of variance is an expression of the assumption of homoscedasticity for linear models more generally. For ANOVA, this assumption can be tested via Levene's test. The test is a function of the residuals and means within each group, though various ...

[Read more...]

Example 9.35: Discrete randomization and formatted output

June 18, 2012 | Ken Kleinman

A colleague asked for help with randomly choosing a kid within a family. This is for a trial in which families are recruited at well-child visits, but in each family only one of the children having a well-child visit that day can be in the study. The idea is that ...

[Read more...]

Example 9.34: Bland-Altman type plot

June 5, 2012 | Ken Kleinman

The Bland-Altman plot is a visual aid for assessing differences between two ways of measuring something. For example, one might compare two scales this way, or two devices for measuring particulate matter. The plot simply displays the difference between the measures against their average. Rather than a statistical test, it ... [Read more...]

Example 9.33: Multiple imputation, rounding, and bias

May 29, 2012 | Ken Kleinman

Nick has a paper in the American Statistician warning about bias in multiple imputation arising from rounding data imputed under a normal assumption. One example where you might run afoul of this is if the data are truly dichotomous or count variables, but you model it as normal (either because ...

[Read more...]

Example 9.32: Multiple testing simulation

May 21, 2012 | Ken Kleinman

In examples 9.30 and 9.31 we explored corrections for multiple testing and then extracting p-values adjusted by the Benjamini and Hochberg (or FDR) procedure. In this post we'll develop a simulation to explore the impact of "strong" and "weak" control of the family-wise error rate offered in multiple comparison corrections. Loosely put, ...

[Read more...]

Example 9.31: Exploring multiple testing procedures

May 14, 2012 | Ken Kleinman

In example 9.30 we explored the effects of adjusting for multiple testing using the Bonferroni and Benjamini-Hochberg (or false discovery rate, FDR) procedures. At the time we claimed that it would probably be inappropriate to extract the adjusted p-values from the FDR method from their context. In this entry we attempt ...

[Read more...]

Example 9.27: Baseball and shrinkage

April 16, 2012 | Ken Kleinman

To celebrate the beginning of the professional baseball season here in the US and Canada, we revisit a famous example of using baseball data to demonstrate statistical properties. In 1977, Bradley Efron and Carl Morris published a paper about the Jame... [Read more...]

Example 9.26: More circular plotting

April 9, 2012 | Ken Kleinman

SAS's Rick Wicklin showed a simple loess smoother for the temperature data we showed here. Then he came back with a better approach that does away with edge effects. Rick's smoothing was calculated and plotted on a cartesian plane. In this entry we'll explore another option or two for smoothing, ... [Read more...]

Example 9.25: It’s been a mighty warm winter? (Plot on a circular axis)

April 2, 2012 | Ken Kleinman

Updated (see below)People here in the northeast US consider this to have been an unusually warm winter. Was it?The University of Dayton and the US Environmental Protection Agency maintain an archive of daily average temperatures that's reasonably current. In the case of Albany, NY (the most similar of ... [Read more...]

Example 9.24: Changing the parameterization for categorical predictors

March 22, 2012 | Ken Kleinman

In our book, we discuss the important question of how to assign different parameterizations to categorical variables when fitting models (section 3.1.3). We show code in R for use in the lm() function, as follows:lm(y ~ x, contrasts=list(x,"contr.trea...

[Read more...]

Example 9.23: Demonstrating proportional hazards

March 13, 2012 | Ken Kleinman

A colleague recently asked after a slide suitable for explaining proportional hazards. In particular, she was concerned that her audience not focus on the time to event or probability of the event. An initial thought was to display the cumulative haz...

[Read more...]

« 1 2 3 4 … 6 »

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Articles by Ken Kleinman

Example 10.7: Fisher vs. Pearson

Example 10.6: Should Poisson regression ever be used? Negative binomial vs. Poisson regression

Example 10.5: Convert a character-valued categorical variable to numeric

Example 10.4: Multiple comparisons and confidence limits

Example 10.3: Enhanced scatterplot with marginal histograms

Example 10.2: Custom graphic layouts

Example 10.1: Read a file byte by byte

Third year wrap-up

Citing R or SAS

Example 9.36: Levene’s test for equal variances

Example 9.35: Discrete randomization and formatted output

Example 9.34: Bland-Altman type plot

Example 9.33: Multiple imputation, rounding, and bias

Example 9.32: Multiple testing simulation

Example 9.31: Exploring multiple testing procedures

Example 9.27: Baseball and shrinkage

Example 9.26: More circular plotting

Example 9.25: It’s been a mighty warm winter? (Plot on a circular axis)

Example 9.24: Changing the parameterization for categorical predictors

Example 9.23: Demonstrating proportional hazards

Articles by Ken Kleinman

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)