# Articles by Ken Kleinman

### Example 9.21: The birthday "problem" re-examined

February 23, 2012 |

The so-called birthday paradox or birthday problem is simply the counter-intutitive discovery that the probability of (at least) two people in a group sharing a birthday goes up surprisingly fast as the group size increases. If the group is only 23 peo...

### RStudio in the cloud, for dummies

February 13, 2012 |

You can have your own cloud computing version of R, complete with RStudio. Why should you? It's cool! Plus, there's a lot more power out there than you can easily get on your own hardware. And, it's R in a web page. Run it from your tablet. Run i...

### SAS Macro Simplifies SAS and R integration

January 26, 2012 |

Many of us feel very enthusiastic about R. It's free, it features cutting edge applications, it has a large community of users contributing for mutual benefit, and on and on. There are also many things to like about SAS, including stability, backwards...

### Example 9.19: Demonstrating the central limit theorem

January 11, 2012 |

A colleague recently asked "why should the average get closer to the mean when we increase the sample size?" We should interpret this question as asking why the standard error of the mean gets smaller as n increases. The central limit theorem shows t...

### Example 9.18: Constructing the fastest relay team via enumeration

January 5, 2012 |

In competitive swimming, the medley relay is a team event in which four different swimmers each swim one of the four strokes: freestyle, breaststroke, backstroke, and butterfly. In general, every swimmer might be able swim any given stroke. How can w...

### Example 9.16: Small multiples

November 29, 2011 |

Small multiples are one of the great ideas of graphics visionary Edward Tufte (e.g., in Envisioning Information). Briefly, the idea is that if many variations on a theme are presented, differences quickly become apparent. Today we offer general guida...

### Example 9.15: Bar chart with error bars ("Dynamite plot")

November 22, 2011 |

The "dynamite plot", a bar chart plotting the a mean with a error bar, is one of the most reviled types of image among statisticians. Reasons to dislike them are numerous, and are nicely summarized here. (Edward Tufte also suggests they be avoided.) ...

### Example 9.13: Negative binomial regression with proc mcmc

November 8, 2011 |

In practice, data that derive from counts rarely seem to be fit well by a Poisson model; one more flexible alternative is a negative binomial model. In this SAS-only entry, we discuss how proc mcmc can be used for estimation. An overview of support f...

### Proc report for simple statistics

October 30, 2011 |

Ken Beath, of Macquarie University, commented on an earlier entry that the best way to generate summary statistics is using proc report. While the best tools might differ, depending on the purpose, we wanted to share Ken's code demonstrating how to re...

### Example 9.11: Employment plot

October 25, 2011 |

A facebook friend posted the picture reproduced above-- it makes the case that President Obama has been a successful creator of jobs, and also paints GW Bush as a president who lost jobs. Another friend pointed out that to be fair, all of Bush's presi...

### Example 9.8: New stuff in SAS 9.3– Bayesian random effects models in Proc MCMC

October 4, 2011 |

Rounding off our reports on major new developments in SAS 9.3, today we'll talk about proc mcmc and the random statement.Stand-alone packages for fitting very general Bayesian models using Markov chain Monte Carlo (MCMC) methods have been available for...

### Example 9.7: New stuff in SAS 9.3– Frailty models

September 27, 2011 |

Shared frailty models are a way of allowing correlated observations into proportional hazards models. Briefly, instead of l_i(t) = l_0(t)e^(x_iB), we allow l_ij(t) = l_0(t)e^(x_ijB + g_i), where observations j are in clusters i, g_i is typically norma...

### Example 9.6: Model comparison plots (Completed)

September 21, 2011 |

We often work in settings where the data set has a lot of missing data-- some missingness in the (many) covariates, some in the main exposure of interest, and still more in the outcome. (Nick describes this as "job security for statisticians").Some ana...

### Example 9.5: New stuff in SAS 9.3– proc FMM

September 13, 2011 |

Finite mixture models (FMMs) can be used in settings where some unmeasured classification separates the observed data into groups with different exposure/outcome relationships. One familiar example of this is a zero-inflated model, where some observat...

### Example 9.4: New stuff in SAS 9.3– MI FCS

September 6, 2011 |

We begin the new academic year with a series of entries exploring new capabilities of SAS 9.3, and some functionality we haven't previously written about.We'll begin with multiple imputation. Here, SAS has previously been limited to multivariate norma...

### Taking August off!

July 31, 2011 |

We'll be back with recharged batteries and lots of new entries in September. Have a great summer*!As usual, please send any questions you have about using SAS or R.*Not valid in the southern hemisphere.

### Really useful R package: sas7bdat

July 25, 2011 |

For SAS users, one hassle in trying things in R, let alone migrating, is the difficulty of getting data out of SAS and into R. In our book (section 1.2.2) and in a blog entry we've covered getting data out of SAS native data sets. Unfortunately, for ...

### Example 9.2: Transparency and bivariate KDE

July 11, 2011 |

In Example 9.1, we showed a binning approach to plotting bivariate relationships in a large data set. Here we show more sophisticated approaches: transparent overplotting and formal two-dimensional kernel density estimation. We use the 10,000 simulat...

### A third year of entries!

July 1, 2011 |

Contrary to previous reports, we started blogging after our book was published, with the conceit that we were adding examples to the book. Today marks the second anniversary of the book's appearance and of the blog. To celebrate, we're turning over o...