Blog Archives

Example 8.41: Scatterplot with marginal histograms

June 20, 2011
By
Example 8.41: Scatterplot with marginal histograms

The scatterplot is one of the most ubiquitous, and useful graphics. It's also very basic. One of its shortcomings is that it can hide important aspects of the marginal distributions of the two variables. To address this weakness, you can add a histo...

Read more »

Example 8.40: Side-by-side histograms

June 13, 2011
By
Example 8.40: Side-by-side histograms

It's often useful to compare histograms for some key variable, stratified by levels of some other variable. There are several ways to display something like this. The simplest may be to plot the two histograms in separate panels.SASIn SAS, the most d...

Read more »

Example 8.39: calculating Cramer’s V

June 3, 2011
By
Example 8.39: calculating Cramer’s V

Cramer's V is a measure of association for nominal variables. Effectively it is the Pearson chi-square statistic rescaled to have values between 0 and 1, as follows:V = sqrt(X^2 / )where X^2 is the Pearson chi-square, n...

Read more »

Example 8.37: Read sheets from an excel file

May 11, 2011
By
Example 8.37: Read sheets from an excel file

Microsoft Excel is an awkward tool for data analysis. However, it is a reasonable environment for recording and transfering data. In our consulting practice, people frequently send us data in .xls (from Excel 97-2003) or .xlsx (from Excel 2007 or 201...

Read more »

Example 8.35: Grab true (not pseudo) random numbers; passing API URLs to functions or macros

April 19, 2011
By
Example 8.35: Grab true (not pseudo) random numbers; passing API URLs to functions or macros

Usually, we're content to use a pseudo-random number generator. But sometimes we may want numbers that are actually random-- an example might be for randomizing treatment status in a randomized controlled trial.The site Random.org provides truly rando...

Read more »

Example 8.33: Merging data sets one-to-many

April 5, 2011
By
Example 8.33: Merging data sets one-to-many

It's often necessary to combine data from two data sets for further analysis. Such merging can be one-to-one, many-to-one, and many-to-many. The most common form is the one-to-one match, which we cover in section 1.5.7. Today we look at a one-to-man...

Read more »

Example 8.32: The HistData package, sunflower plots, and getting data from R into SAS

March 29, 2011
By
Example 8.32: The HistData package, sunflower plots, and getting data from R into SAS

This entry is mainly a promotion of the fascinating HistData R package. The package, compiled by the psychologist, statistician, and graphics innovator Michael Friendly, contains a number of small data sets of historical interest. These include data ...

Read more »

Example 8.31: Choropleth maps

March 22, 2011
By
Example 8.31: Choropleth maps

In our book, we show a simple example of a map (section 6.4.2) where we read the boundary files as data sets and use SAS and R to plot them. But both SAS and R have complex functionality for using pre-compiled map data. To demonstrate them, we'll sho...

Read more »

Example 8.30: Compare Poisson and negative binomial count models

March 15, 2011
By
Example 8.30:  Compare Poisson and negative binomial count models

How similar can a negative binomial distribution get to a Poisson distribution?When confronted with modeling count data, our first instinct is to use Poisson regression. But in practice, count data is often overdispersed. We can fit the overdispersio...

Read more »

Example 8.29: Risk ratios and odds ratios

March 7, 2011
By
Example 8.29: Risk ratios and odds ratios

When can you safely think of an odds ratio as being similar to a risk ratio?Many people find odds ratios hard to interpret, and thus would prefer to have risk ratios. In response to this, you can find several papers that purport to convert an odds rat...

Read more »