Blog Archives

Taking August off!

July 31, 2011
By
Taking August off!

We'll be back with recharged batteries and lots of new entries in September. Have a great summer*!As usual, please send any questions you have about using SAS or R.*Not valid in the southern hemisphere.

Read more »

Really useful R package: sas7bdat

July 25, 2011
By
Really useful R package: sas7bdat

For SAS users, one hassle in trying things in R, let alone migrating, is the difficulty of getting data out of SAS and into R. In our book (section 1.2.2) and in a blog entry we've covered getting data out of SAS native data sets. Unfortunately, for ...

Read more »

Example 9.2: Transparency and bivariate KDE

July 11, 2011
By
Example 9.2:  Transparency and bivariate KDE

In Example 9.1, we showed a binning approach to plotting bivariate relationships in a large data set. Here we show more sophisticated approaches: transparent overplotting and formal two-dimensional kernel density estimation. We use the 10,000 simulat...

Read more »

A third year of entries!

July 1, 2011
By
A third year of entries!

Contrary to previous reports, we started blogging after our book was published, with the conceit that we were adding examples to the book. Today marks the second anniversary of the book's appearance and of the blog. To celebrate, we're turning over o...

Read more »

Example 8.41: Scatterplot with marginal histograms

June 20, 2011
By
Example 8.41: Scatterplot with marginal histograms

The scatterplot is one of the most ubiquitous, and useful graphics. It's also very basic. One of its shortcomings is that it can hide important aspects of the marginal distributions of the two variables. To address this weakness, you can add a histo...

Read more »

Example 8.40: Side-by-side histograms

June 13, 2011
By
Example 8.40: Side-by-side histograms

It's often useful to compare histograms for some key variable, stratified by levels of some other variable. There are several ways to display something like this. The simplest may be to plot the two histograms in separate panels.SASIn SAS, the most d...

Read more »

Example 8.39: calculating Cramer’s V

June 3, 2011
By
Example 8.39: calculating Cramer’s V

Cramer's V is a measure of association for nominal variables. Effectively it is the Pearson chi-square statistic rescaled to have values between 0 and 1, as follows:V = sqrt(X^2 / )where X^2 is the Pearson chi-square, n...

Read more »

Example 8.37: Read sheets from an excel file

May 11, 2011
By
Example 8.37: Read sheets from an excel file

Microsoft Excel is an awkward tool for data analysis. However, it is a reasonable environment for recording and transfering data. In our consulting practice, people frequently send us data in .xls (from Excel 97-2003) or .xlsx (from Excel 2007 or 201...

Read more »

Example 8.35: Grab true (not pseudo) random numbers; passing API URLs to functions or macros

April 19, 2011
By
Example 8.35: Grab true (not pseudo) random numbers; passing API URLs to functions or macros

Usually, we're content to use a pseudo-random number generator. But sometimes we may want numbers that are actually random-- an example might be for randomizing treatment status in a randomized controlled trial.The site Random.org provides truly rando...

Read more »

Example 8.33: Merging data sets one-to-many

April 5, 2011
By
Example 8.33: Merging data sets one-to-many

It's often necessary to combine data from two data sets for further analysis. Such merging can be one-to-one, many-to-one, and many-to-many. The most common form is the one-to-one match, which we cover in section 1.5.7. Today we look at a one-to-man...

Read more »