# Blog Archives

## Augmented support for complex survey designs in R

March 3, 2010
By

We'll get back to code examples later this week, but wanted to let you know about an R package with updated functionality in the meantime.The appropriate analysis of sample surveys requires incorporation of complex design features, including stratification, clustering, weights, and finite population correction. These can be address in SAS and R for many common models. Section...

## Example 7.24: Sampling from a pathological distribution

March 1, 2010
By

Evans and Rosenthal consider ways to sample from a distribution with density given by:f(y) = c e^(-y^4)(1+|y|)^3where c is a normalizing constant and y is defined on the whole real line.Use of the probability integral transform (section 1.10.8) is not feasible in this setting, given the complexity of inverting the cumulative density function.The Metropolis--Hastings algorithm is a Markov...

## Example 7.22: the Knapsack problem

January 13, 2010
By

The website http://rosettacode.org/wiki/Knapsack_Problem describes a fanciful trip by a traveler to Shangri La. They can take as many as they want of three valuable items, as long as they fit in a knapsack. The knapsack will hold no more than 25 weight units, and no more than 25 volume units. The problem is to maximize the...

## Example 7.17: The Smith College diploma problem

November 12, 2009
By

Smith College is a residential women's liberal arts college in Northampton, MA that is steeped in tradition. One such tradition is to give each student at graduation a diploma at random (or more accurately, in a haphazard fashion). At the end of the ceremony, a diploma circle is formed, and students pass the diplomas that they receive to...

## Example 7.16: assess robustness of permutation test to violations of exchangeability assumption

October 24, 2009
By

Permutation tests (section 2.4.3) are a form of resampling based inference that can be used to compare two groups. A simple univariate two-group permutation test requires that the group labels for the observations are exchangeable under the null hypothesis of equal distributions, but allows relaxation of specific distributional assumptions required by parametric procedures such as the t-test (

## Example 7.13: Read a file with two lines per observation

September 24, 2009
By

In example 7.6 we showed how to retrieve the Amazon sales rank of a book. A cron job on one of our machines grabs the sales rank hourly. We’d like to use this data to satisfy our curiosity about when and how often a book sells. A complication is that the result of the cron job is a...

## Example 7.12: Calculate and plot a running average

September 17, 2009
By

The Law of Large Numbers concerns the stability of the mean, as sample sizes increase. This is an important topic in mathematical statistics. The convergence (or lack thereof, for certain distributions) can easily be visualized in SAS and R (see also Horton, Qian and Brown, 2004).Assume that X1, X2, ..., Xn are independent and identically distributed realizations...

## packages and CRANtastic

August 24, 2009
By

Additional functionality in R is added through packages, which consist of libraries of bundled functions, datasets, examples and help files that can be downloaded from CRAN (the Comprehensive R Archive Network). The function install.packages() or the w...

## Book excerpts now posted

July 18, 2009
By

We've posted excerpts from the book on the book website. The excerpts include Chapter 3 (regression and ANOVA) in its entirety. This demonstrates how the entries (the generic descriptions of software functions) and the worked examples reinforce each ...

## Example 7.1: Create a Fibonacci sequence

June 12, 2009
By

The Fibonacci numbers have many mathematical relationships and have been discovered repeatedly in nature. They are constructed as the sum of the previous two values, initialized with the values 1 and 1.A pdf of this example is available here.SASIn SAS, we use the lag function (section 1.4.17,...