Articles by Ken Kleinman

Example 7.21: Write a function to simulate categorical data

January 8, 2010 | Ken Kleinman

In example 7.20, we showed how to simulate categorical data. But we might anticipate needing to do that frequently. If a SAS function weren't built in and an equivalent R function not available in a package, we could build them from scratch.SASThe SAS code is particularly tortured, since we must ...

Example 7.20: Simulate categorical data

January 4, 2010 | Ken Kleinman

Both SAS and R provide means of simulating categorical data (see section 1.10.4). Alternatively, it is trivial to write code to do this directly. In this entry, we show how to do it once. In a future entry, we'll demonstrate writing a SAS Macro (section A.8.1) and a function in R (...

Example 7.19: find the closest pair of observations

December 28, 2009 | Ken Kleinman

Suppose we need to find the closest pair of observations on some variable x. For example, we might be concerned that some data had been accidentally duplicated. We return the ID's of the two closest observations, and their distance from each other. In both languages, we'll first create the data, ...

SAS and R included on R bloggers

December 18, 2009 | Ken Kleinman

The R bloggers site is an aggregator for blogs about R. We're excited to be joining that community and suggest any readers of this blog may also find it useful.

Example 7.18: Displaying missing value categories in a table

December 14, 2009 | Ken Kleinman

When displaying contingency tables (section 2.3.1), there are times when it is useful to either show or hide the missing data category. Both SAS and the typical R command default to displaying the table only for observations where both factors are observed.In this example, we generate some multinomial data (section 1.10.4) ...

Example 7.15: A more complex sales graphic

October 13, 2009 | Ken Kleinman

The plot of Amazon sales rank over time generated in example 7.14 leaves questions. From a software perspective, we'd like to make the plot prettier, while we can embellish the plot to inform our interpretation about how the rank is calculated.For the latter purpose, we'll create an indicator of whether ...

Example 7.14: A simple graphic of sales

September 29, 2009 | Ken Kleinman

In this example, we show a simple plot of the sales rank data read in as shown in example 7.13.SASIn SAS, we use the symbol statement (section 5.3) to request small (with the h option) dots (with the v option, and that the dots not be connected (with the i option. (...

Example 7.11: Plot an empirical cumulative distribution function from scratch

August 31, 2009 | Ken Kleinman

In example 7.8, we used built-in functions to produce an empirical CDF plot. But the empirical cumulative distribution function (CDF) is simple to calculate directly, and it might be useful to have more control over its appearance than is aﬀorded by...

Example 7.10: Get data from R into SAS

August 13, 2009 | Ken Kleinman

In our previous entry, we described how to generate a dataset from SAS that could be used for analyses in R. Alternatively, someone primarily using R might want to test the new ”statistical graphics” procedures available starting with SAS 9.2. Her...

Example 7.9: Get data from SAS into R

August 8, 2009 | Ken Kleinman

Some people use both SAS and R in their daily work. They might be more familiar with SAS as a tool for manipulating data and R preferable for plotting purposes. While our goal in the book is to enable people to avoid having to switch back and forth, ...

Example 7.8: Plot two empirical cumulative density functions using available tools

August 1, 2009 | Ken Kleinman

The empirical cumulative density function (CDF) (section 5.1.16) is a useful way to compare distributions between populations. The Kolmogorov-Smirnov (section 2.4.2) statistic D is the value of x with the maximum distance between the two curves. As an...

Book now shipping from Amazon

July 27, 2009 | Ken Kleinman

Amazon now reports that the book is in stock! The current discount is 13%.Or, order from the publisher. If you are an ASA member, you can use the online discount code 634LH to obtain a 15% discount.

Example 7.7: Tabulate binomial probabilities

July 25, 2009 | Ken Kleinman

Suppose we wanted to assess the probability P(X=x) for a binomial random variate with n = 10 and with p = .81, .84, ..., .99. This could be helpful, for example, in various game settings. In SAS, we ﬁnd the probability that X=x using differences in t...

Example 7.6: Find Amazon sales rank for a book

July 20, 2009 | Ken Kleinman

In honor of Amazon's official release date for the book, we offer this blog entry.Both SAS and R can be used to find the Amazon Sales Rank for a book by downloading the desired web page and ferreting out the appropriate line. This code is likely to br...

Example 7.5: Replicating a prettier jittered scatterplot

July 15, 2009 | Ken Kleinman

The scatterplot in section 7.4 is a plot we could use repeatedly. We demonstrate how to create a macro (SAS, section A.8) and a function (R, section B.5) to do it more easily.SAS%macro logiplot(x=x, y=y, data=, jitterwidth=.05, smooth=50);data lp1;set...

Example 7.4: A prettier jittered scatterplot

July 2, 2009 | Ken Kleinman

The plot in section 7.3 has some problems. At the very least, the jittered values ought to be between 0 and 1, so the smoothed lines ﬁt better with them. Once again we use the data generated in section 7.2 as an example. For both SAS and R, we use conditioning (section 1.11.2) to make ...

Example 7.3: Simple jittered scatterplot with smoother for dichotomous outcomes with continuous predictors

June 24, 2009 | Ken Kleinman

It's useful to look at scatterplots even when the "y" variable is dichotomous. For example, this can help determine whether categorization or linear assumptions would be more plausible. However, an unmodified scatterplot is less than helpful, since all of the "y" values are either 0 or 1, and are hard to separate ...

Book now discounted 33% at Amazon!

June 24, 2009 | Ken Kleinman

Our book, SAS and R: Data Management, Statistical Analysis, and Graphics, is discounted by a full third at Amazon. With free shipping! Also, they claim if it is further discounted before it ships, they'll give you the reduced price.

Example 7.2: Simulate data from a logistic regression

June 13, 2009 | Ken Kleinman

It might be useful to be able to simulate data from a logistic regression (section 4.1.1). Our process is to generate the linear predictor, then apply the inverse link, and finally draw from a distribution with this parameter. This approach is useful in that it can easily be applied to other ...

« 1 … 4 5 6

Copyright © 2025 | MH Corporate basic by MH Themes