Blog Archives

Kaplan-Meier plots using ggplots2 (updated)

April 1, 2014
By
Kaplan-Meier plots using ggplots2 (updated)

About 3 years ago I published some code on this blog to draw a Kaplan-Meier plot using ggplot2. Since then, ggplot2 has been updated (from 0.8.9 to 0.9.3.1) and has changed syntactically. Since that post, I have also become comfortable with Git and Github. I have updated the code, edited it for a small error,

Read more »

Pocketbook costs of software

February 23, 2012
By
Pocketbook costs of software

I have always been provided SAS as part of my job, so I never really realized how much it cost. I’ve bought Stata before, and of course R . I recently found out how much a reasonable bundle of SAS modules along with base SAS costs per year per seat, at least under the GSA.

Read more »

An enhanced Kaplan-Meier plot, updated

September 1, 2011
By
An enhanced Kaplan-Meier plot, updated

I’ve updated the R code for the enhanced K-M plot to include additions and improvements by Gil Thomas and Mark Cowley. Thanks fellows for the feedback and updates. http://statbandit.wordpress.com/2011/03/08/an-enhanced-kaplan-meier-plot/

Read more »

RStudio 0.94.92 visited

July 30, 2011
By
RStudio 0.94.92 visited

I just updated my RStudio version to the latest, v.0.94.92 (will this asymptotically approach 1, or actually get to 1?). It was nice to see the number of improvements the development team has implemented, based I’m sure on community feedback. The team has, in my experience, been extraordinarily responsive to user feedback, and I’m sure

Read more »

A word of warning about grep, which and the like

July 13, 2011
By
A word of warning about grep, which and the like

I’ve often selected columns or rows of a data frame using grep or which, based on some property. That is inherently sound, but the trouble comes when you wish to remove rows or columns based on that grep or which call, e.g., which would remove columns with a .1 in the name. This is fine

Read more »

SAS, R and categorical variables

July 13, 2011
By
SAS, R and categorical variables

One of the disappointing problems in SAS (as I need PROC MIXED for some analysis) is to recode categorical variables to have a particular reference category. In R, my usual tool, this is rather easy both to set and to modify using the  relevel command available in base R (in the stats package). My understanding

Read more »

An enhanced Kaplan-Meier plot

March 8, 2011
By
An enhanced Kaplan-Meier plot

We often see, in publications, a Kaplan-Meier survival plot, with a table of the number of subjects at risk at different time points aligned below the figure. I needed this type of plot (or really, matrices of such plots) for an upcoming publication. Of course, my preferred toolbox was R and the ggplot2 package. There

Read more »

RStudio: a cut above

March 1, 2011
By
RStudio: a cut above

As most followers of R-bloggers.com and the Twitter #rstats know by now, RStudio is a new open-source IDE for R that was beta-released yesterday. I have started putting it through its paces within my R workflow, and my impressions are more than favorable. I also tried it out on my home Linux server in server

Read more »

The split-apply-combine paradigm in R

February 25, 2011
By
The split-apply-combine paradigm in R

Last night at the DC R Users meetup, which was our largest meetup to date, I gave an introductory presentation on data munging, and spent a bit of time on the split-apply-combine paradigm that I use almost daily in my work. I talked mainly about the packages plyr and doBy, which I use a lot

Read more »

ggplot2 joy

February 25, 2011
By
ggplot2 joy

I’ve been working on a long-term (25+yr) longitudinal study of rheumatoid arthritis with my boss. He just walked in and asked if I could create a plot showing the trajectory of pain scores over time for each subject, separated by educational level (4 groups). Having now worked with ggplot2 for a while, and learning more

Read more »