Blog Archives

The need for documenting functions

May 22, 2014
By
The need for documenting functions

My current work usually requires me to work on a project until we can submit a research paper, and then move on to a new project. However, 3-6 months down the road, when the reviews for the paper return, it is quite common to have to do some new analyses or re-analyses of the data.

Read more »

Kaplan-Meier plots using ggplots2 (updated)

April 1, 2014
By
Kaplan-Meier plots using ggplots2 (updated)

About 3 years ago I published some code on this blog to draw a Kaplan-Meier plot using ggplot2. Since then, ggplot2 has been updated (from 0.8.9 to 0.9.3.1) and has changed syntactically. Since that post, I have also become comfortable with Git and Github. I have updated the code, edited it for a small error,

Read more »

Pocketbook costs of software

February 23, 2012
By
Pocketbook costs of software

I have always been provided SAS as part of my job, so I never really realized how much it cost. I’ve bought Stata before, and of course R . I recently found out how much a reasonable bundle of SAS modules along with base SAS costs per year per seat, at least under the GSA.

Read more »

An enhanced Kaplan-Meier plot, updated

September 1, 2011
By
An enhanced Kaplan-Meier plot, updated

I’ve updated the R code for the enhanced K-M plot to include additions and improvements by Gil Thomas and Mark Cowley. Thanks fellows for the feedback and updates. http://statbandit.wordpress.com/2011/03/08/an-enhanced-kaplan-meier-plot/

Read more »

RStudio 0.94.92 visited

July 30, 2011
By
RStudio 0.94.92 visited

I just updated my RStudio version to the latest, v.0.94.92 (will this asymptotically approach 1, or actually get to 1?). It was nice to see the number of improvements the development team has implemented, based I’m sure on community feedback. The team has, in my experience, been extraordinarily responsive to user feedback, and I’m sure

Read more »

A word of warning about grep, which and the like

July 13, 2011
By
A word of warning about grep, which and the like

I’ve often selected columns or rows of a data frame using grep or which, based on some property. That is inherently sound, but the trouble comes when you wish to remove rows or columns based on that grep or which call, e.g., which would remove columns with a .1 in the name. This is fine

Read more »

SAS, R and categorical variables

July 13, 2011
By
SAS, R and categorical variables

One of the disappointing problems in SAS (as I need PROC MIXED for some analysis) is to recode categorical variables to have a particular reference category. In R, my usual tool, this is rather easy both to set and to modify using the  relevel command available in base R (in the stats package). My understanding

Read more »

An enhanced Kaplan-Meier plot

March 8, 2011
By
An enhanced Kaplan-Meier plot

We often see, in publications, a Kaplan-Meier survival plot, with a table of the number of subjects at risk at different time points aligned below the figure. I needed this type of plot (or really, matrices of such plots) for an upcoming publication. Of course, my preferred toolbox was R and the ggplot2 package. There

Read more »

RStudio: a cut above

March 1, 2011
By
RStudio: a cut above

As most followers of R-bloggers.com and the Twitter #rstats know by now, RStudio is a new open-source IDE for R that was beta-released yesterday. I have started putting it through its paces within my R workflow, and my impressions are more than favorable. I also tried it out on my home Linux server in server

Read more »

The split-apply-combine paradigm in R

February 25, 2011
By
The split-apply-combine paradigm in R

Last night at the DC R Users meetup, which was our largest meetup to date, I gave an introductory presentation on data munging, and spent a bit of time on the split-apply-combine paradigm that I use almost daily in my work. I talked mainly about the packages plyr and doBy, which I use a lot

Read more »