Articles by Abhijit

Cleaning up tables

May 16, 2018 | Abhijit

This post is re-published from my blog Context One of things I have to do quite often is create tables for papers and presentations. Often the “Table 1” of a paper has descriptives about the study, broken down by subgroups. For presentation purposes, it doesn’t look good (to me, at ... [Read more...]

Tidying messy Excel data (Introduction)

May 7, 2018 | Abhijit

[Re-posted from Abhijit’s blog] Personal expressiveness, or how data is stored in a spreadsheet When you get data from a broad research community, the variability in how that data is formatted and stored is truly astonishing. Of course there are the standardized formats that are output from machines, like ...
[Read more...]

Moving to blogdown

May 1, 2018 | Abhijit

I’ve been in the process of transferring my blog (along with creating a personal website) to blogdown, which is hosted on Github Pages. The new blog, or rather, the continuation of this blog, will be at, and it went live today. I’ll be cross-posting ...
[Read more...]

Surprising result when exploring Rcpp gallery

July 20, 2017 | Abhijit

I’m starting to incorporate more Rcpp in my R work, and so decided to spend some time exploring the Rcpp Gallery. One example by John Merrill caught my eye. He provides a C++ solution to transforming an list of lists into a data frame, and shows impressive speed savings ...
[Read more...]

Quirks about running Rcpp on Windows through RStudio

July 20, 2017 | Abhijit

Quirks about running Rcpp on Windows through RStudio This is a quick note about some tribulations I had running Rcpp (v. 0.12.12) code through RStudio (v. 1.0.143) on a Windows 7 box running R (v. 3.3.2). I also have RTools v. 3.4 installed. I fully admit that this may very well be specific to my […] [Read more...]

Finding my Dropbox in R

July 5, 2017 | Abhijit

I’ll often keep non-sensitive data on Dropbox so that I can access it on all my machines without gumming up git. I just wrote a small script to find the Dropbox location on each of my computers automatically. The crucial information is available here, from Dropbox. My small snippet ... [Read more...]

pandas “transform” using the tidyverse

April 11, 2017 | Abhijit

Chris Moffit has a nice blog on how to use the transform function in pandas. He provides some (fake) data on sales and asks the question of what fraction of each order is from each SKU. Being a R nut and a tidyverse fan, I thought to compare and contrast ... [Read more...]

Copying tables from R to Outlook

February 28, 2017 | Abhijit

I work in an ecosystem that uses Outlook for e-mail. When I have to communicate results with collaborators one of the most frequent tasks I face is to take a tabular output in R (either a summary table or some sort of tabular output) and send it to collaborators in ... [Read more...]

A quick exploration of the ReporteRs package

October 28, 2016 | Abhijit

The package ReporteRs has been getting some play on the interwebs this week, though it’s actually been around for a while. The nice thing about this package is that it allows writing Word and PowerPoint documents in an OS-independent fashion unlike some earlier packages. It also allows the editing ... [Read more...]

Annotated Facets with ggplot2

October 20, 2016 | Abhijit

I was recently asked to do a panel of grouped boxplots of a continuous variable, with each panel representing a categorical grouping variable. This seems easy enough with ggplot2 and the facet_wrap function, but then my collaborator wanted p-values on the graphs! This post is my approach to the ...
[Read more...]

Creating new data with max values for each subject

December 1, 2014 | Abhijit

We have a data set dat with multiple observations per subject. We want to create a subset of this data such that each subject (with ID giving the unique identifier for the subject) contributes the observation where the variable X takes it’s maximum value for that subject. An R ... [Read more...]

“LaF”-ing about fixed width formats

November 10, 2014 | Abhijit

If you have ever worked with US government data or other large datasets, it is likely you have faced fixed-width format data. This format has no delimiters in it; the data look like strings of characters. A separate format file defines which columns of data represent which variables. It seems ... [Read more...]

Practical Data Science Cookbook

November 10, 2014 | Abhijit

Practical Data Science Cookbook My friends Sean Murphy, Ben Bengfort, Tony Ojeda and I recently published a book, Practical Data Science Cookbook. All of us are heavily involved in developing the data community in the Washington DC metro area, serving on the Board of Directors of Data Community DC. Sean ...
[Read more...]

The need for documenting functions

May 22, 2014 | Abhijit

My current work usually requires me to work on a project until we can submit a research paper, and then move on to a new project. However, 3-6 months down the road, when the reviews for the paper return, it is quite common to have to do some new analyses ... [Read more...]

Kaplan-Meier plots using ggplots2 (updated)

April 1, 2014 | Abhijit

About 3 years ago I published some code on this blog to draw a Kaplan-Meier plot using ggplot2. Since then, ggplot2 has been updated (from 0.8.9 to and has changed syntactically. Since that post, I have also become comfortable with Git and Github. I have updated the code, edited it for a ... [Read more...]

Pocketbook costs of software

February 23, 2012 | Abhijit

I have always been provided SAS as part of my job, so I never really realized how much it cost. I’ve bought Stata before, and of course R . I recently found out how much a reasonable bundle of SAS modules along with base SAS costs per year per seat, ... [Read more...]

An enhanced Kaplan-Meier plot, updated

September 1, 2011 | Abhijit

I’ve updated the R code for the enhanced K-M plot to include additions and improvements by Gil Thomas and Mark Cowley. Thanks fellows for the feedback and updates. [Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)