R vs Python: Survival Analysis with Plotly

June 24, 2015

(This article was first published on Plotly Blog, and kindly contributed to R-bloggers)

We just published a new Survival Analysis tutorial. You can find code, an explanation of methods, and six interactive ggplot2 and Python graphs here.

How We Built It

Survival analysis is a set of statistical methods for analyzing events over time: time to death in biological systems, failure time in mechanical systems, etc. We used the tongue dataset from the KMsurv package in R, pandas and the lifelines library in Python, the survival package for R, the IPython Notebook to execute and publish code, and rpy2 to execute R code in the same document as the Python code.

Plotly is a platform for making and sharing interactive, D3.js graphs with APIs for R, Python, MATLAB, and Excel. You can make graphs and analyze data on Plotly’s free public cloud and within Shiny Apps. For collaboration and sensitive data, you can run Plotly Enterprise on your own servers.

The Plots We Made

For our first plot, made with R, the y axis represents the probability a patient is still alive at time t weeks. We see a steep drop off within the first 100 weeks, and then observe the curve flattening. The dotted lines represent the 95% confidence intervals. See the code, details, and plot in the IPython Notebook.

Survival vs Time

And now with Python. Click and drag to zoom, or hover your mouse to see data.

Tumor DNA Profile 1 (95% CI)

Many times there are different groups contained in a single dataset. These may represent categories such as treatment groups, different species, or different manufacturing techniques. The type variable in the tongues dataset describes a patients DNA profile. Below we define a Kaplan-Meier estimate for each of these groups in R and Python. Here we make the plot with R:

Lifespans of different tumor DNA profile

It looks like DNA Type 2 is potentially more deadly, or more difficult to treat compared to Type 1. But check out the IPython Notebook for more details. And now with Python:

Lifespans of different tumor DNA profile

To leave a comment for the author, please follow the link and comment on their blog: Plotly Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)