R vs Python: Survival Analysis with Plotly

[This article was first published on Plotly Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We just published a new Survival Analysis tutorial. You can find code, an explanation of methods, and six interactive ggplot2 and Python graphs here.

How We Built It

Survival analysis is a set of statistical methods for analyzing events over time: time to death in biological systems, failure time in mechanical systems, etc. We used the tongue dataset from the KMsurv package in R, pandas and the lifelines library in Python, the survival package for R, the IPython Notebook to execute and publish code, and rpy2 to execute R code in the same document as the Python code.

Plotly is a platform for making and sharing interactive, D3.js graphs with APIs for R, Python, MATLAB, and Excel. You can make graphs and analyze data on Plotly’s free public cloud and within Shiny Apps. For collaboration and sensitive data, you can run Plotly Enterprise on your own servers.

The Plots We Made

For our first plot, made with R, the y axis represents the probability a patient is still alive at time t weeks. We see a steep drop off within the first 100 weeks, and then observe the curve flattening. The dotted lines represent the 95% confidence intervals. See the code, details, and plot in the IPython Notebook.

Survival vs Time

And now with Python. Click and drag to zoom, or hover your mouse to see data.

Tumor DNA Profile 1 (95% CI)

Many times there are different groups contained in a single dataset. These may represent categories such as treatment groups, different species, or different manufacturing techniques. The type variable in the tongues dataset describes a patients DNA profile. Below we define a Kaplan-Meier estimate for each of these groups in R and Python. Here we make the plot with R:

Lifespans of different tumor DNA profile

It looks like DNA Type 2 is potentially more deadly, or more difficult to treat compared to Type 1. But check out the IPython Notebook for more details. And now with Python:

Lifespans of different tumor DNA profile
To leave a comment for the author, please follow the link and comment on their blog: Plotly Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)