Plotting conditional densities

April 14, 2012

(This article was first published on R snippets, and kindly contributed to R-bloggers)

Recently I have read a post on Comparing all quantiles of two distributions simultaneously on R-bloggers. In the post author plots two conditional density plots on one graph. I often use such a plot to visualize conditional densities of scores in binary prediction. After several times I had a problem with appropriate scaling of the plot to make both densities always fit into the plotting region I have written a small snippet that handles it.

Here is the code of the function. It scales both x and y axes appropriately:

# class: binary explained variable
# score: score obtained from prediction model
# main, xlab, col, lty, lwd: passed to plot function
# lx, ly: passed to legend function as x and y
cdp <- function(class, score,
                main = “Conditional density”, xlab = “score”,
                col = c(24), lty = c(11), lwd = c(11),
                lx = “topleft”, ly NULL) {
    class <- factor(class)
    if (length(levels(class)) != 2) {
        stop(“class must have two levels”)
    if (!is.numeric(score)) {
        stop(“score must be numeric”)
    cscore <- split(score, class)
    cdensity <- lapply(cscore, density)
    xlim <- range(cdensity[[1]]$x, cdensity[[2]]$x)
    ylim <- range(cdensity[[1]]$y, cdensity[[2]]$y)
    plot(cdensity[[1]], main = main, xlab = xlab, col = col[1],
         lty = lty[1], lwd = lwd[1], xlim = xlim, ylim = ylim)
    lines(cdensity[[2]], col = col[2], lty = lty[2], lwd = lwd[2])
    legend(lx, ly, names(cdensity),
           lty = lty, col = col, lwd = lwd)

As an example of its application I compare its results to standard cdplot on a simple classification problem:

data(Participation, package = “Ecdat”)
data.set <- Participation
data.set$age2 <- data.set$age 2
glm.model <- glm(lfp ., data = data.set, family=binomial(link probit))
par(mfrow = c(1, 2))
cdp(data.set$lfp, predict(glm.model), main = “cdp”)
cdplot(factor(data.set$lfp) ~ predict(glm.model),
       main = “cdplot”, xlab = “score”, ylab = “lfp”)

Here is the resulting plot:

To leave a comment for the author, please follow the link and comment on their blog: R snippets. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)