(This article was first published on

Recently I have read a post on Comparing all quantiles of two distributions simultaneously on R-bloggers. In the post author plots two conditional density plots on one graph. I often use such a plot to visualize conditional densities of scores in binary prediction. After several times I had a problem with appropriate scaling of the plot to make both densities always fit into the plotting region I have written a small snippet that handles it.**R snippets**, and kindly contributed to R-bloggers)Here is the code of the function. It scales both x and y axes appropriately:

# class: binary explained variable

# score: score obtained from prediction model

# main, xlab, col, lty, lwd: passed to plot function

# lx, ly: passed to legend function as x and y

cdp

**<-****function****(**class, score, main

**=**"Conditional density", xlab**=**"score", col

**=**c**(**2, 4**)**, lty**=**c**(**1, 1**)**, lwd**=**c**(**1, 1**)**, lx

**=**"topleft", ly**=****NULL****)****{** class

**<-**factor**(**class**)****if**

**(**length

**(**levels

**(**class

**))**

**!=**2

**)**

**{**

stop

**(**"class must have two levels"**)****}**

**if**

**(!**is.numeric

**(**score

**))**

**{**

stop

**(**"score must be numeric"**)****}**

cscore

**<-**split**(**score, class**)** cdensity

**<-**lapply**(**cscore, density**)** xlim

**<-**range**(**cdensity**[[**1**]]$**x, cdensity**[[**2**]]$**x**)** ylim

**<-**range**(**cdensity**[[**1**]]$**y, cdensity**[[**2**]]$**y**)** plot

**(**cdensity**[[**1**]]**, main**=**main, xlab**=**xlab, col**=**col**[**1**]**, lty

**=**lty**[**1**]**, lwd**=**lwd**[**1**]**, xlim**=**xlim, ylim**=**ylim**)** lines

**(**cdensity**[[**2**]]**, col**=**col**[**2**]**, lty**=**lty**[**2**]**, lwd**=**lwd**[**2**])** legend

**(**lx, ly, names**(**cdensity**)**, lty

**=**lty, col**=**col, lwd**=**lwd**)****}**

As an example of its application I compare its results to standard cdplot on a simple classification problem:

data

**(**Participation, package**=**"Ecdat"**)**data.set

**<-**Participationdata.set

**$**age2**<-**data.set**$**age**^**2glm.model

**<-**glm**(**lfp**~**., data**=**data.set, family**=**binomial**(**link**=**probit**))**par

**(**mfrow**=**c**(**1, 2**))**cdp

**(**data.set**$**lfp, predict**(**glm.model**)**, main**=**"cdp"**)**cdplot

**(**factor**(**data.set**$**lfp**)****~**predict**(**glm.model**)**, main

**=**"cdplot", xlab**=**"score", ylab**=**"lfp"**)**Here is the resulting plot:

To

**leave a comment**for the author, please follow the link and comment on his blog:**R snippets**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...