**Strenge Jacke! » R**, and kindly contributed to R-bloggers)

Inspired by these two postings, I thought about including a function in my package for simply creating scatter plots.

In my package, there’s a function called `sjp.scatter`

for creating scatter plots. To reproduce these examples, first load the package and then attach the sample data set:

data(efc)

The simplest function call is by just providing two variables, one for the x- and one for the y-axis:

sjp.scatter(efc$c160age, efc$e17age)

If you have continuous variables with a larger scale, you shouldn’t have problems with overplotting or overlaying dots. However, this problem usually occurs, if you have variables with just a few categories (factor levels). The function automatically estimates the amount of overlaying dots and then automatically jitters them, like in following example, which also includes a marginal rug-plot:

sjp.scatter(efc$e16sex,efc$neg_c_7, efc$c172code, showRug=TRUE)

The same plot, when auto-jittering is turned off, would look like this:

sjp.scatter(efc$e16sex,efc$neg_c_7, efc$c172code, showRug=TRUE, autojitter=FALSE)

You can also add a grouping variable. The scatter plot is then “divided” into as many groups as indicated by the grouping variable. In the next example, two variables (elder’s and carer’s age) are grouped by different dependency levels of the elderly. Additionally, a fitted line for each group is plotted:

sjp.scatter(efc$c160age,efc$e17age, efc$e42dep, title="Scatter Plot", legendTitle=sji.getVariableLabels(efc)['e42dep'], legendLabels=sji.getValueLabels(efc)[['e42dep']], axisTitle.x=sji.getVariableLabels(efc)['c160age'], axisTitle.y=sji.getVariableLabels(efc)['e17age'], showGroupFitLine=TRUE)

If the groups are difficult to distinguish in a single plot area, the graph can be faceted by groups. This is shown in the last example, where a scatter plot is plotted for each group:

sjp.scatter(efc$c160age,efc$e17age, efc$e42dep, title="Scatter Plot", legendTitle=sji.getVariableLabels(efc)['e42dep'], legendLabels=sji.getValueLabels(efc)[['e42dep']], axisTitle.x=sji.getVariableLabels(efc)['c160age'], axisTitle.y=sji.getVariableLabels(efc)['e17age'], showGroupFitLine=TRUE, useFacetGrid=TRUE, showSE=TRUE)

Find a complete overview of the various function options in the package-help or at inside-r.

Tagged: ggplot, R, rstats

**leave a comment**for the author, please follow the link and comment on his blog:

**Strenge Jacke! » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...