Here you will find daily news and tutorials about R, contributed by over 573 bloggers.
There are many ways to follow us - By e-mail:On Facebook: If you are an R blogger yourself you are invited to add your own R content feed to this site (Non-English R bloggers should add themselves- here)

(This article was first published on SAS and R, and kindly contributed to R-bloggers)

In Example 9.1, we showed a binning approach to plotting bivariate relationships in a large data set. Here we show more sophisticated approaches: transparent overplotting and formal two-dimensional kernel density estimation. We use the 10,000 simulated bivariate normals shown in Example 9.1.

SAS In SAS, transparency can be found in proc sgplot, with results shown above. The options here are fairly self-explanatory.

The image gives a good sense of the overall density, with the darker (overplotted) areas reflecting more observations. Overplotting was the problem we sought to avoid with the binning, but here it becomes an advantage.

Another approach is to use bivariate kernel density estimation. This is perhaps more similar to the binning shown previously, but without the stricture of regular polygons. It also offers some default values for smoothing, though whether or not these are good default values could be debated.

proc kde data=mvnorms; bivar x1 x2 / plots=contour; run;

R

In R, the basic plot() function appears to include transparency, though you must select a suitably pale color to see it. The pch, col, and cex parameters govern the shape, color, and size of the plotted symbols, respectively.

Bivariate kernel density estimation is available in the smoothScatter() function, which is in included in the R distribution as part of the graphics package.

smoothScatter(xvals[,1], xvals[,2])

Related

To leave a comment for the author, please follow the link and comment on their blog: SAS and R.