**novyden**, and kindly contributed to R-bloggers)

This article will take us step-by-step over incremental changes to produce a bubble chart using **ggplot2** that looks like this:

We’ll encounter the plot above once again at the very end after explaining each step with code changes and observing intermediate plots. Without getting into details what it means (curios reader can find out here) the dataset behind is defined as:

It contains 2 data points and 4 attributes: three numerical *Aster_experience,* *R_experience*, and *coverage*, and one categorical *product*. Remember that **the data won’t change a bit** while the plot progression unfolds.

The starting plot is simple scatterplot using coordinates *x* and *y* as *Aster_experience,* *R_experience* (line 3), point size as *coverage*, and point color as *product* (line 4) (this type of scatterplot has a special name – bubble chart):

Immediate fix would be making the smaller point big enough to see it with the help of *scale_size* function and its *range* argument (line 3) (strange enough but sibling function *scale_size_area* doesn’t have such argument) that specifies the minimum and maximum size of the plotting symbol after transformation^{1 }:

Next refinement aims at the magic quadrant concept which fits this data well. In this case it’s “R Experience” vs. “Aster Experience” and whether there is more or less of each. Achieving this effect involves fake axes using *geom_hline *and *geom_vline* (line 3), and customizing actual axes using *scale* (line 5-6) and *theme* functions (line 8-12):

Typical for bubble charts its points get both colored and labeled, which also makes color bar legend obsolete. We use *geom_text *to label points (line 5) and *scale_color_manual *to assign new colors and remove color bar legend (line 11):

The next step happened to tackle the most advanced problem while working on the plot. The guide legend for size above looks rather awkward. Ideally, it matches the two points we have in both color and size. It turned out (and rightly so) that the function *scale_size *is responsible for its appearance (line 8). In particular, number of legend positions overrides argument *breaks*, and controling appearance including colors of the legend performed with *guide_legend* and *override.aes*:

We finish cleaning the plot using package **ggthemes** and its *theme_tufte* function (line 10):

As promised, we finished exactly where we started.

**1**Scale size (area or radius).

^{↩}

**leave a comment**for the author, please follow the link and comment on their blog:

**novyden**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...