Tidy Visualization of Mixture Models in R

[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We are excited to announce the release of the plotmm R package (v0.1.0), which is a suite of tidy tools for visualizing mixture model output. plotmm is a substantially updated version of the plotGMM package (Waggoner and Chan). Whereas plotGMM only includes support for visualizing univariate Gaussian mixture models fit via the mixtools package, the new plotmm package supports numerous mixture model specifications from several packages (model objects).

This current stable release is available on CRAN, and was a collaborative effort among Fong Chan, Lu Zhang, and Philip Waggoner.

The package has five core functions:

  • plot_mm(): The main function of the package, plot_mm() allows the user to simply input the name of the fit mixture model, as well as an optional argument to pass the number of components k that were used in the original fit. Note: the function will automatically detect the number of components if k is not supplied. The result is a tidy ggplot of the density of the data with overlaid mixture weight component curves. Importantly, as the grammar of graphics is the basis of visualization in this package, all other tidy-friendly customization options work with any of the plotmm’s functions (e.g., customizing with ggplot2’s functions like labs() or theme_*(); or patchwork’s plot_annotation()).

  • plot_cut_point(): Mixture models are often used to derive cut points of separation between groups in feature space. plot_cut_point() plots the data density with the overlaid cut point (point of greatest separation between component class means) from the fit mixture model.

  • plot_mix_comps(): A helper function allowing for expanded customization of mixture model plots. The function superimposes the shape of the components over a ggplot2 object. plot_mix_comps() is used to render all plots in the main plot_mm() function, and is not bound by package-specific objects, allowing for greater flexibility in plotting models not currently supported by the main plot_mm() function.

  • plot_gmm(): The original function upon which the package was expanded. It is included in plotmm for quicker access to a common mixture model form (univariate Gaussian), as well as to bridge between the original plotGMM package.

  • plot_mix_comps_normal(): Similarly, this function is the original basis of plot_mix_comps(), but for Gaussian mixture models only. It is included in plotmm for bridging between the original plotGMM package.
Supported model objects/packages include:

  1. mixtools
  2. EMCluster
  3. flexmix
Supported specifications include mixtures of:

  1. Univariate Gaussians
  2. Bivariate Gaussians
  3. Gammas
  4. Logistic regressions
  5. Linear regressions (also with repeated measures)
  6. Poisson regressions

Take a look at the Github repo for several demonstrations.

Note that though plotmm includes many updates and expanded functionality beyond plotGMM, it is under active development with support for more model objects and specifications forthcoming. Stay tuned for updates, and always feel free to open an issue ticket to share anything you’d like to see included in future versions of the package. Thanks and enjoy!

To leave a comment for the author, please follow the link and comment on their blog: R-posts.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)