How to create appropriate data visualizations using tidycharts package.
There is a wide range of R packages created for data visualization, but still, something was lacking. There was no simple and easily accessible way to create clean and transparent charts — up to this day! tidychartsguarantees that your charts will be appropriate and consistent with each other. Furthermore, we assure you that you won’t have to worry if your charts are transparent and tidy enough because tidycharts already took care of it for you by following International Business Communication Standards rules.
What is IBCS exactly?
The IBCS Association is an open, not-for-profit organization that supports promoting, maintaining, and further developing the International Business Communication Standards (IBCS®). 1.0 version of IBCS was published in 2013 by Rolf Hichert and Jürgen Faisst. Since 2017 1.1 version is available.
Standards contain practical proposals for the design of business communication. The main goal is to design charts in a proper conceptual, perceptual and semantic way. To achieve this objective, the IBCS creators proposed following the SUCCESS rules — an acronym that stands for Say, Unify, Condense, Check, Express, Simplify, Structure.
You can use the charts later in reports, presentations, and dashboards.
What the package has to offer?
We implemented chart generating functions for the most frequently used types of plots. The package returns the charts in .svg format, which carries many benefits as a transparent background and no loss in the image quality in zooming. Our package includes:
- column bar charts (basic, aggregated, normalized, referenced, grouped)
- horizontal bar charts (basic, aggregated, normalized, referenced, grouped)
- line plots (basic, with markers, aggregated, normalized, referenced, with chosen points highlighted)
- scatter and bubble plots
Additionally, we added a function that will help you custom your plots and make generating plots for reports easier:
- making your own pallet of colors for charts
- showing your charts in a grid, next to each other
For now, the tidycharts package isn’t yet available on CRAN but we encourage you to download it anyway! Simply run the following command:
When the package will be finally available, the installation is even more effortless — it proceeds like every other R package installation. Just run the following command and wait for the library to load.
Let’s say we want to create a series bar chart to show products and services sales in different European cities. First, we need to prepare a data frame.
#prepare the data frame data <- data.frame( city = c("Berlin", "Munich", "Cologne", "London", "Vienna", "Paris", "Zurich"), Products = c(538, 250, 75, 301, 227, 100, 40), Services = c(621, 545, 302, 44, 39, 20, 34) )
Next, we need to generate the plot
#generate barchart <- bar_chart(data, data$city, c("Products", "Services"), c("Products", "Services")) #show the plot barchart
This is the final result
Let’s see one more example using a well known iris data table
scatter <- scatter_plot(iris, iris$Sepal.Length, iris$Sepal.Width, iris$Species, 1, 0.5, c("sepal length", "in cm"), c("sepal width", "in cm"), "Legend") scatter
Customizing your plots
IBCS advises using various shades of grey instead of other colors, but we left it up to the users what colors to use. The grey pallet is the default one, but you can change it by calling the set_colors() function.
Let’s see an example.
#before the customization data_time_series <- data.frame( time = month.abb[1:8], Poland = round(2 + 0.5 * sin(1:8), 1), Germany = round(3 + sin(3:10), 1), Slovakia = round(2 + 2 * cos(1:8), 1) ) column_chart(data_time_series, x = 'time', series = c('Poland', 'Germany', 'Slovakia'), interval = 'months')
Now let’s use the set_colors() function
#changing the colors color_df <- data.frame( bar_colors = c("rgb(61, 56, 124)", "rgb(0,89,161)", "rgb(0,120,186)", "rgb(0,150,193)", "rgb(0, 178, 184)", "rgb(0,178,184)"), text_colors = c("white", "white", "white", "white", "white", "black") ) set_colors(color_df)
Generating the plot again
column_chart(data_time_series, x = 'time', series = c('Poland', 'Germany', 'Slovakia'), interval = 'months')
Later you can always switch to default options by calling the restore_defaults() function.
Gluing the plots together
In writing any reports, you may require to show many plots next to each other. tidycharts has a function perfect for that! Let’s say you want to see correlations between various variables and penguin species in the palmer penguins data table. Scatter plots are an ideal tool for that!
First, let’s load the necessary libraries and drop NA values.
library(palmerpenguins) library(tidyverse) p <- penguins %>% drop_na(bill_length_mm, flipper_length_mm, bill_length_mm, body_mass_g)
Next comes generating the svg strings with scatter plots
#--- bill length on the x-axis --- scatter1 <- scatter_plot( p, p$bill_length_mm, p$bill_depth_mm, p$species, x_names = c("bill length", "in mm"), y_names = c("bill depht", "in mm") ) scatter2 <- scatter_plot( p, p$bill_length_mm, p$flipper_length_mm, p$species, x_names = c("bill length", "in mm"), y_names = c("flipper length", "in mm") ) scatter3 <- scatter_plot( p, p$bill_length_mm, p$body_mass_g, p$species, x_names = c("bill length", "in mm"), y_names = c("body mass", "in g") )
Finally, join the plots together and show the plot
join_charts(scatter1, scatter2, scatter3, nrows=1, ncols=3)
We kindly encourage you to try out the tidycharts package and start your journey with data visualizations and exploring the package’s possibilities.
If you are also interested in posts about explainable, fair, and responsible ML, follow #ResponsibleML on Medium.
A way of creating clear, transparent, and unified data visualizations was originally published in ResponsibleML on Medium, where people are continuing the conversation by highlighting and responding to this story.