Cheatsheet – Selecting Graphs for Statistical Analysis

June 23, 2016
By

(This article was first published on R – Journey of Analytics, and kindly contributed to R-bloggers)

One of the first steps with any statistical analysis, whether for hypothesis testing or predictive analytics or even a Kaggle competition, is checking the relationship between different variables. Checking if a pattern exists.

Graphs are a fantastic and visual way of identifying such relationships.

graph-matplotlib

MATPLOTLIB Graph

However, numerous readers kept getting stuck while selecting graphs for categorical variables and many friends asked if there was a standard rule for graph selection. With that in mind, please see below a cheatsheet for graphical selection for both quantitative (numeric) and categorical ( character -gender, disease type, etc.) variables.

 

 No.

Axis1

Axis2

Chart type

1.

Single quant

Histograms, Density plot, Box plot
2.

Single categorical

Bar chart (freq/ count), Pie chart (freq/ count/%)
2.

Categorical

Quant

Bar chart, pie chart, frequency table, line chart
3.

Quant

Quant

Scatterplot
4.

Categorical

Categorical

Stacked Column Chart, combination chart (typical bar chart with trendlines)
5.

2 categorical

Quant

Stacked or side-by-side bar charts, heat maps. Any basic graph, with Color/shape code for one of the quant variables.
6.

1 categorical

2 Quant

Stacked or side-by-side bar charts, Scatter plots. Any basic graph, with Color/shape code for one of the quant variables.
7.

3+ variables of any type

Please check if you really need so many variables in a single graph. Side-by-side graphs may be a better option, or graphs with filters (if possible based on the programming language)

These are merely guidelines and are language-agnostic, so you may choose to implement them in your choice of programming language ( R, Python, SAS, MATLAB, etc.) . However, if you prefer, code implementations in R and Python are provided in the links below:

  • Charts in R :
  • Charts in Python :
    • This link contains code and images to create stunning graphs (box plots, histograms, heatmaps, bubble charts, etc) using MATPLOTLIB library, like the one shown above.

Hope you find this cheatsheet useful! Feel free to share your thoughts and comments. Adieu!

Filed under: data analytics, learning resources, R, SAS, statistics Tagged: learning resources, r-projects, statistics

To leave a comment for the author, please follow the link and comment on their blog: R – Journey of Analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)