As I conduct some analysis for a content validation study, I wanted to quickly blog about a fun plot I discovered today: ggpairs, which displays scatterplots and correlations in a grid for a set of variables.
To demonstrate, I’ll return to my Facebook dataset, which I used for some of last year’s R analysis demonstrations. You can find the dataset, a minicodebook, and code on importing into R here. Then use the code from this post to compute the following variables: RRS, CESD, Extraversion, Agree, Consc, EmoSt, Openness. These correspond to measures of rumination, depression, and the Big Five personality traits. We could easily request correlations for these 7 variables. But if I wanted scatterplots plus correlations for all 7, I can easily request it with ggpairs then listing out the columns from my dataset I want included on the plot:
(Note: I also computed the 3 RRS subscales, which is why the column numbers above skip from 112 (RRS) to 116 (CESD). You might need to adjust the column numbers when you run the analysis yourself.)
The results look like this:
Since the grid is the number of variables squared, I wouldn’t recommend this type of plot for a large number of variables.