Understanding R’s `describe()` Function: A Complete Guide to Summary Statistics
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
describe() Function: A Complete Guide to Summary StatisticsTable of Contents
Introduction to describe()
The describe() function from R’s psych package (Revelle, 2023) provides a comprehensive statistical summary of your dataset. Unlike R’s base summary() function, it includes additional metrics that are particularly useful for data exploration and assumption checking.
library(psych) describe(your_data)
Breaking Down the Output Columns
Here’s what each column in the output represents:
| Column | Description | Formula/Calculation | Ideal Use Case |
|---|---|---|---|
| vars | Variable index number | – | Tracking variable order |
| n | Complete cases | length(na.omit(x)) |
Data completeness check |
| mean | Arithmetic average | sum(x)/n |
Normally distributed data |
| sd | Standard deviation | sqrt(var(x)) |
Measuring spread |
| median | 50th percentile | quantile(x, 0.5) |
Skewed distributions |
| trimmed | Mean after removing extremes | mean(x, trim=0.1) |
Robust central tendency |
| mad | Median absolute deviation | median(abs(x-median(x))) |
Outlier-resistant spread |
| min | Minimum value | min(x) |
Range assessment |
| max | Maximum value | max(x) |
Range assessment |
| range | Max – Min | max(x)-min(x) |
Total spread |
| skew | Distribution asymmetry | sum((x-mean(x))³)/(n*sd(x)³) |
Detecting skew direction |
| kurtosis | Tailedness | sum((x-mean(x))⁴)/(n*sd(x)⁴)-3 |
Outlier propensity |
| se | Standard error | sd(x)/sqrt(n) |
Precision of mean estimate |
Key Statistics and Their Interpretation
Central Tendency
- Mean vs. Median: Differences indicate skewness
- Trimmed Mean: Removes influence of outliers (default drops top/bottom 10%)
Variability
- SD vs. MAD: Use MAD when outliers are present
- Range: Simple but outlier-sensitive
Distribution Shape
- Skewness:
- >0: Right-tailed
- <0: Left-tailed
- 0: Symmetric
- Kurtosis (Excess):
- >0: Heavy-tailed (more outliers than normal)
- <0: Light-tailed
Practical Examples
Example 1: MPG from mtcars
describe(mtcars$mpg)
Output Interpretation:
vars n mean sd median trimmed mad min max range skew kurtosis se 1 1 32 20.09 6.03 19.2 19.70 5.41 10.4 33.9 23.5 0.61 -0.37 1.07
- Right-skewed (mean > median, positive skew)
- Light-tailed (negative kurtosis)
- SD (6.03) > MAD (5.41): Suggests some outlier influence
When to Use Which Statistic
| Scenario | Recommended Statistics |
|---|---|
| Normal Distribution | Mean, SD |
| Skewed Data | Median, IQR, MAD |
| Outlier Detection | MAD, trimmed mean, kurtosis |
| Parametric Testing | Mean, SE |
| Nonparametric Analysis | Median, IQR |
Extending the Functionality
Adding IQR
The default describe() doesn’t show IQR, but you can add it:
library(dplyr) describe(mtcars) %>% mutate(IQR = apply(mtcars, 2, IQR, na.rm = TRUE))
Comparing Groups
Use describeBy() for grouped statistics:
describeBy(mtcars$mpg, group = mtcars$cyl)
Conclusion
R’s describe() function provides a powerful starting point for exploratory data analysis. By understanding each statistic it provides, you can:
- Detect data quality issues
- Choose appropriate analysis methods
- Understand your variables’ distributions
- Make informed decisions about data transformations
For formal reporting, consider supplementing these metrics with visualization and statistical tests.
Pro Tip: Always visualize your data alongside these statistics – numbers tell part of the story, but plots reveal the full picture!
Happy coding!
—
Reference:
Revelle, W. (2023). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University.
Understanding R’s `describe()` Function: A Complete Guide to Summary Statistics was first posted on April 29, 2026 at 6:09 am.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.