Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Creating summary tables is a key part of data analysis, allowing you to see trends and patterns in your data. In this post, we’ll explore how to create these tables using tidyquant and dplyr in R. These packages make it easy to manipulate and summarize your data.
< section id="examples" class="level1">Examples
< section id="using-tidyquant-for-summary-tables" class="level2">Using tidyquant for Summary Tables
tidyquant is a versatile package that extends the tidyverse for financial and time series analysis. It simplifies working with data by integrating tidy principles.
< section id="example-calculating-average-price-by-month" class="level3">Example: Calculating Average Price by Month
Here’s an example of how to calculate the average price by month using tidyquant:
# Load necessary libraries
library(tidyquant)
library(dplyr)
# Sample data: Daily stock prices
data <- tibble(
date = seq(as.Date('2023-01-01'), as.Date('2023-06-30'), by = 'day'),
price = runif(181, 100, 200)
)
# Create a summary table with average closing price by month
summary_table <- data |>
mutate(month = floor_date(date, "month")) |>
pivot_table(
.rows = month,
.values = ~ mean(price, na.rm = TRUE)
) |>
setNames(c("date", "avg_price"))
print(summary_table)
# A tibble: 6 × 2 date avg_price <date> <dbl> 1 2023-01-01 149. 2 2023-02-01 162. 3 2023-03-01 151. 4 2023-04-01 151. 5 2023-05-01 145. 6 2023-06-01 149.
In this example:
- tidyquant and tibble are loaded to handle data manipulation.
- We create a sample dataset with daily stock prices.
- The
mutatefunction adds a new columnmonth, which extracts the month from each date. pivot_tablecalculates the average price for each month.- Finally, we rename the columns for clarity.
Using dplyr for Summary Tables
dplyr is a core tidyverse package known for its powerful data manipulation functions. It helps streamline the process of filtering, summarizing, and mutating data.
< section id="example-calculating-average-closing-price-by-month" class="level3">Example: Calculating Average Closing Price by Month
Here’s a similar example using dplyr:
# Load necessary libraries
library(dplyr)
library(lubridate)
# Sample data: Daily stock prices
data <- tibble(
date = seq(as.Date('2023-01-01'), as.Date('2023-06-30'), by = 'day'),
price = runif(181, 100, 200)
)
# Create a summary table with average closing price by month
summary_table <- data %>%
mutate(month = floor_date(date, "month")) %>%
group_by(month) %>%
summarise(avg_close = mean(price))
print(summary_table)
# A tibble: 6 × 2 month avg_close <date> <dbl> 1 2023-01-01 149. 2 2023-02-01 140. 3 2023-03-01 147. 4 2023-04-01 146. 5 2023-05-01 147. 6 2023-06-01 151.
In this dplyr example:
- We load dplyr and lubridate for data manipulation and date handling.
- The dataset creation process is the same.
- The
mutatefunction is used to add amonthcolumn. - We group the data by month using
group_byand then calculate the average closing price for each group usingsummarise.
Your Turn!
Using packages like tidyquant and dplyr simplifies data analysis tasks, making it easier to work with large datasets. These examples show just one way to create summary tables; there are many other functions and methods to explore. Give these examples a try with your own data and see how you can summarize and gain insights from your datasets.
Happy coding!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
