This is the beginning of a series on portfolio volatility, variance, and standard deviation. I realize that it’s a lot more fun to fantasize about analyzing stock returns, which is why television shows and websites constantly update the daily market returns and give them snazzy green and red colors. But good ol’ volatility is quite important in its own right, especially to finance geeks, aspiring finance geeks, and institutional investors. If you are, might become, or might ever work with/for any of those, this series should at least serve as a jumpingoff point.
Briefly, our volatility project will proceed as follows:
Part 1

Portfolio Volatility Intro: byhand, matrix algebra, builtin and compare to SPY

Visualizing Volatility Intro: chart portfolio sd over time – will have to roll apply

Shiny app to test different portfolios
Part 2

Asset Contributions to Volatility: byhand, matrix algebra, builtin and visualize snapshot with bar graph

Chart contributions over time: flag any asset that surpasses a threshold

Shiny app to test different portfolios
Part 3

Minimum Variance Portfolio: find minimum variance portfolio weights

Shiny app to test different portfolios
A quick word of warning: this series begins at the beginning with portfolio standard deviation, builds up to a more compelling data visualization in the next post, and finally a nice Shiny app after that. R users with experience in the world of volatility may wish to skip this post and wait for the visualizations in the next one. That said, I would humbly offer a couple of benefits to the R code that awaits us.
First, volatility is important, possibly more important than returns. I don’t think any investment professional looks back on hours spent pondering volatility as a waste of time. Plus, today we’ll look at a new way to convert daily prices to monthly using the tidyquant
package, and that might offer enough new substance.
Second, as always, we have an eye on making our work reproducible and reusable. This Notebook makes it exceedingly clear how we derive our final data visualizations on portfolio volatility. It’s a good template for other visualization derivations, even if standard deviation is old hat for you.
Okay, without further ado, here’s where we are headed today:

Import prices and calculate returns for 5 assets and construct a portfolio.
 Calculate the standard deviation of monthly portfolio returns using three methods:
 the oldfashioned equation
 matrix algebra
 a builtin function from
performanceAnalytics
 Compare those to the standard deviation of monthly SPY returns.
On to step 1, wherein we import prices and calculate returns for the 5 ETFs to be used in our portfolio. Those are AGG (a US bond fund), DBC (a commodities fund), EFA (a nonUS equities fund), SPY (an S&P500 ETF), VGT (a technology fund).
Let’s import prices and save them to an xts
object.
# A vector of symbol for our ETFs.
symbols < sort(c("SPY","VGT","EFA","DBC","AGG"))
# Pipe them to getSymbols, extract the closing prices, and merge to one xts object.
# Take a look at result before moving on to calculate the returns.
# Notice that we are only grabbing prices from 2013 to present, but that is
# only to keep the loading time shorter for the post.
prices <
getSymbols(symbols, src = 'google', from = "20130101",
auto.assign = TRUE, warnings = FALSE) %>%
map(~Cl(get(.))) %>%
reduce(merge) %>%
`colnames<`(symbols)
Next we want to turn those daily prices into monthly returns. We will pipe the prices
object to the tq_transmute()
function from the tidyquant package, but we can’t do that directly. First we need to transform our xts
object to a tibble
using a call to as_tibble(preserve_row_names = TRUE)
from tidyquant
. There’s probably a more efficient way to set this up (check out the tidyquant articles to learn more) than this whole piped construct, but I derive a certain pleasure from toggling between tibble
and xts
objects.
# We are going to make heavy use of the tidyquant package to get monthly returns.
portfolio_component_monthly_returns_xts <
prices %>%
# Convert to tibble so can stay in the tidyquant/verse.
as_tibble(preserve_row_names = TRUE) %>%
# Add a date column.
mutate(date = ymd(row.names)) %>%
# Remove the row.names column; it's not needed anymore.
select(row.names) %>%
# I like to have the date column as the first column.
select(date, everything()) %>%
# We need to gather into long format in order to use tq_transmute().
gather(asset, return, date) %>%
group_by(asset) %>%
# Use the function from tidyquant; note how easily we could change to
# a different time period like weekly or yearly.
tq_transmute(mutate_fun = periodReturn, period = "monthly") %>%
# Put the results back to wide format.
spread(asset, monthly.returns) %>%
# Convert back to an xts, so we can use the cov() and StdDev() functions.
as_xts(date_col = date)
head(portfolio_component_monthly_returns_xts)
## AGG DBC EFA SPY VGT
## 20130131 0.0050473186 0.021162123 0.02147558 0.02492127 0.008422235
## 20130228 0.0039858683 0.047067088 0.01288572 0.01275885 0.005237826
## 20130328 0.0009022828 0.006634722 0.01305393 0.03337511 0.026615970
## 20130430 0.0073150908 0.038081289 0.05018650 0.01921236 0.005349794
## 20130531 0.0217859064 0.015607156 0.03019051 0.02354709 0.042434166
## 20130628 0.0174136193 0.028228925 0.04611287 0.01847773 0.031675393
Take a quick look at the monthly returns above, to make sure things appear to be in order.
Now, on to constructing a portfolio and calculating volatility. To turn these five ETFs into a portfolio, we need to assign them weights. Let’s first create a weights vector.
weights < c(0.10, 0.10, 0.15, 0.40, 0.15)
Before we use the weights in our calculations, we perform a quick sanity check in the next code chunk. This might not be necessary with five assets as we have today, but it is good practice because if we had 50 assets, it could save us a lot of grief to catch a mistake early.
# Make sure the weights line up with assets.
asset_weights_sanity_check < tibble(weights, symbols)
asset_weights_sanity_check
## # A tibble: 5 x 2
## weights symbols
##
## 1 0.10 AGG
## 2 0.10 DBC
## 3 0.15 EFA
## 4 0.40 SPY
## 5 0.15 VGT
Alright, now on to the fun part, wherein we use the textbook equation for the standard deviation of a multiasset portfolio.
 First, we assign the weights of each asset.
 Then, we isolate and assign returns of each asset.
 Next, we plug those weights and returns into the equation for portfolio standard deviation, which involves the following:
 Take the weight squared of each asset times its variance, and sum those weighted variance terms.
 Then we take the covariance of each asset pair, multiplied by two times the weight of the first asset times the weight of the second asset.
 Sum together the covariance terms and the weighted variance terms. This gives us the portfolio variance.
 Then take the square root to get the standard deviation.
# This code chunk is intentionally verbose, repetitive, and inefficient
# to emphasize how to break down volatility and grind through the equation.
# Let's assign each asset a weight from our weights vector above.
w_asset1 < weights[1]
w_asset2 < weights[2]
w_asset3 < weights[3]
w_asset4 < weights[4]
w_asset5 < weights[5]
# And each asset has a return as well, stored in our
# portfolio_component_monthly_returns_xts object.
asset1 < portfolio_component_monthly_returns_xts[,1]
asset2 < portfolio_component_monthly_returns_xts[,2]
asset3 < portfolio_component_monthly_returns_xts[,3]
asset4 < portfolio_component_monthly_returns_xts[,4]
asset5 < portfolio_component_monthly_returns_xts[,5]
# I am going to label this 'sd_by_hand' to distinguish it from the matrix algebra we use later,
# and a builtin function for the same operation.
sd_by_hand <
# Important, don't forget to take the square root!
sqrt(
# Our weighted variance terms.
(w_asset1^2 * var(asset1)) + (w_asset2^2 * var(asset2)) + (w_asset3^2 * var(asset3)) +
(w_asset4^2 * var(asset4)) + (w_asset5^2 * var(asset5)) +
# Our weighted covariance terms
(2 * w_asset1 * w_asset2 * cov(asset1, asset2)) +
(2 * w_asset1 * w_asset3 * cov(asset1, asset3)) +
(2 * w_asset1 * w_asset4 * cov(asset1, asset4)) +
(2 * w_asset1 * w_asset5 * cov(asset1, asset5)) +
(2 * w_asset2 * w_asset3 * cov(asset2, asset3)) +
(2 * w_asset2 * w_asset4 * cov(asset2, asset4)) +
(2 * w_asset2 * w_asset5 * cov(asset2, asset5)) +
(2 * w_asset3 * w_asset4 * cov(asset3, asset4)) +
(2 * w_asset3 * w_asset5 * cov(asset3, asset5)) +
(2 * w_asset4 * w_asset5 * cov(asset4, asset5))
)
# I want to print the percentage, so multiply by 100 and round.
sd_by_hand_percent < round(sd_by_hand * 100, 2)
Okay, writing that equation out was painful and very copy/pasty, but at least we won’t be forgetting it any time soon. Our result is a monthly portfolio returns standard deviation of 2.22%.
Now, let’s turn to the less verbose matrix algebra path and confirm that we get the same result.
First, we will build a covariance matrix of returns using the cov()
function.
# Build the covariance matrix.
covariance_matrix < cov(portfolio_component_monthly_returns_xts)
covariance_matrix
## AGG DBC EFA SPY VGT
## AGG 8.185794e05 0.0000645834 6.046268e05 1.370992e06 1.412557e06
## DBC 6.458340e05 0.0018089475 4.198803e04 2.931955e04 1.901253e04
## EFA 6.046268e05 0.0004198803 1.251623e03 7.809411e04 9.529822e04
## SPY 1.370992e06 0.0002931955 7.809411e04 8.119639e04 9.003360e04
## VGT 1.412557e06 0.0001901253 9.529822e04 9.003360e04 1.329909e03
Have a look at the covariance matrix.
AGG, the US bond ETF, has a negative covariance with the other ETFs (besides EFA – something to note), and it should make a nice volatility dampener. Interestingly, the covariance between DBC, a commodities ETF, and VGT is quite low, as well. Our painstakingly writtenout equation above is a good reminder of how low covariances affect total portfolio standard deviation.
Back to our calculation: let’s take the square root of the transpose of the weights vector times the covariance matrix times the weights vector. To perform matrix multiplication, we use %*%
.
# If we wrote out the matrix multiplication, we would get the original byhand equation.
sd_matrix_algebra < sqrt(t(weights) %*% covariance_matrix %*% weights)
# I want to print out the percentage, so I'll multiply by 100 and round.
sd_matrix_algebra_percent < round(sd_matrix_algebra * 100, 2)
The byhand calculation is 2.22% and the matrix algebra calculation is 2.22%. Thankfully, these return the same result, so we don’t have to sort through the byhand equation again.
Finally, we can use the builtin StdDev()
function from the performanceAnalytics
package. It takes two arguments: returns and weights.
# Confirm portfolio volatility
portfolio_sd < StdDev(portfolio_component_monthly_returns_xts, weights = weights)
# I want to print out the percentage, so I'll multiply by 100 and round.
portfolio_sd_percent < round(portfolio_sd * 100, 2)
We now have:
 byhand calculation = 2.22%
 matrix algebra calculation = 2.22%
 builtin function calculation = 2.22%
Huzzah! That was quite a lot of work to confirm that the results of three calculations are equal to each other, but there are a few benefits.
First, while it was tedious, we should all be pretty comfortable with calculating portfolio standard deviations in various ways. That might never be useful to us, until the day that for some reason it is (e.g., if during an interview someone asks you to go to a whiteboard and write down the code for standard deviation or whatever equation/model – I think that’s still a thing in interviews).
More importantly, as our work gets more complicated and we build custom functions, we’ll want to rely on the builtin StdDev
function, and we now have confidence in its accuracy. That’s nice, but even more important is now that we have the template above, we can reuse it for other portfolios.
Also, as usual, this is more of a toy example than an actual template for use in industry. If a team relies heavily on prebuilt functions, even those built by the team itself, it’s not a bad idea to have a grinditout sanity check Notebook like this one. It reminds team members what a prebuilt function might be doing underthehood.
Now, let’s turn to a little bit of portfolio theory (or, why we want to build a portfolio instead of putting all of our money into SPY). We believe that by building a portfolio of assets whose covariances of returns are lower than the variance of SPY returns (or, equivalently, lower than the covariance of SPY returns with themselves), we can construct a portfolio whose standard deviation is lower than the standard deviation of SPY. If we believe that standard deviation and volatility are a good proxy for risk, then the portfolio would have a lower risk.
To see if we succeeded, first, isolate the returns of SPY, then find the standard deviation of those returns.
# First get the returns of the S&P 500 isolated
spy_returns < portfolio_component_monthly_returns_xts$SPY
# Now calculated standard deviation
spy_sd < StdDev(spy_returns)
# To confirm the variance of SPY's returns is equal to
# the covariance of SPY's returns with themselves,
# uncomment and run the next two lines of code.
# spy_var < var(spy_returns)
# spy_cov < cov(spy_returns, spy_returns)
# We could also have extracted this value from the SPY column and SPY row of covariance matrix,
# since the covariance of SPY with itself is equal to its variance.
# spy_sd_from_cov_matrix < sqrt(covariance_matrix[4,4])
# Again, I want percent so will multiply by 100 and round.
spy_sd_percent < round(spy_sd * 100, 2)
The standard deviation of monthly SPY returns is 2.85% and that of the portfolio is 2.22%.
Fantastic, our portfolio has lower monthly volatility!
Alright, despite the fact that we have completely ignored returns, we can see the volatility benefits of assets with low or even negative covariances. That’s all for today’s introduction to volatility. Next time, we will move to visualizing these differences in a Notebook, before heading to the world of Shiny.
Rbloggers.com offers daily email updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...