# Calculating Beta in the Capital Asset Pricing Model

**R Views**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Today we will continue our portfolio fun by calculating the CAPM beta of our portfolio returns. That will entail fitting a linear model and, when we get to visualization next time, considering the meaning of our results from the perspective of asset returns.

By way of brief background, the Capital Asset Pricing Model (CAPM) is a model, created by William Sharpe, that estimates the return of an asset based on the return of the market and the asset’s linear relationship to the return of the market. That linear relationship is the stock’s beta coefficient, or just good ol’ beta.

CAPM was introduced back in 1964, garnered a Nobel for its creator, and, like many ephocally important theories, has been widely used, updated, criticized, debunked, revived, re-debunked, etc. Fama and French have written that CAPM “is the centerpiece of MBA investment courses. Indeed, it is often the only asset pricing model taught in these courses…[u]nfortunately, the empirical record of the model is poor.”^{1}

With that, we will forge ahead with our analysis because calculating CAPM betas can serve as a nice template for more complex models in a team’s work and sometimes it’s a good idea to start with a simple model, even if it hasn’t stood up to empirical rigor. Plus, it might have been questioned by future research, but it’s still an iconic model that we should learn and love.

We are going to focus on one particular aspect of CAPM: beta. Beta, as we noted above, is the beta coefficient of an asset that results from regressing the returns of that asset on market returns. It captures the linear relationship between the asset/portfolio and the market. For our purposes, it’s a good vehicle for exploring reproducible flows for modeling or regressing our portfolio returns on the market returns. Even if your team dislikes CAPM in favor of more nuanced models, these code flows can serve as a good base for the building of those more complex models.

We are going to be calculating beta in several ways: by-hand (for illustrative purposes), in the `xts`

world with `PerformanceAnalytics`

, in the tidyverse with `dplyr`

, and in the `tidyquant`

world. These seem to be the most popular paradigms for doing financial time series work, and even within a team there can be differing preferences. I don’t think everyone needs to grind through their work using each paradigm, but I do think it’s helpful to be fluent, or, at least, conversant, in the various worlds. If you’re a tidyverse type of person but need to collaborate with an `xts`

or `tidyquant`

enthusiast, it will help if each of you is familiar with the three universes (though at some point ya just have to choose a code flow and get stuff done).

We will be working with and calculating beta for our usual portfolio consisting of:

+ SPY (S&P500 fund) weighted 25% + EFA (a non-US equities fund) weighted 25% + IJS (a small-cap value fund) weighted 20% + EEM (an emerging-mkts fund) weighted 20% + AGG (a bond fund) weighted 10%

Before we can calculate beta for that portfolio, we need to find portfolio monthly returns, which was covered in this post.

I won’t go through the logic again but the code is here:

library(tidyquant) library(tidyverse) library(timetk) library(tibbletime) library(broom) symbols <- c("SPY","EFA", "IJS", "EEM","AGG") prices <- getSymbols(symbols, src = 'yahoo', from = "2013-01-01", to = "2017-12-31", auto.assign = TRUE, warnings = FALSE) %>% map(~Ad(get(.))) %>% reduce(merge) %>% `colnames<-`(symbols) prices_monthly <- to.monthly(prices, indexAt = "last", OHLC = FALSE) asset_returns_xts <- na.omit(Return.calculate(prices_monthly, method = "log")) w <- c(0.25, 0.25, 0.20, 0.20, 0.10) portfolio_returns_xts_rebalanced_monthly <- Return.portfolio(asset_returns_xts, weights = w, rebalance_on = "months") %>% `colnames<-`("returns") asset_returns_long <- prices %>% to.monthly(indexAt = "last", OHLC = FALSE) %>% tk_tbl(preserve_index = TRUE, rename_index = "date") %>% gather(asset, returns, -date) %>% group_by(asset) %>% mutate(returns = (log(returns) - log(lag(returns)))) %>% na.omit() portfolio_returns_tq_rebalanced_monthly <- asset_returns_long %>% tq_portfolio(assets_col = asset, returns_col = returns, weights = w, col_rename = "returns", rebalance_on = "months")

We will be working with two objects of portfolio returns and one object of our individual asset returns:

+ portfolio_returns_xts_rebalanced_monthly (an xts of monthly returns) + portfolio_returns_tq_rebalanced_monthly (a tibble of monthly returns) + asset_returns_long (a tidy tibble of monthly returns for those 5 assets above)

Let’s get to it.

### CAPM and Market Returns

Our first step is to make a choice about which asset to use as a proxy for the market return, and we will go with the SPY ETF, effectively treating the S&P 500 as the market. That’s going to make our calculations substantively uninteresting because (1) SPY is 25% of our portfolio and (2) we have chosen assets and a time period (2013 – 2017) in which correlations with SPY have been high. It will offer one benefit in the way of a sanity check, which I’ll note below. With those caveats in mind, feel free to choose a different asset for the market return and try to reproduce this work, or construct a different portfolio that does not include SPY.

Let’s calculate our market return for SPY and save it as `market_return_xts`

. Note the start date is “2013-01-01” and the end date is “2017-12-31”, so we will be working with five years of returns.

spy_monthly_xts <- getSymbols("SPY", src = 'yahoo', from = "2013-01-01", to = "2017-12-31", auto.assign = TRUE, warnings = FALSE) %>% map(~Ad(get(.))) %>% reduce(merge) %>% `colnames<-`("SPY") %>% to.monthly(indexAt = "last", OHLC = FALSE) market_returns_xts <- Return.calculate(spy_monthly_xts, method = "log") %>% na.omit()

We will also want a `data.frame`

object of market returns, and will convert the `xts`

object using `tk_tbl(preserve_index = TRUE, rename_index = "date")`

from the `timetk`

package.

market_returns_tidy <- market_returns_xts %>% tk_tbl(preserve_index = TRUE, rename_index = "date") %>% na.omit() %>% select(date, returns = SPY) head(market_returns_tidy) ## # A tibble: 6 x 2 ## date returns ## <date> <dbl> ## 1 2013-02-28 0.01267837 ## 2 2013-03-28 0.03726809 ## 3 2013-04-30 0.01903021 ## 4 2013-05-31 0.02333503 ## 5 2013-06-28 -0.01343411 ## 6 2013-07-31 0.05038580

We have a `market_returns_tidy`

object. Let’s make sure it’s periodicity aligns perfectly with our portfolio returns periodicity

portfolio_returns_tq_rebalanced_monthly %>% mutate(market_returns = market_returns_tidy$returns) %>% head() ## # A tibble: 6 x 3 ## date returns market_returns ## <date> <dbl> <dbl> ## 1 2013-02-28 -0.0008696129 0.01267837 ## 2 2013-03-28 0.0186624381 0.03726809 ## 3 2013-04-30 0.0206248856 0.01903021 ## 4 2013-05-31 -0.0053529694 0.02333503 ## 5 2013-06-28 -0.0229487618 -0.01343411 ## 6 2013-07-31 0.0411705818 0.05038580

Note that if the periodicities did not align, `mutate()`

would have thrown an error in the code chunk above.

### Calculating CAPM Beta

There are several R code flows to calculate portfolio beta but first let’s have a look at the equation.

$${\beta}_{portfolio} = cov(R_p, R_m)/\sigma_m $$

\[{\beta}_{portfolio} = cov(R_p, R_m)/\sigma_m \]

Portfolio beta is equal to the covariance of the portfolio returns and market returns, divided by the variance of market returns.

We can calculate the numerator, or covariance of portfolio and market returns, with `cov(portfolio_returns_xts_rebalanced_monthly, market_returns_tidy$returns)`

and the denominator with `var(market_return$returns)`

.

Our portfolio beta is equal to:

cov(portfolio_returns_xts_rebalanced_monthly,market_returns_tidy$returns)/var(market_returns_tidy$returns) ## [,1] ## returns 0.9010689

That beta is quite near to 1 as we were expecting – after all, SPY is a big part of this portfolio.

We can also calculate portfolio beta by finding the beta of each of our assets and then multiplying by asset weights. That is, another equation for portfolio beta is the weighted sum of the asset betas:

$${\beta}_{portfolio} ={\sum_{i=1}^n}W _i~{\beta}_i $$

\[{\beta}_{portfolio} ={\sum_{i=1}^n}W _i~{\beta}_i \]

To use that method with R, we first find the beta for each of our assets, and this gives us an opportunity to introduce a code flow for running regression analysis.

We need to regress each of our individual asset returns on the market return. We could do that for asset 1 with `lm(asset_return_1 ~ market_returns_tidy$returns)`

, and then again for asset 2 with `lm(asset_return_2 ~ market_returns_tidy$returns)`

, etc. for all five of our assets. But if we had a 50-asset portfolio, that would be impractical. Instead let’s write a code flow and use `map()`

to regress all of our assets and calculate betas with one call.

We will start with our `asset_returns_long`

tidy data frame and will then run `nest(-asset)`

.

beta_assets <- asset_returns_long %>% na.omit() %>% nest(-asset) beta_assets ## # A tibble: 5 x 2 ## asset data ## <chr> <list> ## 1 SPY <tibble [59 x 2]> ## 2 EFA <tibble [59 x 2]> ## 3 IJS <tibble [59 x 2]> ## 4 EEM <tibble [59 x 2]> ## 5 AGG <tibble [59 x 2]>

That `nest(-asset)`

changed our data frame so that there are two columns: one called `asset`

that holds our asset name and one called `data`

that holds a list of returns for each asset. We have now ‘nested’ a list of returns within a column.

Now we can use `map()`

to apply a function to each of those nested lists and store the results in a new column via the `mutate()`

function. The whole piped command is `mutate(model = map(data, ~ lm(returns ~ market_returns_tidy$returns, data = .)))`

beta_assets <- asset_returns_long %>% na.omit() %>% nest(-asset) %>% mutate(model = map(data, ~ lm(returns ~ market_returns_tidy$returns, data = .))) beta_assets ## # A tibble: 5 x 3 ## asset data model ## <chr> <list> <list> ## 1 SPY <tibble [59 x 2]> <S3: lm> ## 2 EFA <tibble [59 x 2]> <S3: lm> ## 3 IJS <tibble [59 x 2]> <S3: lm> ## 4 EEM <tibble [59 x 2]> <S3: lm> ## 5 AGG <tibble [59 x 2]> <S3: lm>

We now have three columns: `asset`

which we had before, `data`

which we had before, and `model`

which we just added. The `model`

column holds the results of the regression `lm(returns ~ market_returns_tidy$returns, data = .)`

that we ran for each of our assets. Those results are a beta and an intercept for each of our assets, but they are not in a great format for presentation to others, or even readability by ourselves.

Let’s tidy up our results with the `tidy()`

function from the `broom`

package. We want to apply that function to our model column and will use the `mutate()`

and `map()`

combination again. The complete call is to `mutate(model = map(model, tidy))`

.

beta_assets <- asset_returns_long %>% na.omit() %>% nest(-asset) %>% mutate(model = map(data, ~ lm(returns ~ market_returns_tidy$returns, data = .))) %>% mutate(model = map(model, tidy)) beta_assets ## # A tibble: 5 x 3 ## asset data model ## <chr> <list> <list> ## 1 SPY <tibble [59 x 2]> <data.frame [2 x 5]> ## 2 EFA <tibble [59 x 2]> <data.frame [2 x 5]> ## 3 IJS <tibble [59 x 2]> <data.frame [2 x 5]> ## 4 EEM <tibble [59 x 2]> <data.frame [2 x 5]> ## 5 AGG <tibble [59 x 2]> <data.frame [2 x 5]>

We are getting close now, but the `model`

column holds nested data frames. Have a look and see that they are nicely formatted data frames:

beta_assets$model ## [[1]] ## term estimate std.error statistic ## 1 (Intercept) 1.806734e-18 1.136381e-18 1.589902e+00 ## 2 market_returns_tidy$returns 1.000000e+00 3.899949e-17 2.564136e+16 ## p.value ## 1 0.1173886 ## 2 0.0000000 ## ## [[2]] ## term estimate std.error statistic ## 1 (Intercept) -0.005427739 0.002908978 -1.865858 ## 2 market_returns_tidy$returns 0.945476441 0.099833320 9.470550 ## p.value ## 1 6.720983e-02 ## 2 2.656258e-13 ## ## [[3]] ## term estimate std.error statistic ## 1 (Intercept) -0.001693293 0.003639218 -0.4652905 ## 2 market_returns_tidy$returns 1.120583127 0.124894444 8.9722416 ## p.value ## 1 6.434963e-01 ## 2 1.713903e-12 ## ## [[4]] ## term estimate std.error statistic ## 1 (Intercept) -0.00811518 0.004785237 -1.695878 ## 2 market_returns_tidy$returns 0.95562574 0.164224722 5.819013 ## p.value ## 1 9.536495e-02 ## 2 2.841106e-07 ## ## [[5]] ## term estimate std.error statistic ## 1 (Intercept) 0.001888304 0.001230331 1.5347933 ## 2 market_returns_tidy$returns -0.005419543 0.042223776 -0.1283529 ## p.value ## 1 0.1303671 ## 2 0.8983215

Still, I don’t like to end up with nested data frames, so let’s `unnest()`

that `model`

column.

beta_assets <- asset_returns_long %>% na.omit() %>% nest(-asset) %>% mutate(model = map(data, ~ lm(returns ~ market_returns_tidy$returns, data = .))) %>% mutate(model = map(model, tidy)) %>% unnest(model) beta_assets ## # A tibble: 10 x 6 ## asset term estimate std.error ## <chr> <chr> <dbl> <dbl> ## 1 SPY (Intercept) 1.806734e-18 1.136381e-18 ## 2 SPY market_returns_tidy$returns 1.000000e+00 3.899949e-17 ## 3 EFA (Intercept) -5.427739e-03 2.908978e-03 ## 4 EFA market_returns_tidy$returns 9.454764e-01 9.983332e-02 ## 5 IJS (Intercept) -1.693293e-03 3.639218e-03 ## 6 IJS market_returns_tidy$returns 1.120583e+00 1.248944e-01 ## 7 EEM (Intercept) -8.115180e-03 4.785237e-03 ## 8 EEM market_returns_tidy$returns 9.556257e-01 1.642247e-01 ## 9 AGG (Intercept) 1.888304e-03 1.230331e-03 ## 10 AGG market_returns_tidy$returns -5.419543e-03 4.222378e-02 ## # ... with 2 more variables: statistic <dbl>, p.value <dbl>

Now that looks human-readable and presentable. We will do one further cleanup and get rid of the intercept results, since we are isolating the betas.

beta_assets <- asset_returns_long %>% na.omit() %>% nest(-asset) %>% mutate(model = map(data, ~ lm(returns ~ market_returns_tidy$returns, data = .))) %>% unnest(model %>% map(tidy)) %>% filter(term == "market_returns_tidy$returns") %>% select(-term) beta_assets ## # A tibble: 5 x 5 ## asset estimate std.error statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 SPY 1.000000000 3.899949e-17 2.564136e+16 0.000000e+00 ## 2 EFA 0.945476441 9.983332e-02 9.470550e+00 2.656258e-13 ## 3 IJS 1.120583127 1.248944e-01 8.972242e+00 1.713903e-12 ## 4 EEM 0.955625743 1.642247e-01 5.819013e+00 2.841106e-07 ## 5 AGG -0.005419543 4.222378e-02 -1.283529e-01 8.983215e-01

A quick sanity check on those asset betas should reveal that SPY has beta of 1 with itself.

beta_assets %>% select(asset, estimate) %>% filter(asset == "SPY") ## # A tibble: 1 x 2 ## asset estimate ## <chr> <dbl> ## 1 SPY 1

Now let’s see how our combination of these assets leads to a portfolio beta.

Let’s assign portfolio weights as we chose above.

w <- c(0.25, 0.25, 0.20, 0.20, 0.10)

Now we can use those weights to get our portfolio beta, based on the betas of the individual assets.

beta_byhand <- w[1] * beta_assets$estimate[1] + w[2] * beta_assets$estimate[2] + w[3] * beta_assets$estimate[3] + w[4] * beta_assets$estimate[4] + w[5] * beta_assets$estimate[5] beta_byhand ## [1] 0.9010689

That beta is the same as we calculated above using the covariance/variance method, and now we know the the covariance of portfolio returns and market returns divided by the variance of market returns is equal to the weighted estimates we got by regressing each asset’s return on market returns.

### Calculating CAPM Beta in the `xts`

World

We can make things even more efficient, of course, with built-in functions. Let’s go to the `xts`

world and use the built-in `CAPM.beta()`

function from `PerformanceAnalytics`

. That function takes two arguments: the returns for the portfolio (or any asset) whose beta we wish to calculate, and the market returns. Our function will look like `CAPM.beta(portfolio_returns_xts_rebalanced_monthly, mkt_return_xts)`

.

beta_builtin_xts <- CAPM.beta(portfolio_returns_xts_rebalanced_monthly, market_returns_xts) beta_builtin_xts ## [1] 0.9010689

### Calculating CAPM Beta in the Tidyverse

We will run that same function through a `dplyr`

and `tidyquant`

code flow to stay in the tidy world.

First we’ll use `dplyr`

to grab our portfolio beta. We’ll return to this flow later for some visualization, but for now will extract the portfolio beta.

To calculate the beta, we call `do(model = lm(returns ~ market_returns_tidy$returns, data = .))`

. Then we head back to the `broom`

package and use the `tidy()`

function to make our model results a little easier on the eyes.

beta_dplyr_byhand <- portfolio_returns_tq_rebalanced_monthly %>% do(model = lm(returns ~ market_returns_tidy$returns, data = .)) %>% tidy(model) %>% mutate(term = c("alpha", "beta")) beta_dplyr_byhand ## term estimate std.error statistic p.value ## 1 alpha -0.003129799 0.00155617 -2.011219 4.903980e-02 ## 2 beta 0.901068930 0.05340627 16.871969 7.855042e-24

### Calculating CAPM Beta in the Tidyquant World

Let’s use one more flow with built-in functions, this time using `tidyquant`

and the `tq_performance()`

function. This will allow us to apply the `CAPM.beta()`

function from `PerformanceAnalytics`

to a data frame.

beta_builtin_tq <- portfolio_returns_tq_rebalanced_monthly %>% mutate(market_return = market_returns_tidy$returns) %>% na.omit() %>% tq_performance(Ra = returns, Rb = market_return, performance_fun = CAPM.beta) %>% `colnames<-`("beta_tq")

Let’s take a quick look at our four beta calculations.

beta_byhand ## [1] 0.9010689 beta_builtin_xts ## [1] 0.9010689 beta_dplyr_byhand$estimate[2] ## [1] 0.9010689 beta_builtin_tq$beta_tq ## [1] 0.9010689

Consistent results and a beta near 1 as we were expecting, since our portfolio has a 25% allocation to the S&P 500. We’re less concerned with numbers than we are with the various code flows used to get here. Next time we’ll do some visualizing - see you then!

The Capital Asset Pricing Model: Theory and Evidence Eugene F. Fama and Kenneth R. French, The Capital Asset Pricing Model: Theory and Evidence, The Journal of Economic Perspectives, Vol. 18, No. 3 (Summer, 2004), pp. 25-46↩

**leave a comment**for the author, please follow the link and comment on their blog:

**R Views**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.