[This article was first published on Digital Age Economist on Digital Age Economist, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

## Packages used in this post

Disclaimer: I am no financial advisor, have never been and you should not take any of this analysis as investment advice. These thoughts are my own, please dont mail me about your money strategies/problems. I enjoy numbers, scraping and data analysis and that is wat this post is about. Also, do you really want to trust someone who wrote this post at 2am… no you dont

library(tidyverse)
library(glue)
library(rvest)
library(gridExtra)
library(ggpubr)
library(scales)
library(ggthemes)

theme_set(theme_economist())

# Why even do this?

Well, I recently decided to do a quick overview of some of the funds I am invested in as part of a retirement annuity. Its been almost 4 years since I started the RA and thought I would just view the fund’s performance, its asset composition and the fees that have been charged over the period.

If you do not know what an RA is and why you would need one, you can think of it as a way to get back at the tax man while investing over a long period. In South Africa certain funds are what are called Regulation 28 compliant, which means, any money you put into such a fund is tax-exempt – but you just have to leave it there till you retire, which isn’t actually a bad thing thanks the miracle of compounded growth. I think having an RA is a cornerstone of any good financial planning. So enough of sounding like your parents.

OK, so at this point in time, you probably like this:

Well, actually you are kind of right of thinking this. If you invest in a retirement annuity on a regular basis and over the contract period you should have some money for retirement, in theory. However, the market for funds are huge and which one do you pick? Financial advisers could give you good advice, but its also in their interest to invest you in multiple complex funds, as they get management fees in return1.

The easiest way to get a grips with the cost involved when investing in a fund is what is called Total Expense Ratios. These are important, because if the fund is costing you 3.5%, annualised growth of fund is 8% and inflation is 6%, then you not gaining of the advantage of compounded growth, but actually losing -1.5% a year in real terms. An example of a TER is as follows:

Investopedia gives a good explanation of these fees

The total expense ratio (TER) is a measure of the total costs associated with managing and operating an investment fund, such as a mutual fund. These costs consist primarily of management fees and additional expenses, such as trading fees, legal fees, auditor fees and other operational expenses. The total cost of the fund is divided by the fund’s total assets to arrive at a percentage amount, which represents the TER, most often referred to as simply “expense ratio”

So what I set out to do is look at the TERs for South African funds and then compare it with their performance of the last 5 years. To collect the data I scraped all the information from Fundsdata.co.za as they have a wide collection of different funds and categories

I used rvest and combined it with a bit of purrr magic that got me the results I wanted. The data was a bit dirty so I just cleaned one or two things before starting the analysis.

main_pg <- read_html("http://www.fundsdata.co.za/navs/default.htm")

pg_funds <- main_pg %>%
html_nodes("tr.pagelink") %>%
html_nodes("a") %>%
html_text()

pg_links <- main_pg %>%
html_nodes("tr.pagelink") %>%
html_nodes("a") %>%
html_attr("href")

all_funds <- tibble(funds = pg_funds, link = glue("http://www.fundsdata.co.za/navs/{pg_links}"))

link <- "http://www.fundsdata.co.za/navs/ZEGN.htm"
collect_ter <- possibly(function(link){

cat(glue("Now collecting {link}"), "\n")

pg <- read_html(link)

pg_tbl <- html_table(pg, fill = T)[[1]][, -20]

x <- pg_tbl %>%
slice(-c(1:2)) %>%
set_names(make.names(pg_tbl[2,], unique = T)) %>%
tbl_df %>%
slice(-c(n():(n()-3)))
Sys.sleep(10)
x
}, NULL)

all_funds <- all_funds %>%
filter(grepl("South Africa", funds)) %>%
mutate(info = map(link, collect_ter))

#saveRDS(all_funds, "all_funds.rds")

After the collection is done, you have a neat tibble that looks like this:

## # A tibble: 6 x 3
##   funds                                            link           info
##   <chr>                                            <S3: glue>     <list>
## 1 South African - Equity - General Funds           http://www.fu~ <tibble~
## 2 South African - Equity - Industrial Funds        http://www.fu~ <tibble~
## 3 South African - Equity - Large Cap Funds         http://www.fu~ <tibble~
## 4 South African - Equity - Mid and Small Cap Funds http://www.fu~ <tibble~
## 5 South African - Equity - Unclassified Funds      http://www.fu~ <tibble~
## 6 South African - Equity - Resource Funds          http://www.fu~ <tibble~

Next I wanted to calculate the annualised rate of return of the each of the funds just to get an idea of what kind of returns are out there. The data only goes back 5 years, so I am going to work from that.

The formula for CAGR (compounded annual growth rate) is simply:

$$CAGR = (\frac{F}{P})^{1/n} - 1$$

Where:

• F is future value
• P is present value
• n is number of years

Once I have the annualised return, I add one or two extra columns of information. I calculate a sharpe ratio as a measure of risk adjusted returns (assume 5% risk free rate), I also adjust the cagr for inflation2. The website was also kind enough to add an asterisk(*) to the sheet indicating funds which are eligible to the tax break aka Reg28 compliant.

all_funds <- readRDS("data/all_funds.rds")
annual_ret <- function(info){
info %>%
select(Fund.Name, TER, X5.yearsCash.Value, Volatility.Ann...) %>%
set_names(c("fund_name", "TER", "cash_value_5yrs", "volatility")) %>%
mutate_at(vars(-one_of("fund_name")), as.numeric) %>%
mutate(
# growth rate
annualised_ret = ((cash_value_5yrs/100)^(1/5)-1)* 100,
# risk adjusted return
risk_adj_cagr = annualised_ret * (1 - volatility),
# sharpe ration for risk adjusted return
sharpe = (annualised_ret - 5)/volatility,
# inflation adjustment
inflation_adj_cagr = annualised_ret - 5.5) %>%
mutate(reg_28_compliant = grepl("[*]", fund_name))
}

all_funds_ret <- all_funds %>%
mutate(returns = map(info, annual_ret)) %>%
select(funds, returns) %>%
unnest %>%
arrange(-annualised_ret) %>%
filter(!is.na(TER))

all_funds_ret
## # A tibble: 1,182 x 10
##    funds     fund_name       TER cash_value_5yrs volatility annualised_ret
##    <chr>     <chr>         <dbl>           <dbl>      <dbl>          <dbl>
##  1 South Af~ Absa Propert~ 1.72             175.      19.1           11.9
##  2 South Af~ Long Beach F~ 1.53             167.      14.9           10.9
##  3 South Af~ Nedgroup Inv~ 1.36             166.      10.7           10.7
##  4 South Af~ Nedgroup Inv~ 1.93             162.      10.7           10.1
##  5 South Af~ Momentum Fin~ 1.46             161.      11.1            9.94
##  6 South Af~ Fairtree Fle~ 0.900            160.       3.08           9.82
##  7 South Af~ Satrix FINI ~ 0.410            159.      14.1            9.78
##  8 South Af~ Fairtree Equ~ 1.16             158.      12.7            9.58
##  9 South Af~ *Pan-African~ 0.770            158.       3.80           9.51
## 10 South Af~ Sesfikile BC~ 1.28             156.      11.5            9.37
## # ... with 1,172 more rows, and 4 more variables: risk_adj_cagr <dbl>,
## #   sharpe <dbl>, inflation_adj_cagr <dbl>, reg_28_compliant <lgl>

Now that we have all the pieces, lets see what the average TER is for funds in South Africa.

OK doki, so what we see is that on average, the cost of the the non Reg28 funds are cheaper, with the median TER sitting around 1.21%. This means that we will need to choose quite carefully when we decide on which fund to invest in.

Next I wanted to see what the distribution of returns look like between the Reg28 and non-Reg28 funds3.

Its nice to see that although the fees are higher for the Reg28 funds, the returns also seem higher with a bi-modal distribution. This could be a something like a sector effect. I also wanted to check whether there was some statistical significance to this. So I conduct a wilcoxon test. A wilcoxon is a non-parametric test to check whether the true location shift is equal to 0. Here we reject the hypothesis and confirm that Reg28 do indeed have higher returns.

At what cost is the higher returns? Are the Reg28 funds more risky or risk-averse? To answer this question, lets turn to our sharpe ratio. Anything about 1 is considered good and anything about 2 we starting to see some really good returns on low risk investments. There was some outliers, so I removed all more than 3.5 as to focus on the meat of data.

Its by a small margin, but the Reg28 funds seem to be delivering higher return with a lower risk profile.

The last bit of analysis I did was to have a look at the inflation adjusted returns over the last 5 years. The market took quite a beating over the last year, mostly due to some of our presidential factors and also because of some “other” presidential factors in the developed world. So lets look at how the different market sectors did over the last 5 years.

Financial equities did well along with some interest bearing funds. This is not completely out of place. Five years can be considered a short window of investment. Our market equity market is also high concentrated with Naspers driving a lot of the growth in recent years through their investment in a company called Tencent in China.

# Conclusion

This write up was partly to get a grip on how my own RA is doing amid the sea of other options. I also wanted to get an idea of what a reasonable TER is and what a good return on the higher TERs are - i.e are the complex funds delivering higher returns? It would seem so, but even the best funds haven’t beaten the inflation trackers. This is basically because of the short period I am looking at as well as the fact that 2018 sucked for investments in SA (and globally)

The other part of why I wrote this blog, was because I wanted to explore the ggpubr package. Really enjoy some of the features that it has when it comes to quick reporting and visualisation.

1. Again, I am not saying that financial advisers are the devil, I am exaggerating for effect. I have a financial adviser

2. Doing a quick check, in SA the average inflation over the past 5 years was ~5.5%

3. From here on in I used the ggpubr package quite a lot. Really enjoyed it. Do wish it had NSE though

To leave a comment for the author, please follow the link and comment on their blog: Digital Age Economist on Digital Age Economist.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)