[This article was first published on logopt: a journey in R, finance and open source, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We extend the analysis of part 7 by calculating the final wealth for all tuples of 3 and 4 stocks, this is a simple extension but it also shows the most important problem of the Universal portfolio algorithm, its exponential complexity in the number of stocks.

The first part shows the cumulative distribution of the absolute final wealth for all possible tuples of 2, 3 and 4 stocks.  The results are qualitatively the same as before, a thin tail of good results.  It also a shows increase in absolute wealth as the number of stocks increase.

The code remains quite compact, R is simply wonderful for this type of analysis.

library(logopt)
x <- coredata(nyse.cover.1962.1984)
w <- logopt:::x2w(x)
nDays <- dim(x)
nStocks <- dim(x)
Days <- 1:nDays
iWin <- 1 ; plot(1:10)

FinalWealth <- function(cols) {
xij <- x[,cols]
uc <- universal.cover(xij, 20)
return(uc[length(uc)])
}

TupleSizes <- c(2,3,4)
if (exists(“lws”) == FALSE) {
lws <- list()
for (i in 1:length(TupleSizes)) {
TupleSize <- TupleSizes[i]
ws <- combn(1:nStocks, TupleSize, FinalWealth)
lines(ecdf(ws), pch=”.”, col=Colors[i])
lws[[i]] <- ws
}
}

# show the ecdf
if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ;
plot(ecdf(lws[]), pch=”.”, col=”red”, main=”Cumulative probability of final wealth”,
xlab=”Final wealth”, ylab=”Cumulative probability”)
lines(ecdf(lws[]), pch=”.”, col=”green”)
lines(ecdf(lws[]), pch=”.”, col=”blue”)
legend(“bottomright”, legend=c(“2 stocks”,”3 stocks”,”4 stocks”), fill=c(“blue”,”green”,”red”))
grid()

# show best absolute wealth and its composition
for (i in 1:length(TupleSizes)) {
TupleSize <- TupleSizes[i]
BestTuple <- which.max(lws[[i]])
BestStocks <- combn(1:nStocks, TupleSize)[,BestTuple]
cat(sprintf(“Max final wealth %.4f for stocks: “, lws[[i]][BestTuple]))
cat(colnames(x)[BestStocks]) ; cat(“\n”)
}

This gives the following output and graph

Max final wealth 78.4742 for stocks: comme kinar
Max final wealth 111.6039 for stocks: comme kinar meico
Max final wealth 138.5075 for stocks: comme espey kinar meico

The highlighted line of code is there for a reason, calculating the final wealth for all 4-tuples takes about   one full day on my machine.  So the code is only run once, then the magic of automatically saving the environment takes care of avoiding recalculating the result.

But why is the running time so long?  The main problem of the Universal Portfolio algorithm is that it needs to evaluate an integral on the simplex, and the curse of dimensionality rears its ugly head.  The general approach for a numerical integration in n dimensions is exponential in the number of dimensions.  Furthermore we calculate the wealth for all possible n-tuples and this increases also exponentially in n when n is much smaller than the total number of stocks concerned.

As usual R code can provide a more vivid representation, the following code generates a plot showing the complexity evolution as the number of stocks increase for 36 stocks and a grid spacing of 0.05.  The code also produces a table for small the small values of n.  The code uses the ::: operator, it was new to me and allows to call package functions that are not part of the exposed interface.

library(logopt)
x <- coredata(nyse.cover.1962.1984)
nStocks <- dim(x)

Sizes <- 1:nStocks
nCs <- 0 * Sizes
nSs <- 0 * Sizes

for (i in 1:length(Sizes)) {
n <- Sizes[i]
nCs[i] <- choose(nStocks, n)
nSs[i] <- logopt:::count.grid.points(n, 20)
}

plot(nCs * nSs, type=”l”, col=”blue”, log=”y”,ylim=range(nCs,nSs,nCs*nSs))
lines(nCs, col=”red”)
lines(nSs, col=”green”)
grid()

cat(sprintf(“n    C(36,n)    S(n,20) C(36,n)*S(n,20)   Ratios to previous value\n”))
for(i in 2:6) {
n <- Sizes[i]
cat(sprintf(“%d %10d %10d %15.0f”,n,nCs[i],nSs[i], nCs[i] * nSs[i]))
if(i > 2) {
cat(sprintf(“%7.1f %7.1f %7.1f”, nCs[i]/nCs[i-1], nSs[i]/nSs[i-1],
(nCs[i] * nSs[i]) / (nCs[i-1] * nSs[i-1])))
}
cat(“\n”)
}

n    C(36,n)    S(n,20) C(36,n)*S(n,20)   Ratios to previous value
2        630         21           13230
3       7140        231         1649340   11.3    11.0   124.7
4      58905       1771       104320755    8.2     7.7    63.2
5     376992      10626      4005916992    6.4     6.0    38.4
6    1947792      53130    103486188960    5.2     5.0    25.8

The exponential complexity makes the Universal Porfolio algorithm impractical for large portfolio, a serious limitation.  Some authors have devised more efficient approaches, that either try to approximate Universal Portfolio at reduced complexity or are inspired by the ideas of Universal but with less complexity and general either no bounds or worse bounds.