**logopt: a journey in R, finance and open source**, and kindly contributed to R-bloggers)

We extend the analysis of part 7 by calculating the final wealth for all tuples of 3 and 4 stocks, this is a simple extension but it also shows the most important problem of the Universal portfolio algorithm, its exponential complexity in the number of stocks.

The first part shows the cumulative distribution of the absolute final wealth for all possible tuples of 2, 3 and 4 stocks. The results are qualitatively the same as before, a thin tail of good results. It also a shows increase in absolute wealth as the number of stocks increase.

The code remains quite compact, R is simply wonderful for this type of analysis.

library(logopt)

x <- coredata(nyse.cover.1962.1984)

w <- logopt:::x2w(x)

nDays <- dim(x)[1]

nStocks <- dim(x)[2]

Days <- 1:nDays

iWin <- 1 ; plot(1:10)

FinalWealth <- function(cols) {

xij <- x[,cols]

uc <- universal.cover(xij, 20)

return(uc[length(uc)])

}

TupleSizes <- c(2,3,4)**if (exists(“lws”) == FALSE)** {

lws <- list()

for (i in 1:length(TupleSizes)) {

TupleSize <- TupleSizes[i]

ws <- combn(1:nStocks, TupleSize, FinalWealth)

lines(ecdf(ws), pch=”.”, col=Colors[i])

lws[[i]] <- ws

}

}

# show the ecdf

if(length(dev.list()) < iWin) { x11() } ; iWin <- iWin + 1 ; dev.set(iWin) ;

plot(ecdf(lws[[3]]), pch=”.”, col=”red”, main=”Cumulative probability of final wealth”,

xlab=”Final wealth”, ylab=”Cumulative probability”)

lines(ecdf(lws[[2]]), pch=”.”, col=”green”)

lines(ecdf(lws[[1]]), pch=”.”, col=”blue”)

legend(“bottomright”, legend=c(“2 stocks”,”3 stocks”,”4 stocks”), fill=c(“blue”,”green”,”red”))

grid()

# show best absolute wealth and its composition

for (i in 1:length(TupleSizes)) {

TupleSize <- TupleSizes[i]

BestTuple <- which.max(lws[[i]])

BestStocks <- combn(1:nStocks, TupleSize)[,BestTuple]

cat(sprintf(“Max final wealth %.4f for stocks: “, lws[[i]][BestTuple]))

cat(colnames(x)[BestStocks]) ; cat(“\n”)

}

This gives the following output and graph

Max final wealth 78.4742 for stocks: comme kinar

Max final wealth 111.6039 for stocks: comme kinar meico

Max final wealth 138.5075 for stocks: comme espey kinar meico

The highlighted line of code is there for a reason, calculating the final wealth for all 4-tuples takes about one full day on my machine. So the code is only run once, then the magic of automatically saving the environment takes care of avoiding recalculating the result.

But why is the running time so long? The main problem of the Universal Portfolio algorithm is that it needs to evaluate an integral on the simplex, and the curse of dimensionality rears its ugly head. The general approach for a numerical integration in n dimensions is exponential in the number of dimensions. Furthermore we calculate the wealth for all possible n-tuples and this increases also exponentially in n when n is much smaller than the total number of stocks concerned.

As usual R code can provide a more vivid representation, the following code generates a plot showing the complexity evolution as the number of stocks increase for 36 stocks and a grid spacing of 0.05. The code also produces a table for small the small values of n. The code uses the ::: operator, it was new to me and allows to call package functions that are not part of the exposed interface.

library(logopt)

x <- coredata(nyse.cover.1962.1984)

nStocks <- dim(x)[2]

Sizes <- 1:nStocks

nCs <- 0 * Sizes

nSs <- 0 * Sizes

for (i in 1:length(Sizes)) {

n <- Sizes[i]

nCs[i] <- choose(nStocks, n)

nSs[i] <- logopt:::count.grid.points(n, 20)

}

plot(nCs * nSs, type=”l”, col=”blue”, log=”y”,ylim=range(nCs,nSs,nCs*nSs))

lines(nCs, col=”red”)

lines(nSs, col=”green”)

grid()

cat(sprintf(“n C(36,n) S(n,20) C(36,n)*S(n,20) Ratios to previous value\n”))

for(i in 2:6) {

n <- Sizes[i]

cat(sprintf(“%d %10d %10d %15.0f”,n,nCs[i],nSs[i], nCs[i] * nSs[i]))

if(i > 2) {

cat(sprintf(“%7.1f %7.1f %7.1f”, nCs[i]/nCs[i-1], nSs[i]/nSs[i-1],

(nCs[i] * nSs[i]) / (nCs[i-1] * nSs[i-1])))

}

cat(“\n”)

}

**leave a comment**for the author, please follow the link and comment on their blog:

**logopt: a journey in R, finance and open source**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...