One of the big differences between a language like Stata compared to R is the ability in R to handle many different types of objects at once, and combine them together or pull them apart. I had a post about objects last year, but I thought I'd sh...

I learned about Lord Rayleigh’s discovery of argon in my 2nd-year analytical chemistry class while reading “Quantitative Chemical Analysis” by Daniel Harris. (William Ramsay was also responsible for this discovery.) This is one of my favourite stories in chemistry; it illustrates how diligence in measurement can lead to an elegant and surprising discovery. I find

Recently, for a research paper, I some samples, and I wanted to compare them. Not to compare they means (by construction, all of them were centered) but there dispersion. And not they variance, but more their quantiles. Consider the following boxplot type function, where everything here is quantile related (which is not the case for standard boxplot, see http://freakonometrics.hypotheses.org/4138,...

What effect do predicted correlations have when optimizing trades? Background A concern about optimization that is not one of “The top 7 portfolio optimization problems” is that correlations spike during a crisis which is when you most want optimization to work. This post looks at a small piece of that question. It wonders if increasing predicted … Continue reading...

Exploring the quality of predictions using random portfolios and optimization. Previously “Simple tests of predicted returns” showed a few ways to look at expected returns at the asset level. Here we move to the portfolio level. The previous post focused on correlation. Win Vector Blog points out that gauging prediction quality using correlation can be … Continue reading...

Some ways to explore how good a method of predicting returns is. Data and model The universe is 443 large cap US stocks that have data back to the beginning of 2004. The daily (adjusted) close was used. The model that is used as an example is the default signal from the MACD function of … Continue reading...

One of the topics emphasized in Exploring Data in Engineering, the Sciences and Medicine is the damage outliers can do to traditional data characterizations. Consequently, one of the procedures to be included in the ExploringData package is FindOutliers, described in this post. Given a vector of numeric values, this procedure supports four different methods for identifying possible outliers.Before...