In general, the standard practice for correcting for population stratification in genetic studies is to use principal components analysis (PCA) to categorize samples along different ethnic axes. Price et al. published on this in 20...

R news and tutorials contributed by (552) R bloggers

The “Minimum Correlation Algorithm” is a term I stumbled at the CSS Analytics blog. This is an Interesting Risk Measure that in my interpretation means: minimizing Average Portfolio Correlation with each Asset Class for a given level of return. One might try to use Correlation instead of Covariance matrix in mean-variance optimization, but this approach,

In the last few posts I introduced Maximum Loss, Mean-Absolute Deviation, and Expected shortfall (CVaR) and Conditional Drawdown at Risk (CDaR) risk measures. These risk measures can be formulated as linear constraints and thus can be combined with each other to control multiple risk measures during construction of efficient frontier. Let’s examine efficient frontiers computed

Dan Woods at Forbes interviewed LinkedIn's Daniel Tunkelang about the rise of data science and on building data science teams. When asked how students today should prepare themselves to be data scientists, Tunkelang gives some good advice: When we built the data science team at LinkedIn a few years ago, we looked for raw talent, assuming that smart people...

So you want to get statistical? Nowadays one of the ways to go is to use R, mostly in combination with ggplot2 for generating the plots. These plots and graphs however need some data, for that we use data sources. There are a lot of data sources availa...

In the Maximum Loss and Mean-Absolute Deviation risk measures post I started the discussion about alternative risk measures we can use to construct efficient frontier. Another alternative risk measures I want to discuss are Expected shortfall (CVaR) and Conditional Drawdown at Risk (CDaR). I will use methods presented in Comparative Analysis of Linear Portfolio Rebalancing

In the New York Times' "Bits" blog today, Quentin Hardy offers recollections on Big Data talks at the recent Web 2.0 Summit. He begins with a definition of Big Data: Big Data is really about ... the benefits we will gain by cleverly sifting through it to find and exploit new patterns and relationships. You see it now in...