The Problem In the United States, health care costs have been going up for a number of years, even when adjusted for inflation. Not unlike a runaway freight train, this rampant inflation cannot continue indefinitely without crashing. ...

When we backtest a strategy on a portfolio, it is a simple analysis of a single period in time. There are ways to “stress test” a strategy such as monte carlo, random portfolios, or shuffling the returns in a random order. I could never really wrap my head around monte carlo and shuffling the returns … Continue reading...

(by Trevor Hastie) Glmnet_1.8 uploaded to CRAN – This is a major revision, with two additional models included. 1) Multiresponse regression – family=”mgaussian” Here we have a matrix of M responses, and we fit a series of linear models in parallel. We use a group-lasso penalty on the set of M coefficients for each variable. This means they are...

Following the crash of my hard drive right before leaving Kyoto, I bought a cheap Compaq Presario CQ57 to reinstall Ubuntu 12.04 over the weekend (and have a laptop available before leaving for Australia…) It took about one hour to install from the DVD and everything seems to be working out of the box. The

For me Kaggle becomes a social network for data scientist, as stackoverflow.com or github.com for programmers. If you are data scientist, machine learner or statistician you better off to have a profile there, otherwise you do not exist. Nevertheless, I won’t bet on rosy future for data scientist as journalists suggest (sexy job for next

Part II – Solving Big Problems with Oracle R Enterprise In the first post in this series (see https://blogs.oracle.com/R/entry/solving_big_problems_with_oracle), we showed how you can use R to perform historical rate of return calculations against investment data sourced from a spreadsheet. We demonstrated the calculations against sample data for a small set of accounts. While this worked...

I want to continue with Factor Attribution theme that I presented in the Factor Attribution post. I have re-organized the code logic into the following 4 functions: factor.rolling.regression – Factor Attribution over given rolling window factor.rolling.regression.detail.plot – detail time-series plot and histogram for each factor factor.rolling.regression.style.plot – historical style plot for selected 2 factors factor.rolling.regression.bt.plot

On July 25th, I’ll be presenting at the Seattle R Meetup about implementing Bayesian nonparametrics in R. If you’re not sure what Bayesian nonparametric methods are, they’re a family of methods that allow you to fit traditional statistical models, such as mixture models or latent factor models, without having to fully specify the number of