Monthly Archives: May 2011

S&P 500 High Beta and Low Volatility Indexes and Powershares ETFs

May 5, 2011
By
S&P 500 High Beta and Low Volatility Indexes and Powershares ETFs

There must be a useful insight, concept, or system provided by the new S&P 500 High Beta and Low Volatility Indexes.  Now with the announcement by Powershares of etfs for these indicies http://www.invescopowershares.com/volatility/, any of the...

Read more »

Build instructions for R on Amazon EC2

May 4, 2011
By
Build instructions for R on Amazon EC2

In this post, I will show: - How to create an Amazon EC2 micro instance - How to login to the EC2 instance using PuTTY - How to install the R source and build it. - Use R in the … Continue reading →

Read more »

Bank of America Merrill Lynch Bond Returns on St. Louis Fed

May 4, 2011
By
Bank of America Merrill Lynch Bond Returns on St. Louis Fed

After all my complaining about proprietary data, the St. Louis Federal Reserve announced today the availability of Bank of America Merrill Lynch Bond Indicies on their FRED site.  The data is limited in scope and duration, but accessibility especi...

Read more »

Using R for Map-Reduce applications in Hadoop

May 4, 2011
By

Data Scientist Antonio Piccolboni recently published this comparison of the various language and interfaces available for programming Big Data analysis tasks in the map-reduce framework. The interfaces he reviewed included: Java Hadoop (mature and efficient, but verbose and difficult to program) Cascading (brings an SQL-like flavor to Java programming with Hadoop) Pipes/C++ (a C++ interface to programming on Hadoop)...

Read more »

R Exercise with USDA Data

May 4, 2011
By
R Exercise with USDA Data

After the helpful comment by Bradley on my post Commodity Index Estimators, How about the National Agricultural Statistics Service (NASS)? Looks like they have information for prices received back to 1908 for many agricultural goods (http://www.nass.u...

Read more »

PLINK/SEQ for Analyzing Large-Scale Genome Sequencing Data

May 4, 2011
By
PLINK/SEQ for Analyzing Large-Scale Genome Sequencing Data

PLINK/SEQ is an open source C/C++ library for analyzing large-scale genome sequencing data. The library can be accessed via the pseq command line tool, or through an R interface. The project is developed independently of PLINK but it's syntax will be f...

Read more »

PLINK/SEQ for Analyzing Large-Scale Genome Sequencing Data

May 4, 2011
By

PLINK/SEQ is an open source C/C++ library for analyzing large-scale genome sequencing data. The library can be accessed via the pseq command line tool, or through an R interface. The project is developed independently of PLINK but it's syntax will be f...

Read more »

Whassup with glm()?

May 4, 2011
By

We're having problem with starting values in glm(). A very simple logistic regression with just an intercept with a very simple starting value (beta=5) blows up....

Read more »

Again with Ledoit-Wolf and factor models

May 4, 2011
By
Again with Ledoit-Wolf and factor models

We come closer to a definitive answer on the relative merit of Ledoit-Wolf shrinkage versus a statistical factor model for variance matrices. Previously This post builds on the post entitled: A test of Ledoit-Wolf versus a factor model That post depended on some posts previous to it. New information Previously we generated random portfolios with … Continue reading...

Read more »

Invisible blogs!

May 4, 2011
By
Invisible blogs!

Julien just signaled an intermitent disappearance of the posts on the ‘Og, depending on the operating system: Ubuntu 10.10 seems to be working (most of the time!) while Mac and Windows are having problems… This is beyond my abilities, I have contacted WordPress support, maybe they are working on some new feature, maybe I once

Read more »