Day #35 replacing characters

Today I had a meeting with Emmanuel. He is a guy from inside Janssen who is very good with R-scripts. He made a lot of great plots which I had to use for our reports. During the meeting we came to a conclusion that all the difficult R-scripting he did,...

Read more »

How to access databases from R

May 5, 2011
By

From his presentation at the Greater Boston useR Group, R user Jeffrey Breen has shared some useful slides detailing how to bring data from relational databases like MySQL and Oracle. In fact, data from just about any relational database is accessible from R by sending an SQL query to the standard ODBC or JDBC interfaces. R packages also offer...

Read more »

sab-R-metrics: Logistic Regression

May 5, 2011
By
sab-R-metrics: Logistic Regression

It's been a while since my last sab-R-metrics post, and I have not gotten to the real fun stuff yet. I apologize for the long layoff, and it's likely that these will be sparse for the next couple weeks. I have had some consulting opportunities come u...

Read more »

sab-R-metrics: Logistic Regression

May 5, 2011
By
sab-R-metrics: Logistic Regression

It's been a while since my last sab-R-metrics post, and I have not gotten to the real fun stuff yet. I apologize for the long layoff, and it's likely that these will be sparse for the next couple weeks. I have had some consulting opportunities come u...

Read more »

Mapping airline flight networks with R

May 5, 2011
By
Mapping airline flight networks with R

Inspired by the Facebook Social Network chart, FlowingData's Nathan Yau also turns to R to create a beautiful chart of the network of all flight connections between major airlines in the US: Like the Facebook chart, the chart reflects the intensity of the connections (here, the number of flights) between pairs of cities. Nathan explains: Brighter lines represent more...

Read more »

Who will be the next President of the US ?

May 5, 2011
By
Who will be the next President of the US ?

A lot of weird facts (?) can be found on the internet. For instance, about the height of the winner of Presidential elections in the US: the taller always win... "Still, being short does, on average, hurt a person's prospects...The tall guy gets th...

Read more »

S&P 500 High Beta and Low Volatility Indexes and Powershares ETFs

May 5, 2011
By
S&P 500 High Beta and Low Volatility Indexes and Powershares ETFs

There must be a useful insight, concept, or system provided by the new S&P 500 High Beta and Low Volatility Indexes.  Now with the announcement by Powershares of etfs for these indicies http://www.invescopowershares.com/volatility/, any of the...

Read more »

Build instructions for R on Amazon EC2

May 4, 2011
By
Build instructions for R on Amazon EC2

In this post, I will show: - How to create an Amazon EC2 micro instance - How to login to the EC2 instance using PuTTY - How to install the R source and build it. - Use R in the … Continue reading →

Read more »

Bank of America Merrill Lynch Bond Returns on St. Louis Fed

May 4, 2011
By
Bank of America Merrill Lynch Bond Returns on St. Louis Fed

After all my complaining about proprietary data, the St. Louis Federal Reserve announced today the availability of Bank of America Merrill Lynch Bond Indicies on their FRED site.  The data is limited in scope and duration, but accessibility especi...

Read more »

Using R for Map-Reduce applications in Hadoop

May 4, 2011
By

Data Scientist Antonio Piccolboni recently published this comparison of the various language and interfaces available for programming Big Data analysis tasks in the map-reduce framework. The interfaces he reviewed included: Java Hadoop (mature and efficient, but verbose and difficult to program) Cascading (brings an SQL-like flavor to Java programming with Hadoop) Pipes/C++ (a C++ interface to programming on Hadoop)...

Read more »

R Exercise with USDA Data

May 4, 2011
By
R Exercise with USDA Data

After the helpful comment by Bradley on my post Commodity Index Estimators, How about the National Agricultural Statistics Service (NASS)? Looks like they have information for prices received back to 1908 for many agricultural goods (http://www.nass.u...

Read more »

PLINK/SEQ for Analyzing Large-Scale Genome Sequencing Data

May 4, 2011
By
PLINK/SEQ for Analyzing Large-Scale Genome Sequencing Data

PLINK/SEQ is an open source C/C++ library for analyzing large-scale genome sequencing data. The library can be accessed via the pseq command line tool, or through an R interface. The project is developed independently of PLINK but it's syntax will be f...

Read more »

PLINK/SEQ for Analyzing Large-Scale Genome Sequencing Data

May 4, 2011
By

PLINK/SEQ is an open source C/C++ library for analyzing large-scale genome sequencing data. The library can be accessed via the pseq command line tool, or through an R interface. The project is developed independently of PLINK but it's syntax will be f...

Read more »

Whassup with glm()?

May 4, 2011
By

We're having problem with starting values in glm(). A very simple logistic regression with just an intercept with a very simple starting value (beta=5) blows up....

Read more »

Again with Ledoit-Wolf and factor models

May 4, 2011
By
Again with Ledoit-Wolf and factor models

We come closer to a definitive answer on the relative merit of Ledoit-Wolf shrinkage versus a statistical factor model for variance matrices. Previously This post builds on the post entitled: A test of Ledoit-Wolf versus a factor model That post depended on some posts previous to it. New information Previously we generated random portfolios with … Continue reading...

Read more »

Invisible blogs!

May 4, 2011
By
Invisible blogs!

Julien just signaled an intermitent disappearance of the posts on the ‘Og, depending on the operating system: Ubuntu 10.10 seems to be working (most of the time!) while Mac and Windows are having problems… This is beyond my abilities, I have contacted WordPress support, maybe they are working on some new feature, maybe I once

Read more »

Day #35 replacing characters

May 4, 2011
By

Today I had a meeting with Emmanuel. He is a guy from inside Janssen who is very good with R-scripts. He made a lot of great plots which I had to use for our reports. During the meeting we came to a conclusion that all the difficult R-scripting he did,...

Read more »

bigkmeans also works well for ordinary matrix objects: The biganalytics package

May 4, 2011
By
bigkmeans also works well for ordinary matrix objects: The biganalytics package

The bigmemory is an excellent package for handling big matrix in R. There are several sister packages provided by "The Bigmemory Project": biganalytics for analysis, bigtabulate for tabulation, bigalgebra for linear algebra functionality, synchronicity for synchronization via mutexes and interprocess communication and message passing.biganalytics provides a few functions for analysis: linear regression model, generalized linear regression model, and...

Read more »

bigkmeans also works well for ordinary matrix objects: The biganalytics package

May 4, 2011
By
bigkmeans also works well for ordinary matrix objects: The biganalytics package

The bigmemory is an excellent package for handling big matrix in R. There are several sister packages provided by "The Bigmemory Project": biganalytics for analysis, bigtabulate for tabulation, bigalgebra for linear algebra functionality, synchronicity...

Read more »

Extension to mtable function

May 4, 2011
By

Here are some useful extension to the "mtable" function in the memisc package.

Read more »

Extension to mtable function

May 4, 2011
By

Here are some useful extension to the "mtable" function in the memisc package.

Read more »

Guide to Getting Started with R: 2011 Update

May 4, 2011
By

In mid-2009, I wrote a post on getting started with R. A lot has happened in the world of R over the last two years. New books, videos, online documentation, blogs and other resources have emerged. New community structures have emerged. As such I'v...

Read more »

Guide to Getting Started with R: 2011 Update

May 4, 2011
By
Guide to Getting Started with R: 2011 Update

In mid-2009, I wrote a post on getting started with R. A lot has happened in the world of R over the last two years. New books, videos, online documentation, blogs and other resources have emerged. New community structures have emerged. As such I've gi...

Read more »

How to learn R

May 3, 2011
By

Over at R community site inside-R.org, Revolution's Joseph Rickert has published a How-To guide with tips for new users on How to Learn R, with links to resources for R books, blogs and courses. Check it out at the link below. Inside-R: How to Learn R

Read more »

Putting Robust Standard Errors into LaTeX Tables: An Extension of mtable

May 3, 2011
By
Putting Robust Standard Errors into LaTeX Tables: An Extension of mtable

I recently discovered the mtable() command in the memisc library and its use with toLatex() to produce nice summary output for lm and glm objects in a nicely formatted table like this:Once you have your linear model objects, all you need is one command...

Read more »

Putting Robust Standard Errors into LaTeX Tables: An Extension of mtable

May 3, 2011
By
Putting Robust Standard Errors into LaTeX Tables: An Extension of mtable

I recently discovered the mtable() command in the memisc library and its use with toLatex() to produce nice summary output for lm and glm objects in a nicely formatted table like this:Once you have your linear model objects, all you need is one command...

Read more »

Fun with twitteR: Osama bin Laden tweets

May 3, 2011
By
Fun with twitteR: Osama bin Laden tweets

I thought it would be fun to play around with the R package twitteR , an R API into Twitter.  I decided to take the most prominent news story of the past few days, Osama bin Laden’s death, to see … Continue reading →

Read more »

Running R on an iPhone/iPad with RStudio

May 3, 2011
By
Running R on an iPhone/iPad with RStudio

This thread has been widely discussed on a lot of forums. To make a long story short, running natively R on an iDevice (meaning iPhone/iPad) is disabled by its OS, unless it is jailbroken. The steps for the installation through Cydia are described in this R wiki, or this post. But there are some limitations,

Read more »

CPI and US 10y Treasury Extreme –> System Idea

May 3, 2011
By
CPI and US 10y Treasury Extreme –> System Idea

When I see extremes, I feel compelled to explore. The US 10y Treasury yield is at an extreme versus the annualized 3 month CPI rate of change. From TimelyPortfolio Of course, I have to try to build a system around the idea.  While this 3 mont...

Read more »