# Monthly Archives: March 2011

## Can one beat a Random Walk– IMPOSSIBLE (you say?)

March 8, 2011
Firstly, apologies for the long absence as I've been busy with a few things.  Secondly, apologies for the horrific use of caps in the title (for the grammar monitors).  Certainly, you'll gain something useful from today's musing, as it's a pr...

## Challenge: Visualizing the US Federal Budget

March 8, 2011
Google today announced a Data Visualization Challenge that is well suited to the graphical capabilities of R. The goal is to visualize the US Federal budget from the point of view of the taxes an individual pays. The data are available from whatwepayfor.com -- their FAQ gives details about the source of the data and the philosophy of making...

## New R IDE

March 8, 2011
I'm always looking for ways to improve my workflow and overall academic efficiency. I've tried a variety of text editors, GUIs, and integrated development environments (IDEs) for R. I have some preferences but I haven't found anything that I'm complete...

March 8, 2011
$A Short Return to the Age-Earnings Profile$

Two posts ago I mentioned the age-earnings profile but did not provide a regression of log earnings on wage. I also offered, without evidence, that fitting a simple linear regression would be inappropriate. How do I know that? How could … Continue reading →

## Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

March 8, 2011
In my previous post I showed you how to randomly split up a dataset into training and testing datasets. (Thanks to all those who emailed me or left comments letting me know that this could be done using other means. As things go with R, it's sometimes ...

## Blackbox trading Strategy using Rapidminer and R II

March 8, 2011
Long time without updating the blog for lack of time (again) due to new professional and personal challenges. Continuing with the strategy of Black Box, thanks to recommendations made by several readers and the lack of time to make a good tutorial of the model, I’m going to make available the file with...

## R Studio

March 8, 2011
If you think that R is The EnvironmentForStatisticalAnalysisAndGraphics but you do not think that Vim is The Editor for text files you might want to have a look at R Studio. It works on Windows, MacOS and Linux. I tried it out on my Ubuntu

## Video Tutorial on Instrumental Variables in R

March 8, 2011
Update: I have replaced this video tutorial with a video tutorial on a newer, easier to use IV regression command. Check out that command here.In this video, I show how to use my instrumental variables function in R, ivreg(), along with its companion ...

## IV Regression

March 8, 2011
Here is my code from a previous post that performs IV regression. This may be easier to copy into an R script. I will post a video tutorial using this code shortly.

## Machine Learning Ex3 – multivariate linear regression

March 8, 2011
Exercise 3 is about multivariate linear regression. First part is about finding a good learning rate (alpha) and 2nd part is about implementing linear regression using normal equations instead of the gradient descent algorithm. Data As usual hosted in google docs: mydata = read.csv("http://spreadsheets.google.com/pub?key=0AnypY27pPCJydExfUzdtVXZuUWphM19vdVBidnFFSWc&output=csv", header = TRUE) # show last 5 rows tail(mydata, 5) area bedrooms price 43 2567 ...