New R IDE

March 8, 2011
By
New R IDE

I'm always looking for ways to improve my workflow and overall academic efficiency. I've tried a variety of text editors, GUIs, and integrated development environments (IDEs) for R. I have some preferences but I haven't found anything that I'm complete...

Read more »

A Short Return to the Age-Earnings Profile

March 8, 2011
By
A Short Return to the Age-Earnings Profile

Two posts ago I mentioned the age-earnings profile but did not provide a regression of log earnings on wage. I also offered, without evidence, that fitting a simple linear regression would be inappropriate. How do I know that? How could … Continue reading →

Read more »

Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

March 8, 2011
By
Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

In my previous post I showed you how to randomly split up a dataset into training and testing datasets. (Thanks to all those who emailed me or left comments letting me know that this could be done using other means. As things go with R, it's sometimes ...

Read more »

Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

March 8, 2011
By

In my previous post I showed you how to randomly split up a dataset into training and testing datasets. (Thanks to all those who emailed me or left comments letting me know that this could be done using other means. As things go with R, it's sometimes ...

Read more »

Blackbox trading Strategy using Rapidminer and R II

March 8, 2011
By
Blackbox trading Strategy using Rapidminer and R II

Long time without updating the blog for lack of time (again) due to new professional and personal challenges. Continuing with the strategy of Black Box, thanks to recommendations made by several readers and the lack of time to make a good tutorial of the model, I’m going to make available the file with...

Read more »

Blackbox trading Strategy using Rapidminer and R II

March 8, 2011
By
Blackbox trading Strategy using Rapidminer and R II

Long time without updating the blog for lack of time (again) due to new professional and personal challenges. Continuing with the strategy of Black Box, thanks to recommendations made by several readers and the lack of time to make a good tutorial of the model, I’m going to make available the file with...

Read more »

R Studio

March 8, 2011
By
R Studio

                      If you think that R is The EnvironmentForStatisticalAnalysisAndGraphics but you do not think that Vim is The Editor for text files you might want to have a look at R Studio. It works on Windows, MacOS and Linux. I tried it out on my Ubuntu

Read more »

In case you missed it: January Roundup

March 8, 2011
By

Catching up on roundups today. February roundup will follow soon, but in the meantime enjoy this trip down memory lane - DS. In case you missed them, here are some articles from January of particular interest to R users. Revolution Analytics is now offering annual sponsorship grants for local R user groups worldwide. Issue 2 of the R Journal...

Read more »

Video Tutorial on Instrumental Variables in R

March 8, 2011
By
Video Tutorial on Instrumental Variables in R

Update: I have replaced this video tutorial with a video tutorial on a newer, easier to use IV regression command. Check out that command here.In this video, I show how to use my instrumental variables function in R, ivreg(), along with its companion ...

Read more »

IV Regression

March 8, 2011
By
IV Regression

Here is my code from a previous post that performs IV regression. This may be easier to copy into an R script. I will post a video tutorial using this code shortly.

Read more »

Machine Learning Ex3 – multivariate linear regression

March 8, 2011
By
Machine Learning Ex3 – multivariate linear regression

Exercise 3 is about multivariate linear regression. First part is about finding a good learning rate (alpha) and 2nd part is about implementing linear regression using normal equations instead of the gradient descent algorithm. Data As usual hosted in google docs: mydata = read.csv("http://spreadsheets.google.com/pub?key=0AnypY27pPCJydExfUzdtVXZuUWphM19vdVBidnFFSWc&output=csv", header = TRUE) # show last 5 rows tail(mydata, 5) area bedrooms price 43 2567 ...

Read more »

An enhanced Kaplan-Meier plot

March 8, 2011
By
An enhanced Kaplan-Meier plot

We often see, in publications, a Kaplan-Meier survival plot, with a table of the number of subjects at risk at different time points aligned below the figure. I needed this type of plot (or really, matrices of such plots) for an upcoming publication. Of course, my preferred toolbox was R and the ggplot2 package. There

Read more »

Our Friend the Age-Earnings Profile

March 7, 2011
By
Our Friend the Age-Earnings Profile

I like Labor Economics. Partially because it has a nice mix of theory and practical empiricism, but mostly because it seems to be a sub-field with a number of agreed upon stylized facts that grow not out of micro theory … Continue reading →

Read more »

Initial SoilWeb Concept on Paper

March 7, 2011
By
Initial SoilWeb Concept on Paper

read more

Read more »

money is coin $ flip

March 7, 2011
By
money is coin $ flip

Well, sorta. More precisely, money is the sum of coin$flip divided by the number of coin$flip. But we'll get to that later. For now, let me introduce you to a new algorithm written in R. This one is another "quote" -- simple few lines of code -- whose ...

Read more »

Alabama is a foreign country

March 7, 2011
By
Alabama is a foreign country

Faculty and students of Iowa State University Department of Statistics published online an analysis of the data on 2009 distributions of the US Stimulus funds, aka the Recovery And Reinvestment Act. (The analysis was published in March last year as part of the Design for America competition, but I only recently came across it.) The analyses and associated charts...

Read more »

Basic Plots in R

March 7, 2011
By
Basic Plots in R

Here's a tutorial I recorded on producing basic plots in R.I lost the script file I used to create the video to a horrifying black screen of death, but I used the data from the previous post (available here). Hopefully, the video is clear enough that ...

Read more »

Visualizing the Language Used by Academics when Protected by Anonymity

March 7, 2011
By

Those in the political science discipline probably remember their first encounter with poliscijobrumors.com. For those outside, you have probably never heard of this particular message board, and you would have no reason to. As the URL suggests, the board specializes in rumor, gossip, back-bitting, mudslinging, and the occasional lucid thread on the political science

Read more »

Example 8.29: Risk ratios and odds ratios

March 7, 2011
By
Example 8.29: Risk ratios and odds ratios

When can you safely think of an odds ratio as being similar to a risk ratio?Many people find odds ratios hard to interpret, and thus would prefer to have risk ratios. In response to this, you can find several papers that purport to convert an odds rat...

Read more »

R Tutorial Series: ANOVA Pairwise Comparison Methods

March 7, 2011
By
R Tutorial Series: ANOVA Pairwise Comparison Methods

When we have a statistically significant effect in ANOVA and an independent variable of more than two levels, we typically want to make follow-up comparisons. There are numerous methods for making pairwise comparisons and this tutorial will demonstrate...

Read more »

R Tutorial Series: ANOVA Pairwise Comparison Methods

March 7, 2011
By
R Tutorial Series: ANOVA Pairwise Comparison Methods

When we have a statistically significant effect in ANOVA and an independent variable of more than two levels, we typically want to make follow-up comparisons. There are numerous methods for making pairwise comparisons and this tutorial will demonstrate...

Read more »

Factor models of variance in finance

March 7, 2011
By
Factor models of variance in finance

In “What the hell is a variance matrix?” I talked about the basics of variance matrices and highlighted challenges for estimating them in finance.  Here we look more deeply at the most popular estimation technique. Models for variance matrices The types of variance estimates that are used in finance can be classified as: Sample estimate … Continue reading...

Read more »

R Package Automated Download

March 6, 2011
By

There are situations where we might want to run R on a standalone machine so need to download a (potentially) large number of packages to install on this system. Rather than having to through the pain of searching through CRAN to find the packages and all the dependencies and manually download, it would be nice

Read more »

Boxplots & Beyond IV: Beanplots

Boxplots & Beyond IV: Beanplots

This post is the last in a series of four on boxplots and some of their extensions.  Previous posts in this series have discussed basic boxplots, modified boxplots based on a robust asymmetry measure, and violin plots, an alternative that essentia...

Read more »

Moving from Excel to R

March 5, 2011
By

This first post of the Backtesting in Excel and R series will provide some resources to help smooth the transition from the familiarity and comfort of Excel to the potentially strange and intimidating world of R.I made my voyage from Excel to R more th...

Read more »

Moving from Excel to R

March 5, 2011
By

This first post of the Backtesting in Excel and R series will provide some resources to help smooth the transition from the familiarity and comfort of Excel to the potentially strange and intimidating world of R.I made my voyage from Excel to R more th...

Read more »

Choropleths Made Easy!

March 5, 2011
By
Choropleths Made Easy!

Choropleth Maps are very useful to visualize spatial trends. There have been several blog posts providing detailed instructions on how to create a choropleth map in R using map, spplot and ggplot2. However, I believe that the p...

Read more »

Five ways to visualize your pairwise comparisons

March 5, 2011
By
Five ways to visualize your pairwise comparisons

In data analysis it is often nice to look at all pairwise combinations of continuous variables in scatterplots. Up until recently, I have used the function splom in the package lattice, but ggplot2 has superior aesthetics, I think anyway.Here a fe...

Read more »

Parallel processing in R for Windows

March 4, 2011
By

The doSMP package (and its companion package, revoIPC), previously bundled only with Revolution R, is now available on CRAN for use with open-source R under the GPL2 license. In short, doSMP makes it easy to do SMP parallel processing on a Windows box with multiple processors. (It works on Mac and Linux too, but it's been relatively easy to...

Read more »