Reproducible blogging

July 10, 2011
By

As a fact-based blog, the posts here contain very often diagrams and data tables. To enable you to reproduce the results and insights, I include the computations as computer code.Most blogposts I write are markdown text combined (or weaved) with computer code written in the R language. I created a small package mdtools that puts the...

Read more »

Now I’m R-Blogging

July 10, 2011
By

Today a lot of great mails arrived at my inbox. In one of them I was reading I’ve just added your feed to the site. Where did this mail come from? The sender of the email was Tal Galili. He is a researcher in BioStatistics at the Tel Aviv University, very active around the internet.

Read more »

Migrating from SPSS/Excel to R

July 10, 2011
By
Migrating from SPSS/Excel to R

In this post, I give an outline for those interested in migrating from using SPSS and Excel for data processing/analysis …Continue reading »

Read more »

Heatmap tables done better, in Sweave and latex

July 10, 2011
By
Heatmap tables done better, in Sweave and latex

  I wrote before about using heatmap tables to combine the strengths of tables and graphics for nominal data. Here is a neat approach using Sweave and latex to produce an effect like in the picture. This latex code is self-contained. Just save it as myfile.Rnw, run Sweave(myfile.Rnw) from inside R and then pdflatex myfile.tex

Read more »

Heatmap tables done better, in Sweave and latex

July 10, 2011
By
Heatmap tables done better, in Sweave and latex

  I wrote before about using heatmap tables to combine the strengths of tables and graphics for nominal data. Here is a neat approach using Sweave and latex to produce an effect like in the picture. This latex code is self-contain...

Read more »

The Road to Default: Who’s getting the most screwed?

July 10, 2011
By
The Road to Default: Who’s getting the most screwed?

Let's take a look at who gets the most screwed (who loses the most money) when bond prices collapse and the United States defaults.Well until recently only about 55% of treasury's were held domestically. The rest was externally held by places like Japa...

Read more »

More fun with the Failed States Index (and the State Fragility Index)

July 9, 2011
By
More fun with the Failed States Index (and the State Fragility Index)

So the other day’s experiment with the Failed States Index and the Polity Data didn’t yield the linear trend I had originally expected.  After all, the two measure fundamentally distinct things.  But perhaps there’s another dataset which will match linearly.  The same people who made polity also put out a dataset called the State Fragility

Read more »

R at Wikimedia

July 9, 2011
By

Last year Wikipedia rolled out a pilot program to use Wikipedia article creation as an assignment in the classroom. Students wrote articles on a topic area and rather than turning them into a professor and forgetting about it they upload it to Wikipedia and expose it to readers around the world. 24 schools inside the…

Read more »

The Road to Default: We Crumble Like A Cookie

July 9, 2011
By
The Road to Default: We Crumble Like A Cookie

What should we be expecting when the United States defaults and how will this unsavory process unfold? Well for one thing anticipate the downfall with a downgrade in the credit rating.  According to recent Bloomberg article in the event of a U.S. ...

Read more »

Index decomposition with R

July 9, 2011
By
Index decomposition with R

Few days ago, I finally finished a small package ida. It enables you to analyse contributions of underlying factors to the change in an aggregate, using methods based on index number theory. These methods have become popular by, but are not restricted to, investigating the change of CO2 emissions.Here is a chart that shows what the change of population,...

Read more »

R on the cloud

July 9, 2011
By

Just as scientists should never really have to think much about statistics, I feel that, in an ideal world, statisticians would never have to worry about computing. In the real world, though, we have to spend a lot of time building our own tools.It would be great if we could routinely run R with speed and memory limitations...

Read more »

Notes on Engineering Data Analysis (with R and ggplot2)

July 8, 2011
By
Notes on Engineering Data Analysis (with R and ggplot2)

Hadley Wickham gave a Google Tech Talk a couple weeks back titled Engineering Data Analysis (with R and ggplot2). These are my notes. The data analysis cycle is to iteratively transform, visualize and model. Leading into the cycle is data access an...

Read more »

Blog in motion

July 8, 2011
By

In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries

Read more »

The Road to Default: Let’s Look at the Damage with a Rant.

July 8, 2011
By
The Road to Default: Let’s Look at the Damage with a Rant.

The following graph shows Real GDP as a percentage of the Gross Federal Debt.  FRED is the resource I frequently use for United States financial data and it serves us well here.  What is Gross Federal Debt? Well, its total government debt out...

Read more »

NYT on the importance of reproducible research

July 8, 2011
By
NYT on the importance of reproducible research

Yesterday's New York Times includes a great article on the failure of some genetic tests for cancer detection, and the flaws in the research that led to them. The article features quotes from Keith Baggerly of MD Anderson Cancer Center, and includes a photo of him and colleague Kevin Coombes in front of a page of R code: The...

Read more »

The Rebirth

July 8, 2011
By
The Rebirth

Hey guys, I know that I have been gone for awhile now.  I just came back from a month long euro adventure so have no fear, I have plenty of time to devote to blogging now.  From this day forward, X.U. Economics will be known as The Dancing Ec...

Read more »

Censoring on one end, “outliers” on the other, what can we do with the middle?

July 8, 2011
By

This post was written by Phil. A medical company is testing a cancer drug. They get a 16 genetically identical (or nearly identical) rats that all have the same kind of tumor, give 8 of them the drug and leave 8 untreated…or maybe they give them a placebo, I don’t know; is there a placebo

Read more »

installing R 2.13.1 on Amazon EC2′s “Amazon Linux” AMI #rstats

July 8, 2011
By
installing R 2.13.1 on Amazon EC2′s “Amazon Linux” AMI #rstats

Condensed from this post (and comments) on David Chudzicki’s blog, tweaked, and updated for R-2.13.1. Assumes you’re starting with a virgin “Amazon Linux” AMI. I picked “Basic 64-bit Amazon Linux AMI 2011.02.1 Beta” (AMI Id: ami-8e1fece7) because it was marked as free tier eligible on the “Quick Start” tab of AWS’s “Launch Instance” dialog box:

Read more »

The virtues of incoherence?

July 8, 2011
By

Kent Osband writes:

Read more »

R 2.13.1 released

July 8, 2011
By

As announced by the R core team, R 2.13.1 has been released on schedule today. For those who build R themselves, updated source code is available now; pre-built binaries will propagate through the CRAN network over the next couple of days. This update includes some minor bug fixes to graphics, fixes to some edge cases in the class systems,...

Read more »

The R Programming Wikibook

July 8, 2011
By
The R Programming Wikibook

The R Programming wikibook is an open source community project that "aims to create a cross-disciplinary practical guide to the R programming language." It was launched in June 2011 and is seeking content and contributors. The full call for the R Progr...

Read more »

Analyzing the Failed States Index (with Polity IV)

July 7, 2011
By
Analyzing the Failed States Index (with Polity IV)

So, I decided to sit down and have a little fun with that Failed States Index data I put together. To start, I expect that the dataset will be pretty linearly correlated with the polity IV data. This makes sense–true democracies aren’t failed states, and failed states tend not to be democratic. To test this,

Read more »

Scary Derivatives and Scary XML in R

July 7, 2011
By
Scary Derivatives and Scary XML in R

I need some new R skills, and there is no better motivation to learn XML in R than one of the scariest financial datasets out there—the US Department of the Treasury Office of the Comptroller of the Currency (OCC) Quarterly Derivatives Report. I’ll...

Read more »

High-quality R graphics on the Web with SVG

July 7, 2011
By
High-quality R graphics on the Web with SVG

If you want the graphics you create with R to look their best, in general it's best to go for a vector-based graphics format instead of a raster-based format. Common formats like GIF and JPG are raster-based: the image is composed of pixels, and if you don't choose a high enough resolution, you're likely to lose fine details and/or...

Read more »

GLMM Hell

July 7, 2011
By
GLMM Hell

I have been starting to analyze some data I have of repeated counts of salamanders from 5 plots over 4 years. I am trying to develop a predictive model of salamander nighttime surface activity as a function of weather variables. The repeated counting l...

Read more »

A simple ggplot2 scatterplot revisited

July 7, 2011
By

Rick Wicklin contacted me with a helpful suggestion for improving the data presentation method outlined in my  previous post on using ggplot2 to visualize some data. In the previous post I had plotted up a highly correlated set of points, showing the correspondence between maximum daily body temperatures of model snails sitting with the foot

Read more »

Twitter Math Puzzle and Solution

July 7, 2011
By

Yesterday I posted a very simple math puzzle to Twitter that I found in Jonathan Baron’s book, Thinking and Deciding. The puzzle is the following: Show that every number of the form ABC,ABC is divisible by 13. The puzzle comes up in Baron’s book as an example of an “insight problem” in which one goes

Read more »

Twitter Math Puzzle and Solution

July 7, 2011
By

Yesterday I posted a very simple math puzzle to Twitter that I found in Jonathan Baron’s book, Thinking and Deciding. The puzzle is the following: Show that every number of the form ABC,ABC is divisible by 13. The puzzle comes up in Baron’s book as an example of an “insight problem” in which one goes

Read more »

R: calculations involving months

July 7, 2011
By
R: calculations involving months

Ask anyone how much time has elapsed since September last year and they’ll probably start counting on their fingers: “October, November…” and tell you “just over 9 months.” So, when faced as I was today with a data frame (named dates) like this: How to add a 7th column, with the number of months between

Read more »