R from source

July 11, 2011
By

The following are notes for myself. I like to use the bleeding edge version of R: svn checkout https://svn.r-project.org/R/trunk/ r-devel cd r-devel ./tools/rsync-recommended ## use the following to update sources: svn update ## pre-reqs sudo apt-get build-dep r-base #sudo apt-get install gcc g++ gfortran libreadline-dev libx11-dev xorg-dev #sudo apt-get install texlive texinfo ./configure make sudo... Read more »

In case you missed it: June Roundup

July 11, 2011
By

In case you missed them, here are some articles from June of particular interest to R users. Highlights of presentations from the R/Finance 2011 conference. Trulia uses R and statistical models to map local crime. Resources for data mining with R. K-means clustering on large data sets with the RevoScaleR package. Revolution Analytics' CTO David Champagne writes on real-time...

Read more »

The foundations of Statistics: a simulation-based approach

July 11, 2011
By
The foundations of Statistics: a simulation-based approach

“We have seen that a perfect correlation is perfectly linear, so an imperfect correlation will be `imperfectly linear’.” page 128 This book has been written by two linguists, Shravan Vasishth and Michael Broe, in order to teach statistics “in  areas that are traditionally not mathematically demanding” at a deeper level than traditional textbooks “without using

Read more »

Sir Sun Drop

July 11, 2011
By
Sir Sun Drop

Okay so one of my best friends Sir Kris "Wespro" Wesslen has started a new blog and i think it's so hilariously decked out with pompous amounts of hilarity that even a blind and brainless mouse would chuckle out of amusement. Please check it out here. ...

Read more »

XLConnect 0.1-5

July 11, 2011
By
XLConnect 0.1-5

Mirai Solutions GmbH (http://www.mirai-solutions.com) is pleased to announce the availability of XLConnect 0.1-5. This release adds the following new features: Support for setting/getting cell formulas. See methods set/getCellFormula. Support for setting/getting the force formula recalculation flag on worksheets. See methods … Continue reading →

Read more »

The Road to Default: Debt Ratio Comparison’s With Previous Episodes

July 11, 2011
By
The Road to Default: Debt Ratio Comparison’s With Previous Episodes

In 2009, Carmen M. Reinhart and Kenneth S. Rogoff wrote a book titled ,"This Time Is Different" about debt and financial crisis. One of their charts will provide a benchmark for us in our analysis.  This chart can be found on page 121 of the book ...

Read more »

You can scrap it and write something better but let me keep R ;)

July 11, 2011
By

Ross Ikaha (via Xi'an -- thanks ;) ) gives a nice example to show why R is basically impossible to optimize: > f = function() { > if (runif(1) > 0.5) { > x = 10 > } > ...

Read more »

Example 9.2: Transparency and bivariate KDE

July 11, 2011
By
Example 9.2:  Transparency and bivariate KDE

In Example 9.1, we showed a binning approach to plotting bivariate relationships in a large data set. Here we show more sophisticated approaches: transparent overplotting and formal two-dimensional kernel density estimation. We use the 10,000 simulat...

Read more »

Tamino’s Method: Regional Temperatures

July 11, 2011
By
Tamino’s Method: Regional Temperatures

Tamino over at  Open Mind has a new post detailing his approach for calculating temperature averages. See his post here. His method is based on the Berkeley method as he notes and he uses it primarily for calculating regional or local temperature averages. Read his post for the math details behind the approach. I got

Read more »

Creating 3D geographical plots in R using RGL

July 11, 2011
By
Creating 3D geographical plots in R using RGL

I've been playing around with the rgl package in the last week, as part of an ongoing quest to come up with nice-looking (but more importantly, useful) data vizualisations. It's a nice little package, and once you've run through the excell...

Read more »

Testing an S&P 500 prediction

July 10, 2011
By
Testing an S&P 500 prediction

If a particular prediction comes true, how surprised should we be? The prediction The page that sparked my curiosity tells of a prediction made a year ago that the S&P 500 would beat its historic high by the end of 2011.  It says that at the point the prediction was made, the level of the … Continue reading...

Read more »

Reproducible blogging

July 10, 2011
By

As a fact-based blog, the posts here contain very often diagrams and data tables. To enable you to reproduce the results and insights, I include the computations as computer code.Most blogposts I write are markdown text combined (or weaved) with computer code written in the R language. I created a small package mdtools that puts the...

Read more »

Now I’m R-Blogging

July 10, 2011
By

Today a lot of great mails arrived at my inbox. In one of them I was reading I’ve just added your feed to the site. Where did this mail come from? The sender of the email was Tal Galili. He is a researcher in BioStatistics at the Tel Aviv University, very active around the internet.

Read more »

Migrating from SPSS/Excel to R

July 10, 2011
By
Migrating from SPSS/Excel to R

In this post, I give an outline for those interested in migrating from using SPSS and Excel for data processing/analysis …Continue reading »

Read more »

Heatmap tables done better, in Sweave and latex

July 10, 2011
By
Heatmap tables done better, in Sweave and latex

  I wrote before about using heatmap tables to combine the strengths of tables and graphics for nominal data. Here is a neat approach using Sweave and latex to produce an effect like in the picture. This latex code is self-contained. Just save it as myfile.Rnw, run Sweave(myfile.Rnw) from inside R and then pdflatex myfile.tex

Read more »

Heatmap tables done better, in Sweave and latex

July 10, 2011
By
Heatmap tables done better, in Sweave and latex

  I wrote before about using heatmap tables to combine the strengths of tables and graphics for nominal data. Here is a neat approach using Sweave and latex to produce an effect like in the picture. This latex code is self-contain...

Read more »

The Road to Default: Who’s getting the most screwed?

July 10, 2011
By
The Road to Default: Who’s getting the most screwed?

Let's take a look at who gets the most screwed (who loses the most money) when bond prices collapse and the United States defaults.Well until recently only about 55% of treasury's were held domestically. The rest was externally held by places like Japa...

Read more »

More fun with the Failed States Index (and the State Fragility Index)

July 9, 2011
By
More fun with the Failed States Index (and the State Fragility Index)

So the other day’s experiment with the Failed States Index and the Polity Data didn’t yield the linear trend I had originally expected.  After all, the two measure fundamentally distinct things.  But perhaps there’s another dataset which will match linearly.  The same people who made polity also put out a dataset called the State Fragility

Read more »

R at Wikimedia

July 9, 2011
By

Last year Wikipedia rolled out a pilot program to use Wikipedia article creation as an assignment in the classroom. Students wrote articles on a topic area and rather than turning them into a professor and forgetting about it they upload it to Wikipedia and expose it to readers around the world. 24 schools inside the…

Read more »

The Road to Default: We Crumble Like A Cookie

July 9, 2011
By
The Road to Default: We Crumble Like A Cookie

What should we be expecting when the United States defaults and how will this unsavory process unfold? Well for one thing anticipate the downfall with a downgrade in the credit rating.  According to recent Bloomberg article in the event of a U.S. ...

Read more »

Index decomposition with R

July 9, 2011
By
Index decomposition with R

Few days ago, I finally finished a small package ida. It enables you to analyse contributions of underlying factors to the change in an aggregate, using methods based on index number theory. These methods have become popular by, but are not restricted to, investigating the change of CO2 emissions.Here is a chart that shows what the change of population,...

Read more »

R on the cloud

July 9, 2011
By

Just as scientists should never really have to think much about statistics, I feel that, in an ideal world, statisticians would never have to worry about computing. In the real world, though, we have to spend a lot of time building our own tools.It would be great if we could routinely run R with speed and memory limitations...

Read more »

Notes on Engineering Data Analysis (with R and ggplot2)

July 8, 2011
By
Notes on Engineering Data Analysis (with R and ggplot2)

Hadley Wickham gave a Google Tech Talk a couple weeks back titled Engineering Data Analysis (with R and ggplot2). These are my notes. The data analysis cycle is to iteratively transform, visualize and model. Leading into the cycle is data access an...

Read more »

Blog in motion

July 8, 2011
By

In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries

Read more »

The Road to Default: Let’s Look at the Damage with a Rant.

July 8, 2011
By
The Road to Default: Let’s Look at the Damage with a Rant.

The following graph shows Real GDP as a percentage of the Gross Federal Debt.  FRED is the resource I frequently use for United States financial data and it serves us well here.  What is Gross Federal Debt? Well, its total government debt out...

Read more »

NYT on the importance of reproducible research

July 8, 2011
By
NYT on the importance of reproducible research

Yesterday's New York Times includes a great article on the failure of some genetic tests for cancer detection, and the flaws in the research that led to them. The article features quotes from Keith Baggerly of MD Anderson Cancer Center, and includes a photo of him and colleague Kevin Coombes in front of a page of R code: The...

Read more »

The Rebirth

July 8, 2011
By
The Rebirth

Hey guys, I know that I have been gone for awhile now.  I just came back from a month long euro adventure so have no fear, I have plenty of time to devote to blogging now.  From this day forward, X.U. Economics will be known as The Dancing Ec...

Read more »

Censoring on one end, “outliers” on the other, what can we do with the middle?

July 8, 2011
By

This post was written by Phil. A medical company is testing a cancer drug. They get a 16 genetically identical (or nearly identical) rats that all have the same kind of tumor, give 8 of them the drug and leave 8 untreated…or maybe they give them a placebo, I don’t know; is there a placebo

Read more »

installing R 2.13.1 on Amazon EC2′s “Amazon Linux” AMI #rstats

July 8, 2011
By
installing R 2.13.1 on Amazon EC2′s “Amazon Linux” AMI #rstats

Condensed from this post (and comments) on David Chudzicki’s blog, tweaked, and updated for R-2.13.1. Assumes you’re starting with a virgin “Amazon Linux” AMI. I picked “Basic 64-bit Amazon Linux AMI 2011.02.1 Beta” (AMI Id: ami-8e1fece7) because it was marked as free tier eligible on the “Quick Start” tab of AWS’s “Launch Instance” dialog box:

Read more »