Monthly Archives: June 2010

Linear Modeling in R and the Hubble Bubble

June 22, 2010
By
Linear Modeling in R and the Hubble Bubble

Here is a scatter plot with the coordinate labels deliberately omitted. Figure 1. Do you see any trends? How would you model these data? It just so happens that this scatterplot is arguably the most famous scatterplot in history. One aficionado, writing more than forty years after its publication, commented skeptically :" data points were consequently spread...

Read more »

Reaching escape velocity

June 22, 2010
By
Reaching escape velocity

Sample once from the Uniform(0,1) distribution. Call the resulting value . Multiply this result by some constant . Repeat the process, this time sampling from Uniform(0, ). What happens when the multiplier is 2? How big does the multiplier have to be to force divergence. Try it and see: iters = 200 locations = rep(0,iters)

Read more »

Analyzing competitive nordic skiing with R

June 22, 2010
By
Analyzing competitive nordic skiing with R

Here's another great example of R being used to analyze sports data. Statistician and skier Joran Elias has started a project to analyze and visualize international cross country ski racing results, and he publishes his analysis at the blog Statistical Skier. All of the analyses are done using R (and for data, SQLite via the RSQLite package). As much...

Read more »

Employee productivity as function of number of workers revisited

June 22, 2010
By
Employee productivity as function of number of workers revisited

We have a mild obsession with employee productivity and how that declines as companies get bigger. We have previously found that when you treble the number of workers, you halve their individual productivity which is mildly scary. We revisit the analysis for the...

Read more »

Employee productivity as function of number of workers revisited

June 22, 2010
By
Employee productivity as function of number of workers revisited

We have a mild obsession with employee productivity and how that declines as companies get bigger. We have previously found that when you treble the number of workers, you halve their individual productivity which is mildly scary. We revisit the analysis for the...

Read more »

The most violent municipalities in Mexico (2008)

June 21, 2010
By
The most violent municipalities in Mexico (2008)

The top six most violent municipalities are near the US border. Ciudad Juárez is in a class by itself with 113 homicides per 100,000 people. José Azueta is the municipality where Zihuatanejo is located. Mazátlan, another popular tourist destination, also appears on the list.  Lázaro Cárdenas is the largest seaport in Mexico and ever since the...

Read more »

The most violent municipalities in Mexico (2008)

June 21, 2010
By
The most violent municipalities in Mexico (2008)

The top six most violent municipalities are near the US border. Ciudad Juárez is in a class by itself with 113 homicides per 100,000 people. José Azueta is the municipality where Zihuatanejo is located. Mazátlan, another popular tourist destination, also appears on the list.  Lázaro Cárdenas is the largest seaport in Mexico and ever since the...

Read more »

R Layout command.

June 21, 2010
By
R Layout command.

In the previous post I created a chart but could not figure out to fit the legend in the chart area. Peter Carl pointed me to the layout command which partitions the display area and allowed the the legend to be included. Source code to produce the c...

Read more »

MMDS 2010

June 21, 2010
By

The 2010 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2010) finished up this past Friday (June 18th) at Stanford. This was an exceptionally well organized conference: four days of mind-stretching talks on algorithm development and the challenges of working with massive data sets approached from almost every conceivable angle. The approximately 100 attendees were a diverse group...

Read more »

New blog from Rmetrics Foundation

June 21, 2010
By

The Rmetrics Foundation (the sharp minds behind the Rmetrics suite of packages for financial analysis in R) have just launched a new blog where you can keep up with the latest Rmetrics news. Amongst the recent news: a ne eBook about data management of Indian financial market data, and a new interface between Rmetrics and AMPL. You can also...

Read more »