Wanted: Big-data beta testers

July 15, 2010
By

We're nearing completion of the package of statistical tools for very large data sets that I gave an early preview of at R/Finance 2010. It will be released for Revolution R Enterprise later this year, but we're looking for some R users with big data sets to put the 1.0 version through its paces in the beta program and...

Read more »

R: a “Rock Star” for Business Intelligence

July 15, 2010
By

TDWI (The Data Warehousing Institute) recently published a comprehensive article about R and increasing level of activity around it from commercial organizations, including Revolution Analytics. The article opens with: In statistical circles, "R" is the name of an open source programming language for statistical analysis. These days, it might also be shorthand for "rock star." Many of the companies...

Read more »

ggplot2 GSOC progress

July 15, 2010
By
ggplot2 GSOC progress

(Written by Ian Fellows) The RForge build error has been fixed. the package can now be tried with: install.packages("Deducer",,"http://www.rforge.net",type="source")

Read more »

ggplot2 GSOC progress

July 15, 2010
By
ggplot2 GSOC progress

(Written by Ian Fellows) The next installment of the vlog is here. All of ggplot2 has been implemented, including layers, scales, facets, and themes. For some reason rforge.net is having some troubles with my package, so you would need to build it from...

Read more »

Maps, Geocoding, and the R User Conference 2010

July 14, 2010
By
Maps, Geocoding, and the R User Conference 2010

The R User Conference 2010 is scheduled for July 20-23, 2010.  Wanna know where?Although there are more sophisticated methods of mapping with R the maps package makes mapping activities straightforward.  A bit of XML and Google...

Read more »

Creating a Presentation with LaTeX Beamer – Tables

July 14, 2010
By
Creating a Presentation with LaTeX Beamer – Tables

Tables of information can be included in a LaTeX beamer presentation in the same way that they would be incorporated into any other LaTeX document. The tabular environment is used and, if necessary, the tables could be numbered but this probably doesn’t make as much sense as labelling and numbering tables within an article or

Read more »

R’s Normal Distribution Functions: rnorm and pals

July 14, 2010
By

The rnorm() function in R is a convenient way to simulate values from the normal distribution, characterized by a given mean and standard deviation. I hadn't previously used the associated commands dnorm() (normal density function), pnorm() (cumulative...

Read more »

Revolution at useR! 2010

July 14, 2010
By

Revolution Analytics is a proud sponsor of this year's annual R user conference, useR! 2010, and many members of the Revolution team will be there at Gaithersburg next week. We'll be hosting a booth at the conference where you can come up and meet the team, and see some of the new features being developed for Revolution R in...

Read more »

Homicide in North America

July 14, 2010
By
Homicide in North America

I'm surprised by how similar the trends are (excluding the drug war in Mexico). There were big decreases in the homicide rate in all three countries starting in the early nineties, which then slowed down around 2000. The homicide rates for Mexicans ...

Read more »

Homicide in North America

July 14, 2010
By
Homicide in North America

I'm surprised by how similar the trends are (excluding the drug war in Mexico). There were big decreases in the homicide rate in all three countries starting in the early nineties, which then slowed down around 2000. The homicide rates for Mexicans ...

Read more »

Short Open Source Q&A with Revolution Analytics

July 14, 2010
By

I recently e-mailed David Smith of Revolution Analytics with a few questions about their relationship with the R-project, and how they handle R‘s source code. David mentioned, and I’m flattered that my email motivated an additional page on the Revolution website. Beyond this, I have no other relationship with the company. I’d like to thank

Read more »

QQ plot of p-values in R using base graphics

July 14, 2010
By

Update Tuesday, September 14, 2010: Fixed the ylim issue, now it sets the y axis limit based on the smallest observed p-value. A while back Will showed you how to create QQ plots of p-values in Stata and in R using the now-deprecated sma package. A bi...

Read more »

Multidimension bridge sampling (CoRe in CiRM [5])

July 13, 2010
By
Multidimension bridge sampling (CoRe in CiRM [5])

Since Bayes factor approximation is one of my areas of interest, I was intrigued by Xiao-Li Meng’s comments during my poster in Benidorm that I was using the “wrong” bridge sampling estimator when trying to bridge two models of different dimensions, based on the completion (for and missing from the first model) When revising the

Read more »

House Data: 41k finance summaries from 2200 candidates

July 13, 2010
By
House Data: 41k finance summaries from 2200 candidates

I’d like to announce a new project by Offensive Politics called House Data, launching today. House Data is a large-scale extract of FEC Form 3 Summary of receipts of disbursements (pdf warning) of every US House campaign from mid-2001 onward. The traditional source for campaign finance summaries is the Candidate Summary File, which is a

Read more »

Norman Nie on Internet Evolution radio

July 13, 2010
By

Revolution CEO Norman Nie just recorded a live podcast with Terry Sweeney of Internet Evolution Radio. In the 30-minute interview, Norman talked about the history of R, his time with SPSS, development plans for Revolution R, and how predictive analytics is impacting businesses, the Web, and even political opinion. You can hear the recorded interview at the link below....

Read more »

Area Plots with Intensity Coloring

July 13, 2010
By
Area Plots with Intensity Coloring

I am not sure apeescape’s ggplot2 area plot with intensity colouring is really the best way of presenting the information, but it had me intrigued enough to replicate it using base R graphics. The key technique is to draw a gradient...

Read more »

Area Plots with Intensity Coloring

July 13, 2010
By
Area Plots with Intensity Coloring

I am not sure apeescape’s ggplot2 area plot with intensity colouring is really the best way of presenting the information, but it had me intrigued enough to replicate it using base R graphics. The key technique is to draw a gradient...

Read more »

Hierarchical Visualizations in R and the Javascript InfoVis Toolkit

July 12, 2010
By
Hierarchical Visualizations in R and the Javascript InfoVis Toolkit

I love R.  It is really a great language and platform for statistical work and graphing.  But every technology has its limits - and other tools can be meet different needs.  So in this post, I will start with R and move on to the JavaScr...

Read more »

A quantum leap (CoRe in CiRM [4])

July 12, 2010
By
A quantum leap (CoRe in CiRM [4])

Today, as I was trying to install SpatialEpi to use the Scotland lip cancer data in the last chapter of Bayesian Core, I realised my version of R, R Version 2.6.1, was hopelessly out of date! As I am also using Hardy Heron, a somehow antiquated version of Ubuntu on my Mac, upgrading R took

Read more »

Charting the World Cup

July 12, 2010
By
Charting the World Cup

Now that Spain has won the World Cup, it's interesting to go back and look at some metrics from the matches and see if we can tease out what characteristics made for a winning Cup team this time around. Fortunately, the Guardian's Data Blog has made a wealth of World Cup statistics available, with data on every player of...

Read more »

Example 8.2: Digits of Pi, redux

July 12, 2010
By
Example 8.2: Digits of Pi, redux

In example 8.1, we considered some simple tests for the randomness of the digits of Pi. Here we develop a different test and implement it. If each digit appears in each place with equal and independent probability, then the places between recurrences...

Read more »

Launch R document from Smultron / Fraise

July 12, 2010
By

A short code to launch R documents from Smultron / Fraise

Read more »

A robust Hotelling test…

July 12, 2010
By

Recently I was in need of testing a mean vector. I wrote a few lines of code in R and had it done perfectly. Hotelling test is one of the least interesting test to me. never really figured out why… At that time I had some time to search more about it. One of the

Read more »

A robust Hotelling test…

July 12, 2010
By

Recently I was in need of testing a mean vector. I wrote a few lines of code in R and had it done perfectly. Hotelling test is one of the least interesting test to me. never really figured out why… At that time I had some time to search more about it. One of the ...read more

Read more »

World Cup 2010 Statistics Plotted with R

July 11, 2010
By
World Cup 2010 Statistics Plotted with R

Opta  agreed to let the UK Guardian Data Blog publish 2010 World Cup Team and Player statistics.  The data is available in a Google Docs spreadsheet.  There are two tabs on this spreadsheet - one is PLAYERS the other is TEAM st...

Read more »

using R + ess-remote with screen in emacs

July 11, 2010
By

Dear list, I brought up this issue before but a good solution never arised: being able to use screen on a remote server (so if something goes wrong on my side I can always resume that R session) inside of emacs in order to utilize ESS. The closest thing I found to a good work

Read more »

Area Plots with Intensity Coloring ~ el nino SST anomalies w/ ggplot2

July 10, 2010
By
Area Plots with Intensity Coloring ~ el nino SST anomalies w/ ggplot2

I see many economy indicator graphs that show emphasis by shading in the curve under the area (while x-axis is time). The shade is stronger at higher values (example). I did this in R below (ggplot2). This was a little more difficult that I’d expected. The color gradients are good to color each individual points

Read more »

CoRe in CiRM [3]

July 10, 2010
By
CoRe in CiRM [3]

Still drudging along preparing the new edition of Bayesian Core. I am almost done with the normal chapter, where I also changed the Monte Carlo section to include specific tools (bridge) for evidence/Bayes factor approximation. Jean-Michel has now moved to the new hierarchical model chapter and analysed longitudinal  datasets that will constitute the core of

Read more »

World Government Data Store API (R and Ruby)

July 10, 2010
By
World Government Data Store API (R and Ruby)

The UK Guardian Data Blog has great visualizations on the topics of the day - along with with specific references to data sets and online resources in use.  You can find out more about the origins and plans of this and related data sites in t...

Read more »