A true data-doodler – Christophe Ladroue (R ddly and plyr on Triathlon Results)

October 12, 2011
By

To me, this post by Christophe Ladroue personifies what data doodlers do.They take a dataset that is of interest to them (In his case, his triathlon results) and then they manipulate the numbers to see what insights can be drawn. Most bloggers only sho...

Read more »

Typos in Introduction to Monte Carlo Methods with R

October 12, 2011
By
Typos in Introduction to Monte Carlo Methods with R

The two translators of our book in Japanese, Kazue & Motohiro Ishida, contacted me about some R code mistakes in the book. The translation is nearly done and they checked every piece of code in the book, an endeavour for which I am very grateful! Here are the two issues they have noticed (after incorporating

Read more »

Bay Area R Users group has 1300 members

October 12, 2011
By

Impressive. You are not alone!

Read more »

Percentage of Organic Farming Operations by State

October 12, 2011
By
Percentage of Organic Farming Operations by State

With data from the USDA on certified organic farms for 2008.  I created a map using the Geo Map function from the googleVis API package available in R.  I’ve copied and pasted the image below as WordPress.com sites don’t support … Continue reading →

Read more »

Slides and replay for "Introduction to R for SAS and SPSS users"

October 12, 2011
By

If you missed last week's webinar from Bob Muenchen, "Introduction to R for SAS and SPSS users", you missed a great overview of the R Project and how it compares to commercial statistical software. Bob's slides are below, and you can download the slides and replay from the Revolution Analytics website. Bob pointed out a couple of really useful...

Read more »

What does it mean to be a Data Scientist?

October 12, 2011
By

Check out this talk by John Rauser of AMZN at the 2011 Strata Conf. It is an excellent intro to the field.

Read more »

Multiply Imputing an Outcome Variable

October 12, 2011
By

Some scholars suggest that multiply imputing an outcome variable is incorrect. I use intuition and simulation to argue that multiply imputing outcomes can drastically improve estimates, even in the case of non-ignorable missingness. Continue reading &#...

Read more »

Simulating data following a given covariance structure

October 12, 2011
By

Every year there is at least a couple of occasions when I have to simulate multivariate data that follow a given covariance matrix. For example, let’s say that we want to create an example of the effect of collinearity when … Continue reading →

Read more »

Generosity of Asian Central Banks

October 12, 2011
By
Generosity of Asian Central Banks

The only thing that separates the United States from Europe and the notorious PIIGS is the generosity of Asian Central Banks who have been consistently quantitatively easing since 1998 (Join the Reserves). From TimelyPortfolio Without this generos...

Read more »

Tricks I learned today #1: as.integer() on factor levels

October 12, 2011
By

I normally work with full numerical data, not categorical data. R, when using read.csv() seems to recognize such categories and marks the column as to have factor levels. This is useful indeed. However, I wanted to make a PCA biplot on this data, so wa...

Read more »

What’s there to like about R?

October 12, 2011
By
What’s there to like about R?

Update 10/11/2011: There’s a good discussion on RedditUpdate 10/12/2011: Note manipulate package and highlight data.table packageThe R statistical computing platform is a rising star that’s been gaining popularity and attention, but it gets no respect in the hood. It’s telling that a popular guide to R is called The R Inferno, and that advocacy pieces Follow me on...

Read more »

Why it doesn’t make sense to chew people out for not reading the help page

October 12, 2011
By

Karl Broman writes: Barry Rowlingson gave an interesting talk at UseR 2011, “Why R-help must die!” He suggested the Q-and-A type sites Stack Overflow (on programming) and Cross Validated (on statistics), both part of Stack Exchange. I haven’t used R-help recently but I do occasionally send people there. Just to see what was going on The post Why...

Read more »

Identifying Records in Data Frame A That Are Not Contained In Data Frame B – A Comparison

October 12, 2011
By
Identifying Records in Data Frame A That Are Not Contained In Data Frame B – A Comparison

Yesterday I launched my first question at Stackoverflow and apparently did a lot of things wrong as I managed to get my question closed wihtin hours http://stackoverflow.com/questions/7728462/identify-records-in-data-frame-a-not-contained-in-data-frame-b I had collected 9 different solutions to the problem and made the mistake to put it all within the original question space. So people complained and told me … Continue reading...

Read more »

Yet Another One.. Animation with saveHTML / saveVideo from Package ANIMATION

October 12, 2011
By
Yet Another One.. Animation with saveHTML / saveVideo from Package ANIMATION

...some more playing with saveHTML, as.raster() and rasterImage(), producing a "flickering screen":Read more »

Read more »

Online

October 12, 2011
By
Online

Hello world, I decided to start blogging a bit to throw my weird R code examples at you ;-) Hope you’ll like it! Greetz, Janko

Read more »

R related books: Traditional vs online publishing

October 12, 2011
By
R related books: Traditional vs online publishing

How many R related books have been published so far? Who is the most popular publisher? How many other manuals, tutorials and books have been published online? Let's find out. A few years ago I used the publication list on r-project.org as an argument ...

Read more »

Model decision tree in R, score in Base SAS

October 11, 2011
By
Model decision tree in R, score in Base SAS

This code creates a decision tree model in R using party::ctree() and prepares the model for export it from R to Base SAS, so SAS can score new records. SAS Enterprise Miner and PMML are not required, and Base SAS … Continue reading →

Read more »

Le Monde puzzle [#743]

October 11, 2011
By
Le Monde puzzle [#743]

As Le Monde weekend has yet again changed its format (with so much more advertisements for luxurious items that I sometimes wonder whether or not this is the weekend edition of Le Monde!], it took me a while to locate the mathematical puzzle. The good news is there now is a science&techno leaflet with, at

Read more »

Where to find data to use with R

October 11, 2011
By

(Contributing blogger Joe Rickert has put together a fantastic list of data sources suitable for use with R. If you're looking for data to use in the Applications of R Contest -- entries close October 31 -- this is a great resource for you -- Ed.) Hardly a day goes by without someone or something reminding me that we...

Read more »

Setting plots side by side

October 11, 2011
By
Setting plots side by side

This is simple example code to display side-by-side lattice plots or ggplot2 plots, using the mtcars dataset that comes with any R installation. We will display a scatterplot of miles per US gallon (mpg) on car weight (wt) next to … Continue reading →

Read more »

R Bloggers widget in R Graph Gallery

October 11, 2011
By
R Bloggers widget in R Graph Gallery

Following last post about partnership with R Bloggers, Tal and I have added a small widget to the gallery main page to present links to recent posts on R Bloggers It uses the wordpress api to grab information about the rss feed generated by R Bl...

Read more »

The Work of the 1 Percent and the 0.1 Percent

October 10, 2011
By
The Work of the 1 Percent and the 0.1 Percent

The Occupy Wall Street movement chants "We are the 99 percent, you are the 1 percent." It's a catchy refrain, and there are many excellent reasons to put the focus on Wall Street in the struggle for economic and political justice in the US. But even singling out one percent of the US means we

Read more »

Top 50 Statistics blogs

October 10, 2011
By

TheBestColleges.org has just published their list of the "Top 50 Statistics Blogs of 2011", and I'm pleased say that not only did our own Revolutions blog make the list, but it's in fine company with some truly excellent blogs. Several of my personal favourites made the list, including: Guardian columnist Ben Goldacre's Bad Science blog The Dataists, a blog...

Read more »

Upgrading R (and packages)

October 10, 2011
By

I tend not to upgrade R very often—running from 6 months to 1 year behind in version numbers—because I had to reinstall all packages: a real pain. A quick search shows that people have managed to come up with good … Continue reading →

Read more »

An exercise in plyr and ggplot2 using triathlon results

October 10, 2011
By
An exercise in plyr and ggplot2 using triathlon results

I ran my last triathlon for this year a couple of weeks ago, in the beautiful town of Stratford-upon-Avon. The results were online the day after so I decided to have a look at my fellow competitors’ times, which gave … Continue reading →

Read more »

Artist view of crimes in London

October 10, 2011
By
Artist view of crimes in London

At first sight, one could think this picture is a scale model of some narrow moutains, like Bryce Canyon… Actually it represents crimes in East London, an cardboard artwork by the Londoner artist Abigail Reynolds, called Mount Fear.  Here is what can be read on the artist’s webpage: The terrain of Mount Fear is generated

Read more »

single-column data frame

October 10, 2011
By

This is a trivial but very useful tip:> x=data.frame(a=1:4, c=5)> x a c1 1 52 2 53 3 54 4 5> x a c1 1 5> x 1 2 3 4> x a1 12 23 34 4where you can see that:to avoid a become a vector, rather than a...

Read more »

k-mean clustering + heatmap

October 10, 2011
By

If you want more info about clustering, I have another post about "Clustering analysis and its implementation in R". Here is the link:  http://onetipperday.blogspot.com/2012/04/clustering-analysis-2.html------------Several R functions in this...

Read more »

Reading HTML pages in R for text processing

October 10, 2011
By

We were talking with one of my colleagues about doing some text analysis—that, by the way, I have never done before—for which the first issue is to get text in R. Not any text, but files that can be accessed … Continue reading →

Read more »