New R User Group in Raleigh-Durham

August 18, 2010
By

New local R user groups keep on popping up on a regular basis, which is great to see. The latest one is deep in SAS territory: it's the Raleigh-Durham-Chapel Hill R Users Group in They don't have any meetings scheduled just yet (but when they do their gracious meetup hosts are Carrboro Creative Coworking). So if you're in the...

Read more »

R be dragons

August 18, 2010
By

Hic sunt dracones used to be placed on maps, as a way to denote a dangerous or otherwise unexplored territory. We might as well write it all over R-related material used in introductory classes, because students seems to be really (…)Read the rest of this entry »

Read more »

Distributions in R

August 18, 2010
By

One of the R language's most powerful features is its ability to deal with random distributions: not just generating random numbers from various distributions (based on a very powerful pseudo-random number generator), but also calculating densities, probabilities, and quintiles. John Cook provides a handy reference chart listing all of the distributions supported by standard R (reproduced below -- and...

Read more »

Bookshelf remodelling

August 18, 2010
By
Bookshelf remodelling

I found time and read Gelman and Hill’s “Data Analysis Using Regression and Multilevel / Hierarchical Models“…Now, please do yourself a favour and get it (of course the paperback version ). Even for experienced or intermediate (myself) this will be a treat for your eyes and neurons. PS : (Confession) I didn’t like the Bayesian ...read more

Read more »

Twifficiency Scores

August 18, 2010
By
Twifficiency Scores

Neil Kodner wrote a great post this morning about yesterday’s Twifficiency scores outbreak. He grabbed all the auto-tweeted scores he could find and plotted their distribution. I was struck by the asymmetry of the resulting distribution, which you can see below: Thankfully, Neil handed me the raw data for his plot, so I was able

Read more »

state-by-state pendulum

August 17, 2010
By
state-by-state pendulum

By popular demand (!), my state-by-state pendulum (pendula?) for 2010 is up (big PDF), just in time for the election.  550px wide JPG version is inline, below. This follows the same formatting I used in the 2007 edition. We start with the 2PP ALP vote shares recorded at the last election (incorporating changes from electoral

Read more »

Ed Burnette on Software Patents

August 17, 2010
By

Ed Burnette makes a point that hits home, with regard to software patents, and how engineers and programmers of modern companies are now being asked to write them: Unfortunately, the joke is on all of us. It’s on our economy, as we let patents choke down innovation and increase fear, uncertainty, and doubt in an

Read more »

Programming Language Popularity: StackOverflow and Ohloh

August 17, 2010
By
Programming Language Popularity: StackOverflow and Ohloh

In the following example, programming language popularity is measured based upon two data sets.  The first is the number of  contributors associated with a language on ohloh.net.  The second is tag usage at stackoverflow.c...

Read more »

Unit Testing in R: The Bare Minimum

August 17, 2010
By

Introduction This week I decided to start unit testing my R code, so I taught myself the bare minimum about the RUnit and testthat packages to be able to use them. Here’s what I found necessary to get started writing tests with both packages. RUnit Basic Example I’m going to assume that you’ve got a

Read more »

Animated Heatmap of WikiLeaks Report Intensity in Afghanistan

August 17, 2010
By

Visualisation of Activity in Afghanistan using the Wikileaks data from Mike Dewar on Vimeo.The latest visualization of the WikiLeaks data compiled by our group is an animation of the intensity of report observations in Afghanistan over the six year period in the WikiLeaks data. Team member Mike Dewar did the vast majority of work for

Read more »

New R User Group in Singapore

August 17, 2010
By

There's yet another R user group starting, this time in Singapore. Their first meetup is next week, on Wednesday August 25. If you're in Singapore, come along and meet your fellow R users! meetup.com: R User Group - SG

Read more »

Deducer: R and ggplot2 GUI

August 16, 2010
By

Last Year I introduced you to R Commander, a nice graphical user interface (GUI) for R for those of you who are still hesitant to leave the clicky-box style research a la SPSS for the far more superior reproducible research using R. As most of you know...

Read more »

IEOR Tools Tutorial: Learning XML with R

August 16, 2010
By

I have been using a lot of R lately in my work.  R (main site) is an open source statistical computing platform.  Saying R is only used for statistics does not do it justice.  I am finding it to be a really powerful statistical and optim...

Read more »

IEOR Tools Tutorial: Learning XML with R

August 16, 2010
By

I have been using a lot of R lately in my work.  R (main site) is an open source statistical computing platform.  Saying R is only used for statistics does not do it justice.  I am finding it to be a really powerful statistical and optim...

Read more »

Goals per Game in MLS

August 16, 2010
By
Goals per Game in MLS

I promised something related to Major League Soccer and here it is.  Caveat:  It’s not much.  Why so sparse?  (1) The data is a bit messy due to teams folding, expansion, name changes, etc.  (2)  I was backpacking all weekend and didn’t have time to work on this side project.  Yes, I have a real

Read more »

Rose plot using Deducers ggplot2 plot builder

August 16, 2010
By

The (excellent!) LearnR blog had a post today about making a rose plot in ggplot2. Following today’s announcement, by Ian Fellows, regarding the release of the new version of Deducer (0.4) offering a strong support for ggplot2 using a GUI plot builder, Ian also sent an e-mail where he shows how to create a rose plot using the new...

Read more »

In case you missed it: July Roundup

August 16, 2010
By

In case you missed them, here are some articles from July of particular interest to R users. We reviewed the updates to Hadley Wickham's ggplot2 and plyr packages. We linked to an article about R co-creator Ross Ihaka in New Zealand's Sunday Star Times. We noted that the presentations from the R/Finance 2010 conference are available for download. We...

Read more »

Charting the performance of cricket all-rounders – IT Botham

August 16, 2010
By
Charting the performance of cricket all-rounders – IT Botham

Cricket is a sport that generates a large volume of performance data and corresponding debate about the relative qualities of various players over their careers and in relation to their contemporaries. The cricinfo website has an extensive database of statistics for professional cricketers that can be searched to access the information in various formats. As an

Read more »

GillespieSSA 0.5-4 is released

August 16, 2010
By
GillespieSSA 0.5-4 is released

I just uploaded GillespieSSA 0.5-4 to CRAN. It should be  just a matter of days before it has propagated itself across all mirrors. This release consists of minor revisions with no (intended) changes in functionality. The main change (and it is … Continue reading →

Read more »

GPU Computing with R

August 16, 2010
By
GPU Computing with R

Statistics is computationally intensive. Routine statistical tasks such as data extraction, graphical summary, and technical interpretation all require heavy use of modern computing machinery. Obviously, these tasks can benefit greatly from a paralle...

Read more »

ggplot2 plot builder is now on CRAN! (through Deducer 0.4 GUI for R)

August 16, 2010
By

Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of Deducer (0.4) to CRAN (scheduled to update in the next day or so). This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality. Following is the e-mail he sent out with...

Read more »

Intraday volatility of OMX Baltic stocks

August 16, 2010
By
Intraday volatility of OMX Baltic stocks

Usually, intraday volatility exhibits a “smile” – it is high at open and close and it is lower during the trading day. DJI index, 5 min. intervals, CET time:MOS stock, 5 s. intervals, CET time:Because many readers of this blog are trading Nasdaq OMX Baltic stocks, it is worth to share my findings about volatility in

Read more »

Gone Guerrill_ R on Our Data

August 16, 2010
By

Here's a summary of some things we learnt about applying R to computer performance and capacity planning data in the GDAT Class last week. Neural nets pkg nnet applied to CPU performance data in the Ripley and Venables book (see Section 8.10). How to do stacked plots that Jim calls "spark plots." Jim told...

Read more »

Gone Guerrill_ R on Our Data

August 16, 2010
By

Here's a summary of some things we learnt about applying R to computer performance and capacity planning data in the GDAT Class last week. Neural nets pkg nnet applied to CPU performance data in the Ripley and Venables book (see Section 8.10).How to do stacked plots that Jim calls "spark plots."Jim told...

Read more »

Project Euler Problem #21

August 16, 2010
By

This is a solution for problem 21 on the Project Euler website. It consists of finding the sum of all the amicable numbers under 10000. This was pretty easy to solve, but the solution could probably be improved quite a bit. Solution #1 in R is as follo...

Read more »

Consultants’ Chart in ggplot2

August 16, 2010
By
Consultants’ Chart in ggplot2

Excel Charts Blog posted a video tutorial of how to create a circumplex or rose or dougnut chart in Excel. Apparently this type of chart is very popular in the consulting industry, hence the “Consultants’ Chart”. It is very easy to make this chart in Excel 2010, but it involves countless number of clicks and

Read more »

A quick analysis of the trends in the number of weddings in France (1975–2010)

August 15, 2010
By
A quick analysis of the trends in the number of weddings in France (1975–2010)

I’m currently planning my wedding, and my fiancée and I were discussing wether there were more or less couples getting married over time. It turns out that this information is quite easy to get via INSEE, a french institute that (…)Read the rest of this entry »

Read more »

Downloading DNA sequences into R

August 15, 2010
By

A while ago, a friend of mine needed to download a number of different DNA sequences from Genbank, the online repository for the vast majority of DNA sequences read from all organisms by labs all over the world. This is not a problem. The "ape" package in R has a nifty function, read.GenBank(), that downloads the...

Read more »

Downloading DNA sequences into R

August 15, 2010
By

A while ago, a friend of mine needed to download a number of different DNA sequences from Genbank, the online repository for the vast majority of DNA sequences read from all organisms by labs all over the world. This is not a problem. The "ape" package in R has a nifty function, read.GenBank(), that downloads the...

Read more »