state-by-state pendulum

August 17, 2010
By
state-by-state pendulum

By popular demand (!), my state-by-state pendulum (pendula?) for 2010 is up (big PDF), just in time for the election.  550px wide JPG version is inline, below. This follows the same formatting I used in the 2007 edition. We start with the 2PP ALP vote shares recorded at the last election (incorporating changes from electoral

Read more »

Ed Burnette on Software Patents

August 17, 2010
By

Ed Burnette makes a point that hits home, with regard to software patents, and how engineers and programmers of modern companies are now being asked to write them: Unfortunately, the joke is on all of us. It’s on our economy, as we let patents choke down innovation and increase fear, uncertainty, and doubt in an

Read more »

Programming Language Popularity: StackOverflow and Ohloh

August 17, 2010
By
Programming Language Popularity: StackOverflow and Ohloh

In the following example, programming language popularity is measured based upon two data sets.  The first is the number of  contributors associated with a language on ohloh.net.  The second is tag usage at stackoverflow.c...

Read more »

Unit Testing in R: The Bare Minimum

August 17, 2010
By

Introduction This week I decided to start unit testing my R code, so I taught myself the bare minimum about the RUnit and testthat packages to be able to use them. Here’s what I found necessary to get started writing tests with both packages. RUnit Basic Example I’m going to assume that you’ve got a

Read more »

Animated Heatmap of WikiLeaks Report Intensity in Afghanistan

August 17, 2010
By

Visualisation of Activity in Afghanistan using the Wikileaks data from Mike Dewar on Vimeo. The latest visualization of the WikiLeaks data compiled by our group is an animation of the intensity of report observations in Afghanistan over the six year period in the WikiLeaks data. Team member Mike Dewar did the vast majority of work for

Read more »

New R User Group in Singapore

August 17, 2010
By

There's yet another R user group starting, this time in Singapore. Their first meetup is next week, on Wednesday August 25. If you're in Singapore, come along and meet your fellow R users! meetup.com: R User Group - SG

Read more »

Deducer: R and ggplot2 GUI

August 16, 2010
By

Last Year I introduced you to R Commander, a nice graphical user interface (GUI) for R for those of you who are still hesitant to leave the clicky-box style research a la SPSS for the far more superior reproducible research using R. As most of you know...

Read more »

IEOR Tools Tutorial: Learning XML with R

August 16, 2010
By

I have been using a lot of R lately in my work.  R (main site) is an open source statistical computing platform.  Saying R is only used for statistics does not do it justice.  I am finding it to be a really powerful statistical and optim...

Read more »

IEOR Tools Tutorial: Learning XML with R

August 16, 2010
By

I have been using a lot of R lately in my work.  R (main site) is an open source statistical computing platform.  Saying R is only used for statistics does not do it justice.  I am finding it to be a really powerful statistical and optim...

Read more »

Goals per Game in MLS

August 16, 2010
By
Goals per Game in MLS

I promised something related to Major League Soccer and here it is.  Caveat:  It’s not much.  Why so sparse?  (1) The data is a bit messy due to teams folding, expansion, name changes, etc.  (2)  I was backpacking all weekend and didn’t have time to work on this side project.  Yes, I have a real

Read more »

Rose plot using Deducers ggplot2 plot builder

August 16, 2010
By

The (excellent!) LearnR blog had a post today about making a rose plot in ggplot2. Following today’s announcement, by Ian Fellows, regarding the release of the new version of Deducer (0.4) offering a strong support for ggplot2 using a GUI plot builder, Ian also sent an e-mail where he shows how to create a rose plot using the new...

Read more »

In case you missed it: July Roundup

August 16, 2010
By

In case you missed them, here are some articles from July of particular interest to R users. We reviewed the updates to Hadley Wickham's ggplot2 and plyr packages. We linked to an article about R co-creator Ross Ihaka in New Zealand's Sunday Star Times. We noted that the presentations from the R/Finance 2010 conference are available for download. We...

Read more »

Charting the performance of cricket all-rounders – IT Botham

August 16, 2010
By
Charting the performance of cricket all-rounders – IT Botham

Cricket is a sport that generates a large volume of performance data and corresponding debate about the relative qualities of various players over their careers and in relation to their contemporaries. The cricinfo website has an extensive database of statistics for professional cricketers that can be searched to access the information in various formats. As an

Read more »

GillespieSSA 0.5-4 is released

August 16, 2010
By
GillespieSSA 0.5-4 is released

I just uploaded GillespieSSA 0.5-4 to CRAN. It should be  just a matter of days before it has propagated itself across all mirrors. This release consists of minor revisions with no (intended) changes in functionality. The main change (and it is … Continue reading →

Read more »

GPU Computing with R

August 16, 2010
By
GPU Computing with R

Statistics is computationally intensive. Routine statistical tasks such as data extraction, graphical summary, and technical interpretation all require heavy use of modern computing machinery. Obviously, these tasks can benefit greatly from a paralle...

Read more »

ggplot2 plot builder is now on CRAN! (through Deducer 0.4 GUI for R)

August 16, 2010
By

Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of Deducer (0.4) to CRAN (scheduled to update in the next day or so). This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality. Following is the e-mail he sent out with...

Read more »

Intraday volatility of OMX Baltic stocks

August 16, 2010
By
Intraday volatility of OMX Baltic stocks

Usually, intraday volatility exhibits a “smile” – it is high at open and close and it is lower during the trading day. DJI index, 5 min. intervals, CET time: MOS stock, 5 s. intervals, CET time: Because many readers of this blog are trading Nasdaq OMX Baltic stocks, it is worth to share my findings about volatility in

Read more »

Gone Guerrill_ R on Our Data

August 16, 2010
By

Here's a summary of some things we learnt about applying R to computer performance and capacity planning data in the GDAT Class last week. Neural nets pkg nnet applied to CPU performance data in the Ripley and Venables book (see Section 8.10). How to do stacked plots that Jim calls "spark plots." Jim told...

Read more »

Gone Guerrill_ R on Our Data

August 16, 2010
By

Here's a summary of some things we learnt about applying R to computer performance and capacity planning data in the GDAT Class last week. Neural nets pkg nnet applied to CPU performance data in the Ripley and Venables book (see Section 8.10). How to do stacked plots that Jim calls "spark plots." Jim told...

Read more »

Project Euler Problem #21

August 16, 2010
By

This is a solution for problem 21 on the Project Euler website. It consists of finding the sum of all the amicable numbers under 10000. This was pretty easy to solve, but the solution could probably be improved quite a bit. Solution #1 in R is as follo...

Read more »

Consultants’ Chart in ggplot2

August 16, 2010
By
Consultants’ Chart in ggplot2

Excel Charts Blog posted a video tutorial of how to create a circumplex or rose or dougnut chart in Excel. Apparently this type of chart is very popular in the consulting industry, hence the “Consultants’ Chart”. It is very easy to make this chart in Excel 2010, but it involves countless number of clicks and

Read more »

A quick analysis of the trends in the number of weddings in France (1975–2010)

August 15, 2010
By
A quick analysis of the trends in the number of weddings in France (1975–2010)

I’m currently planning my wedding, and my fiancée and I were discussing wether there were more or less couples getting married over time. It turns out that this information is quite easy to get via INSEE, a french institute that (…)Read the rest of this entry »

Read more »

Downloading DNA sequences into R

August 15, 2010
By

A while ago, a friend of mine needed to download a number of different DNA sequences from Genbank, the online repository for the vast majority of DNA sequences read from all organisms by labs all over the world. This is not a problem. The "ape" package in R has a nifty function, read.GenBank(), that downloads the...

Read more »

Downloading DNA sequences into R

August 15, 2010
By

A while ago, a friend of mine needed to download a number of different DNA sequences from Genbank, the online repository for the vast majority of DNA sequences read from all organisms by labs all over the world. This is not a problem. The "ape" package in R has a nifty function, read.GenBank(), that downloads the...

Read more »

Two Surpising Things about R

August 14, 2010
By
Two Surpising Things about R

I see that it’s been over a year since my last post!  I have a backlog of blog post ideas, but something else always seems to have higher priority.   Today, though, I have some interesting (and useful) things to say about R, which I discovered in the last few days, and which shouldn’t take long

Read more »

Hard drive occupation prediction with R – The linear regression

Hard drive occupation prediction with R – The linear regression

On some environments, disk space usage can be pretty predictable. In this post, we will see how to do a linear regression to estimate when free space will reach zero, and how to assess the quality of such regression, all using R - the statistical soft...

Read more »

Hard drive occupation prediction with R

Hard drive occupation prediction with R

On some environments, disk space usage can be pretty predictable. In this post, we will see how to do a linear regression to estimate when free space will reach zero, and how to assess the quality of such regression, all using R - the statistical soft...

Read more »

Auto-completion in Notepad++ for R Script

August 14, 2010
By
Auto-completion in Notepad++ for R Script

Auto-completion is fancy in a text editor. Notepad++ does not support auto-completion for the R language, so I spent a couple of hours on creating such an XML file to support R: Put it under ‘plugins/APIs‘ in the installation directory of Notepad++ (you can see several other XML files there supporting different languages such as

Read more »

Introducing visualVaR.com

August 13, 2010
By
Introducing visualVaR.com

After a month of on-again, off-again coding, I’ve finally completed a web site geared towards calculating the Value at Risk of the average investor’s portfolio. The site is visualvar.com. The big idea was to combine the statistical and visualization tools of R (especially ggplot2) with the web interface of Drupal. While I’m

Read more »