Grey’s Anatomy Network of Sexual Relations

March 25, 2011
By

This all began with an introductory presentation about social network analysis to a group of medical students.  What better way to grab their attention than with attractive, fake doctors having sex on television?  Naturally this led to the dense network … Continue reading →

Read more »

MCMC with errors

March 25, 2011
By
MCMC with errors

I received this email last week from Ian Langmore, a postdoc in Columbia: I’m looking for literature on a subject and can’t find it:  I have a Metropolis sampler where the acceptance probability is evaluated with some error.  This error is not simply error in evaluation of the target density.  It occurs due to the

Read more »

Day #11 Easter “egg”

March 25, 2011
By

This is not my daily blogpost, but something I found while searching for different plots just copy this code and post it in your Rserve (or what you use to give in R commands) install.packages("onion") require(onion) data(bunny) p3d(bunny,theta=3,phi=1...

Read more »

Radiation levels at Fukushima

March 24, 2011
By
Radiation levels at Fukushima

From BWR The above graph is derived from data scraped from TEPCO press releases. Every hour or so for the first few days of the crisis, a TEPCO van would record radiation (probably Beta/Gamma, but the translation is unclear) at … Continue reading →

Read more »

No simulation is complete without a gif

March 24, 2011
By
No simulation is complete without a gif

I promise this is my last post on the now week and a half old π pay! Building on the last post, I figured I could show how convergence actually works in the estimation algorithm. If you’ll recall, we plotted … Continue reading →

Read more »

Predicting R models with PMML: Revolution R Enterprise and ADAPA

March 24, 2011
By

The recently announced Revolution Analytics / Zementis partnership goes a long way towards demonstrating how R fits into big-league production environments. A frequent complaint against R is that although R is fine prototyping tool it is not able to handle production environments. Well, that’s just not true. In fact, it is straightforward to build a model in R, translate...

Read more »

R Still On Top

March 24, 2011
By
R Still On Top

According to the Google Ngram corpus, R is still the top rated statistical software package. Ok, I’m just kidding. That plot is worthless. All the data are from books published between the years 1890 and 2008, and none of those software packages wou...

Read more »

Silver Is A Weighted Coin

March 24, 2011
By
Silver Is A Weighted Coin

editorial note: there is an error in the code explained below the code. When you flip a quarter, you normally assume the coin is fair and that there is a 50% chance of getting either heads or tails. Option pricing assumes the world of trading is f...

Read more »

Generate MP3 waveforms with Ruby and R

March 24, 2011
By
Generate MP3 waveforms with Ruby and R

I blame Rully for this. If it wasn’t for him I wouldn’t have been obsessed with this and spent a good few hours at night figuring it out last week. It all started when Rully mentioned that he knew how many beeps there are in the Singapore MRT (subway system) ‘doors closing’ warning. There are

Read more »

Day #11 R graphs as nodes

March 24, 2011
By

Today my company supervisor isn’t at work so he gave me (and the other students) a task-list. I have to check the availability of the following R scripts. Whether or not they work how they should in Knime. While doing these tasks, at the main tim...

Read more »

The Many Uses of Q-Q Plots

The Many Uses of Q-Q Plots

My last four posts have dealt with boxplots and some useful variations on that theme.  Just after I finished the series, Tal Galili, who maintains the R-bloggers website, pointed me to a variant I hadn’t seen before.  It's called a bee...

Read more »

Yeah Sure, Maybe, Well … Okay

March 23, 2011
By
Yeah Sure, Maybe, Well … Okay

Whoever wrote the book on statistics, probably avoided getting a proper education in literature. At least that's my null hypothesis. The cryptic and awkward presentation of probabilities common amongst the Frequentists (no, not the Latin American Socia...

Read more »

Typos sorted, at last!

March 23, 2011
By
Typos sorted, at last!

After posting so many entries about typos in my books (making you wonder how there could be any text left!) and postponing their classification for so long, I decided on Saturday afternoon to collect those entries into a comprehensive pdf document that should be more useful for readers. I incidentally noticed that my book web-page

Read more »

jStat: Advanced Statistics using Javascript

March 23, 2011
By
jStat: Advanced Statistics using Javascript

While 'R' is getting enterprise ready, it's no longer the only open source option for advanced statistical programming. jStat.js is the new kid on the block.Things in favor of jStat:Based on Javascript, jQuery - future is assuredLight-weightAbility to ...

Read more »

basic ggplot2 network graphs – ver2

March 23, 2011
By
basic ggplot2 network graphs – ver2

I posted last week a simple function to plot networks using ggplot2 package. Here is version 2. I still need to work on figuring out efficient vertex placement.Changes in version 2:-You have one of three options: use an igraph object, a matrix, or a da...

Read more »

The Popularity of Data Analysis Software (R vs SAS vs SPSS, etc.)

March 23, 2011
By
The Popularity of Data Analysis Software (R vs SAS vs SPSS, etc.)

Robert Muenchen, the author of R for SAS and SPSS Users (A great book I’m proud to have on my shelf), has published this week an article in which he compares the popularity/market-share of many of the common statistical packages including R, SAS, SPSS and many others. The full article is available on r4stats.com at: “The Popularity of Data Analysis...

Read more »

RcppArmadillo 0.2.17

March 23, 2011
By

Another release (1.1.90) by Conrad Sanderson for his wonderful Armadillo templated C++ library for linear algebra appeared yesterday. Consequently, a new release 0.2.17 of RcppArmadillo, our Rcpp-based integration into R is now on CRAN mirrors. The...

Read more »

The Register profiles Revolution Analytics

March 23, 2011
By

Tech news site The Register has just published an in-depth profile of Revolution Analytics. It was great meeting the author Dan Olds at Revolution HQ a couple of weeks ago, and sharing with him why we think the R language is the way forward for data science: modern, applied, large-scale statistical analysis. He captures that sentiment perfectly in the...

Read more »

Graphical Display of R Package Dependencies

March 23, 2011
By
Graphical Display of R Package Dependencies

In some work that I am currently involved in, we have to decide which GUI engine we should use. As an obvious starter, we decided to have a look at what other people are using in their packages. While cran helpfully displays all the R packages that are available, it doesn’t (I don’t think), give

Read more »

Downloading S&P 500 Data to R

March 23, 2011
By

The cornerstone of your analysis and quantitative trading algorithms are data. There are lots of different ways how to do it in R (depending of what your investment instruments are). Today I am going to download data from finance.yahoo which are stock ...

Read more »

Getting into shape for the sport of data science: Screencast of talk by Jeremy Howard at Melbourne R Users

March 23, 2011
By
Getting into shape for the sport of data science: Screencast of talk by Jeremy Howard at Melbourne R Users

Jeremy Howard gave a talk at the Melbourne R User Group on 16th March 2011. Jeremy provided tips on how to successfully compete in data mining competitions. He showed how he combines R with other tools to build predictive models. … Continue reading →

Read more »

Applied R: Manual for the quantitative social scientist

March 23, 2011
By

Applied R for the quantitative social scientist is a manual on R written specifically as an introduction for the quantitative social scientist. To my opinion, R-Project is a magnificent statistical program, ready to be accepted and implemented in the social sciences. The flexibility of this program and the way data are handled gives the user a sense of closeness...

Read more »

sab-R-metrics Sidetrack: Bubble Plots

March 22, 2011
By
sab-R-metrics Sidetrack: Bubble Plots

While I had mentioned in my last post that I will cover logistic regression in my next post, I decided that a quick interlude in working with bubble plots would be fun. Bubble plots have become pretty popular recently, especially with all of the Visualization Challenges I've seen around the internet (by the way, I...

Read more »

sab-R-metrics Sidetrack: Bubble Plots

March 22, 2011
By
sab-R-metrics Sidetrack: Bubble Plots

While I had mentioned in my last post that I will cover logistic regression in my next post, I decided that a quick interlude in working with bubble plots would be fun. Bubble plots have become pretty popular recently, especially with all of the Visualization Challenges I've seen around the internet (by the way, I...

Read more »

R again in Google Summer of Code

March 22, 2011
By

I'm a big fan of the Google Summer of Code.  It brings great projects together with a learning opportunity for students.  Once again the R Project was selected to be part of the Google Summer of Code in 2011.  Some other notable mathemat...

Read more »

R again in Google Summer of Code

March 22, 2011
By

I'm a big fan of the Google Summer of Code.  It brings great projects together with a learning opportunity for students.  Once again the R Project was selected to be part of the Google Summer of Code in 2011.  Some other notable mathemat...

Read more »

Where the heck has JD been?

March 22, 2011
By
Where the heck has JD been?

It’s been pointed out to me that I haven’t had any blog posts in a while. It’s true. I’m fairly slack. But in the last few months I’ve changed jobs (same firm, new role), written an R abstraction on top of Hadoop, been to China, and managed to stay married. While that sounds pretty awesome,

Read more »

Code: extended model support for mtable

March 22, 2011
By
Code: extended model support for mtable

I finally got around to organizing and packaging my complete set of extended model support for mtable in Martin Elff’s memisc library. Here is a list of the models supported: coxph, survreg – Cox proportional hazards models and parametric survival … Continue reading →

Read more »

More on R-Studio

March 22, 2011
By
More on R-Studio

Here's a link to keyboard shortcuts for the RStudio IDE. RStudio has replaced EMacs, Aquamacs, Tinn-R, Bluefish, and even Komodo Edit as my preferred R IDE/editor.http://gettinggeneticsdone.blogspot.com/2011/03/rstudio-keyboard-shortcut-reference-pdf.h...

Read more »