R subplot() with multiple lines

September 11, 2011
By

I have recently used the subplot() function of the TeachingDemos library for R: I wanted to create a simple embedded chart with multiple lines on it. The trick was to create a simple function that prepares the whole plot and pass it to the subplot() function to execute as shown below: > x > x() > plot(1:10)...

Read more »

Alternately coloured line environment with fancyvrb

September 11, 2011
By

Recently, while typing up an R tutorial, I used the LaTeX fancyvrb package to create two environments—one coloured blue for R commands, and one coloured red to display R output. This worked well for large blocks of each type. Then I decided I wan...

Read more »

A shortcut function for install.packages() and library()

September 10, 2011
By
A shortcut function for install.packages() and library()

I enjoy trying difference kind of R packages. Since I have more than 1 computers (1 at home, 1 at office and a laptop) it is troublesome to check whether I have installed some new packages for each computer. Therefore i wrote a function to load and install packages at once. If the package does

Read more »

Visualizing Bayesian Updating

September 10, 2011
By
Visualizing Bayesian Updating

One of the most straightforward examples of how we use Bayes to update our beliefs as we acquire more information can be seen with a simple Bernoulli process. That is, a process which has only two  possible outcomes. Probably the most commonly thought of example is that of a coin toss. The outcome of tossing

Read more »

Polynomial Interpolation with R

September 10, 2011
By
Polynomial Interpolation with R

As a first step to produce some useable code for spline interpolation/approximation in R, I set out to first do polynomial interpolation to see how I get along. It's not that there is no spline interpolation software for R, but I find it a bit limited. splinefun, for example, can do only 1-dimensional interpolation. interp{akima} can do bicubic splines...

Read more »

Getting data from the Infochimps Geo API in R

September 10, 2011
By
Getting data from the Infochimps Geo API in R

I am very intrigued by the Infochimps Geo API, so wanted to play around with it a little bit and pull the data into R. I’ll start by getting data from the American Community Survey Topline API for a 10km area around where I live. First some setup code here. It imports a couple libraries

Read more »

Unlocking Big Data with R

September 9, 2011
By

I have an article out this week on ReadWriteHack: Unlocking Big Data with R. My thanks to the folks at ReadWriteWeb for giving us the opportunity to showcase some of the many real-world Big Data applications of R. Here are some additional links about the applications mentioned in the article: New York Times: Destruction of the Haiti earthquake; 2010...

Read more »

Revolution Newsletter: September 2011

September 9, 2011
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you read the full September edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Using Revolution R with Hadoop: Revolution Analytics has released three open-source R packages, making it...

Read more »

My take on an R introduction talk

September 9, 2011
By
My take on an R introduction talk

Here is a short intro R talk I gave today...for what it's worth...R Introduction View more presentations from schamber

Read more »

Looking to hire data scientists

September 9, 2011
By

We are a global management consulting firm and are looking for data scientists in our team in New York/Washington DC and Gurgaon/Chennai (India). There are full-time and internship (New York) opportunities. There are multiple positions i...

Read more »

Le Monde puzzle [#739]

September 9, 2011
By
Le Monde puzzle [#739]

The weekend puzzle in Le Monde this week is again about a clock.  Now, the clock has one hand and x ticks where a lamp is either on or off. The hand moves from tick to tick and each time the lights go on or off depending on whether or not both  neighbours were in

Read more »

I’m Starting a New Position at the University of Virginia

September 8, 2011
By
I’m Starting a New Position at the University of Virginia

I just accepted an offer for a faculty position at the University of Virginia in the Center for Public Health Genomics / Department of Public Health Sciences. Starting in October I will be developing and directing a new centralized bioinformatics core ...

Read more »

Faster (recursive) function calls: Another quick Rcpp case study

September 8, 2011
By

There was another question recently on StackOverflow that I had meant to discuss in a follow-up post here. User deltanovember asked about slow recursive functions and used the very classic Fibonacci number as an example. To recap, Fibonacci number a...

Read more »

The effectiveness of links shared on Facebook, Twitter, and YouTube

September 8, 2011
By
The effectiveness of links shared on Facebook, Twitter, and YouTube

The bitly blog has posted a really interesting analysis of the effectiveness of links shared via the social-media services Facebook, Twitter and YouTube. Here, effectiveness is measured by the "half-life" of a link: the amount of time it takes for that link to generate half the clicks it will ever attract. They summarize their results in this ggplot2 density...

Read more »

In case you missed it: August Roundup

September 8, 2011
By

In case you missed them, here are some articles from August of particular interest to R users. A contest to showcase applications of R for businesses is offering $20,000 in prizes from Revolution Analytics. Three new open-source packages integrating R and Hadoop will be introduced by Revolution Analytics' CTO David Champagne in a webinar on September 21. Dirk Eddelbuettel...

Read more »

Interacting with bioinformatics webservers using R

September 8, 2011
By
Interacting with bioinformatics webservers using R

In an ideal world, all bioinformatics tools would be made available via the Web as a web service with an API, as well as a standalone package to download for local use. This is rarely the case and sometimes, even where one or the other is available, factors such as cost come into play. So

Read more »

A brief history of S&P 500 beta

September 8, 2011
By
A brief history of S&P 500 beta

Data The data are daily returns starting at the beginning of 2007.  There are 477 stocks for which there is full and seemingly reliable data. Estimation The betas are all estimated on one year of data. The times that identify the betas mark the point at which the estimate would become available.  So the betas … Continue reading...

Read more »

Multiple plots with subplot in R

September 8, 2011
By

I'm in the middle of creating a poster and wanted to compresss the content by transforming some of the charts into subplots of other charts.I made a little survey and found that there is a TeachingDemos library in CRAN that fits my needs. Well, the parameterization of the functions is a bit tricky but after a few tries...

Read more »

Shared and reproducible computing with OpenCPU

September 7, 2011
By
Shared and reproducible computing with OpenCPU

While looking for an online computing provider, I bumped into OpenCPU.org: OpenCPU is a new initiative to make innovations in statistics, visualization and data-science more widely applicable. I guess the idea of online analysis and visualization, and online cloud R computing platform isn’t really new at this point anymore, but the real incentive is the

Read more »

Analyzing big data in R: two presentations from useR! 2011

September 7, 2011
By

At last month's useR! 2011 conference at Warwick University, there were two talks on the RevoScaleR package for big data statistics in R. The first was a keynote presentation from Revolution Analytics' Chief Scientist, Lee Edlefsen. Here is the overview of his talk, Scalable Data Analysis in R: For the past several decades the rising tide of technology --...

Read more »

Information Transmission in a Social Network: Dissecting the Spread of a Quora Post

September 7, 2011
By
Information Transmission in a Social Network: Dissecting the Spread of a Quora Post

tl;dr See this movie visualization for a case study on how a post propagates through Quora. How does information spread through a network? Much of Quora’s appeal, after all, lies in its social graph — and when you’ve got a network of users, all broadcasting their activities to their neighbors, information can cascade in multiple

Read more »

Hey! I made you some Wiener processes!

September 7, 2011
By
Hey! I made you some Wiener processes!

Check them out. Here are thirty homoskedastic ones: > homo.wiener for (j in 1:30) {  for (i in 2:length(homo.wiener)) {          homo.wiener for (j in 1:30) {        plot( homo.wiener,           type = "l", col = rgb(.1,....

Read more »

Hey! I made you some Wiener processes!

September 7, 2011
By
Hey! I made you some Wiener processes!

Check them out. Here are thirty homoskedastic ones: > homo.wiener for (j in 1:30) {  for (i in 2:length(homo.wiener)) {          homo.wiener for (j in 1:30) {        plot( homo.wiener,           type = "l", col = rgb(.1,....

Read more »

Link to StatDNA Guest Post

September 7, 2011
By
Link to StatDNA Guest Post

The post is officially up on the StatDNA blog. Go check it out.As I said in my previous post, this is a very rough and preliminary model. This is why my work was not any sort of formal entry, just some fun with some great data.I used an Vector Genera...

Read more »

R is a cool sound editor!

September 7, 2011
By

Capabilities of R are definitely unless! After my previous posts about some easy image editing in R (they are here, and here), now is the time to explore if R is capable of sound editing!Just for fun, here I created a function that receives a phone number (or another sequence of numbers), and returns the equivalent melody...

Read more »

A simple example for writting parallel code

September 7, 2011
By
A simple example for writting parallel code

Today, programmers have to deal with multi-core and multi-computer technologies. Several people claim that software developers are far behind hardware technologies. My two favorite posts for this statement are Editor’s Desk: Software Lags Behind Hardware, But That’s a Good Thing A Hacker’s Craic -Why is software so far behind hardware? Parallel computing is not that

Read more »

Google Spreadsheets API: Listing Individual Spreadsheet Sheets in R

September 7, 2011
By
Google Spreadsheets API: Listing Individual Spreadsheet Sheets in R

In Using Google Spreadsheets as a Database Source for R, I described a simple Google function for pulling data into R from a Google Visualization/Chart tools API query language query applied to a Google spreadsheet, given the spreadsheet key and worksheet ID. But how do you get a list of sheets in spreadsheet, without opening

Read more »

2011 Perth City to Surf Stats

September 6, 2011
By
2011 Perth City to Surf Stats

Like every year, August sees the thousands taking part in the Perth City to Surf, and with that comes the chance for some stats. Why? Curiosity more than anything, and to convince myself that my time in the 12km run … Continue reading →

Read more »

Fortune: Data Science is the hot new job

September 6, 2011
By

An article in the September 5 issue of Fortune Magazine notes that despite the economy, companies are scrambling to hire data scientists: Data scientists have been a fixture at online companies like Google (GOOG) and Amazon (AMZN) for years. But these days organizations as diverse as Wal-Mart (WMT) and Foursquare are hiring computer science experts who can analyze all...

Read more »