RStudio: New free IDE for R

February 28, 2011
By

Just saw the announcement of the availability of Rstudio, a new (free & open source) integrated development environment for R that works on Windows, Mac, and Linux. Judging from the screenshots, it looks like Rstudio supports syntax highlighting for Sweave & easy PDF creation from Sweave code, which is something I haven't seen anywhere else (on Windows at least)....

Read more »

RStudio: New free IDE for R

February 28, 2011
By

Just saw the announcement of the availability of Rstudio, a new (free & open source) integrated development environment for R that works on Windows, Mac, and Linux. Judging from the screenshots, it looks like Rstudio supports syntax highlighting for Sweave & easy PDF creation from Sweave code, which is something I haven't seen anywhere else (on Windows at least)....

Read more »

Running a Regression in R

February 28, 2011
By
Running a Regression in R

I created another video tutorial on R. This time, I discuss R's lm() command and how to use it for a variety of standard applications.Here is the code that goes with the video:Enjoy!

Read more »

Plug for RStudio: powerful, free, and easy to use interactive development environment for R

February 28, 2011
By
Plug for RStudio: powerful, free, and easy to use interactive development environment for R

(click for a bigger picture)As a longtime SAS user, one obstacle for me in using R professionally has been figuring out a process for saving and testing code across several work sessions and integrating code composition and execution. There are a coup...

Read more »

RStudio, new open-source IDE for R

February 28, 2011
By
RStudio, new open-source IDE for R

RStudio is a new open-source IDE for R which we’re excited to announce the availability of today. RStudio has interesting features for both new and experienced R developers including code completion, execute from source, searchable history, and support for authoring Sweave documents. RStudio runs on all major desktop platforms (Windows, Mac OS X, Ubuntu, or

Read more »

RStudio, new open-source IDE for R

February 28, 2011
By
RStudio, new open-source IDE for R

RStudio is a new open-source IDE for R which we’re excited to announce the availability of today. RStudio has interesting features for both new and experienced R developers including code completion, execute from source, searchable history, and support for authoring Sweave documents. RStudio runs on all major desktop platforms (Windows, Mac OS X, Ubuntu, or

Read more »

Example 8.28: should we buy snowstorm insurance?

February 28, 2011
By
Example 8.28: should we buy snowstorm insurance?

It's been a long winter so far in New England, with many a snow storm. In this entry, we consider a simulation to complement the analytic solution for a probability problem concerning snow. Consider a company that buys a policy to insure its revenue ...

Read more »

R Tutorial Series: Two-Way ANOVA with Unequal Sample Sizes

February 28, 2011
By
R Tutorial Series: Two-Way ANOVA with Unequal Sample Sizes

When the sample sizes within the levels of our independent variables are not equal, we have to handle our ANOVA differently than in the typical two-way case. This tutorial will demonstrate how to conduct a two-way ANOVA in R when the sample sizes withi...

Read more »

R Tutorial Series: Two-Way ANOVA with Unequal Sample Sizes

February 28, 2011
By
R Tutorial Series: Two-Way ANOVA with Unequal Sample Sizes

When the sample sizes within the levels of our independent variables are not equal, we have to handle our ANOVA differently than in the typical two-way case. This tutorial will demonstrate how to conduct a two-way ANOVA in R when the sample sizes withi...

Read more »

What $480M of Gross Revenue Looks Like to Groupon

February 28, 2011
By
What $480M of Gross Revenue Looks Like to Groupon

On Saturday, the Wall St. Journal posted details of an internal Groupon memo that reported $760 million in revenue last year. The WSJ article came just as I was finishing up a visualization of some data I had collected on … Continue reading →

Read more »

Visualizing Soccer League Standings

February 27, 2011
By
Visualizing Soccer League Standings

I feel ashamed for this boring title, but hope that the entry can make up for it. This visualization did inspire me, as a comment did point to my Tour de France visualizations. As with all visualizations, we need data first – this sounds trivial, but is sometimes a frustrating show-stopper. After I found the

Read more »

About the RStudio Project

February 27, 2011
By
About the RStudio Project

We started the RStudio project because we were excited and inspired by R. The creators of R provided a flexible and powerful foundation for statistical computing; then made it free and open so that it could be improved collaboratively and its benefits could be shared by the widest possible audience. It’s better for everyone if the

Read more »

Welcome to our Weblog

February 27, 2011
By
Welcome to our Weblog

Welcome to the RStudio weblog! We’ll use the weblog to talk about both the product and its features as well as broader issues that concern the R community.

Read more »

John Chambers, the inventor of S, added reference classes to R…

February 26, 2011
By
John Chambers, the inventor of S, added reference classes to R…

John Chambers, the inventor of S, added reference classes to R 2.12, and oh boy are they fun to look at! What you see in the picture above is a “Hello World” web application for R. It’s written using the Rack R package (not unlike Ruby’s Rac...

Read more »

More Chicago Mayoral Analaysis

February 26, 2011
By
More Chicago Mayoral Analaysis

I perform a precincts-votes analysis on the returns from the Chicago Democratic Mayoral primary of 2011.

Read more »

The split-apply-combine paradigm in R

February 25, 2011
By
The split-apply-combine paradigm in R

Last night at the DC R Users meetup, which was our largest meetup to date, I gave an introductory presentation on data munging, and spent a bit of time on the split-apply-combine paradigm that I use almost daily in my work. I talked mainly about the packages plyr and doBy, which I use a lot

Read more »

ggplot2 joy

February 25, 2011
By
ggplot2 joy

I’ve been working on a long-term (25+yr) longitudinal study of rheumatoid arthritis with my boss. He just walked in and asked if I could create a plot showing the trajectory of pain scores over time for each subject, separated by educational level (4 groups). Having now worked with ggplot2 for a while, and learning more

Read more »

R 2.12.2 is available

February 25, 2011
By
R 2.12.2 is available

As previously announced, R 2.12.2 is available for download today. Browsing through the various mirrors (using the Download R tool on inside-R.org), it looks like the Windows version is already available on many mirrors; the Mac and Linux versions will follow soon (and of course, sources are available now). The complete list of changes is in the announcement on...

Read more »

Tutorial on Distributions in R

February 25, 2011
By
Tutorial on Distributions in R

Here's a video tutorial I put together to go over how to generate a random sample from one of the commonly known parametric distributions in R.Along the way, I also discuss how some of the properties of estimators are reflected in the computations I pe...

Read more »

Mapping the 2011 Chicago Mayoral Democratic Primary

February 25, 2011
By
Mapping the 2011 Chicago Mayoral Democratic Primary

Mapping the Chicago Democratic Mayoral 2011 primary with Ruby, R, and ggplot2

Read more »

Setting up a parallel computing cluster for R with OpenSSH and doSNOW

February 25, 2011
By

Responding to yesterday's post which included an aside on using parallel processing for by-group computations in R, reader Christian Gunning mused about the possibility of using doSNOW on his network, with OpenSSH to manage the authentication: I sit on a fast campus network and have at least 10 remote cores available that I could farm out for big jobs....

Read more »

Example 8.27: using regular expressions to read data with variable number of words in a field

February 25, 2011
By
Example 8.27: using regular expressions to read data with variable number of words in a field

A more or less anonymous reader commented on our last post, where we were reading data from a file with a varying number of fields. The format of the file was:1 Las Vegas, NV --- 53.3 --- --- 12 Sacramento, CA --- 42.3 --- --- 2The complication in the...

Read more »

snow and ssh — secure inter-machine parallelism with R

February 24, 2011
By

I just threw a post up on Revolutions, which got a lot longer than I planned. And got me thinking. And reading (see refs in previous post). And trying. Turns out that it was way easier than I thought! The problem:From the blog post: " OpenSSH is now available on all platforms. A sensible solution...

Read more »

MT4 -> Multi-R sessions for tick-analysis

February 24, 2011
By
MT4 -> Multi-R sessions for tick-analysis

The Shared-Memory between multiple R sessions mentioned in my previous post got me thinking … quite some potential indeed. As a result, I investigated further using (calling) multiple R sessions from the same MT4 script. Specifically, I wanted to have a clearer understanding of the time required to performed lightning fast & dead slow processing,

Read more »

How to read and write Stata data (.dta) files into R

February 24, 2011
By
How to read and write Stata data (.dta) files into R

Here's an R tutorial where I explain how to read Stata data files into R (even if you don't own the program Stata). I also offer some other basic tips.Of note, you can also write Stata .dta files from R (if your coauthors or journals insist on having ...

Read more »

when Nuns or Hells Angels get in a plane

February 24, 2011
By
when Nuns or Hells Angels get in a plane

Today, at lunch, Matthieu told us a nice story (or call it a paradox if you like) about the probability to find you seat empty when you get in a place.  a plane full of nuns Assume that you are in the line to get in the airplane, you are the ...

Read more »

Packages for By-Group Processing in R

February 24, 2011
By

Analyst and BI expert Steve Miller takes a look at the facilities in R for doing "by-group" processing of data. The task consisted of: ... read several text files, merge the results, reshape the intermediate data, calculate some new variables, take care of missing values, attend to meta data, execute a few predictive models and graph the results. Then...

Read more »

Split a Data Frame into Testing and Training Sets in R

February 24, 2011
By

I recently analyzed some data trying to find a model that would explain body fat distribution as predicted by several blood biomarkers. I had more predictors than samples (p>n), and I didn't have a clue which variables, interactions, or quadratic terms made biological sense to put into a model. I then turned to a few data mining procedures that I...

Read more »

Split a Data Frame into Testing and Training Sets in R

February 24, 2011
By

I recently analyzed some data trying to find a model that would explain body fat distribution as predicted by several blood biomarkers. I had more predictors than samples (p>n), and I didn't have a clue which variables, interactions, or quadratic terms made biological sense to put into a model. I then turned to a few data mining procedures that I...

Read more »