Blog Archives

10 R packages every data scientist should know about

February 18, 2013
By

The yhat blog lists 10 R packages they wish they'd known about earlier. Drew Conway calls them "10 reasons to always start your analysis in R". They're all very useful R packages that every data scientist should be aware of. They are: sqldf (for selecting from data frames using SQL) forecast (for easy forecasting of time series) plyr (data...

Read more »

Video: Data Mining with R

February 15, 2013
By

Yesterday's Introduction to R for Data Mining webinar was a record setter, with more than 2000 registrants and more than 700 attending the live session presented by Joe Rickert. If you missed it, I've embedded the video replay below, and Joe's slides (with links to many useful resources) are also available. During the webinar, Joe demoed several examples of...

Read more »

Make a Valentine’s Heart with R

February 14, 2013
By
Make a Valentine’s Heart with R

If you haven't sent your loved one a Valentine's Day greeting yet, it's not too late! Thanks to Guillermo Santos who pointed out an R script from Berkeley's Concepts in Computing with Data course, I created the following Valentine's Day card for my husband: If you want to make one for your loved one, you can use the R...

Read more »

In case you missed it: January 2103 Roundup

February 13, 2013
By

In case you missed them, here are some articles from January of particular interest to R users. Anthony Damico created an amusing and useful flowchart for finding resources for learning R, especially for survey analysis. All R users: please be counted for the 2013 Rexer Data Miner Survey (R was the #1 software reported in the last survey). Relatedly,...

Read more »

Did an Excel error bring down the London Whale?

February 11, 2013
By

When JP Morgan Chase announced it had lost more than 2 billion dollars on the capital markets back in May 2012, many pointed to the actions of rogue trader Bruno Iksil as the cause. But was the "London Whale" — the nickname he was given by other traders for his outsized positions — the victim not of hubris, but...

Read more »

Keep up with new R questions on StackOverflow with @StackOverflowR

February 8, 2013
By

Last time I checked on the number of R questions on StackOverflow, back in June 2011, there were 5000. Today, there are 23,649. (For comparison, there are 15,649 questions about Matlab and 971 questions on SAS.) If you use Twitter, thanks to Trey Causey there's now an easy way to keep up with new R questions posted to StackOverflow....

Read more »

Analyze web traffic data with Google Analytics and R

February 7, 2013
By
Analyze web traffic data with Google Analytics and R

If you run an e-commerce site, blog or other web property there's a good chance you use Google Analytics to monitor traffic, look at visitor sources, and measure conversions. And while Google Analytics is quite powerful at looking at historic activity on your site, it lacks much in the way of predictive analytics. That's where R shines of course,...

Read more »

Make building R packages easier with devtools

February 6, 2013
By

If you're writing any significant amount of R code, you might want to start think about bundling it up into packages. An R package combines functions, data, documentation and unit tests, and is a convenient and reliable system to manage and version collections of R content that could otherwise become unwieldy. And if you want to share your code...

Read more »

Learn about R through data mining

February 5, 2013
By
Learn about R through data mining

If you're in San Francisco for this week's DeveloperWeek conference, our own Joe Rickert will also giving a presentation on Wednesday at 2:10PM on Predictive Modeling with Big Data in R which will feature several demos of data mining massive data sets using the Revolution R Enterprise. Incidentally, the whole team Revolution Analytics was proud to receive the Top...

Read more »

Visualizing networks in R: arc diagrams and hive plots

February 4, 2013
By
Visualizing networks in R: arc diagrams and hive plots

Arc diagrams are an alternate way of representing two-dimensional graphs. Rather than scattering the nodes across the page connected by straight edges, you can instead arrange the nodes along a one-dimensional axis, and replace the straight edges with arcs between the nodes. While an arc diagram might not give as good a sense of the connections between the nodes...

Read more »