Blog Archives

In case you missed it: August 2014 Roundup

September 5, 2014
By

In case you missed them, here are some articles from August of particular interest to R users: R is the most popular software in the KDNuggets poll for the 4th year running. The frequency of R user group meetings continues to rise, and there are now 147 R user groups worldwide. A video interview with David Smith, Chief Community...

Read more »

Hortonworks Seminar Series: The Modern Data Architecture

September 3, 2014
By

As more companies explore the benefits that Hadoop may provide, the opportunities to better understand the technology are myriad and unequal. As a provider of in-Hadoop analytics, Revolution Analytics is participating in the coming Hortonworks seminar series. We will be on site to discuss how to deploy R-based analytics within Hadoop clusters using Revolution R Enterprise. The seminar series...

Read more »

R tops KDNuggets data analysis software poll for 4th consecutive year

August 29, 2014
By
R tops KDNuggets data analysis software poll for 4th consecutive year

KDNuggests asked its readers the question "What programming/statistics languages you used for an analytics / data mining / data science work in 2014?" and one again, R was the #1 response. (R was also the #1 response in similar polls in 2013, 2012 and 2011.) The top 5 selections of the 719 respondents were: R (352 respondents) SAS (262)...

Read more »

DataScienceLA interviews David Smith

August 27, 2014
By

While I was in LA for the useR! 2014 conference last month, I had the great pleasure of being among the participants in the DataScienceLA interview series hosted by Eduardo Ariño de la Rubia. Eduardo is both an R user and an excellent interviewer: his preparation and knowledge of R has resuled in a fascinating interview series for any...

Read more »

Because it’s Friday: A 3-minute movie in 4095 bytes

August 22, 2014
By

This entire movie — images, music, everything — is generated from a Windows PC executable of just 4,095 bytes. That's not a typo: we're not talking bytes not megabytes or gigabytes here. Less than 4kb total creates this entire scene. For comparison, a medium-quality video file of this exact same scene in AVI format comes in at over 64Mb:...

Read more »

Entering the field as a data scientist with certification

August 22, 2014
By

By Neera Talbert, VP Services and Ben Wiley, R Programmer at Revolution Analytics By now, everyone should be familiar with the data scientist boom. Simply logging onto LinkedIn reveals a seemingly infinite number of people with words and phrases like “Data Scientist”, “Big Data Specialist”, and “Analytics” in their title. A few weeks ago, an article floated around the...

Read more »

How to integrate R with your calendar

August 20, 2014
By
How to integrate R with your calendar

Hilary Parker has contributed a lovely article to Significance, the magazine of the American Statistical Association and the Royal Statistical Society, on using R to set your Google calendar to mark the time of sunsets. Hilary details the process in the article, but the basic idea is to use the sunrise.set function from the StreamMetabolism package to calculate sunset...

Read more »

Data Cleaning is a critical part of the Data Science process

August 18, 2014
By

A New York Times article yesterday discovers the 80-20 rule: that 80% of a typical data science project is sourcing cleaning and preparing the data, while the remaining 20% is actual data analysis. The article gives short shrift to this important task by calling it "janitorial work", but whether you call it data munging, data wrangling or anything else,...

Read more »

Search for CRAN, GitHub and BioConductor packages at Rdocumentation.org

August 15, 2014
By
Search for CRAN, GitHub and BioConductor packages at Rdocumentation.org

If you're looking for just the right package to solve your R problem, you could always browse through the list of available packages on CRAN. But with almost 6000 entries, that's not going to be the most efficient process. And even then, many very useful packages aren't found on CRAN: there are more than 800 packages hosted on BioConductor...

Read more »

Table comparing the statistical capabilities of software packages

August 13, 2014
By
Table comparing the statistical capabilities of software packages

A statistical consultant known only as "Stanford PhD" has put together a table comparing the statistical capabilities of the software packages R, Matlab, SAS, Stata and SPSS. For each of 57 methods (including techniques like "ridge regression", "survival analysis", "optimization") the author ranks the capabilities of each software package as "Yes" (fully supported), "Limited" or "Experimental". Here are the...

Read more »