Big Data

Applications of R at Google

July 13, 2012 | David Smith

At a talk I saw at the useR!2012 conference last month, Googler Karl Millar estimated that there are at least 200 active R users at Google, plus another 300+ occasional users participating in Google's internal R support list. But what are all these Google employees doing with R? A post from the ... [Read more...]

The role of Statistics in the Higgs Boson discovery

July 3, 2012 | David Smith

News is starting to leak that the Large Hadron Collider may have accomplished its primary mission of confirming the existence of the hypothesised and heretofore elusive subatomic particle, the Higgs Boson. And sure, billions of Euros worth of state-of-the-art high-energy machinery and an army of experimental and theoretical physicists probably ... [Read more...]

A big list of the things R can do

July 2, 2012 | David Smith

R is an incredibly comprehensive statistics package. Even if you just look at the standard R distribution (the base and recommended packages), R can do pretty much everything you need for data manipulation, visualization, and statistical analysis. And for everything else, there's more than 5000 packages on CRAN and other repositories, ... [Read more...] R is a Big Data open-source technology to watch

June 19, 2012 | David Smith recently published its list of 9 open-source technologies to watch. Hadoop is first on the list, and second up is the R Project: R is an open source programming language and software environment designed for statistical computing and visualization. R was designed by Ross Ihaka and Robert Gentleman at ... [Read more...]

More on birthday probabilities

June 15, 2012 | David Smith

Last week, Joe Rickert used R and four years of US Census data to create an image plot of the relative probabilities of being born on a given day of the year: Chris Mulligan also tackled this problem with R, but this time using 20 years of Census data from 1969 to 1988. ... [Read more...]

Data distillation with Hadoop and R

June 11, 2012 | David Smith

We're definitely in the age of Big Data: today, there are many more sources of data readily available to us to analyze than there were even a couple of years ago. But what about extracting useful information from novel data streams that are often noisy and minutely transactional ... aye, there's ... [Read more...]

Data Mining with R

June 8, 2012 | David Smith

Earlier this week, Revolution Analytics' Joe Rickert gave a webinar Introduction to R for Data Mining. You can watch the replay below: If you're already familiar with R and the basics of data mining, you might want to skip ahead to the 13-minute mark where Joe's live demo begins. There ... [Read more...]

Announcing Revolution R Enterprise 6.0

June 5, 2012 | David Smith

Revolution Analytics is proud to announce the latest update to our enhanced, production-grade distribution of R, Revolution R Enterprise. This update expands the range of supported computation platforms, adds new Big Data predictive models, and updates to the latest stable release of open source R (2.14.2), which improves performance of the ... [Read more...]

Applications of R in Government

June 4, 2012 | David Smith

Following the announcement of the US Government Big Data Initiative, I was asked to write a small article about applications of R in government. The article has just appeared in Government Security News (and I believe will appear in their daily newsletter tomorrow). In the article, I highlighted several R ... [Read more...]

Facebook-class social network analysis with R and Hadoop

May 25, 2012 | David Smith

In computing, social networks are traditionally represented as graphs: a connection of nodes (people), pairs of which may be connected by edges (friend relationships). Visually, the social networks can then be represented like this: Social network analysis often amounts to calculating the statistics on a graph like this: the number ... [Read more...]

R is to SAS as Java is to COBOL

May 18, 2012 | David Smith

An interview with Revolution Analytics CEO Dave Rich was published this week by BeyeNetwork. During the interview, Dace was asked about how the statistical modeling platforms have changed over the decades: People have been doing statistical modeling and predictive analytics for 50 years now, SAS and SPSS have been around since ... [Read more...]

Orbitz: R has become the data-mining tool of choice

May 17, 2012 | David Smith

Sameer Chopra, vice president of Advanced Analytics at Orbitz Worldwide, wrote recently in Analytics magazine about the changing landscape of processes, software and systems for statistical modelers. In a section on "Big Data and Open Source Analytics", Chopra lays out the reasons why the R language "has become the data-mining ... [Read more...]

Multiple Sclerosis Tweet-Chat: Review

May 14, 2012 | David Smith

We had a great Twitter conversation last Thursday on the use of big-data analytics, Revolution R Enterprise, and IBM Netezza in the search for a cure for MS. Many thanks to the other panelists: Murali Ramanathan (SUNY Buffalo), Tim Coetzee (National MS Society) and moderator Shawn Dolley (IBM) for fielding ... [Read more...]

Thursday: Tweet-chat on Multiple Sclerosis research

May 7, 2012 | David Smith

The story about the great work that SUNY Buffalo has been doing to find a cure for Multiple Sclerosis with Revolution R Enterprise and IBM Netezza has generated a lot of attention, with stories in Forbes, InformationWeek and eWeek (amongst others). To continue the discussion, IBM has put together a ... [Read more...]

Big Data Analytics with R and Hadoop

May 3, 2012 | David Smith

The open-source RHadoop project makes it easier to extract data from Hadoop for analysis with R, and to run R within the nodes of the Hadoop cluster -- essentially, to transform Hadoop into a massively-parallel statistical computing cluster based on R. In yesterday's webinar (the replay of which is embedded ... [Read more...]

Yes, you need more than just R for Big Data Analytics

May 2, 2012 | David Smith

Douglas Merrill, former CIO/VP of Engineering at Google, writes in Forbes about using the R language for data analysis: Most folks with math-oriented graduate degrees will have written something in R, a non-commercial option for your big data analysis. So, great graduates from great graduate schools know great tools. ... [Read more...]

Google BigQuery and the Github Data Challenge

May 1, 2012 | David Smith

Github has made data on its code repositories, developer updates, forks etc. from the public GitHub timeline available for analysis, and is offering prizes for the most interesting visualization of the data. Sounds like a great challenge for R programmers! The R language is currently the 26th most popular on ... [Read more...]
1 2 3 4 5

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)