333 search results for "hadoop"

Experience with Oracle R Enterprise in the Oracle micro-processor tools environment

August 17, 2012
By
Experience with Oracle R Enterprise in the Oracle micro-processor tools environment

Normal 0 false false false EN-US X-NONE X-NONE ...

Read more »

Ryan Rosario on Parallel programming in R

August 17, 2012
By

Earlier this year data scientist Ryan Rosario gave a talk on parellel computing with R to the Los Angeles R User Group, and he recently made the slides from the talk available online. They're a great resource for anyone looking to make use of multi-processor systems a Hadoop based architechure to speed computations with big data. Ryan's talk was...

Read more »

How Williams Sonoma uses R to target customers online

August 16, 2012
By
How Williams Sonoma uses R to target customers online

If you live in the US, you've probably visited a Williams Sonoma store for gourmet food or quality cookware for the kitchen. And if you've shopped at Pottery Barn or West Elm stores for furniture, those chains are part of the Williams Sonoma stable as well. All three brands have major online stores, all supported by a sophisticated marketing...

Read more »

In case you missed it: July 2012 Roundup

August 10, 2012
By

In case you missed them, here are some articles from June of particular interest to R users. The Environmental Performance Index website uses R to rank countries by measures like environmental health and ecosystem vitality. A log-linear regression in R predicted the gold-winning Olympic 100m sprint time to be 9.68 seconds (it was actually 9.63 seconds). Some R-related talks...

Read more »

Adventures at My First JSM (Joint Statistical Meetings) #JSM2012

August 6, 2012
By
Adventures at My First JSM (Joint Statistical Meetings) #JSM2012

During the past few decades that I have been in graduate school (no, not literally) I have boycotted JSM on the notion that “I am not a statistician.” Ok, I am a renegade statistician, a statistician by training. JSM 2012 was held in San Diego, CA, one of the best places to spend a week during the summer. This...

Read more »

Surveys continue to rank R #1 for Data Mining

August 3, 2012
By
Surveys continue to rank R #1 for Data Mining

KDnuggets recently posted its annual poll on data mining software, and the R language retains its #1 ranking as the most commonly-used software for data mining: R is now used by 52.5% of poll respondents, compared with 45% last year. Donnie Berkholz provides an analysis of the year-on-year trends for Redmonk. He provides the chart below, and notes "the...

Read more »

Data Parallelism Using Oracle R Enterprise

August 2, 2012
By

Modern computer processors are adequately optimized for many statistical calculations, but large data operations may require hours or days to return a result.  Oracle R Enterprise (ORE), a set of R packages designed to process large data computations in Oracle Database, can run many R operations in parallel, significantly reducing processing time. ORE supports parallelism through the transparency layer,...

Read more »

Edge Prediction in a Social Graph: My Solution to Facebook’s User Recommendation Contest on Kaggle

July 31, 2012
By
Edge Prediction in a Social Graph: My Solution to Facebook’s User Recommendation Contest on Kaggle

A couple weeks ago, Facebook launched a link prediction contest on Kaggle, with the goal of recommending missing edges in a social graph. I love investigating social networks, so I dug around a little, and since I did well enough to score one of the coveted prizes, I’ll share my approach here. (For some background, the contest provided...

Read more »

Big data, big analytics, big opportunity

July 30, 2012
By
Big data, big analytics, big opportunity

Data, data, every where Nor any byte to think The world today is awash with data. Corporations, governments, and individuals are busy generating petabytes of data on culture, economy, environment, religion, and society.  While data has become abundant and ubiquitous, data analysts needed to turn raw data into knowledge are in fact in short...

Read more »

Community Detection in Networks with R

Community Detection in Networks with R

I mainly post this visualization because I think it’s pretty. It reminds a little of the work by the famous Dutch painter Mondrian. The complete matrix can be found here. The plot is a heatmap of an adjacency matrix generated by a weighted dir...

Read more »