336 search results for "hadoop"

According to Microsoft, the fourth paradigm of science is data

December 16, 2009
By

In scientific discovery, the first three paradigms were experimental, theoretical and (more recently) computational science. A new book of essays published by Microsoft (and available for free download -- kudos, MS!) argues that a fourth paradigm of scientific discovery is at hand: the analysis of massive data sets. The book is dedicated to the late Microsoft researcher Dr Jim...

Read more »

R, REvolution named in top analytic trends for 2010

December 7, 2009
By

Author and enterprise software executive Nenshad Bardoliwalla lists his Top 10 Trends for 2010 in Analytics, Business Intelligence, and Performance Management at the website Enterprise Irregulars. If you've been following the Business Intelligence space (and who hasn't, right?) you'll recognize some familiar themes: predictive analytics, Web 2.0, Software-as-a-Service, risk, IBM. What's interesting about this list is that it raises...

Read more »

In case you missed it: November roundup

December 4, 2009
By

In case you missed them, here are some articles from last month of particular interest to R users. This post demonstrated reader Paul Bleicher's code for visualizing a time series as a heat-map calendar. This post and followup showed (with thanks to Drew Conway) how to use R to perform social network analysis on live data from Twitter. This...

Read more »

Massively parallel database for analytics

July 22, 2009
By
Massively parallel database for analytics

This is by far the best description of why traditional parallel databases (like Teradata, Greenplum et al.) is a evolutionary dead end. But much more than a theoretical discussion, they have built a solution which they call HadoopDB. It is based on Hadoop, PostgreSQL, and Hive and is completely Open Source. Alternative, column-based, backends to PostgreSQL...

Read more »

Massively parallel database for analytics

July 22, 2009
By
Massively parallel database for analytics

This is by far the best description of why traditional parallel databases (like Teradata, Greenplum et al.) is a evolutionary dead end. But much more than a theoretical discussion, they have built a solution which they call HadoopDB. It is based on Hadoop, PostgreSQL, and Hive and is completely Open Source. Alternative, column-based, backends to PostgreSQL...

Read more »

What I’ll be presenting at O’Reilly Money Tech 2009

October 21, 2008
By
What I’ll be presenting at O’Reilly Money Tech 2009

(April 2009 Update:  Unfortunately, The Money Tech Conference was indefinitely postponed, but fortunately I will be presenting a version of this talk in July at OSCON 2009). I’ve been invited to speak at O’Reilly’s Money Tech conference this coming February 4-6th in New York City and thought I’d share the abstract for my talk here.  I’ll

Read more »