346 search results for "hadoop"

Thoughts on Making Data Work

June 9, 2010
By

I really enjoyed all four talks at today's online conference, Making Data Work. (Disclosure: Revolution sponsored this conference.) I thought the four speakers together gave a great overview of issues related to the processing, analysis, and visualization of big data. Mike Driscoll started off with a useful categorization for data size. "Small Data" (<10Gb) fits in the memory of...

Read more »

Data preparation for Social Network Analysis using R and Gephi

June 2, 2010
By
Data preparation for Social Network Analysis using R and Gephi

I want to share my experience in generating the data for social network analysis using R and analyzing it using Gephi... WHICH DATA STRUCTURE TO USE FOR LARGE GRAPHS?I quickly realized that using edge lists and adjacency matrix gets difficult as the g...

Read more »

The Next Big Thing: SAS and SPSS!…wait, what?

April 15, 2010
By
The Next Big Thing: SAS and SPSS!…wait, what?

Thanks to the R Bloggers aggregator I came across Yihui Xie’s post on a piece currently making the rounds about statistical analysis platforms. In The Next Big Thing, AnnMaria De Mars makes the argument that R—as a statistical computing platform—is not well suited for what she views as the next big things in data

Read more »

Lessons Learned from EC2

March 24, 2010
By
Lessons Learned from EC2

A week or so ago I had my first experience using someone else’s cluster on Amazon EC2. EC2 is the Amazon Elastic Compute Cloud. Users set up a virtual computing platform that runs on Amazon’s servers “in the cloud.” Amazon EC2 is not just another cluster. EC2 allows the user to create a disk image containing an operating system...

Read more »

My Experience at ACM Data Mining Camp #DMcamp

March 21, 2010
By
My Experience at ACM Data Mining Camp #DMcamp

My parents and I made plans to visit San Jose and Saratoga on my grandmother’s birthday, March 19, since that is where she grew up. I randomly saw someone tweet about the ACM Data Mining Camp unconference that happened to be the next day, March 20, only a couple of miles from our hotel in Santa Clara. This was...

Read more »

Open Source is Opening Data to Predictive Analytics

March 9, 2010
By

This article by REvolution Computing CEO Norman Nie is crossposted from the Future of Open Source Forum. The R Project: despite there being over 2 million users of this open-source language for statistical data analysis, you might not have heard of it ... yet. You might have seen this feature in the New York Times last year, and you...

Read more »

More on the Economist’s special report on big data

March 4, 2010
By

I totally missed this the other day, but there's much more to that special report on the data deluge in The Economist. (Thanks to readers SB and DN for pointing this out.) There's an total of nine articles in the report (you can find them all in the Related Items box on this page), including a section on business...

Read more »

RProtoBuf 0.1-0

February 3, 2010
By

Romain uploaded our first release of RProtoBuf to CRAN yesterday. RProtoBuf provides bindings for GNU R to the Google Protobuf implementation. Google Protobuf is (and I quote) a way of encoding structured data in an efficient yet extensible format th...

Read more »

RProtoBuf 0.1-0

February 3, 2010
By

Romain uploaded our first release of RProtoBuf to CRAN yesterday. RProtoBuf provides bindings for GNU R to the Google Protobuf implementation. Google Protobuf is (and I quote) a way of encoding structured data in an efficient yet extensible format that...

Read more »

What to Expect?

January 22, 2010
By
What to Expect?

In 2007, I was introduced to Twitter via the written qualifying exam towards my Ph.D.. At first, I did not know what to do with it. After a good year or so (maybe even sooner) passed, I began to follow some very interesting people that share the same interests as me. It has transformed my academic experience. It is...

Read more »