"Anyone planning to work with Big Data ought to learn Hadoop and R"

Posted on October 25, 2011 by David Smith in R bloggers | 0 Comments

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Dan Woods at Forbes interviewed LinkedIn's Daniel Tunkelang about the rise of data science and on building data science teams. When asked how students today should prepare themselves to be data scientists, Tunkelang gives some good advice:

When we built the data science team at LinkedIn a few years ago, we looked for raw talent, assuming that smart people could pick up the needed technical skills on the job. Now that the field has matured, it’s a good idea to learn some of those technical skills in school. Anyone planning to work with big data ought to learn Hadoop and R, the two open-source tools most used by data scientists. It’s also a good idea to take courses in statistics in machine learning. Beyond that, find every opportunity to work with real data sets. Struggling with the warts of real data is a key part of a data scientist’s job — in fact, some would say that the struggle is our “day job.”

(Emphasis mine.) Any student thinking about working with Hadoop and R should check out the RHadoop project, a collection of R packages that make it easy to write map-reduce jobs for Hadoop data stores in the R langauge.

Forbes: LinkedIn's Daniel Tunkelang On “What Is a Data Scientist?”

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

"Anyone planning to work with Big Data ought to learn Hadoop and R"

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)