A pre-requisite to be a Data Scientist

December 7, 2011

(This article was first published on Doodling with Data, and kindly contributed to R-bloggers)

So what should be in the toolkit of people who call themselves a data scientist?

A fundamental skill is the ability to manipulate data. A data scientist should be familiar and comfortable with a number of platforms and scripting tools to get the job done. What is difficult in Excel might be trivial in R. And when R struggles, you should switch to Unix (or use a programming language such as Python) get that portion of the data munging done. Along the way, you pick up a lot of tips and tricks. For example: how to read a big datafile in R?

The goal is to get the job done. Familiarity with a wide variety of tools, and expertise in some is the hallmark of any good would-be data scientist.

To leave a comment for the author, please follow the link and comment on their blog: Doodling with Data.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)