A Rough Guide to Data Science

[This article was first published on Pivotal P.O.V. » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Visualization by speedoflife via Flickr.

Visualization by speedoflife via Flickr.

If Big Data was last year’s buzzword, Data Science may reach the same level of hype this year. There’s no shortage of discussion about the high demand for data scientists, the term’s usefulness as a designation, and even declarations of its “sexiness” as a career. And as with many terms that reach a critical mass on social media, data science is a concept more widely discussed than understood. What is data science? What differentiates the practice to justify this new term? And how does someone become a data scientist?

The definition of data science varies among practitioners, but is widely understood as the application of statistical analysis and software engineering to transform vast amounts of data into useful insight. Beyond this, the data scientist iterates on models to further explore questions posed by the data, and then uses techniques such as visualization to communicate the insights and stories revealed from the process.

In a useful new document, “A Practical Introduction to Data Science Skills”, Google’s Michael Manoochehri offers a syllabus for those wanting to learn more about data science, its role in organizations and society, and the common skills, platforms, and frameworks used by practitioners. Manoochehri is the author of the forthcoming book Data Just Right, which aims to disambiguate the role of big data within the modern enterprise, and explore how organizations can not only adapt to this paradigm shift, but embrace it.

And while expert data scientists are in command of numerous mathematical and programming skills, Manoochehri offers some entry points and potential projects for the curious. Many of the “short term skills” he identifies are common among reasonably-technical users — proficiency in Python and JavaScript, familiarity with UNIX and SQL — along with data science-specific learning tasks such as gaining a basic understanding of R and running a Hadoop instance locally. While the long-term skills may be more imposing to neophytes, there’s a lot of free tools, tutorials, and datasets to learn from, and even entry-level skills can be useful for non-profits and municipalities that lack such expertise.

To leave a comment for the author, please follow the link and comment on their blog: Pivotal P.O.V. » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)