Posts Tagged ‘ Data Science ’

Machine Learning for Hackers

February 16, 2012
By
Machine Learning for Hackers

"Machine Learning for Hackers" is a new book from O'Reilly Media by Drew Conway and John Myles White. A "hacker", here, is "someone who likes to solve problems and experiment with new technologies", and "Machine Learning" is usually thought of as a black-box, algorithmic approach to producing predictions or classifications from data. This book, however, takes a pleasingly statistical...

Read more »

Analytic applications are built by data scientists

February 1, 2012
By

Ventana Research analyst David Menninger was on the judging panel for the Applications of R in Business contest. In a post on the Ventana research blog, he offers his perspectives on the contest, noting that R, as a statistical package, includes many algorithms for predictive analytics, including regression, clustering, classification, text mining and other techniques. The contest submissions supported...

Read more »

Norman Nie on two big problems with Big Data

December 23, 2011
By

Revolution Analytics CEO Norman Nie sat down with Cassimir Medford from Business Agility to talk about the problems business today face with respect to Big Data. The two big problems identified: finding adequately trained personnel and locating the right tools. Norman traces the problem of finding skilled practitioners to work with Big Data to the US educational system: The...

Read more »

EMC survey differentiates BI and Data Science

December 15, 2011
By
EMC survey differentiates BI and Data Science

EMC last week published the results of a survey of 462 IT decision makers who self-identified as either a data scientist or business intelligence professional (plus 35 invitees who were attendees at the EMC Data Scientist Summity and/or Kaggle competitors). There's a nice summary of the conclusions at the EMC blog, (where data scientists are described as "The New...

Read more »

"Anyone planning to work with Big Data ought to learn Hadoop and R"

October 25, 2011
By

Dan Woods at Forbes interviewed LinkedIn's Daniel Tunkelang about the rise of data science and on building data science teams. When asked how students today should prepare themselves to be data scientists, Tunkelang gives some good advice: When we built the data science team at LinkedIn a few years ago, we looked for raw talent, assuming that smart people...

Read more »

Data Science: a literature review

September 28, 2011
By
Data Science: a literature review

Just what is Data Science, anyway? Here's one take: Ever since the term "Data Scientist" was coined by DJ Patil and Jeff Hammerbacker in 2009, there's been a vigorous debate on what the team actually means. More than 80% of statisticians consider themselves data scientists, but Data Science is more than just Statistics. (My own take is that Data...

Read more »

The effectiveness of links shared on Facebook, Twitter, and YouTube

September 8, 2011
By
The effectiveness of links shared on Facebook, Twitter, and YouTube

The bitly blog has posted a really interesting analysis of the effectiveness of links shared via the social-media services Facebook, Twitter and YouTube. Here, effectiveness is measured by the "half-life" of a link: the amount of time it takes for that link to generate half the clicks it will ever attract. They summarize their results in this ggplot2 density...

Read more »

Fortune: Data Science is the hot new job

September 6, 2011
By

An article in the September 5 issue of Fortune Magazine notes that despite the economy, companies are scrambling to hire data scientists: Data scientists have been a fixture at online companies like Google (GOOG) and Amazon (AMZN) for years. But these days organizations as diverse as Wal-Mart (WMT) and Foursquare are hiring computer science experts who can analyze all...

Read more »

How to access 100M time series in R in under 60 seconds

August 25, 2011
By
How to access 100M time series in R in under 60 seconds

DataMarket, a portal that provides access to more than 14,000 data sets from various public and private sector organizations, has more than 100 million time series available for download and analysis. (Check out this presentation for more info about DataMarket.) And now with the new package rdatamarket, it's trivially easy to import those time series into R for charting,...

Read more »

Statisticians at JSM consider themselves "Data Scientists"

August 4, 2011
By
Statisticians at JSM consider themselves "Data Scientists"

At the JSM 2011 conference in Miami earlier this week, we conducted an informal poll of attendees on their attitudes to respect to Big Data, statistical software, and data science. JSM is the largest gathering of statisticians in North America, and attendees were invited to complete a survey after logging into the Wi-Fi network. Of the 190 respondents to...

Read more »