Data Science

Yes, you need more than just R for Big Data Analytics

May 2, 2012 | David Smith

Douglas Merrill, former CIO/VP of Engineering at Google, writes in Forbes about using the R language for data analysis: Most folks with math-oriented graduate degrees will have written something in R, a non-commercial option for your big data analysis. So, great graduates from great graduate schools know great tools. ... [Read more...]

Incompetence borne of excessive cleverness

April 29, 2012 | Derek-Jones

I have just got back from the 24 hour Data Science Global Hackathon; I was an on-site participant at Hub Westminster in London (thanks to Carlos and his team for doing such a great job looking after us all {around 50 turned up from the 100 who registered; the percentage was similar in ... [Read more...]

The race for speed at the data layer

April 6, 2012 | David Smith

The competition amongst database vendors to create the fastest, most powerful "data layer" — the hardware and software to provide storage for Big Data with high-performance data processing — is clearly heating up. The Netezza appliance has been so successful that IBM has been racing to keep up with demand. SAP is ... [Read more...]

Compete in the Data Science Hackathon, April 28

April 5, 2012 | David Smith

All around the world at noon GMT on April 28, data scientists around the world will compete in the world's first one-day International Data Science Hackathon, organized by Data Science London. Participants will receive a data set at the beginning of the event, and work in teams of 3-5 over the ... [Read more...]

Data Science Undefined

April 4, 2012 | Capehart

One of the favorite bar room discussions of statisticians, machine learners, and computer scientists is – what is data science? (And I don’t care whether it happens in a bar or not, it’s a “bar room” discussion by virtue of... [Read more...]

Radical Education Reform? Think Bigger.

April 2, 2012 | Capehart

“My job is to teach you how to think.” –Hugh Young A few days ago John Naughton published an article summarizing his manifesto on how to reform computer science education. I agree computer science education is in need of drastic... [Read more...]

Missing Data Club

April 1, 2012 | Capehart

Welcome to Missing Data Club. There are only three rules. Rule #1 is: There is no missing data. Rule #2 is: THERE IS NO MISSING DATA! Rule #3: If you’ve never built a model using missing data – you must do it... [Read more...]

A Crash Course in git for Data Scientists

March 10, 2012 | Capehart

I really like git. It’s the first versioning tool I’ve ever used so I have nothing else to compare it to, but in the world of statistical model building where iteration is constant (and almost never a strict linear progression)... [Read more...]

Analytic applications are built by data scientists

February 1, 2012 | David Smith

Ventana Research analyst David Menninger was on the judging panel for the Applications of R in Business contest. In a post on the Ventana research blog, he offers his perspectives on the contest, noting that R, as a statistical package, includes many algorithms for predictive analytics, including regression, clustering, classification, ... [Read more...]

Norman Nie on two big problems with Big Data

December 23, 2011 | David Smith

Revolution Analytics CEO Norman Nie sat down with Cassimir Medford from Business Agility to talk about the problems business today face with respect to Big Data. The two big problems identified: finding adequately trained personnel and locating the right tools. Norman traces the problem of finding skilled practitioners to work ... [Read more...]

EMC survey differentiates BI and Data Science

December 15, 2011 | David Smith

EMC last week published the results of a survey of 462 IT decision makers who self-identified as either a data scientist or business intelligence professional (plus 35 invitees who were attendees at the EMC Data Scientist Summity and/or Kaggle competitors). There's a nice summary of the conclusions at the EMC blog, (... [Read more...]

Data Science: a literature review

September 28, 2011 | David Smith

Just what is Data Science, anyway? Here's one take: Ever since the term "Data Scientist" was coined by DJ Patil and Jeff Hammerbacker in 2009, there's been a vigorous debate on what the team actually means. More than 80% of statisticians consider themselves data scientists, but Data Science is more than just ... [Read more...]

Fortune: Data Science is the hot new job

September 6, 2011 | David Smith

An article in the September 5 issue of Fortune Magazine notes that despite the economy, companies are scrambling to hire data scientists: Data scientists have been a fixture at online companies like Google (GOOG) and Amazon (AMZN) for years. But these days organizations as diverse as Wal-Mart (WMT) and Foursquare are ... [Read more...]

How to access 100M time series in R in under 60 seconds

August 25, 2011 | David Smith

DataMarket, a portal that provides access to more than 14,000 data sets from various public and private sector organizations, has more than 100 million time series available for download and analysis. (Check out this presentation for more info about DataMarket.) And now with the new package rdatamarket, it's trivially easy to import ... [Read more...]

Statisticians at JSM consider themselves "Data Scientists"

August 4, 2011 | David Smith

At the JSM 2011 conference in Miami earlier this week, we conducted an informal poll of attendees on their attitudes to respect to Big Data, statistical software, and data science. JSM is the largest gathering of statisticians in North America, and attendees were invited to complete a survey after logging into ... [Read more...]

Growth in data-related jobs

July 19, 2011 | David Smith

At job-search site, you can take a look at trends in the use of keywords used in job postings. As you might expect, job postings containing terms related to making sense from data are on the rise. Here's the growth in job postings mentioning big data: And here's ... [Read more...]
