Posts Tagged ‘ Data Science ’

Slides and replay for "The Rise of Data Science"

November 2, 2012
By

I had a great time presenting my new webinar yesterday, thanks to everyone who attended "The Rise of Data Science in the Age of Big Data Analytics" and especially those who submitted questions. Sorry I didn't have time to get to them all, but feel free to ask here in the comments. There's been some discussion recently about whether...

Read more »

More data apps spawned by Sandy

October 31, 2012
By
More data apps spawned by Sandy

As the clean-up continues on the eastern seaboard, I wanted to follow up on Monday's post on tracking Hurricane Sandy with Open Data with a couple of other R-based data applications spawned by the storm. Josef Fruehwald created an R script to tap into local weather sensors to keep track of air pressure, wind speed and rainfall near his...

Read more »

Montreal R User Group meetup Nov. 14th

October 29, 2012
By
Montreal R User Group meetup Nov. 14th

After a bit of a summer lull, the Montreal R User Group is meeting up again! We’re trying out a new venue this time. Notman House is the home of the web in Montreal. They hold hackathons and other tech user group meetups, and they are all around great people in an all around great

Read more »

Two Talks on Data Science, Big Data and R

October 23, 2012
By

On Thursday next week (November 1), I'll be giving a new webinar on the topic of Big Data, Data Science and R. Titled "The Rise of Data Science in the Age of Big Data Analytics: Why Data Distillation and Machine Learning Aren’t Enough", this is a provocative look at why data scientists cannot be replaced by technology, and why...

Read more »

Revolution Analytics receives Top Innovator award for Data Science Technology

August 23, 2012
By
Revolution Analytics receives Top Innovator award for Data Science Technology

A big thank-you to all the R users out there who voted for Revolution R Enterprise in DataWeek Awards. We're so pleased to be recognized by the voters and the DataWeek judging panel with the Top Innovator Award for Data Science Technology. We're looking forward to the awards ceremony next week at DataWeek SF (in San Francisco, September 24-27)....

Read more »

Ryan Rosario on Parallel programming in R

August 17, 2012
By

Earlier this year data scientist Ryan Rosario gave a talk on parellel computing with R to the Los Angeles R User Group, and he recently made the slides from the talk available online. They're a great resource for anyone looking to make use of multi-processor systems a Hadoop based architechure to speed computations with big data. Ryan's talk was...

Read more »

Success does not require understanding

July 23, 2012
By

I took part in the second Data Science London Hackathon last weekend (also my second hackathon) and it was a very different experience compared to the first hackathon. Once again Carlos and his team really looked after us. The data was released 24 hours before the competition started and even though I had spent less

Read more »

Modeling Trick: Impact Coding of Categorical Variables with Many Levels

July 23, 2012
By
Modeling Trick: Impact Coding of Categorical Variables with Many Levels

One of the shortcomings of regression (both linear and logistic) is that it doesn’t handle categorical variables with a very large number of possible values (for example, postal codes). You can get around this, of course, by going to another modeling technique, such as Naive Bayes; however, you lose some of the advantages of regression Related posts:

Read more »

Coke vs Soda vs Pop : Linguistic trends analyzed with Twitter and R

July 19, 2012
By
Coke vs Soda vs Pop : Linguistic trends analyzed with Twitter and R

Growing up in Australia, for me a carbonated drink like Pepsi or Fanta or lemonade was always just a "soft drink". (Also, 'lemonade' in Australia was something different to 'lemonade' in the US; it's something close to 7-Up.) So when I moved to Seattle, it was surprising to me that all such things were called "pop". And then I...

Read more »

Preparing public data for analysis with R

July 18, 2012
By
Preparing public data for analysis with R

In most data science applications, preparing the data is at least half the job. Finding where the data lives, figuring out how to access it, finding the right records, filtering, cleaning and transforming the data ... all of this has to be done before the statistical analysis can even begin. Fortunately, the R language has many tools for data...

Read more »