How to build a world-beating predictive model using R

Many modern data analysis problems in both industry and academia involve building a model that can predict the future based on historical variables. The 2009 KDD Cup was an international data mining competition devoted to this type of problem, where contestants attempted to predict the behaviour of mobile phone customers using an extensive database of historical information. The University of Melbourne team managed to win one part of this challenge, using R almost exclusively. In this talk I’ll give some background to the area and the specific problem, and discuss how we went about building our models. The talk will be fairly accessible, and deal with many of the practical issues encountered in this type of work.

Presentation

SURF Meet Up Group

Posted in RUG Sydney | Tagged , , , , | 6 Comments

Taking R to the Limit: Large Datasets; Predictive modeling with PMML and ADAPA

During the first part of our meeting, Ryan Rosario presented on the topic of large datasets in R. Video, slides and code of the talk “Taking R to the Limit: Large Datasets” by Ryan Rosario at the Los Angeles area R Users Group in August 2010 are below.

Video

Slides

Slides are also available for PDF download here.
R code is available here.
More information about the talk can be found here.

During the second part, Trividesh Jena presented on creating models in R for use with the Zementis ADAPA product in the cloud. The video of his talk is below.

Posted in RUG Los Angeles | Tagged , , , , , , , , , , , , , , | Leave a comment

useR! 2010 conference videos

Videos of the invited talks of the useR! 2010 conference as follows (courtesy by Kate Mullen and NIST). This site also aims at collecting the materials (video, slides, R code) of local R users group (RUG) meetings and various other R talks and bringing them to the larger R community. See more videos here, and if you’d like to contribute with materials, see more information here.

Welcome by NIST and
Frank E. Harrell Jr: Information Allergy

Mark S. Handcock: Statistical Modeling of Networks in R and
Panel Discussion: Challenges Bringing R into Commercial Environments

Luke Tierney: Some possible directions for the R engine and
Diethelm Würtz: The Hull, the Feasible Set, and the Risk Surface: A Review of the Portfolio Modeling Infrastructure in R/Rmetrics

Friedrich Leisch: Reproducible Statistical Research in Practice and
Uwe Ligges: Prospects and Challenges for CRAN – with a glance on 64-bit Windows binaries

Also, audio of Richard M. Stallman: Free Software in Ethics and in Practice can be found here (the main talk) and here (the Q&A). More details about the talk and the Q&A session is available on the original blog post link.

Posted in useR! conference | Tagged , , , , , , , , , , , , | 7 Comments

How Google and Facebook are using R

This is an older (2009) video from the kickoff meeting of the San Francisco Bay Area R Users Group. It was a panel discussion within the Predictive Analytics World conference. Video courtesy by Ron Fredericks of LectureMaker (click on the image below to see the video on LectureMaker’s site).

Posted in RUG San Francisco Bay Area | Tagged , , , , , , , | 1 Comment

Use Rapache: It Works!

A half hour lecture by Jeffery Horner on RApache.

Posted in RUG San Francisco Bay Area | Tagged , , , , | Leave a comment

useR! 2010 – Local R User Group Panel

Posted in group introductions | Tagged , , , , , | 1 Comment

Advanced debugging techniques in R

Posted in RUG New York | Tagged , , , , | Leave a comment

Map-reduce in R with Amazon EMR

Posted in RUG Chicago | Tagged , , , , , | Leave a comment

Taking R to the Limit: Parallelization

Video, slides and code of the talk “Taking R to the Limit: Parallelization” by Ryan Rosario at the Los Angeles area R Users Group in July 2010 as follows.

Slides:

GDE Error: Unable to load profile settings

R code: here.

Video:

If you have a question to the speaker, please leave a comment below. Also, this site encourages discussions between all interested in the talks (please use the comments for that).

Posted in RUG Los Angeles | Tagged , , , , , , | 1 Comment

RUG Introduction: Los Angeles area R Users Group

A nice group of people from academia and industry meeting about once a month at UCLA. Attendance is usually 30-40, but gradually increasing (also about 300 registered members). If you’d like to join, visit the group’s website: http://www.meetup.com/LAarea-R-usergroup/, you’ll find some description of past meetings, and you’ll be able to RSVP for coming ones. The materials of the talks (video, slides, code) will be uploaded right here (http://www.r-bloggers.com/RUG), but if you are in the Los Angeles area don’t miss the discussions after the talks and the networking opportunities offered by the meetings. All events are absolutely free for anyone attending, all you need is to sign up to the website and RSVP for the meetings. If you have questions, please contact the organizers (Szilard Pafka and Jan de Leeuw) via the group’s website.

Posted in group introductions, RUG Los Angeles | Leave a comment