Blog Archives

How Orbitz uses Hadoop and R to optimize hotel search

December 21, 2010
By
How Orbitz uses Hadoop and R to optimize hotel search

Positional bias — the tendency for users to preferentially select results in the first few positions of a search — is a big issue for all kinds of search engines. But for online travel site Orbitz the stakes are higher than for a traditional Web search engine: if a customer chooses the first-listed hotel in a search for accommodations,...

Read more »

In case you missed it: November Roundup

December 17, 2010
By

In case you missed them, here are some articles from November of particular interest to R users. Dirk Eddelbuettel and Romain Francois went to Google to talk about integrating R (using Rcpp, for example), and we gave a review of the video presentation. R co-creator Ross Ihaka wins a Lifetime Achievement Award in Open Source. Revolution has job openings...

Read more »

Programming languages, ranked by popularity

December 17, 2010
By
Programming languages, ranked by popularity

In a presentation to the Chicago R User Group last night, Drew Conway used his new Infochimps package in R to assess the relative popularity of programming languages. Drew used the word.stats function in the Infochimps package to count the frequency of common computer languages mentioned in Twitter messages, and displayed the results in this bar chart: It's not...

Read more »

R 2.12.1 is out

December 16, 2010
By

As promised, the latest patch to R is out with the release of R 2.12.1, as announced today by the R Core Team. If you build R yourself, sources are available now at your local CRAN mirror, and binaries for Windows, Mac and Linux will be available in the next few days. There are a few new features: The...

Read more »

Data Driven Journalism

December 15, 2010
By

Last night at the Bay Area UseR Group meeting, Peter Aldhous, San Francisco Bureau Chief of New Scientist Magazine, gave an inspiring presentation about Data Driven Journalism. Even though the newspaper industry is faltering as a business model, there's a beacon of light: journalists can be the driving force behind bringing the meaning in the huge data sets that...

Read more »

Facebook’s Social Network Graph

December 14, 2010
By
Facebook’s Social Network Graph

Paul Butler, an intern on Facebook’s data infrastructure engineering team, was interested in visualizing the "locality of friendship". Luckily, he has some great data to work with: Facebook's social network of the friendships between its 500 million members. But visualizing that much data can be a challenge in its own right -- it takes skill to draw meaning from...

Read more »

Machine Learning and Data Mining with R

December 13, 2010
By

The San Francisco Bay Area ACM runs several courses on data mining and machine learning with R. Machine Learning 101 deals primarily with supervised learning problems, and Machine Learning 102 covers unsupervised learning and fault detection. Machine Learning 101 & 102 were most recently presented by Mike Bowles & Tricia Hoffman in September, and the lecture notes and class...

Read more »

An R interface to the Google Prediction API

December 10, 2010
By

An the New York R User Group* last night, 100 R users heard Ni Wang and Max Lin talk explain how "R is one of the important tools used by analysts and engineers at Google for analyzing data". During the talk, Lin revealed that Google plans to make "R more integrated with internal machine learning algorithms and infrastructure", and...

Read more »

An R interface to the Google Prediction API

December 10, 2010
By

An the New York R User Group* last night, 100 R users heard Ni Wang and Max Lin talk explain how "R is one of the important tools used by analysts and engineers at Google for analyzing data". During the talk, Lin revealed that Google plans to make "R more integrated with internal machine learning algorithms and infrastructure", and...

Read more »

Choosing colors for your charts with RColorBrewer

December 9, 2010
By
Choosing colors for your charts with RColorBrewer

If you're creating a bar chart in R, how do you decide what colors the bars should be? Or if you're creating an image plot, what range of images should you use? The colors you choose can not only affect the viewer's interpretation of the graphic, it can also determine its aesthetic appeal, too. That's where the RColorBrewer package...

Read more »