Blog Archives

Insert and Remove Performance of Boost’s flat_set v.s. std::set

January 16, 2014
By

The standard way to represent an ordered set of numbers is with a binary tree. This offers a good mix of performance properties as the number of elements gets large. In particular it offers O(log(n)) operations insertion/deletion, O(log(n)) operations to find an element. Finding the ith element of the set takes more time, O(n) operations

Read more »

The OpenStreetMap Package Opens Up

April 14, 2013
By
The OpenStreetMap Package Opens Up

A new version of the OpenStreetMap package is now up on CRAN, and should propagate to all the mirrors in the next few days. The primary purpose of the package is to provide high resolution map/satellite imagery for use in your R plots. The package supports base graphics and ggplot2, as well as transformations between spatial coordinate

Read more »

Interview by DecisionStats

April 5, 2013
By

Ajay Ohri interviewed me on his popular DecisionStats blog. Topics discussed ranged widely from Fellows Statistics, to Deducer, to statnet, to Poker A.I., to Big Data.    

Read more »

Quickly Profiling Compiled Code within R on the Mac

April 3, 2013
By
Quickly Profiling Compiled Code within R on the Mac

This is a quick note on profiling your compiled code on the mac. It is important not to guess when figuring out where the bottlenecks in your code are, and for this reason, the R manual has several suggestions on how to profile compiled code running within R. All of the methods are platform dependent, with linux requiring command line tools

Read more »

Great Infographic

March 4, 2013
By

This is a really great exposition on an infographic. Note that the design elements and "chart junk" serve to better connect and communicate the data to the viewer. The choice not to go with pie charts for the first set of plots is a good one. The drawbacks of polar representations of proportions is very

Read more »

Climate: Misspecified

December 4, 2012
By
Climate: Misspecified

I'm usually quite a big fan of the content syndicated on R-Bloggers (as this post is), but I came across a post yesterday that was as statistically misguided as it was provocative. In this post, entitled "The Surprisingly Weak Case for Global Warming," the author (Matt Asher) claims that the trend toward hotter average global temperatures over the last

Read more »

How the Democrats may have won the House, but lost the seats

November 14, 2012
By
How the Democrats may have won the House, but lost the seats

  The 2012 election is over and in the books. A few very close races remain to be officially decided, but for the most part everything has settled down over the last week. By all accounts it was a very good night for the Democrats, with wins in the presidency, senate and state houses. They also performed

Read more »

wordcloud makes words less cloudy

September 11, 2012
By
wordcloud makes words less cloudy

  An update to the wordcloud package (2.2) has been released to CRAN. It includes a number of improvements to the basic wordcloud. Notably that you may now pass it text and Corpus objects directly. as in: #install.packages(c("wordcloud","tm"),repos="http://cran.r-project.org") library(wordcloud) library(tm) wordcloud("May our children and our children's children to a thousand generations, continue to enjoy the

Read more »

R for Dummies

August 20, 2012
By

The book R for Dummies was released recently, and was just reviewed by Dirk Eddelbuettel in the Journal of Statistical Software. Dirk is an R luminary, creating such fantastic works as Rcpp. R for Dummies seems to have beaten Dirk's natural disinclination to like anything with "for Dummies" appended to it, receiving a pretty positive review. Here is the last bit: "R

Read more »

Deducer.org reaches 250,000 page views and continues to grow

April 20, 2012
By
Deducer.org reaches 250,000 page views and continues to grow

It is difficult for R package authors to know how much (if at all) their packages are being used. CRAN does not calculate or make public download statistics (though this might change in the relatively near future), so authors can't tell if 10 or 10,000 people are using their work. Deducer is in much the same boat.

Read more »