I have been forever annoyed at how long it takes to plot data on a large shapefile. And this is a domain where doesn’t matter if you’re working with MapInfo or R. Just zooming the figure takes ages. But a … Continue reading →

Did you notice that the file generated from write.table() in R has missed a tab (\t) in the top-left corner, when row.names=T (by default)?I found the solution here:http://stackoverflow.com/questions/2478352/write-table-in-r-screws-up-header-when-has-r...

Naomi Robbins is running a graph makeover challenge over at her Forbes blog and this is my entry for the B2B/B2C Traffic Sources one (click for larger version): And, here’s the R source for how to generate it: library(ggplot2) df = read.csv("b2bb2c.csv") ggplot(data=df,aes(x=Site,y=Percentage,fill=Site)) + geom_bar(stat="identity") + facet_grid(Venue ~ .) + coord_flip() + opts(legend.position

I’ve been writing lately on what to do when people who make decisions in an organization say they want data-driven capabilities but then ignore or attack the results of data-driven analysis for not saying what they think the data ought to say. Some of the most productive things you can do in that situation include

In the course of working through my MODIS LST project and reviewing the steps that Imhoff and Zhang took as well has the data preparations other researchers have taken ( Neteler ) the issue of MODIS Quality control bits came up. Every MODIS HDF file comes with multiple SDS or multiple layers of data. For

This brief tutorial illustrates how to combine S4 object oriented capabilities with function closures in order to develop classes with built in methods. Thanks to Hadley Wickham for the great contribution of material and tutorials made available on the web and to Bill Venables and Stefano Iacus for their kind reviews. Regular … Continue reading →

Graphs can provide an excellent way to emphasize a point and to quickly and efficiently show important information. Sadly, poor graphs can be a good way to waste space in an article, take up time in a presentation, and waste a lot of ink all while providing little to no information. Excel has made it

As you probably know, I am one of the strongest proponents of the Shiny package for developing interactive web applications Amongst the latest news from RStudio is that what was planned to be commercial software will now be free and Open Source (AGPLv3 license) To celebrate this momentous announcement, I have produced an Earthquake app.

Last month's release of Revolution R Enterprise 6.1 added the capability to fit decision and regresson trees on large data sets (using a new parallel external memory algorithm included in the RevoScaleR package). It also introduced the possibility of applying this and the other big-data statistical methods of RevoScaleR to data files distributed in in Hadoop's HDFS file system*,...

I found the following post regarding the anomalous metal object observed in a Curiosity Rover photo to be fascinating - specifically, the clever ways that some programmers used for filtering the image for the object. The following answer on mathematica.stackexchange.com was especially illuminating for its use of a multivariate distribution to...

I'm usually quite a big fan of the content syndicated on R-Bloggers (as this post is), but I came across a post yesterday that was as statistically misguided as it was provocative. In this post, entitled "The Surprisingly Weak Case for Global Warming," the author (Matt Asher) claims that the trend toward hotter average global temperatures over the last

Last month we released Shiny, our new R package for creating interactive web applications. The response from the community has been extremely encouraging–we’ve received a lot of great feedback that has helped us to make significant improvements to the framework already! Shiny 0.2.3 on CRAN Starting with Shiny 0.2.3, you can install the latest stable

R, which was largely predominant in the academic world, has started picking up a lot in businesses as well. At least that is what I am witnessing among my colleagues. Lot of people have started experimenting with R, choosing the path to enlightenment. ...

igraph is a library for "complex network research". While it integrates very well with R and provides a lot of convenient functions, huge graphs put a quick end to all the joy. The good news is: not all functions in igraph have bad performanc...

Here. Indeed, I’d much rather be a legend than a myth. I just want to clarify one thing. Walter Hickey writes: collaborated on this presentation where they take a hard look at what’s wrong with the recent trends of data visualization and infographics. The takeaway is that while there have The post An...

I had a bit of a play with Shiny over the weekend, using the Ergast Motor Racing Data API and the magical Shiny library for R, that makes building interactive, browser based applications around R a breeze. As this is just a quick heads-up/review post, I’ll largely limit myself to a few screenshots. When I

In celebration of my achieving 10,000 “reputation” on Stack Overflow, I’m re-posting one of my questions from there that was (as I had expected) deleted after being live for about 5 hours. In that time, I never really got a satisfactory answer, so if anyone wants to offer one in the comments, that would be

Lattice plots are a great way of displaying multivariate data in R. Deepayan Sarkar, the author of lattice, has written a fantastic book about Multivariate Data Visualization with R . However, I often have to refer back to the help pages to remind myself how to set and change the legend and how to ensure that...

The cube, 41063625 (3453), can be permuted to produce two other cubes: 56623104 (3843) and 66430125 (4053). In fact, 41063625 is the smallest cube which has exactly three permutations of its digits which are also cube.Find the smallest cube for which exactly five permutations of its digits are cube. Read...

New York Times columnist Charles Blow needed a chart to accompany his op-ed piece Lincoln, Liberty and Two Americas (about one-party control in state legislatures). So he turned to resident graphic editor Kevin Quealy, who found the source data and used R to create the chart below: If you'd like to create similar charts yourself, Kevin provides a useful...

A nice, but not very well known, interface to R is TeXmacs. (I have to say that I am not totally objective, since I wrote the interface between R and TeXmacs…) Here’s a sample window: In the following few posts I’d like to explain how to use this interface. Installation First, install TeXmacs. Best is

Over the weekend, we updated all of the pbdR packages currently available on the CRAN. The updates include tons of internal housecleaning as well as many new features. Notably, pbdBASE_0.1-1 and pbdDMAT_0.1-1 were released, which contain lm.fit() methods. This function in particular has been available at my github for over a month, but didn't make its way to the...