Boxplots are a good way to get some insight in your data, and while R provides a fine ‘boxplot’ function, it doesn’t label the outliers in the graph. However, with a little code you can add labels yourself:The numbers plotted next to ...

I began to think on a nice way of plotting campaign expenditures in a paper I'm working on. I thought this would be something like the following--simple but meaningful even when there are outliers in both tails. Though I like the seniors Tukey's boxplot and scatter plots, I had already used them the last time … Read More...

This week my research group discussed Adrian Raftery’s recent paper on “Use and Communication of Probabilistic Forecasts” which provides a fascinating but brief survey of some of his work on modelling and communicating uncertain futures. Coincidentally, today I was also sent a copy of David Spiegelhalter’s paper on “Visualizing Uncertainty About the Future”. Both are

Before we start: yes, we’ve been here before. There was the Biostars question “Calculating Time From Submission To Publication / Degree Of Burden In Submitting A Paper.” That gave rise to Pierre’s excellent blog post and code + data on Figshare. So why are we here again? 1. It’s been a couple of years. 2.

ggvis 0.4 is now available on CRAN. You can install it with: install.packages("ggvis") The major features of this release are: Boxplots, with layer_boxplots() chickwts %>% ggvis(~feed, ~weight) %>% layer_boxplots() Better stability when errors occur. Better handling of empty data and malformed data. More consistent handling of data in compute pipeline functions. Because of these changes,

Last week, a student asked me about multiple tests. More precisely, she ran an experience over – say – 20 weeks, with the same cohort of – say – 100 patients. An we observe some size=100 nb=20 set.seed(1) X=matrix(rnorm(size*nb),size,nb) (here, I just generate some fake data). I can visualize some trajectories, over the 20 weeks, library(RColorBrewer) cl1=brewer.pal(12,"Set3") cl2=brewer.pal(8,"Set2") cl=c(cl1,cl2)...

We leave the Jolly Roger behind this year and turn our piRate spyglass towards the digital seas and take a look at piRated movies as seen through the lens of TorrentFreak. The seasoned seadogs who pilot that ship have been doing a weekly “Top 10 Pirated Movies of the Week” post since early 2013, and...

by Joseph Rickert While preparing for the DataWeek R Bootcamp that I conducted this week I came across the following gem. This code, based directly on a Max Kuhn presentation of a couple years back, compares the efficacy of two machine learning models on a training data set. #----------------------------------------- # SET UP THE PARAMETER SPACE SEARCH GRID ctrl <-...

Rick Wicklin (@RickWicklin) made a recent post to the SAS blog on An exploratory technique for visualizing the distributions of 100 variables. It’s a very succinct tutorial on both the power of boxplots and how to make them in SAS (of course). I’m not one to let R be “out-boxed”, so I threw together a