My R package’s worldmap of downloads!

June 18, 2013

(This article was first published on Research Side Effects » R, and kindly contributed to R-bloggers)

Last week, a colleague draw my attention on this new log files from the Rstudio cloud CRAN mirror, through a post from Tal Galili. This CRAN mirror is a little different, as it uses Amazon CloudFront to deliver the downloads rapidly from a server near you, wherever that is. But what’s really great about it, is the availability of those log files, that have been recording every package download since October 2012, daily! As R is primarily intended for statisticians, it didn’t take long before we start playing with the data. Below is my bit to this effort, a function to plot the world map of one’s package downloads from the Rstudio “0-Cloud” CRAN mirror. It relies on Tal Galili’s functions for downloading and formating the data. Of course, the Rstudio CRAN mirror is only 1 mirror among all the CRAN mirrors around the world, and is not representative due to its link with the Rstudio IDE. However, in research, there is quite a delay between one’s hard work (i.e. implementing a package) and the reward (i.e. publication). Encouragements such as download stats are welcomed!

pkgDNLs.worldmapcolor <- function(pkgname, rmvdupips=TRUE, date.start=Sys.Date()-8, date.stop=Sys.Date()-1, shp.file.repos){

if(date.stop>=Sys.Date() | date.start>=Sys.Date()){
stop("date not available yet")

cat('downloading the RStudio CRAN data...\n')
dnldata_folder <- download_RStudio_CRAN_data(START = date.start, END = date.stop)
cat('data downloaded!\n')
dnldata <- read_RStudio_CRAN_data(dnldata_folder)

data <- dnldata[which(dnldata$package == pkgname),]
data <- data[!duplicated(data$ip_id),]

counts <-$country))
names(counts) <- c("country", "count")

# you need to download a shapefile of the world map from Natural Earth (for instance)
# and unzip it in the 'shp.file.repos' repository
world<-readShapePoly(fn=paste(shp.file.repos, "ne_110m_admin_0_countries", sep="/"))
ISO_full <- as.character([email protected]$iso_a2)
ISO_full[146] <- "SOM"  # The iso identifier for the Republic of Somaliland is missing
ISO_full[89]  <- "KV" # as for the Republic of Kosovo
ISO_full[39]  <- "CYP" # as for Cyprus

colcode <- numeric(length(ISO_full))
names(colcode) <- ISO_full
dnl_places <- names(colcode[which(names(colcode) %in% as.character(counts$country))])
rownames(counts) <- counts$country
colcode[dnl_places] <- counts[dnl_places, "count"]

[email protected]$id <- rownames([email protected])
world.points <- fortify(world, by="id")
names(colcode) <- rownames([email protected])
world.points$dnls <- colcode[world.points$id] <-  ggplot(data=world.points) +
geom_polygon(aes(x = long, y = lat, group=group, fill=dnls), color="black") +
coord_equal() + #theme_minimal() +
scale_fill_gradient(low="white", high="#56B1F7", name="Downloads") +
labs(title=paste(pkgname, " downloads from the '0-Cloud' CRAN mirror by country\nfrom ", date.start, " to ", date.stop,"\n(Total downloads: ", sum(counts$count), ")", sep=""))

wm <- pkgDNLs.worldmapcolor(pkgname="timeROC",  shp.file.rep="~/shapefileRepository")

Here is an exemple of the result on a friend‘s package: timeROC timeROC worldmap

To leave a comment for the author, please follow the link and comment on their blog: Research Side Effects » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)