My R package’s worldmap of downloads!

[This article was first published on Research Side Effects » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last week, a colleague draw my attention on this new log files from the Rstudio cloud CRAN mirror, through a post from Tal Galili. This CRAN mirror is a little different, as it uses Amazon CloudFront to deliver the downloads rapidly from a server near you, wherever that is. But what’s really great about it, is the availability of those log files, that have been recording every package download since October 2012, daily! As R is primarily intended for statisticians, it didn’t take long before we start playing with the data. Below is my bit to this effort, a function to plot the world map of one’s package downloads from the Rstudio “0-Cloud” CRAN mirror. It relies on Tal Galili’s functions for downloading and formating the data. Of course, the Rstudio CRAN mirror is only 1 mirror among all the CRAN mirrors around the world, and is not representative due to its link with the Rstudio IDE. However, in research, there is quite a delay between one’s hard work (i.e. implementing a package) and the reward (i.e. publication). Encouragements such as download stats are welcomed!

pkgDNLs.worldmapcolor <- function(pkgname, rmvdupips=TRUE, date.start=Sys.Date()-8, date.stop=Sys.Date()-1, shp.file.repos){

if(date.stop>=Sys.Date() | date.start>=Sys.Date()){
stop("date not available yet")

cat('downloading the RStudio CRAN data...\n')
dnldata_folder <- download_RStudio_CRAN_data(START = date.start, END = date.stop)
cat('data downloaded!\n')
dnldata <- read_RStudio_CRAN_data(dnldata_folder)

data <- dnldata[which(dnldata$package == pkgname),]
data <- data[!duplicated(data$ip_id),]

counts <-$country))
names(counts) <- c("country", "count")

# you need to download a shapefile of the world map from Natural Earth (for instance)
# and unzip it in the 'shp.file.repos' repository
world<-readShapePoly(fn=paste(shp.file.repos, "ne_110m_admin_0_countries", sep="/"))
ISO_full <- as.character(world@data$iso_a2)
ISO_full[146] <- "SOM"  # The iso identifier for the Republic of Somaliland is missing
ISO_full[89]  <- "KV" # as for the Republic of Kosovo
ISO_full[39]  <- "CYP" # as for Cyprus

colcode <- numeric(length(ISO_full))
names(colcode) <- ISO_full
dnl_places <- names(colcode[which(names(colcode) %in% as.character(counts$country))])
rownames(counts) <- counts$country
colcode[dnl_places] <- counts[dnl_places, "count"]

world@data$id <- rownames(world@data)
world.points <- fortify(world, by="id")
names(colcode) <- rownames(world@data)
world.points$dnls <- colcode[world.points$id] <-  ggplot(data=world.points) +
geom_polygon(aes(x = long, y = lat, group=group, fill=dnls), color="black") +
coord_equal() + #theme_minimal() +
scale_fill_gradient(low="white", high="#56B1F7", name="Downloads") +
labs(title=paste(pkgname, " downloads from the '0-Cloud' CRAN mirror by country\nfrom ", date.start, " to ", date.stop,"\n(Total downloads: ", sum(counts$count), ")", sep=""))

wm <- pkgDNLs.worldmapcolor(pkgname="timeROC",  shp.file.rep="~/shapefileRepository")

Here is an exemple of the result on a friend‘s package: timeROC timeROC worldmap

To leave a comment for the author, please follow the link and comment on their blog: Research Side Effects » R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)