Time to Accept It: publishing in the Journal of Statistical Software

June 30, 2014
By

(This article was first published on The Geokook. » R, and kindly contributed to R-bloggers)

When I was considering submitting my paper on psd to J. Stat. Soft. (JSS), I kept noticing that the time from “Submitted” to “Accepted” was nearly two years in many cases.  I ultimately decided that was much too long of a review process, no matter what the impact factor might be (and in two years time, would I even care?).  Tonight I had the sudden urge to put together a dataset of times to publication.

Fortunately the JSS website is structured such that it only took a few minutes playing with XML scraping (*shudder*) to write the (R) code to reproduce the full dataset.  I then ran a changepoint (published in JSS!) analysis to see when shifts in mean time have occurred.  Here are the results:

 

Top: The number of days for a paper to go from 'Submitted' to 'Accepted'.  Middle: In log2(time), with lines for one month, one year, and two years. Bottom frame: changepoint analyses.

Top: The number of days for a paper to go from ‘Submitted’ to ‘Accepted’ as a function of the cumulative issue index (each paper is an “issue” in JSS).
Middle: In log2(time), with lines for one month, one year, and two years.
Bottom frame: changepoint analyses.

Pretty interesting stuff, but kind of depressing: the average time it takes to publish is about 1.5 years, with a standard deviation of 206 days.  There are many cases where the paper review is <1 year, but those tend to be in the ‘past’ (prior to volume 45, issue 1).

Of course, these results largely reflect an increase in academic impact (JSS is becoming more impactful), which simultaneously increases the number of submissions for the editors to deal with.  So, these data should be normalized by something.  By what, exactly, I don’t know.

And, finally, I can’t imagine how the authors of the paper that went through a 1400+ day review process felt — or are they still feeling the sting?

Here’s my session info:

R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] changepoint_1.1.5 zoo_1.7-11 plyr_1.8.1 XML_3.98-1.1

loaded via a namespace (and not attached):
[1] grid_3.1.0 lattice_0.20-29 Rcpp_0.11.1 tools_3.1.0
And here's the R-code needed to reproduce the dataset and figure:
library(XML)
library(plyr)
library(changepoint)

#Current Volume:
cvol <- 58
# set to 'TRUE' if you want to
# reproduce the dataset with each
# run (not recommended)
redo <- FALSE

jstat.xml <- function(vol=1, tolist=TRUE){
 src <- "http://www.jstatsoft.org/"
 vsrc <- sprintf("%sv%i",src,vol)
 message(vsrc)
 X <- xmlParse(vsrc)
 if (tolist) X <- xmlToList(X)
 return(X)
}

jstat.DTP <- function(vol=1, no.authors=FALSE, no.title=FALSE){
 # Get article data
 xl <- jstat.xml(vol)
 Artic <- xl$body[[4]]$div$ul # article data

 # Vol,Issue
 issues <- ldply(Artic, function(x) return(x[[5]][[1]]))$V1
 issues2 <- ldply(strsplit(issues,split=","),
 .fun=function(x){gsub("\n Vol. ","",gsub("Issue ","",x))})

 # Accepted
 dates <- ldply(Artic, function(x) return(x[[6]][[1]]))$V1
 dates2 <- ldply(strsplit(dates, split=","),
 .fun=function(x){as.Date(gsub("Accepted ","",gsub("Submitted ","",x)))})

 Dat <- data.frame(Volume=issues2$V1, Issue=issues2$V2, Date=issues2$V3,
 Submitted=dates2$V1, Accepted=dates2$V2,
 Days.to.pub=as.numeric(difftime(dates2$V2, dates2$V1, units="days")),
 Author=NA, Title=NA)

 if (!no.authors){
 # Authors
 Dat$Author <- ldply(Artic, function(x) return(x[[3]][[1]]))$V1
 }

 if (!no.title){
 # Title
 Dat$Title <- ldply(Artic, function(x) return(x[[1]][[1]]))$V1
 }

 return(Dat)
}

# Shakedown
#jstat.DTP(58)
#ldply(57:58, jstat.DTP)

if (!exists("Alldata") | redo){
 Alldata <- ldply(seq_len(cvol), jstat.DTP)
 save(Alldata, file="JStatSoft_DtP.rda")
} else {
 load("JStatSoft_DtP.rda")
}

Alldata.s <- arrange(Alldata, Days.to.pub)
Cpt <- suppressWarnings(cpt.mean(Alldata$Days.to.pub, method="SegNeigh", Q=4))
niss <- length(Cpt@data.se[email protected])

summary(Cpt)
sd(Alldata$Days.to.pub)

PLT <- function(){
 layout(matrix(1:3))
 par(las=1, cex=0.8)

 with(Alldata, {
 par(mar=c(0.5,5,2,1))
 plot(Days.to.pub,
 xlim=c(0,niss), #xaxs="i",
 type="l", col="grey", ylab="Days", xaxt="n",
 ylim=c(0,1500),yaxs="i")
 points(Days.to.pub, pch=3)
 mtext("Time to 'Accepted': J. Stat. Soft.", font=2, line=0.5)
 axis(1, labels=FALSE)
 par(mar=c(2,5,0.,1))
 plot(log2(Days.to.pub),
 xlim=c(0,niss), #xaxs="i",
 pch=3, ylab="log2 Days")
 yt <- log2(356*c(1/12,1:2))
 abline(h=yt, col="red", lty=2)
 })

 par(mar=c(4,5,1,1))
 plot(Cpt,
 xlim=c(0,niss), #xaxs="i",
 xlab="Issue index", ylab="Days",
 ylim=c(-499,1500),yaxs="i")
 mtext("Changes in mean", font=2, line=-2, adj=0.1, cex=0.8)

 Dm <- lbls <- round(param.est(Cpt)$mean)
 lbls[1] <- paste("section mean:",lbls[1])
 Lc <- cpts(Cpt)
 yt <- -75-max(Dm)+Dm
 xt <- c(0,Lc)+diff(c(0,Lc,niss))/2

 text(xt, yt, lbls, col="red", cex=0.9)
}

PLT()

To leave a comment for the author, please follow the link and comment on his blog: The Geokook. » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.