Think academic journals look the same ? Well, some do…

February 8, 2011

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

We have seen yesterday that finding an optimal strategy to publish is
not that simple. And actually, it can be even more difficult in the
case the journal rejects the paper (not because it is not correct, but
because “it
does not fit
” with the standards, the quality of the journal, the
audience, the editor’s mood, or whatever). The author has basically two

  • forget about the article and move to something else (e.g. start a
    blog where he/she will be the author and
    the editor)
  • pretend that the article is worth publishing and then try to find
    another journal with similar interests

But this last choice is not that easy, since sometimes the author think
that this journal was indeed the one that should publish it (e.g. all
the articles on the subject have been published in that journal).
So I was wondering if there were clusters of journals, i.e. journals
that publish almost the same
kind of articles (so that next time one of my paper is rejected by the
editor, I just go to for some journal in the same cluster).
So what I did is extremely simple: I looked at articles titles and looked for correlations
between words frequency (I could have done that in key words, but I am
not a big fan of those key words). I looked at 35 journals (that are
somehow related to my areas of interest) and looked at titles of all
articles published over the last 20 years. Then I kept the top 1000 of
words, and I removed standard short words (“a“, “the“, “is“, etc). Actually, my top words
looks like

"models" "model" "data" "estimation" "analysis" "time" 
"processes" "risk" "random" "stochastic" "regression"
"market" "approach" "optimal" "based" "information"
"evidence" "linear" "games" "bayesian" "theory" "effects"
"distribution" "multivariate" "tests" "markets" "markov"
"equilibrium" "dynamic" "process" "distributions"
"application" "stock" "likelihood"

Then, I ran a principal component
on my dataset (containing 960 variables – here words – and 35 observations – here journal names).

res.pca = PCA(MATRICE, scale.unit=TRUE, ncp=5,
plot.PCA(res.pca, axes=c(1, 2), choix="ind")

The projection of the journals on the first two axis looks like that

Here, we can clearly observe some clusters : on the up-left Journal of Finance and Journal of Banking and Finance (say
financial journals) on the top-right Biometrika,
Biometrics, Computational Statistics and Data Analysis
and Journal of Econometrics (JASA is not far away, i.e. applied
statistics journal). And below, on the right, Stochastic Processes and their Applications,
Annals of Applied Probability
, Journal
of Applied Probability
, Annals
of Probability
, Proceedings
of AMS
and Topology and
(ie more theoretical journal).
Note that the projection is rather robust: if I consider my first 200
words, the graph is the same

In order to go further in the interpretation, we can also plot
variables, i.e. words from titles,

where we cannot distinguish anything. So if I just look at my top 30,
here they are,

On top left we see market(s),
risk or information; on top right analysis, effects, models or tests; while below we see Markov or process(es). And we can observe
interesting facts: in finance in statistics, we talk about dynamics while in theoretical
(mathematical) journal it is about processes.

But the goal was to find cluster, i.e. classes of journals that publish
papers with similar titles.

cah = hclust(DISTANCE)

Here we have

If some classes a rather natural (Journal of Applied Proba. and Advances in Applied Proba.or Economic Theory, Journal of Economic Theory and Journal of Mathematical Economics) some strong correlation are not simple to understand, (e.g. Insurance: Mathematics and Economics and Management Science or Annals of Statistics and the Journal of Multivariate Analysis).
Again, it might be possible to spend hours on the graphs, but if I want
– someday – to submit something to one of those journals, I guess I have
to stop here, and move to something else…

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , , , , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)