**Freakonometrics - Tag - R-english**, and kindly contributed to R-bloggers)

We have seen yesterday that finding an optimal strategy to publish is

not that simple. And actually, it can be even more difficult in the

case the journal rejects the paper (not because it is not correct, but

because “it

does not fit” with the standards, the *quality* of the journal, the

audience, the editor’s mood, or whatever). The author has basically two

choices,

- forget about the article and move to something else (e.g. start a

blog where he/she will be the author*and*

the editor) - pretend that the article is worth publishing and then try to find

another journal with similar interests

But this last choice is not that easy, since sometimes the author think

that this journal was indeed the one that should publish it (e.g. all

the articles on the subject have been published in that journal).

So I was wondering if there were clusters of journals, i.e. journals

that publish *almost* the same

kind of articles (so that next time one of my paper is rejected by the

editor, I just go to for some journal in the same cluster).

So what I did is extremely simple: I looked at articles *titles* and looked for correlations

between words frequency (I could have done that in key words, but I am

not a big fan of those key words). I looked at 35 journals (that are

somehow related to my areas of interest) and looked at titles of all

articles published over the last 20 years. Then I kept the top 1000 of

words, and I removed standard short words (“*a*“, “*the*“, “*is*“, etc). Actually, my top words

looks like

"models" "model" "data" "estimation" "analysis" "time"

"processes" "risk" "random" "stochastic" "regression"

"market" "approach" "optimal" "based" "information"

"evidence" "linear" "games" "bayesian" "theory" "effects"

"distribution" "multivariate" "tests" "markets" "markov"

"equilibrium" "dynamic" "process" "distributions"

"application" "stock" "likelihood"

Then, I ran a *principal component
analysis *on my dataset (containing 960 variables – here

*words*– and 35 observations – here

*journal names*).

library("FactoMineR")

res.pca = PCA(MATRICE, scale.unit=TRUE, ncp=5,

graph=FALSE)

plot.PCA(res.pca, axes=c(1, 2), choix="ind")

The projection of the journals on the first two axis looks like that

Here, we can clearly observe some clusters : on the up-left *Journal of Finance* and *Journal of Banking and Finance* (say

financial journals) on the top-right *Biometrika*,

*Biometrics*, *Computational Statistics and Data Analysis*

and *Journal of Econometrics* (*JASA* is not far away, i.e. applied

statistics journal). And below, on the right, *Stochastic Processes and their Applications,
Annals of Applied Probability*,

*Journal*

of Applied Probability,

of Applied Probability

*Annals*

of Probability,

of Probability

*Proceedings*

of AMSand

of AMS

*Topology and*

Applications(ie more theoretical journal).

Applications

Note that the projection is rather robust: if I consider my first 200

words, the graph is the same

In order to go further in the interpretation, we can also plot

variables, i.e. words from titles,

where we cannot distinguish anything. So if I just look at my top 30,

here they are,

On top left we see *market(s)*,

*risk* or *information*; on top right *analysis*, *effects*, *models* or *tests*; while below we see *Markov* or *process(es)*. And we can observe

interesting facts: in finance in statistics, we talk about *dynamics* while in theoretical

(mathematical) journal it is about *processes*.

But the goal was to find cluster, i.e. classes of journals that publish

papers with similar titles.

Here we have

If some classes a rather natural (*Journal of Applied Proba*. and *Advances in Applied Proba.or Economic Theory, Journal of Economic Theory and Journal of Mathematical Economics) *some strong correlation are not simple to understand, (e.g.

*Insurance: Mathematics and Economics*and

*Management Science*or

*Annals of Statistics*and the

*Journal of Multivariate Analysis*).

Again, it might be possible to spend hours on the graphs, but if I want

– someday – to submit something to one of those journals, I guess I have

to stop here, and move to something else…

**leave a comment**for the author, please follow the link and comment on their blog:

**Freakonometrics - Tag - R-english**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...