What Would Cohen Have Titled “The Earth is Round (p < .05)” in 2014?

June 25, 2014

(This article was first published on TRinker's R Blog » R, and kindly contributed to R-bloggers)

The area of bibliometrics is not my area of expertise but is still of interest as a researcher. I sometimes think about how Google has impacted the way we title articles. Gone are the days of witty, snappy titles. Title selection is an art form but of a different kind. Generally, researchers try to construct titles of the most searchable keywords. In trying to title an article today and came upon an Internet article entitled Heading for Success: Or How Not to Title Your Paper.

According to the article, to increase citation rates, a title should:

  1. Contain no ? or !
  2. May contain a :
  3. Should be between 31-40 character
  4. Avoid humor/pun

In seeing:

…some authors are tempted to spice them up with a touch of humour, which may be a pun, a play on words, or an amusing metaphor. This, however, is a risky strategy.

my mind went to the classic Jacob Cohen (1994) paper entitled The Earth is Round (p < .05). In 1994 the world was different; Google didn't exist yet. I ask, “What if Cohen had to title his classic title in 2014?” What would it look like?

Keywords: Mining “The Earth is Round (p < .05)”

I set to work by grabbing the paper's content and converting to plain text. Then I decided to tease out the most frequent terms after stemming and removing stopwords. Here's the script I used:

library(qdap); library(RCurl); library(wordcloud); library(ggplot2)

cohen_url <- "https://raw.githubusercontent.com/trinker/cohen_title/master/data/Cohen1994.txt"
cohen <- getURL(cohen_url, ssl.verifypeer = FALSE)

## remove reference section and title
cohen <- substring(strsplit(cohen, "REFERENCES")[[c(1, 1)]], 34)

## convert format so we can eliminate strange characters
cohen <- iconv(cohen, "", "ASCII", "byte")

## replacement parts
bads <- c("-", "<e2><80><9c>", "<e2><80><9d>", "<e2><80><98>", 
    "<e2><80><99>", "<e2><80><9b>", "<ef><bc><87>", "<e2><80><a6>", 
    "<e2><80><93>", "<e2><80><94>", "<c3><a1>", "<c3><a9>", 
    "<c2><bd>", "<ef><ac><81>", "<c2><a7>", "<ef><ac><82>", 
    "<ef><ac><81>", "<c2><a2>", "/j")

goods <- c(" ", " ", " ", "'", "'", "'", "'", "...", " ", 
    " ", "a", "e", "half", "fi", " | ", "ff", "ff", " ", "ff")

## sub the bad for the good
cohen <- mgsub(bads, goods, clean(cohen))

## Stem it
cohen_stem <- stemmer(cohen)

## Find top words
(cohen_top_20 <- freq_terms(cohen_stem, top = 20, stopwords = Top200Words))
##    WORD         FREQ
## 1  test           21
## 2  signiffc       19
## 3  research       18
## 4  probabl        17
## 5  size           17
## 6  data           15
## 7  h              15
## 8  effect         14
## 9  p              14
## 10 statist        14
## 11 given          13
## 12 hypothesi      13
## 13 analysi        11
## 14 articl         11
## 15 nhst           11
## 16 null           11
## 17 psycholog      11
## 18 conffdenc      10
## 19 correl         10
## 20 psychologist   10
## 21 result         10
## 22 theori         10

plot of chunk plot1

with(cohen_top_20, wordcloud(WORD, FREQ))
mtext("Content Cloud: The Earth is Round (p < .05)", col="blue")

plot of chunk plot2

What Would Cohen Have Titled “The Earth is Round (p < .05)”?

So what would Cohen have titled “The Earth is Round (p < .05)” in 2014? Looking at the results… I don't know. It's fun to speculate. Maybe some could suggest in the comments but as for me I still like “The Earth is Round (p < .05)”.

Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997-1003. doi:10.1037/0003-066X.49.12.997

To leave a comment for the author, please follow the link and comment on their blog: TRinker's R Blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

plotly webpage

dominolab webpage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training





CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)