# Love is all around: Popular words in pop hits

May 25, 2017
By

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Data scientist Giora Simchoni recently published a fantastic analysis of the history of pop songs on the Billboard Hot 100 using the R language. Giora used the rvest package in R to scrape data from the Ultimate Music Database site for the 350,000 chart entries (and 35,000 unique songs) since 1940, and used those data to create and visualize several measures of song popularity over time.

A novel measure that Giora calculates is "area under the song curve": the sum of all the ranks above 100 for every week the song is in the Hot 100. By that measure, the most popular (and also longest-charting) song of all time is Radioactive by Imagine Dragons:

It's turns out that calculating this "song integral" is pretty simple in R when you use the tidyverse:

```calculateSongIntegral <- function(positions) {
sum(100 - positions)
}

billboard %>%
filter(EntryDate >= date_decimal(1960)) %>%
group_by(Artist, Title) %>%
summarise(positions = list(ThisWeekPosition)) %>%
mutate(integral = map_dbl(positions, calculateSongIntegral)) %>%
group_by(Artist, Title) %>%
tally(integral) %>%
arrange(-n)
```

Another fascinating chart included in Giora's post is this analysis of the most frequent words to appear in song titles, by decade. He used the tidytext package to extract individual words from song titles and then rank them by frequency of use:

So it seems as though Love Is All Around (#41, October 1994) after all! For more analysis of the Billboard Hot 100 data, including Top-10 rankings for various measures of song popularity and the associated R code, check out Giora's post linked below.

Sex, Drugs and Data: Billboard Bananas

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...