Funky music in funky months: Does my taste of music change over the year?

April 28, 2013
By

(This article was first published on Rcrastinate, and kindly contributed to R-bloggers)

I already introduced some stuff I did with the last.fm API. But did you ever wonder if your taste of music changes over the year? Sunny music in the sunny months and dark music in darker months? Well, I did. And I want to check it out with the RLastFM package and some additional functions.

First, we load the package and assign an API key to the global variable api.key, you have to get yourself an API key to test this stuff.

library(RLastFM)
api.key <- <your key here>

I define a function which only calls another one from the RLastFM package. This isn't really necessary.


get.chart.list <- function (user, key = api.key) {
  user.getWeeklyChartList(user, key)
}

The following function changes a chart list to make intervals larger than weeks (in the beginning, this function only supplied months, hence the name).


make.months <- function (week.list, range = 4) {
  max.w <- nrow(week.list)
  start.weeks.i <- seq(1, max.w, range)
  end.weeks.i <- seq(range, max.w, range)
  if (length(start.weeks.i) > length(end.weeks.i)) {
    end.weeks.i <- c(end.weeks.i, max.w)
    cat("start weeks:", length(start.weeks.i), "\n", " end weeks:", length(end.weeks.i), "\n")
  }
  start.weeks <- week.list[start.weeks.i, "from"]
  end.weeks <- week.list[end.weeks.i, "to"]
  data.frame(from = start.weeks, to = end.weeks)
}


Now, this next function is important and a little complex. It uses the previous functions to
- get a chart list (a list of weeks in which a user was registered on last.fm)
- it changes the chart list to allow for larger intervals than weeks (please see the note under the function)
- it iterates through this chart list and extracts the user's charts for this specific week
- if last.fm fails, it tries again one more time
- it saves the information in the list 'charts'
- it returns the number of errors it caught from last.fm

make.time.charts <- function (user, chart.list = get.chart.list(user, key = key),
                              range = 1, wait.for.rec.sec = 3, key = api.key) {
  chart.list <- make.months(chart.list, range)
  charts <- list()
  errors <- 0
  for (row.i in 1:nrow(chart.list)) {
    if (row.i %% 10 == 0) cat(row.i, "of", nrow(chart.list), "\n")
    start.week <- chart.list[row.i, "from"]
    end.week <- chart.list[row.i, "to"]
    week.chart <- try(get.top.artists.week(user, start.week, end.week, key))
    if (class(week.chart) == "try-error") {
      cat("Trying again in", wait.for.rec.sec ,"seconds...\n")
      Sys.sleep(wait.for.rec.sec)
      week.chart <- try(get.top.artists.week(user, start.week, end.week, key))
      if (class(week.chart) == "try-error") {
        errors <- errors + 1
        charts[[as.character(start.week)]] <- "error"
      }
      else {
        week.chart$start <- start.week
        week.chart$end <- end.week
        charts[[as.character(start.week)]] <- week.chart
      }
    }
    else {
      week.chart$start <- start.week
      week.chart$end <- end.week
      charts[[as.character(start.week)]] <- week.chart
    }
  }
  cat("Caught", errors, "errors.\n")
  charts
}

One more comment: I don't know if the parameter 'range' really works for values other than 1. Maybe, the last.fm API only supplies calls for adjacent weeks. If you want, you can test this.


The next functions cleans a chart list. It removes all weeks no plays are recorded in.


clear.charts <- function (charts) {
  el.lens <- sapply(charts, FUN = function (x) {
    if (length(x) > 1) {
      length(x$artist)
    }
    else { NA }
  } )
  first.with.something <- which(el.lens > 0)[1]
  charts[first.with.something:length(charts)]
}


Now for the calls! Let's get the weekly charts for me ('swolf2008').


week.charts <- make.time.charts(user="swolf2008", range=1, wait.for.rec.sec=0)

And clean it:

clear.week.charts <- clear.charts(week.charts)

Now for the fun part: We build a function 'get.artist.month.play.rate' which
- takes an artist
- iterates through a chart list
- gets the month the current week is in (time conversion took some time for me to learn... *phew*)
- and gets the plays and the play ratio of the artist for this week
- and writes everything in a new row of a dataframe
- and returns a table of play ratios per month, aggregated over all years of the respective user's scrobbling history

get.artist.month.play.rate <- function (artist, week.charts) {
  result.df <- data.frame()
  for(week in week.charts) {
    if (length(week) > 1) {
      month <- as.POSIXlt(as.numeric(week[["start"]]),
                          tz = "GMT", origin = "1970-01-01")$mon + 1
      plays <- week$playcount[which(week$artist == artist)]
      all.plays <- sum(week$playcount)
      if (length(plays) == 0) plays <- 0
      result.df <- rbind(result.df, data.frame(month = month,
                                               plays = plays,
                                               all.plays = all.plays))
    }
    else { 0 }
  }
  result.df$perc.plays <- result.df$plays / result.df$all.plays * 100
  tapply(result.df$perc.plays, result.df$month, FUN = function (x) {
    mean(x, na.rm = T) } ) }

Let's do this with three artists:

pj <- get.artist.month.play.rate("PJ Harvey", clear.week.charts)
rh <- get.artist.month.play.rate("Radiohead", clear.week.charts)
pr <- get.artist.month.play.rate("Prince", clear.week.charts)

And plot it.

layout(matrix(c(1,2), nrow = 2), heights = c(0.9, 0.1))
radial.plot(lengths = as.numeric(pj), start = pi/2, clockwise = T,
            labels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul",
                       "Aug", "Sep", "Oct", "Nov", "Dec"),
            rp.type = "p", lwd = 3, show.grid.labels = F, line.col = "green",
            radial.lim = c(0,5))
radial.plot(lengths = as.numeric(rh), start = pi/2, clockwise = T,
            labels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul",
                       "Aug", "Sep", "Oct", "Nov", "Dec"),
            rp.type = "p", lwd = 3, show.grid.labels = F, line.col = "blue",
            add = T)
radial.plot(lengths = as.numeric(pr), start = pi/2, clockwise = T,
            labels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul",
                       "Aug", "Sep", "Oct", "Nov", "Dec"),
            rp.type = "p", lwd = 3, show.grid.labels = F, line.col = "red",
            add = T)
par(mar = c(0,0,0,0))
plot.new()
legend(x = "center", bty = "n", lwd = 3, lty = "solid",
       col = c("green", "blue", "red"), ncol = 3,
       legend = c("PJ Harvey", "Radiohead", "Prince"))

What we get is a radial plot showing how much plays an artist gets in a specific month. And indeed, it looks like I'm listening to Radiohead more in "darker" months like in February and March - but also in August.

(click to enlarge)

PJ Harvey seems to be evenly distributed throughout the year.

Prince seems to be reserved for sunny, funky months like August and July (and January).

I wonder which artist I'm listening to in December, obviously it's none of these three.

These functions could be extended in several directions:
- Plotting of tags ("dark sad postrock shit" vs. "funny sunny funky tunes") instead of artists
- Plotting of more than one user or even the whole database of last.fm

To leave a comment for the author, please follow the link and comment on his blog: Rcrastinate.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.