R-bloggers

What’s In The Box: Wrapped but not streamed 2025

[This article was first published on Rstats – quantixed, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m a music fan who is anti-streaming and instead I stubbornly maintain a large music collection. At this time of year, streamers receive a round-up of their year’s listening in a “wrapped” report. Not wanting to miss out, I set about rendering my own annual round-up using R!

If you’d like to see a pick of my favourite albums of 2025, jump here. If you’re here for the R coding, read on!

I am aware that that the audience for this post is small. Us non-streamers are a dying breed, but also the number of those people who took an XML snapshot of their iTunes/Music library last year to run this code may well be zero.

To determine the number of plays in 2025, we need a current library snapshot and one that is a year old. It is then a case of loading both into R, subtracting the play count of last year’s snapshot from the current, and then disposing of all tracks with a difference of 0.

library(XML)
library(dplyr)
library(ggplot2)

# file paths for last year and this year XML files
lastyear <- "Data/Library_20241210.xml"
thisyear <- "Data/Library_20251203.xml"

# function to read xml file and convert to data frame
read_iTunesXML <- function(path) {
  df <- plyr::ldply(lapply(readKeyValueDB(path)$Tracks, data.frame))
  df$Play.Count <- replace(df$Play.Count,is.na(df$Play.Count),0)
  return(df)
}

# read in the music libraries
library_thisyear <- read_iTunesXML(thisyear)
library_lastyear <- read_iTunesXML(lastyear)

# merge the two libraries to get the correct play counts for this year
# we only need Persistent.ID and Play.Count from last year
sublibrary_lastyear <- library_lastyear[, c("Persistent.ID", "Play.Count")]
# merge
music_library <- merge(library_thisyear, sublibrary_lastyear, by = "Persistent.ID",
                         all.x = TRUE, sort = FALSE)
music_library$Play.Count.x <- replace(music_library$Play.Count.x,is.na(music_library$Play.Count.x),0)
music_library$Play.Count.y <- replace(music_library$Play.Count.y,is.na(music_library$Play.Count.y),0)
music_library$Play.Count <- music_library$Play.Count.x - music_library$Play.Count.y
# now get the tracks that were listened to in the last year
music_library <- music_library[music_library$Play.Count > 0, ]

# keep relevant columns only
music_library <- music_library %>%
  select(Name, Artist, Album, Genre, Year,
         Play.Count, Total.Time, Bit.Rate, Date.Added)

So now we have a data frame called music_library with the number of plays for each track (in the last year). So let’s get some basic facts about what was played.

## some basic facts about the library

# number of tracks listened to
nrow(music_library)
# total number of different artists listened to
length(unique(music_library$Artist))
# total play count
sum(music_library$Play.Count)
# total time that music has been played, in days
(sum(music_library$Total.Time * music_library$Play.Count) / 1000) / (60 * 60 * 24)

This gave me:

> # number of tracks listened to
> nrow(music_library)
[1] 22785
> # total number of different artists listened to
> length(unique(music_library$Artist))
[1] 2611
> # total play count
> sum(music_library$Play.Count)
[1] 26423
> # total time that music has been played, in days
> (sum(music_library$Total.Time * music_library$Play.Count) / 1000) / (60 * 60 * 24)
[1] 76.9794

Damn! 77 days of music listening in the past year! That’s >20% of my life listening to music…

# histogram of Years
music_library %>%
  ggplot(aes(x = Year)) +
  geom_histogram(binwidth = 1) +
  lims(x = c(1950,2026)) +
  labs(x = "Year", y = "Tracks") +
  theme_classic()
ggsave("Output/Plots/yearHistogram.png", width = 6, height = 4, units = "in", dpi = 300, bg = "white")

# same thing for library_this year
library_thisyear %>%
  ggplot(aes(x = Year)) +
  geom_histogram(binwidth = 1) +
  lims(x = c(1950,2026)) +
  labs(x = "Year", y = "Tracks") +
  theme_classic()
ggsave("Output/Plots/yearHistogram_thisyear.png", width = 6, height = 4, units = "in", dpi = 300, bg = "white")

I listened to mainly new music (released in 2024and 2025) in the last year, with a smattering of tracks from earlier years. Comparing with the library, the bias towards new tracks being played this year is quite stark.

So what were the most played artists, albums and genres (before we get to tracks)?

# find the most played artists, albums, and genres
artist <- music_library %>%
  group_by(Artist) %>%
  summarise(Total_Plays = sum(Play.Count),
            n = n(),
            Mean_Plays = mean(Play.Count),
            Median_Plays = median(Play.Count),
            .groups = "keep") %>%
  filter(n > 5) %>%
  arrange(desc(Mean_Plays))

album <- music_library %>%
  group_by(Artist, Album) %>%
  summarise(Total_Plays = sum(Play.Count),
            n = n(),
            Mean_Plays = mean(Play.Count),
            Median_Plays = median(Play.Count),
            Year = median(Year),
            .groups = "keep") %>%
  filter(n > 2) %>%
  arrange(desc(Mean_Plays))

genre <- music_library %>%
  group_by(Genre) %>%
  summarise(Total_Plays = sum(Play.Count),
            n = n(),
            Mean_Plays = mean(Play.Count),
            Median_Plays = median(Play.Count),
            .groups = "keep") %>%
  filter(n > 15) %>%
  arrange(desc(Mean_Plays))

We now have three data frames, one for each thing, and I’m sorting by mean plays, rather than total plays. This is because the number of tracks for each artist, album or genre is different. Everything gets played once when I added to the library so if I add 3000 indie tracks and they got 3000 plays in total this needs to be ranked lower than 300 reggae tracks played a total of 600 times.

> head(artist)
# A tibble: 6 × 5
# Groups:   Artist [6]
  Artist          Total_Plays     n Mean_Plays Median_Plays
1 200 Stab Wounds          96     9      10.7          12  
2 EYES                     97    10       9.7           9  
3 Miynt                    67    11       6.09          6  
4 Bnny                    145    25       5.8           5  
5 Turnstile               166    29       5.72          4  
6 Polygon Window           77    14       5.5           5.5

> head(album)
# A tibble: 6 × 7
# Groups:   Artist, Album [6]
  Artist          Album                          Total_Plays     n Mean_Plays Median_Plays  Year
1 200 Stab Wounds Manual Manic Procedures                 96     9      10.7          12    2024
2 EYES            SPINNER                                 97    10       9.7           9    2025
3 Turnstile       NEVER ENOUGH                           119    14       8.5           9.5  2025
4 Grouper         Dragging a Dead Deer Up a Hill          96    12       8             8    2007
5 Bnny            Everything                              88    14       6.29          5    2021
6 Miynt           Rain Money Dogs                         67    11       6.09          6    2025

> head(genre)
# A tibble: 6 × 5
# Groups:   Genre [6]
  Genre     Total_Plays     n Mean_Plays Median_Plays
1 Afrobeat           36    16       2.25            1
2 Hard rock          39    23       1.70            2
3 Dream Pop         628   381       1.65            1
4 Ambient           638   397       1.61            1
5 Reggae            122    76       1.61            1
6 Grunge            134    86       1.56            1

Finally, let’s look at tracks. We can simply sort the music_library data frame by play count, but (having done this) it’s a bit boring because I am an album listener. The tracks in the top ten are mainly from my most played album. So let’s only take the top played track per artist/album to make it more interesting.

# order music_library by Play.Count
# take the first occurrence of artist album combination
music_library_unique <- music_library %>%
  arrange(desc(Play.Count)) %>%
  distinct(Artist, Album, .keep_all = TRUE)

and this gives us:

> # echo the top 10 most played tracks to the console
> music_library_unique %>%
+   select(Name, Artist, Album, Year, Play.Count) %>%
+   head(10)
                                Name          Artist                          Album Year Play.Count
1                I'll See You Around      Silver Sun                       Neo Wave 1998         21
2                  Hands of Eternity 200 Stab Wounds        Manual Manic Procedures 2024         16
3  Moving Day For the Overton Window            EYES                        SPINNER 2025         16
4                         Stay Hated      BENCHPRESS                     Stay Hated 2012         13
5                             August            Bnny                     Everything 2021         12
6                          SUNSHOWER       Turnstile                   NEVER ENOUGH 2025         10
7                              Orion        Mastodon                Medium Rarities 2020          9
8                      Wind and Snow         Grouper Dragging a Dead Deer Up a Hill 2007          8
9                            blazing    helen island                    last liasse 2024          8
10                      Sudden Storm     Ezra Furman             Goodbye Small Head 2025          8

So there you have it. I listened to I’ll See You Around by Silver Sun 21 times. It’s because I tend to play this before doing a running race (along with Stay Hated or latterly Moving Day For the Overton Window). Highly recommended!

Hopefully this has given you some ideas of how to make your own “wrapped” using R.

Albums of 2025

I enjoy compiling my albums of the year. No idea whether readers find it useful, but I have benefitted from album recommendations so I’m keen to pass them on. Plus, I’d like to give a shoutout and a link to the artists whose work I’ve enjoyed this year. If you like something here, consider buying some merch or a release from one of the artists! Previous selections can be found here (2024, 2023, 2021).

SPINNER – EYES

Genre: Hardcore [link]

Muscular hardcore from Denmark. I freely admit to only investigating this after seeing it on a list and liking the cover and band logo. It turned out to be one of my favourite albums released this year.

NEVER ENOUGH – Turnstile

Genre: Hardcore

I was looking forward to this release from Turnstile and it didn’t disappoint. They played a great set at Glastonbury and I enjoyed their Tiny Desk Concert. I even watched the visual album of this record. The addition of Meg Mills as a second guitar crunching away in the mix has improved their sound a lot.

Rain Money Dogs – Miynt

Genre: lo-fi [link]

Bedroom pop from Sweden. Miynt is the alias of Fredrika Ribbing, who released this gorgeous retro record (where retro is early 2000s and I feel old) this year.

Shrunken Elvis – Shrunken Elvis

Genre: post-rock [link]

An album of instrumental music of pure, simple guitar vibes. I read about this on the wonderful Tonearm website via Mastodon.

Goodbye Small Head – Ezra Furman

Genre: indie rock [link]

The opening track on this album, Grand Mal, is jaw-droppingly good. The vocal is pure indie rock but it’s phrased like a rap track with a lyric which centres on experience of epilepsy. The album is full of fragile tunes that are varied in style.

Phonetics On and On – Horsegirl

Genre: indie rock [link]

I seemed to have this album on pre-order for months, so much so, I had forgotten about it when it finally came out. The album reminds me of Betti-Cola by Cub with basic instrumentation and simple melodies.

Public Works and Utilities – Warrington-Runcorn New Town Development Plan

Genre: ambient, electronic [link]

I love the whole concept of WRNTDP for reasons that I find hard to describe. He played locally and I caught his performance. It was mainly tracks from this album, which are more dance oriented than the previously releases which were more ambient-style.

Bugland – No Joy

Genre: shoegaze [link]

I have been a big fan of Montreal’s No Joy since Wait To Pleasure which is a modern shoegaze classic. This album is more psychedelic rock in style and again, I watched her play locally. She played a great (and very loud) set one of the highlights of the year.

More – Pulp

Genre: rock

I wasn’t sure whether to include this one. The album was a bit patchy but in places it was Pulp back at their best. Spike Island is a great tune. I saw them play in Birmingham and they were fantastic. I admired the fact that they had truly got back together and released new material rather than just cashing in and reissuing their old stuff (they did that too of course) but I appreciated what they did this year – 30 years after I saw them play the main stage at Glastonbury.

The Bad Fire – Mogwai

Genre: post-rock [link]

I like what Mogwai do even though they keep doing it record after record.

Touch – Tortoise

Genre: post-rock [link]

Another album that I had on pre-order for a long time. The band has undergone several changes in line-up and sound but in places, like on Axial Seamount, they sounded like the Tortoise of Millions Now Living Will Never Die.

McCartney, It’ll Be OK – UNIVERSITY

Genre: noise rock [link]

I had to check this band out because they are from my home town. I loved this record which is kind of punk almost screamo. The guy on drums is a powerhouse. They released an EP later in the year called YES, which has the most unhinged drumming on it that I’ve heard in a long time. The song titles alone tell me that they’re a bunch of mates having a great time and not taking themselves too seriously. I got big nostalgia vibes from this record, a fact that would horrify the youngsters in the band I’m sure!

Till the Morning – Brian D’Addario

Genre: power pop [link]

Without a new The Lemon Twigs album, I made do with this release from Brian D’Addario. It’s in the soft rock style. I’m amazed at how mature the songwriting is and the amazing retro production. Something for the music nerds out there.

Dim Probs – Gruff Rhys

Genre: lo-fi [link]

There are several Super Furry Animals-related releases on my list. The first is this album of acoustic mellowness from Gruff Rhys. I always enjoy his records but this is his best in a while.

Beneath Strawberry Moons – Gulp

Genre: psychedelic folk [link]

An album of spaced-out folk with dreamy vocals and dubby bass from Guto Pryce (ex-SFA). There’s a lounge, almost latin vibe to some of the instrumentation. A summer album I guess.

The Pattern Speaks – SKLOSS

Genre: psychedelic rock [link]

This is heavy sludge rock freak out territory with motorik drums and mountains of fuzz guitars. Wonderful stuff. A recommendation from Steve Russell I’m sure.

Sinister Grift – Panda Bear

Genre: psychedelic [link]

I like the layered vocals that I guess are a trademark of Panda Bear. There’s strong melodies and sunny vibes aplenty on this record.

Pink Silence – Cloth

Genre: indie rock [link]

I find this album hard to categorise. In places it has an 80s pop vibe to it but at it’s core it’s an indie rock album.

Very Human Features – The Bug Club

Genre: Indie pop [link]

A recommendation from Sally Lowell. I enjoyed On The Intricate Inner Workings of the System and their earlier Pure Particles album. This release is kooky, humorous and reminiscent of The Lovely Eggs. They remind me of listening to John Peel’s radio show…

Pando – Das Koolies

Genre: Electronica [link]

The other SFA-related album on my list. I bought their earlier EPs and wasn’t completely bowled over. However, I enjoyed this album. They get some great sounds going and the production is spot on.

Reissues

I enjoyed these reissues, which came out this year:

Listmakers’ remorse

Oh it’s the final bandcamp Friday of the year tomorrow and no doubt I’ll get hold of something that I wish I could add to this list. But having put these together for a few years, I’ve realised it’s best not to obsess over it and just hit publish. Yes, I didn’t put the Geese album on my list even though it’s on everyone else’s. I’m sure there’ll be great albums I forgot too. So be it! That was 2025.

The post title comes from “What’s In The Box (See Whatcha Got)” by The Boo Radleys from their “C’mon Kids” album. I seem to remember that this song is about what’s in your record box, although I couldn’t find any confirmation on the web.

To leave a comment for the author, please follow the link and comment on their blog: Rstats – quantixed.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version