The Walking Dead in The Walking Dead

[This article was first published on R – mildlyscientific, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Me and my better half have been binge watching the walking dead recently and I figured that the show would be a nice topic for a blog post. Of course involving some form of data analysis. I am currently very much interested in text analysis, so I tried to hunt down the episode transcripts. I actually found them here. I was super psyched to find them in the format : , which would make it easy to put together a dataset of all lines of all characters throughout the show. I wrote some could for it (can be found on my github), but was quite disappointed when I realized that only the first season was in this format. For season 2-7 there is no chance to recover who said what. I was rather disappointed because the “coolest” thing I could do is count the number of lines of each character:

So Rick, the central character of season 1, talks the most. Surprise!

I didn’t want to give up on finding some more interesting data from the show and stumbled upon the cast per episode on imdb. Not that exciting either, but it’s something. The below chart shows the percentage of episodes characters have appeared in.

Again no surprise here. The core group of survivors appear in most episodes and the important characters of single seasons below them.

Looking at the character lists a bit more closely, I noticed that it also includes actors that played walkers in the episode. I have now idea how complete these are and they definitely do not include all of them. However, it might still give us a general idea in how many walkers appear in each episode. In total, there are 437 walkers appearing in all character lists. below you find the break down per episode, per season.

One can notice two things here. The number of walkers per episode has declined significantly. While the beginning of the show was all about surviving attacks by walkers, it evolved into a show about how people start turning against each other. Also, fighting against super evil assholes (I am talking to you Negan!). The zombies have evolved into a minor inconvenience, with the occasional drama of herds. The Walking Dead is not about the walking dead anymore.

A second thing you might notice is that the beginning and end of seasons features more walkers than in between. Most prominently is season two. I don’t know why though.

This post was certainly not the most exciting one, but it taught me a valuable lessons. Not every kind of data you need to pursue your analysis ideas, so you have to do the best of what you can find or have at your disposal. If you want to redo any of the above analyses, you find my code on github.


To leave a comment for the author, please follow the link and comment on their blog: R – mildlyscientific. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)