Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

James Bond: Do you expect me to talk?
Auric Goldfinger: No, Mr. Bond, I expect you to die!

## James Bond

I’m a big James Bond fan, so naturally I went to watch the new Bond movie Spectre which – spoiler alert! – is pretty bad. It also got me to reminice about the good Bond films of the past. My personal candidate for worst Bond film is Die Another Day, but what does the “objective” opinion say on this hotly debated topic? Does my taste conform to the Internet’s taste?

The Economist newspaper did some data analysis when the last Bond (Skyfall) came out, stacking up the different Bond actors on killing, drinking martinis, and love conquests (“Booze, bonks and bodies”). They updated the data for the UK release of Spectre, with Daniel Craig jumping in ranking a lot, mainly from his many kills. The Economist also recently did a comparison of box office opening weekends of the Bond films.

As an economist, I’d have to argue that box office success is one objective measure of film quality; if the movie is bad, people don’t go to the cinemas to watch it, and after all, the market is always right, right?

Others might argue that a critical review scale can assess the quality of a film. This is always debateable: who decides what and how things are grouped into a quality scale? Nevertheless, metacritic and other ratings are used a lot in different industries, because it’s the best you can get.

## Data

I decided to pull some data on the Bond movies. Luckily, wikipedia has a nice article, listing all bond films with their respective budget, box office returns, and several critic scales. I’ll use the excellent and easy to use rvest package to pull the data.

rvest really is that easy to use. It was the first time I used it, and I have to say, I like it a lot. It’ll make pulling data from internet webpages much, much easier in the future for me!

The only problem is that the table in the wiki article has a lot of extra information (like footnotes) that we now need to clean to get a nice, usable dataframe. It’s relatively straightforward, I’ve written a couple of custom functions doing mainly some regexp cleaning. These are the f. functions in the code below. If you’re interested in the details you can check out the full code, including the function code on github.

The make.namescommand was new for me, and I am very impressed. It makes setting up proper R-usable variable names a breeze, and in this case was also helpful with uniqueness: The wiki article page uses multi-cell formatting to identify columns, which got lost in my transformations. It’s a bother people don’t conform to clean data standards everywhere! ?

I then pull and clean the second table as well, which has information on several awards and critic ratings, and merge both for the final dataset to use:

The Rotten Tomatoes rating is the one I will be using, by virtue of the fact that it’s available for all films.

## Graphing

With all the data pulled, let’s take a look at it more closely.

A little messy, but you can already get the general idea. There was a sharp decline in box office earnings in the late 1980s, and the older Bond movies (with Connery as Bond in particular) have better ratings.

We’ll clean this graph a bit more, and also add the Bond actors. For that, we need to generate a dataset of Bond actors and their time of service.

With this information, we can build a large ggplot graph with all the information in one place!

## Results

It turns out that my personal least favourite Bond isn’t that bad by the scales provided above: it was relatively successful at the Box office, and also wasn’t rated that badly.

The honour of the worst-rated Bond film goes to A View to a Kill (Roger Moore), and the film with the worst box office results is License to Kill (Timothy Dalton). Perhaps it’s the naming policy? It seems you do not make a killing with Bond movies if they have the word “kill” in the title – these two are the only ones.

Roger Moore presided over a continuous decline in popularity in the 1980s, and Timothy Dalton could not stop that trend. This lead to a long pause in Bond films until the franchise was resurrected in 1995 with Pierce Brosnan. Also interesting is the fact that with Brosnan, Bond film budgets noticeably increased in size. Only Moonraker comes close to being as expensive as the later Bonds, and that was set in space.

The early Connery Bond films were the most profitable – easily topping the current Craig films, by up to a factor of 22 (Dr. No yielded 64 times its costs, versus Quantum of Solace which made 2.8). Given the quality of Spectre, I fully expect the current Bond to not be a big success (ratings are already very bad in general). This would lead to a new Bond coming up, if history repeats itself – and Craig has already said he no longer wants to play Bond.

Code and data for this analysis is availabe on github, as always.

James Bond movies was originally published by Kirill Pomogajko at Opiate for the masses on November 14, 2015.