**jacobsimmering.com**, and kindly contributed to R-bloggers)

# TV Show Cancellations: Myths and Models

TV shows are amazing ways to waste time and, on occasion, the story is so good

that you actually start to care. The problem is that some shows get cancelled

before they jump the shark.

Classic examples are shows like

Firefly or

Arrested

Development. With the increasing serialization of TV shows, having the show

be cancelled early means the story will potentially be unresolved or have a

rushed conclusion.

As a result, a cottage industry of decision rules for

predicting cancellation has emerged over the past few years. A lot of rules

have come up such as “If it is good and it is on Fox, they will cancel it” or

“If it airs on Friday, it will not return for next season.” But how accurate

are these rules?

The most sophisticated method of tracking viewership I have seen is the

Renew/Cancel Index by TV By The Numbers.

Their insight is that a network is unlikely to cancel many of their shows in a

given season. New series are typically just as costly to make but are not a

known quantity. Many shows are cancelled during the course of the first season.

For every new series that a network wants to air, they have to order \( cS \) series

where \( c > 1 \) and \( S \) is number of ordered new series. As such, a network is

loath to cancel too many shows due to the added cost of a new project and

possible replacement for the new project. Like the old joke, you don't have to

outrun the bear, you just have to outrun the slowest guy in the group.

The Renew/Cancel Index produced by TV By The Numbers takes this to heart.

Instead of just looking at the number of viewers or share, they divide the

number of viewers of the \( i^{th} \) show on the \( N \) network by the average

number of viewers of all shows on \( N \) during that week. Now it doesn't matter

that a show that would be a good performer for NBC would be a dud on Fox

because they are normalized to the mean for each network.

If this index holds up as accurate, it provides an interesting way to test

those two major “TV Myths.” From the TV By The Numbers website, I pulled 303

renewal decisions for the networks ABC, CBS, FOX and NBC from 2009-2010 to the

2012-2013 seasons.

## Data Overview

When we look at the density of the index values split by network, we see that

this normalization makes them roughly equal. Each network has some outstanding

shows with index values over 1.5 and all have a few duds around 0.5.

```
ggplot(bigFour, aes(x = index, color = network)) + geom_density() + scale_x_continuous("Renew/Cancel Index")
```

```
bigFour$adjusted <- bigFour$index + bigFour$friday * 0.2
ggplot(bigFour, aes(x = adjusted, color = network)) + stat_ecdf() + geom_vline(aes(xintercept = 0.75)) +
geom_vline(aes(xintercept = 0.9)) + scale_x_continuous("Adjusted Renewal/Cancel Index") +
scale_y_continuous("Cumulative Density")
```

The “slow runners” on each network are also made clear. The people at

TV By The Numbers

use the following breaks for their predictions:

- Index values over 0.90, renewal
- Index values over 0.75 but under 0.90 are toss-ups
- Index values under 0.75 will probably lead to a cancellation

And shows that air on Friday's are given a slight bump of about 0.2 index units

to account for the

Friday Night Death Slot.

These values are the vertical lines in the ecdf plot.

So how well does the index work?

```
bigFour$adjusted <- bigFour$index + bigFour$friday * 0.2
ggplot(bigFour, aes(x = adjusted, y = status)) + geom_point() + geom_vline(aes(xintercept = 0.75)) +
geom_vline(aes(xintercept = 0.9)) + scale_x_continuous("Adjusted Renewal/Cancel Index") +
scale_y_continuous("Renewal Status") + geom_smooth(method = "loess", alpha = 0.25)
```

It captured the expected sigmoidal shape of the response curve! That is

something! Looking at their suggested breaks, they nicely land at about 30% and

about 70% probability of renewal. Not too shabby for simple rules. Now, if we

accept this index as a valid measure of cancellation risk, lets ask the hard

questions.

## Mythbusting: Does FOX Hate TV?

The cancellation of Firefly and Arrested Development in the early 2000s lead to

the idea that

Fox cancels everything good.

Is that true? Does FOX really hate TV? Or more accurately, does FOX have a

higher requirement for show performance than the other networks? Lets test that

with a model. Let \( Y_{i, N} \) be the renewal status of the \( i^{th} \) show on

network \( N \) and \( I_{i, N} \) be the renew/cancel index for the \( i^{th} \) show. I

looked at three models, one that did not have a network effect (M1), a model

with an intercept that varied by network (M2) and a model with both variable

intercept and slope parameters (M3).

```
bigFour$reference <- ifelse(bigFour$network == "FOX", 1, ifelse(bigFour$network ==
"ABC", 2, ifelse(bigFour$network == "CBS", 3, 4)))
bigFour$network <- reorder(bigFour$network, bigFour$reference)
m1 <- glm(status ~ adjusted, data = bigFour)
m2 <- glm(status ~ adjusted + network, data = bigFour)
m3 <- glm(status ~ adjusted + network + adjusted * network, data = bigFour)
summary(m1)
```

```
##
## Call:
## glm(formula = status ~ adjusted, data = bigFour)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1782 -0.3110 -0.0048 0.3421 0.8238
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.3152 0.0673 -4.68 4.3e-06 ***
## adjusted 0.9635 0.0673 14.31 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.1426)
##
## Null deviance: 70.833 on 293 degrees of freedom
## Residual deviance: 41.629 on 292 degrees of freedom
## (9 observations deleted due to missingness)
## AIC: 265.6
##
## Number of Fisher Scoring iterations: 2
```

```
summary(m2)
```

```
##
## Call:
## glm(formula = status ~ adjusted + network, data = bigFour)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.0951 -0.3131 -0.0147 0.3393 0.8944
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.26429 0.07965 -3.32 0.001 **
## adjusted 0.95144 0.06759 14.08 <2e-16 ***
## networkABC -0.05954 0.06359 -0.94 0.350
## networkCBS 0.00875 0.06236 0.14 0.888
## networkNBC -0.11536 0.06578 -1.75 0.081 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.1416)
##
## Null deviance: 70.833 on 293 degrees of freedom
## Residual deviance: 40.919 on 289 degrees of freedom
## (9 observations deleted due to missingness)
## AIC: 266.6
##
## Number of Fisher Scoring iterations: 2
```

```
summary(m3)
```

```
##
## Call:
## glm(formula = status ~ adjusted + network + adjusted * network,
## data = bigFour)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1760 -0.3031 -0.0172 0.3297 0.9497
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.2709 0.1477 -1.83 0.068 .
## adjusted 0.9584 0.1472 6.51 3.4e-10 ***
## networkABC -0.0740 0.1876 -0.39 0.693
## networkCBS 0.1497 0.2016 0.74 0.459
## networkNBC -0.2308 0.2083 -1.11 0.269
## adjusted:networkABC 0.0167 0.1901 0.09 0.930
## adjusted:networkCBS -0.1415 0.1972 -0.72 0.474
## adjusted:networkNBC 0.1240 0.2099 0.59 0.555
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.1421)
##
## Null deviance: 70.833 on 293 degrees of freedom
## Residual deviance: 40.655 on 286 degrees of freedom
## (9 observations deleted due to missingness)
## AIC: 270.7
##
## Number of Fisher Scoring iterations: 2
```

We see in all three models that the show's Friday-adjusted Renew/Cancel Index is

a powerful predictor (OR of around 2.75!) but that the network effects were

not significant.

Looking at this visually…

```
bigFour$fox <- ifelse(bigFour$network == "FOX", "FOX", "Others")
ggplot(bigFour, aes(x = adjusted, y = status, lty = fox)) + geom_smooth(alpha = 0.25,
method = "loess") + scale_x_continuous("Adjusted Index") + scale_y_continuous("Renewal Probability")
```

This is interesting. Fox is slightly less likely to cancel low performing shows

(see Fringe) than

other networks. However, Fox is more likely to cull shows with a more middling

performance. This might be the results of shows like Fringe having favorable

economics

(

syndication rights) that leads to the higher renewal rates for FOX at the

lower index values. However, it is fairly clear that FOX's reputation as a

killer of all that is good on TV is (at least since 2010) unfounded.

## Mythbusting: Friday Night Death Slot

One of the most interesting questions regarding the Friday Night Death slot is

whether airing on Friday night kills the shows (due to the lack of viewers on

Friday evening) or whether being moved to Friday night is the result of a lack

of viewership on other nights. That is an interesting question but can't be

addressed by the current form of the data I have. I can, however, answer

whether Friday night is actually a death slot or a safe harbor.

Because Friday nights are going to have lower viewer numbers regardless of what

show airs and because the networks want to have programming on Friday night,

shows airing on Friday night might be renewed at viewership numbers that

would have resulted in cancellation with extreme prejudice on any other night.

This is because the cost of making a new series, and the midseason replacement

in case the new project is a flop, is no cheaper and often more expensive than

airing the known quantity.

```
m4 <- glm(status ~ index + friday, data = bigFour)
m4
```

```
##
## Call: glm(formula = status ~ index + friday, data = bigFour)
##
## Coefficients:
## (Intercept) index friday
## -0.300 0.953 0.134
##
## Degrees of Freedom: 293 Total (i.e. Null); 291 Residual
## (9 observations deleted due to missingness)
## Null Deviance: 70.8
## Residual Deviance: 41.5 AIC: 267
```

Looking at the two models, shows on Friday don't get a huge break relative to

shows airing on other days during the week (about 0.14 boost compared to the

suggested 0.2 used by TV By The Numbers). Here is the same thing visually:

```
ggplot(bigFour, aes(x = index, y = status, color = as.factor(friday))) + geom_smooth(span = 0.75,
method = "loess")
```

The shows on Friday all have lower viewer counts by are more likely to be renewed

at those lower counts than other nights. In this sense, Friday night time slots

may offer a chance for shows with cult followings, like Fringe or

Chuck to find

an economically viable timeslot and audience. At the very least, the networks

do not hold Friday shows to Thursday standards.

Hopefully this year I won't have to worry about getting my

six seasons and a movie

but if I do, I can rest confident knowing that the network doesn't

matter and Friday offers a refuge for cult shows.

**leave a comment**for the author, please follow the link and comment on their blog:

**jacobsimmering.com**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...