Analyzing Jeopardy in R – Part 2
[This article was first published on Statistics et al., and kindly contributed to Rbloggers]. (You can report issue about the content on this page here)
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.
My previous Jeopardy analyzer was built using a base of about 30 daily Coryat scores. This one has more than 1600 scores that were either recorded directly, emailed to me, or scraped from the forum at jboard.tv . Here we look at the consistency of tournament effects for different athome players, and some longterm trends.
After recording 90 days over 2017, it’s become apparent that I’m not getting any better at Jeopardy just by playing the game at home, as shown in Figure 1.*
The average Coryat is fitted using a spline smoother from the loess package found in base R. It’s a more flexible model and a simpler model to code and manipulate than the previous one. The smooth line in Figure 1 shows that the Coryat score at the beginning of the year is roughly the same as it is at the end of the year. The two tournaments I recorded, the Teen Tournament on the Tournament of Champions are not used in the spline spline smoother, they are linearly interpolated. To measure the effect of these tournaments, I compare these linearly interpolated values to the mean value on each tournament. These results are shown in Table 1, including the estimate for tournament effect from my previous analysis which gives roughly the same value. According to these results, I get an average of 3128 more points in a College Championship game than a normal game, and 1726 fewer points in a Tournament of Champions game than normal.
Player

n

Regular Coryat

College

Champions

Teacher’s

Jack

90

15570

+3128

1726

NA

A

480

30637

+5443

+1866

+2354

B

707

34192

+4405

86

+2090

C

55

35207

15466 (n=1)

370

3829

D

27

16419

NA

NA

NA

E

45

23682

NA

NA

NA

F

119

18795

3329

2044

NA

Table 1 – Coryats and Tournament Effects of Seven Players
Another trend I was interested in was more shortterm. Looking at this chart of date today scores it almost seems like the scores are oscillating, with a high score followed by a low score and viceversa. If this is a real effect we should be able to see it as a negative autocorrelation between one day’s score and the next. Figure 8 shows a scatter plot of one day’s scores in the X and the previous day’s scores in the Y. The estimated Pearson correlation coefficient is almost exactly zero. Furthermore this lack of correlation is not an artifact of some nonlinear affect, because the same lack of patterns shows up when we compare the ranks of the the scores from one day to the next and take the Spearman correlation coefficient of these ranked values instead. In short, there are no daytoday balancing effects and no hot or cold streaks. It’s all just regression to the mean.
Even if the correlation did show up as statistically significant, it doesn’t seem to be particularly meaningful. My hypothesis was that topics were chosen from daytoday such that viewers at home would be more likely to have category or two that they could excel in every couple of days, and to avoid having long stretches we’re home viewers may feel frustrated. Another effect of such a negative correlation would be that champions that stay on a long time truly would be outstanding players in multiple fields rather than specialists. However, I was unable to find any evidence of such a topic shuffling strategy in my own scores, and I would consider myself a fairly typical athome player in my degree of specialization.**
Now let’s try these analyses with some other players and see if the same sort of trends appear, as shown in Figures 27 and Table 1. According to the figures, a whole range of trends appear. The only common one seems to be quick improvement at the beginning of tracking Coryat scores. From the table, we see that the Tournament of Champions tends to vex people. The other tournaments not so much. One note about player C’s College Championship effect is that it represents only a single measurement, which you can see by the triangle on their chart on day 53.
Figure 2 
Figure 3 
Figure 4 
Figure 5 
Figure 6 
Figure 7 
What about that streak or balance hypothesis? Does the nearzero correlation hold for other players as well? Only for player F did a statistically significant correlation appear (p = 0.005, before multiple testing adjustments), and even then that could be an artifact of gradual improvement (a similar check on the model residuals could adjust for improvement, but we’re phacking at this point). Figures 8 to 10 show the scatterplots of me and two of the other players to show how weak such a relationship is, if there is one at all.
Player

Pearson r

Spearman r

Jack

0.004

0.006

A

0.045

0.036

B

0.112

0.100

C

0.013

0.019

D

0.062

0.091

E

0.242

0.206

F

0.252

0.309

Table 2
Another hypothesis I wanted to test was about the nature of Coryat scores. The spline smoother model in this analysis relies on the scores having some linear relationship to underlying skill, rather than something nonlinear like an exponential one. That is, someone who averages 15,000 would be just as much better then someone who average is 12,000, as that second person would be better than someone else whose average is 9000. Or, in other words, that every point of average Coryat increased represents the same amount of latent skill improvement.
This sort of relationship may seem given, but it isn’t in a lot of games and sports. Consider bowling, either 5 or 10 pin. Scores in bowling tend to compound because consecutive strikes are worth more than individual, unconnected ones. The distribution of a typical bowler’s scores are right or positively skewed, meaning that there are more unusually high scores than unusually low ones. (For extremely good bowlers, the opposite is true because they will typically play close to perfection).
So do Coryat scores follow a similar set of patterns? Consider the histograms in the righthalf of Figures 810. Player A is very strong, and their scores exhibit the negative correlation that we would expect of someone consistently at their peak. Player F’s scores are approximately symmetric about 15000. My scores are positively skewed, implying that I’m consistently bad, but occasionally get lucky.
Figure 8 
Figure 9 
Figure 10 
In this link, I have included the updated are code necessary to do these analysis with your own at home scores, as well as a sample dataset of the scores of a couple people that gave my explicit permission to share their scores.
I would be thrilled to have more data from more players, in order to further analyze the athome experience of Jeopardy. I could use this data to further improve questions posed in this post, as well as answer queries from other players.
In the future, I would like to compare the difficulties of Jeopardy! vs Double Jeopardy!, and to see how dependent a typical score is on a few categories which could be answered with data of enough volume and resolution. Please feel free to add your own analysis questions in the comment section, or to my email ([email protected] or Twitter @jack_davis_sfu ).
Thanks for reading!
* I did improve from 16/50 to 30/50 on the annual online test, but that could mostly be attributed to bad luck on the first test and good luck on the second.
**Specifically I nail the science questions do reasonably well on the academic questions, but I’m left silent when it comes to Americana and Opera.
Link to the first Jeopardy analysis: http://www.statsetal.com/2017/03/analyzingjeopardyinrcollege.html
To leave a comment for the author, please follow the link and comment on their blog: Statistics et al..
Rbloggers.com offers daily email updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/datascience job.
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.