Tipster Season

March 16, 2019
By

(This article was first published on Analysis of AFL, and kindly contributed to R-bloggers)

So it is approaching AFL mens season, which means that soon everyones twitter feed, Facebook and emails will get clogged up with various tipsters. People saying they have won at 60% of the time over last season and therefor you should pay them money and follow their tips!

But how can you assess the accuracy of a tipster? Very few would allow a full interrogation of the model. What this means that you if you decide to follow them are doing so only based on their output, and that could be an issue.

Lets take a simple thought experiment. Lets say we have a biased coin toss and lets put deliberately for this experiment the probability of heads to be 0.55

library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0       ✔ purrr   0.3.0  
## ✔ tibble  2.0.1       ✔ dplyr   0.8.0.1
## ✔ tidyr   0.8.3       ✔ stringr 1.4.0  
## ✔ readr   1.3.1       ✔ forcats 0.4.0
## ── Conflicts ──────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
flips = 10000
pHeads =0.55
set.seed(10032019)
coinflips = sample( x=c(0,1),prob = c(1-pHeads,pHeads),size=flips,replace=T)

count_heads = cumsum(coinflips)
flips= 1:flips
runProp = count_heads/flips

flip_data <- data.frame(run=1:10000,prop=runProp)
 ggplot(flip_data,aes(x=run,y=prop,frame=run)) +
  geom_path(aes(cumulative=T))+xlim(1,500)+ylim(0.45,1.0)+
  geom_hline(yintercept = 0.58)+ geom_vline(xintercept=80)+
   geom_hline(yintercept = 0.55)+
   ggtitle("Running Proportion Heads of a biased Coin")+
  ylab("Proportion of Heads")+xlab("Flip Number")
## Warning: Ignoring unknown aesthetics: cumulative
## Warning: Removed 9500 rows containing missing values (geom_path).

Keep in mind this is a simulation I have put in the odds of a head are 0.55. For reference I added some lines, we can see that our reference line geom_hline(yintercept=0.58) is 0.58, we can see even after 200 bets we are still hovering around 0.58. Of course that’s what a tipster would say, that their win rate is 0.58. But we know thats not true long term from the simulation. The other reference line i have added is geom_vline(xintercept=80) we can see that that performance would look to be even slightly higher (about 0.6) and again, we know that this is a simulation and that long term its 0.55. How long is an AFL season? How many games to you realistically expect people to have bet on in a season?

Another thing to think about with regards to the long term performance that we know. Think about the way averages work. If after 250 bets, the proportion of heads is 0.58, in the next 250 bets what proportion of heads is needed to get the running proportion of heads down to 0.55? Maybe have a think about that next time before you pay for a service.

To leave a comment for the author, please follow the link and comment on their blog: Analysis of AFL.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)