a knapsack riddle?

[This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

gear

The [then current now past] riddle of the week is a sort of multiarmed bandits optimisation. Of sorts. Or rather a generalised knapsack problem. The question is about optimising the allocation of 100 undistinguishable units to 10 distinct boxes against a similarly endowed adversary, when the loss function is

and the distribution q of the adversary is unknown. As usual (!), the phrasing of the riddle is somewhat ambiguous but I am under the impression that the game is played sequentially, hence that one can learn about the distribution of the adversary, at least when assuming this adversary keeps the same distribution q at all times.

Thus given such a q, one aims at maximising the expected loss

\sum_{i=1}^{10} i\{2P(Y_i<x_i)+P(Y_i=x_i)\} = \sum_{i=1}^{10} i \{F_i(x_i)+F_i(x_i-1)\}

based solely on the marginals of q. Under the constraint that the sum of the allocations is 100. It seems rather natural to assume a Multinomial distribution on y, with probability vector p. Hence Binomial marginals.

If we knew y, the optimum would be obtained as taking as many x[i] equal to y[i]+1 as possible for the largest values of the index i and cancelling all other terms. However, this is not necessarily the best option when, by cancelling one x[i], all remaining lowest terms could then be filled. The other point is that, once a positive loss is achieved, there is plenty of freedom in allocating the remaining bits. But this is not the problem, only the distribution matters. Still, here is an R code to optimise x given y:

loz=function(x,y){ 
   return(sum(2*(1:10)*(x>y)+(1:10)*(x==y)))}
#distribution of dominating allocations
T=1e2
armz=matrix(0,T,100)
for (l in 1:T){
# initialisation with default probability
 y=rmultinom(n=1,size=100,prob=1:10) #adversary
 x=y+1 #impossible allocation
 i=1 #get rid of lesser values
 while (sum(x)>100){x[i]=0;i=i+1}
 x[i-1]=100-sum(x) #correct sum
#random search
 maxloz=loss=0
 newloss=loz(x,y)
 t=1
 while ((t<100)||(newloss>loss)){
#exchange two groups
  i=j=sample(1:10,1,prob=(x>0))
  while (j==i)
   j=sample(1:10,1,prob=(x<(y+1))) 
#optimise over amount of move 
  kaz=min(x[i],y[j]+1-x[j]) 
  gain=rep(0,kaz) 
  for (k in 1:kaz){ 
   newx=x newx[i]=x[i]-k;newx[j]=x[j]+k
   gain[k]=loz(newx,y) 
   if (gain[k]>maxloz){
    maxloz=gain[k];argz=newx}}
  gain=gain-loss
#soft simulated annealing
  mov=1 #choice for a move
  if ((max(gain)>0)&(kaz>1))
   mov=sample(1:kaz,1,prob=gain*(gain>0))
  gain=gain+loss;loss=newloss
  newloss=gain[mov]
  x[i]=x[i]-mov;x[j]=x[j]+mov
  t=t+1
  if (maxloz>loz(y,argz)) break()
  }
#creating a distribution on each of the 100 objects
 reprz=rep(1,x[1])
 for (m in 2:10) reprz=c(reprz,rep(m,x[m]))
 armz[l,]=reprz
 }

In the riddle, after each round, the posterior on the Multinomial distribution of the adversary can be updated from the latest input. The next step is to play the algorithm against itself, assuming each adversary follows the same track. Or wait for The Riddler to provide the winning solution. Which is what happened to me as a frantic schedule in Warwick over the past two days (and multiple activities in the Swiss Alps before) preventing me from procrastinating this way! Still the plane ride back to France and the even longer late (in both senses) night train slog to Paris let me experiment with R codes… Here is for instance a representation of the table of allocations of the 100 units to the ten classes [after re-ordering]:

riddler0902This makes for a distribution to sample from and try to compete against the adversary. After building this scheme, I went back to The Riddler to check for solutions. And found that I had misinterpreted the riddle since all competitors entered a single value of y instead of a distribution! Like the winning (3,5,8,10,13,1,26,30,2,2). Still, using this repartition as the probability of the adversary in the above quote, using the resulting distribution of x led to a 96.8% success rate:

game=function(y,armz){
 w=0
 v=rep(1,100)
 x=rep(0,10)
 for (i in 1:1e3){
  for (u in 1:100) v[u]=sample(armz[u,],1)
  for (b in 1:10) x[b]=sum(v==b)
  w=w+(loz(x,y)>loz(y,x))}
 return(w/1e3)}

Checking against other top finishers also led to dominance, albeit less clearly:

> cy=c(3,5,8,10,13,1,26,30,2,2)
> game(cy,armz)
[1] 0.966
> sy=c(3,6,7,9,11,2,27,31,2,2)
> game(sy,armz)
[1] 0.91
> ny=c(5,7,9,11,15,21,25,2,2,3)
> game(ny,armz)
[1] 0.609
> jy=c(1,8,2,2,13,18,20,32,2,2)
> game(jy,armz)
[1] 0.977
> my=c(0,0,1,11,11,16,26,31,2,2)
> game(my,armz)
[1] 0.959

I wish I had time to test this code against the 1388 solutions proposed there, on Github, which is much less than the 4 10¹² different allocations!, as it would show how resistant the distribution can be to different strategies.


Filed under: Books, pictures, R, Statistics, Travel Tagged: allocations, Colonel Blotto game, github, optimisation, R, simulated annealing, The Riddler

To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)