**Freakonometrics - Tag - R-english**, and kindly contributed to R-bloggers)

Following the previous post, two additional remarks. Following a comment by @cosi, I have investigated quickly a binomial fit to the distribution of the number of people not getting wet, with a fixed number of players on the field. It looks like it should be a binomial distribution with a fixed probability (2/3) and with size parameter affine in the number of players. @guigui suggested some connexion with with "** birds on a wire**" problem (see e.g. http://www.cut-the-knot.org/)

n=p=rep(NA,20) for(i in 1:20){ NSim=10000 N=Vectorize(NOTWET)(n=rep(3+2*i,NSim)) n[i]=mean(N)/(1-var(N)/mean(N)) p[i]=1-var(N)/mean(N) } plot(seq(5,43,by=2),n,col="red",type="b")

for the implied size parameter above, and below the implied probability parameter.

plot(seq(5,43,by=2),p,col="blue",type="b")

(as functions of the number of players). I'd be glad to get more details on that 2/3 probability.

Now, let us investigate another question sent by email: "Where should you hide if you don't want to get wet ?" A first idea could be the following: given that some players are already on the field, where should I go if I do not want to get wet ? Below are some simulations for 7 or 25 players (already on the field). The red area is the area so that I will become someone's target (perhaps even the target of two players...). The green area is the safe zone.

(with 7 players above, and 25 below)

It looks like, on the border, it might be safer than in the middle of the field. But we have to confirm that intuition... or at least see if that intuition is valid.Based on what was done the other day, it is possible to look *where* people that got wet were located (instead of counting dry players as done in the previous function). So here, we simply look where non wet players were standing

NOTWET=function(n,p){ x=runif(n) y=runif(n) (d=as.matrix(dist(cbind(x,y), method = "euclidean",upper=TRUE))) diag(d)=999999 dmin=apply(d,2,which.min) whonotwet=( (1:n) %notin% names(table(dmin)) ) #plot(x[-whonotwet],y[-whonotwet],pch=19,col="blue",type="p") #points(x[whonotwet],y[whonotwet],pch=19,col="red") M=matrix(NA,p,p);u=seq(0,1,by=1/p) for(i in 1:p){ for(j in 1:p){ M[i,j]=sum((x[whonotwet]>=u[i])&(x[whonotwet]<u[i+1])& (y[whonotwet]>=u[j])&(y[whonotwet]<u[j+1])) }} return(M)}

based on function

"%notin%" <- function(x, y) x[!x %in% y]

On a given grid, we count people playing the game that ended dry (with might avoid boundary bias on nonparametric smooth estimator of distribution, as we'll see later on). For instance with 11 players,

M11=matrix(0,25,25); for(s in 1:100000){ M11=M11+NOTWET(11,25) }

Then we can plot the distribution, on the field,

COL=rev(heat.colors(101)); p=25 u=seq(0,1,by=1/p) plot(0:1,0:1,col="white",xlab="",ylab="") for(i in 1:p){ for(j in 1:p){ polygon(c(u[i],u[i],u[i+1],u[i+1]),

c(u[j],u[j+1],u[j+1],u[j]),border=NA, col=COL[trunc(M11[i,j])/max(M11)*100+1]) }}

Red means a lot of non-wet people (i.e safer zones). Graphs below are with 7 and 11 players respectively (from the left to the right)

But would it be completely different with a field shaped as a disk ?

NOTWET=function(n){ x=(runif(n*20)*2-1)*1 y=(runif(n*20)*2-1)*1 I=which((x^2+y^2<1)) x=x[I];y=y[I] x=x[1:n];y=y[1:n] (d=as.matrix(dist(cbind(x,y), method = "euclidean",upper=TRUE))) diag(d)=999999 dmin=apply(d,2,which.min) whonotwet=( (1:n) %notin% names(table(dmin)) ) return(cbind(x[whonotwet],y[whonotwet])) } M=t(c(0,0)) for(s in 1:10000){ M=rbind(M11,NOTWET(25)) } M=M[-1,] library(ks) HP=matrix(c(.001,0,0,.001),2,2) K=kde(x=M11, H=HP) image(K$eval.points[[1]],K$eval.points[[2]],K$estimate2, col=rev(heat.colors(101)),xlim=c(-1,1),ylim=c(-1,1))

NOTWET2=function(n){ x=(runif(n*20)*2-1)*1 y=(runif(n*20)*2-1)*1 I=which((x^2+y^2<1)) x=x[I];y=y[I] x=x[1:n];y=y[1:n] (d=as.matrix(dist(cbind(x,y),

method = "euclidean",upper=TRUE))) diag(d)=999999 dmin=apply(d,2,which.min) notwet=n-length(table(dmin)) return(notwet)} NOTWET=function(n){ x=runif(n) y=runif(n) (d=as.matrix(dist(cbind(x,y),

method = "euclidean",upper=TRUE))) diag(d)=999999 dmin=apply(d,2,which.min) notwet=n-length(table(dmin)) return(notwet)} NSim=100000 Nsquare=Vectorize(NOTWET)(n=rep(25,NSim)) Ndisk=Vectorize(NOTWET2)(n=rep(25,NSim)) Tsq=table(Nsquare) Tdk=table(Ndisk) plot(as.numeric(names(Tsq)),Tsq/NSim,

type="b",col="red") lines(as.numeric(names(Tdk)),Tdk/NSim,

type="b",pch=4,col="blue")

But so far, it was still simple... I wonder what it might become if we consider a non-convex place, with walls, where player might hide.... Next time, a post on indoor paint-ball !

**leave a comment**for the author, please follow the link and comment on his blog:

**Freakonometrics - Tag - R-english**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...