# a Simpson paradox of sorts

**R – Xi'an's Og**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

**T**he riddle from The Riddler this week is about finding an undirected graph with N nodes and no isolated node such that the number of nodes with more connections than the average of their neighbours is maximal. A representation of a connected graph is through a matrix X of zeros and ones, on which one can spot the nodes satisfying the above condition as the positive entries of the vector (X**1**)^2-(X^2**1**), if **1** denotes the vector of ones. I thus wrote an R code aiming at optimising this target

targe <- function(F){ sum(F%*%F%*%rep(1,N)/(F%*%rep(1,N))^2<1)}

by mere simulated annealing:

rate <- function(N){ # generate matrix F # 1. no single F=matrix(0,N,N) F[sample(2:N,1),1]=1 F[1,]=F[,1] for (i in 2:(N-1)){ if (sum(F[,i])==0) F[sample((i+1):N,1),i]=1 F[i,]=F[,i]} if (sum(F[,N])==0) F[sample(1:(N-1),1),N]=1 F[N,]=F[,N] # 2. more connections F[lower.tri(F)]=F[lower.tri(F)]+ sample(0:1,N*(N-1)/2,rep=TRUE,prob=c(N,1)) F[F>1]=1 F[upper.tri(F)]=t(F)[upper.tri(t(F))] #simulated annealing T=1e4 temp=N targo=targe(F) for (t in 1:T){ #1. local proposal nod=sample(1:N,2) prop=F prop[nod[1],nod[2]]=prop[nod[2],nod[1]]= 1-prop[nod[1],nod[2]] while (min(prop%*%rep(1,N))==0){ nod=sample(1:N,2) prop=F prop[nod[1],nod[2]]=prop[nod[2],nod[1]]= 1-prop[nod[1],nod[2]]} target=targe(prop) if (log(runif(1))*temp1]=1 prop[upper.tri(prop)]=t(prop)[upper.tri(t(prop))] target=targe(prop) if (log(runif(1))*temp This code returns quite consistently (modulo the simulated annealing uncertainty, which grows with N) the answer N-2 as the number of entries above average! Which is rather surprising in a Simpson-like manner since all entries but two are above average. (Incidentally, I found out that Edward Simpson recently wrote a paper in Significance about the Simpson-Yule paradox and him being a member of the Bletchley Park Enigma team. I must have missed out the connection with the Simpson paradox when reading the paper in the first place…)

Filed under: Books, Kids, pictures, R Tagged: Bletchley Park, Edward Simpson, Enigma code machine, graph, mathematical puzzle, Significance, Simpson's paradox, simulated annealing, The Riddler, Yule

Toleave a commentfor the author, please follow the link and comment on their blog:R – Xi'an's Og.

R-bloggers.com offersdaily e-mail updatesabout R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.