R Exercises – Interview Questions (Stats and Simulation)

For Data Science positions that require some knowledge of Statistics and Programming skills, is common to ask questions like those below.

Question 1

Suppose an urn contains 40 red, 25 green and 35 blue balls. Balls are drawn from the urn one-by-one, at random and without replacement. Let \(N\) denote the draw at which the first blue ball appears, and \(S\) denote the number of green balls drawn until the \(N_{th}\) draw (i.e. until the first bue ball appears). Estimate \(E[N|S=2]\) by generating \(10000~iid\) copies of \((S,N)\)

Solution 1

urn<-c(rep("red",40), rep("green",25), rep("blue",35))


for (i in 1:10000) {

s<-sample(urn,100, replace = FALSE)

green_balls[!is.finite(green_balls)] <- 0

if (green_balls==2) {



[1] 4.792257

Question 2

Suppose that claims are made to an insurance company according to a Poisson process with rate 10 per day. The amount of a claim is a random variable that has an exponential distribution with mean \(\$1000\). The insurance company receives payments continuously in time at a constant rate of \(\$11000\) per day. Starting with an initial capital of \(\$25000\), use \(10000\) simulations to estimate the probability that the firm’s capital is always positive throughout its first \(365\) days.

Solution 2

for(i in 1:10000) {


  for (d in 1:365) {




[1] 0.9644

