[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

For Data Science positions, during the interview process, is common to ask questions about Statistics and Probabilities. We will provide some potential interview questions and their indicative solutions.

Question 1: Assuming that X follows the Normal Distribution with Mean=0.545 and Standard Deviation=0.155 find the probability that X exceeds 0.395.

$$Pr(X>0.395) = 1-Pr(X \leq 0.395) = 1- Pr(\frac{X-0.545}{0.155} \leq \frac{0.395-0.545}{0.155} )=$$

$$=1-Z( \frac{0.395-0.545}{0.155})=1-Z(-0.96744) = 1-0.166587 = 0.833$$

We can provide also the solution in R.

1-pnorm(0.395, mean=0.545, sd=0.155)

[1] 0.8334134



Question 2: The probability that a patient recovers from a rare blood disease is 0.4.  If 15 patients are known to have contracted the disease, what is the probability that exactly 5 fail to recover?

The probability to recover is 0.4 and the probability to fail is 0.6. We want to calculate the probability that exactly 5 out of 15 failed to recover. All the possible combinations to get 5 out of 15 people is: $${15\choose 5} = 3003$$.

So the probability we would like to calculate is $$Pr(X=5) = 3003 \times 0.4^{10} \times 0.6^5 = 0.024486$$

We can provide also the solution in R.

dbinom(5,15,0.6)

[1] 0.02448564


Question 3: A secretary makes 2 errors per page on average. What is the probability that on the next 2 pages she makes not more than 3 errors?

Answer 3: We can argue that the mistakes per page follow the Poisson distribution with parameter λ=2. Now we can argue that the number of mistakes of every 2 pages follow a Poisson distribution with parameter λ=4. The probability to make not more than 3 errors is:

Paragraph
$$Pr(X \leq 3) = \sum_{k=1}^{3} \frac{4^k e^{-4}}{k!}=0.43347$$

We can provide also the solution in R.

ppois(3,4)

{1] 0.4334701



Question 4: A homeowner plants 6 flower bulbs selected at a random from a box containing 5 tulips and 4 roses.  What is the probability that he planted 2 roses and 4 tulips?

Answer 4: The probability is given by: $$\frac{ {4 \choose 2} \times {5 \choose 4} }{9 \choose 6 }=0.3571429$$

We can provide also the solution in R.

dhyper(x=4,  m=5, n=4, k=6)
[1] 0.3571429



To leave a comment for the author, please follow the link and comment on their blog: R – Predictive Hacks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)