Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

“?If first you don’t succeed, try two or more times so that your failure is statistically significant”

## ? Problem Statement

An early-stage start up in Germany has been working on a website redesign of their landing page. The team believes a new design will increase the number of people who click through and join the site.They have been testing the changes for a few weeks, and now they want to measure the impact of the change and need to determine if the increase can be due to random chance or if it is statistically significant.

## Aims and Objectives

1. Analyze the conversion rates for each of the four groups: the new/old design of the landing page and the new/old pictures.
2. Can the increases observed be explained by randomness?
3. Which version of the website should they use?

## ? Data Summary

```## # A tibble: 6 x 3
##   treatment new_images converted
##   <chr>     <chr>          <dbl>
## 1 yes       yes                0
## 2 yes       yes                0
## 3 yes       yes                0
## 4 yes       no                 0
## 5 no        yes                0
## 6 yes       no                 0```
 Name df Number of rows 40484 Number of columns 3 _______________________ Column type frequency: character 2 numeric 1 ________________________ Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
treatment 0 1 2 3 0 2 0
new_images 0 1 2 3 0 2 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
converted 0 1 0.11 0.32 0 0 0 0 1 ▇▁▁▁▁
• `treatment` – “yes” if the user saw the new version of the landing page, no otherwise.
• `new_images` – “yes” if the page used a new set of images, no otherwise.
• `converted` – 1 if the user joined the site, 0 otherwise.
1. Their are 40484 users who have visited the site and we have no missing values in the dataset.
2. `Group A` users with “yes” in both columns: the new version with the new set of images.
3. `Group B` users with yes” in column one and “no” in column two: the new version of website with old set of images
4. `Group C` users with “no” in column one and “yes” in column two: old version of website with new set of images
5. `Group D` the control group is those users with “no” in both columns: the old version with the old set of images.

### Null Hypothesis

Increase in users is due to chance and their is no statistical difference between the four groups.

### Alternative Hypothesis

Their is no statisical significance difference between the four groups.

## A/B Test

The A/B testing or bucket testing is a statistical methodology for comparing between two versions of a web page or mobile app to see which one drives more users. The version with the highest conversion rate wins. This will be used to answer our questions and see which of the landing page design and images is better.

## Analysis

### Absolute proportion

```##          converted
## group         0     1   Sum
##   no-no    9037  1084 10121
##   no-yes   8982  1139 10121
##   yes-no   8906  1215 10121
##   yes-yes  8970  1151 10121
##   Sum     35895  4589 40484```

### Conversion Rate(relative proportion)

```##          converted
## group         0     1   Sum
##   no-no   0.893 0.107 1.000
##   no-yes  0.887 0.113 1.000
##   yes-no  0.880 0.120 1.000
##   yes-yes 0.886 0.114 1.000```
• `Group A` (new landing page design and new images) has a conversion rate of 11.4%
• `Group B` (new landing page design and old images) has a conversion rate of 12%
• `Group C` (old landing page design and new images) has a conversion rate of 11.3%
• `Group D` (old landing page design and old images) has a conversion rate of 10.7%
Clearly the highest conversion rate is Group B i.e new landing page design while retaining old images and the lowest conversion rate is Group D i.e old landing page design with old images.

### Pearson’s Chi squared test of proportion

```##
##  4-sample test for equality of proportions without continuity
##  correction
##
## data:  prop
## X-squared = 8.5261, df = 3, p-value = 0.0363
## alternative hypothesis: two.sided
## sample estimates:
##    prop 1    prop 2    prop 3    prop 4
## 0.8928960 0.8874617 0.8799526 0.8862761```

The Pearson’s chi squared test for proportion shows us that that the p-value is less than 0.05 which implies that the 4 groups are significantly different from each other. The null hypothesis is rejected indicating that the increase in users is not by chance.

## Recommendations

The website design was successful, the top conversion rates came from the new landing page designs, but the company should retain old images in the new design. This is better and will attract more people to join the site.

If you find this analysis interesting, please upvote