**Florian Teschner**, and kindly contributed to R-bloggers)

As most bloggers, I do check my analytics stats on a regular basis. What I do not really look at is social shares. For most blog posts traffic follows a clear pattern; 2 days of increased traffic, followed by a steady decrease to the base traffic volume.

The amplitude varies massively depending on how interesting/viral the post was. However, in the long-term this virality does not drive the majority of the traffic but rather the search ranking of individual posts, as 80% of the base traffic comes through organic search results.

There is plenty of discussion on the value of a share/like with very different take-aways.

I thought it would be great to analyse if social shares do increase my traffic.

In addition, I realized that it is pretty easy to get the number of Facebook and LinkedIn shares for each of my blog posts from r-bloggers. (Please find the script at the end of the post.)

For the analysis, I scraped the social share numbers and merged them with my Google Analytics stats.

In order to compare the traffic between blog posts, I normalise the total traffic per post by number of days the post is online.

To get a first picture of the data we can plot the normalized traffic over the number of shares (both by blog post).

Next, we can analyse how the variables are correlated. I included the total pageviews as reference.

We see that LinkedIn and Facebook shares are positively correlated with r=.56 and that the correlation between shares and normalized pageviews is the same.

When looking at the total page views we see that have a much lower correlation with shares both from facebook as well as Linkedin.

To determine the “value” of social media to drive traffic volume, we can regress the number of social shares on the normalized traffic numbers.

Dependent variable: |
|||

normalizedPageViews | |||

(1) | (2) | (3) | |

shares | 0.043^{**} |
0.043^{**} |
0.026 |

(0.017) | (0.017) | (0.022) | |

linkedIn1 | 0.138 | 0.138 | 0.101 |

(0.126) | (0.126) | (0.129) | |

DaysOnline | -0.008 | ||

(0.006) | |||

Constant | 1.462 | 1.462 | 5.744 |

(1.114) | (1.114) | (3.571) | |

Observations | 33 | 33 | 33 |

R^{2} |
0.341 | 0.341 | 0.375 |

Adjusted R^{2} |
0.297 | 0.297 | 0.310 |

Residual Std. Error | 5.042 (df = 30) | 5.042 (df = 30) | 4.994 (df = 29) |

F Statistic | 7.755^{***} (df = 2; 30) |
7.755^{***} (df = 2; 30) |
5.802^{***} (df = 3; 29) |

Note: |
^{*}p<0.1; ^{**}p<0.05; ^{***}p<0.01 |

As in the correlation plot, we see that highly shared posts have more daily visits. If you take the estimates at face value one can also state that a linkedIn share is worth 3 times a facebook share in terms of additional traffic.

In the last regression I also included a variable (DaysOnline) to capture my learning effect. The longer post is online the lower is the daily traffic. (e.g. posts published when I started blogging).

While writing this post, I realized that the script can also be used to analyse;

- which topics are highly shared
- which authors are popular
- how social sharing changed over time

on r-bloggers.

Have fun!

**leave a comment**for the author, please follow the link and comment on their blog:

**Florian Teschner**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...