Hillary Clinton’s Biggest 2016 Rival: Herself

[This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In a recent post I noted that despite Bernie Sanders doing better in many important indicators, Obama 2008 received 3x more media coverage than Sanders 2016.

Reasonably, a reader of my blog noted that not all coverage was equal, that a presidential hopeful might be happier having no coverage than negative coverage. So I decided to do some textual analysis of the headlines comparing Sanders and Clinton in 2016 and Obama and Clinton in 2008.

I looked at 4200 headlines mentioning either Obama in 2007/08, Sanders 2015/16, or Clinton 2007/08 or 2015/16 scraped from major news sources: Google News, Yahoo News, Fox, New York Times, Huffington Post, and NPR (From January 1st, 2007 to January, 2008 and January 1st, 20015 to January, 2016).

First I constructed word clouds for the Clinton and Sanders race.
Figure 1: Hillary Clinton’s 2015/2015 headline word cloud. Excluding “hillary” and “clinton” as terms when constructing the cloud.
Figure 2: Bernie Sanders headline word cloud. Excluding “bernie” when constructing the cloud.
From looking at the differences between Figure 1 and Figure 2, there appears to be some pretty significant differences. First off, the most frequent term in Figure 2 is “Clinton” followed by a lot of general stuff. “Black” for black vote since there is some concern that Bernie can’t get the black vote perhaps combined with some high profile black political activists endorsing him.

Figure 1 though is a world of difference. Almost every major word is a scandal. Email and emails, ben ghazi, private server, and foundation. Each referencing either the email scandal in which Clinton set up an potentially illegal private server to house her official emails while secretary of state, Ben Ghazi, the affair in which diplomats died as a result of terrorist action which many have blamed on Hillary Clinton, as well as the alleged unethical misuse of Clinton foundation funds as a slush fund for the Clinton families luxurious tastes. Interestingly, “Bruni”, as in Frank Bruni, a New York Times reporter who has taken some heat for his critical reporting of Hillary Clinton has appeared in the cloud.

But is this really so bad? How does these word clouds compare with those of 2007/2008?

Figure 3: The word cloud from 2007/2008 for Hillary Clinton excluding “hillary” and “clinton”.
Figure 4: The word cloud from 2007/2008 for Barack Obama excluding “obama”.
From Figure 2, 3, and 4 we can see a significant and substantive difference from that of Figure 1. In those figures the most newsworthy thing to report is the rivalry for the primary seat. All other issues are dwarfed. With Figure 1, scandals and criticism of Hillary Clinton abound. Looking at these word clouds, I would suspect that the Clinton camp would be happy to have the news coverage they had in the 2008 campaign rather than the coverage they are currently having.

But are these frequency word graphs really a reasonable assessment of the media? What of the overall tone of these many articles?
Figure 5: Sentiment analysis of the news coverage of Clinton 2008 and 2016 and Obama 2008 and Sanders 2016. Scales have been standardized so that a positive rating indicates higher likelihood of emotion being displayed and negative rating indicates lower likelihood of emotion being displayed.
From Figure 5 we can see that headlines mentioning Sanders score the highest on the emotions: anticipation, joy, surprise, trust, and positivism. He also scores the lowest in: anger, fear, sadness, and negativity. While Clinton 2016/2008 score the highest on: anger, disgust, fear, sadness, and negativity and the lowest on: anticipation, joy, trust, and positivism.

Compared with 2008, Clinton 2016 articles appear to: have less anger, anticipation, joy, trust, and fear while also having more disgust, sadness, surprise, negativism, as well as slightly more positivism. Overall, the prospects as gauged from the emotions engendered by the media appear to be pretty bleak for Hillary Clinton.

It is interesting to note that articles about Sanders score emotionally very similar in general to that of Obama in direction except that Sanders seems to be outperforming Obama with higher: anticipation, joy, trust, and positivism while also performing better by getting lower scores in: anger, fear, sadness, and negativism. In only one indicator does Obama do better than Sanders and that is in the emotion disgust. The largest emotional difference between Obama 2008 and Sanders 2016 is that Obama articles scored the lowest on surprise while Sanders have scored the highest.

Overall, we must conclude that at least in terms of emotional tone of articles if not coverage, Sanders is doing significantly better than Hillary and even better than Obama was at this time in the 2008 presidential race.

To leave a comment for the author, please follow the link and comment on their blog: Econometrics by Simulation.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)