In this post, we consider some fairly recent studies conducted by folks at the Washington Post and the Pew Research Center that investigate the relationship between political ideology — as estimated by voting behavior/DW-Nominate scores (Poole and Rosenthal 1985) — and social media usage among lawmakers in the US Congress.
- More conservative and more liberal lawmakers tend to have more Facebook followers than moderate lawmakers (Hughes and Lam 2017).
- Political ideology scores derived from the news sources lawmakers share via Twitter (eg, articles from the nytimes.com, foxnews.com, etc.) strongly correlate with DW-Nominate scores based in voting behavior (Eady et al. 2018).
- Moderate members of Congress are more likely to share local (as opposed to national) news sources (via Facebook) than more conservative/liberal members of Congress (Van Kessel and Hughes 2018).
So, here we demonstrate an R-based/Twitter-based framework for replicating/approximating some of these findings (albeit with less methodological rigor), with a focus on the 115th US Senate. Results presented here nicely align with previous findings.
library(Rvoteview)#devtools::install_github("voteview/Rvoteview") library(tidyverse) library(ggthemes) library(ggrepel)#devtools::install_github("slowkow/ggrepel") library(DT) library(ggridges)
Congressional data sources
DW-Nominate scores1 for every lawmaker in the history of the US Congress, as well as the details of every Congressional roll call, are made available by the folks at VoteView in a variety of formats, including via the R package
RVoteview (Poole and Rosenthal 1985; Boche et al. 2018). The package ships with a host of search functionality; here, we use the
member_search function to acquire Senator details & DW-Nominate scores for the 115th US Senate.
sen115 <- Rvoteview:: member_search(chamber= 'Senate', congress = 115)
The plot below summarizes political ideologies in the 115th Senate as estimated by DW-Nominate D1 & D2 scores; labeled are some of the more ideologically extreme/well-known/moderate Senators. Focusing on D1, then, Elizabeth Warren votes most progressively and Rand Paul the most conservatively.
sens <- c('Flake', 'Warren', 'Collins', 'Paul', 'Manchin', 'Merkley', 'Harris', 'Murkowski', 'Udall', 'Jones', 'Shelby', 'Sanders', 'Cruz', 'Rubio' ) sen115 %>% ggplot(aes(x=nominate_dim1, y=nominate_dim2, label = bioname)) + annotate("path", x=cos(seq(0,2*pi,length.out=300)), y=sin(seq(0,2*pi,length.out=300)), color='gray', size = .25) + geom_point(aes(color = as.factor(party_code)), size= 2, shape= 17) + geom_text_repel( data = subset(sen115, bioname %in% toupper(sens)), nudge_y = 0.025, segment.color = "grey50", direction = "y", hjust = 0, size = 2 ) + scale_color_stata() + theme_fivethirtyeight() + theme(legend.position = 'none', plot.title = element_text(size=12), axis.title = element_text())+ xlab('DW-Nominate D1') + ylab('DW-Nominate D2') + labs(title="DW-Nominate Plot for the 115th US Senate")
For additional details about the 115th Congress, we access a collection of resources made available at CivilServiceUSA, which includes information regarding age, race, and religion, as well as Twitter & Facebook handles (and a host of other variables).
library(jsonlite) sen_url <- 'https://raw.githubusercontent.com/CivilServiceUSA/us-senate/master/us-senate/data/us-senate.json' senate_dets <- jsonlite::fromJSON(url(sen_url)) %>% mutate(twitter_handle = ifelse(twitter_handle == 'SenJeffFlake', 'JeffFlake', twitter_handle)) %>% mutate (twitter_handle = tolower(twitter_handle)) %>% rename (bioguide_id = bioguide) %>% left_join(sen115 %>% filter(congress == 115) %>% select(bioguide_id, party_code, nominate_dim1)) %>% mutate(party = ifelse(party == 'independent', 'democrat', party))
The table below summarizes some of the details/info from this data set for a sample of Senators in the 115th Congress.
set.seed(199) senate_dets %>% select(last_name, twitter_handle, date_of_birth, class, religion) %>% sample_n(5) %>% DT::datatable(options = list(pageLength = 5,dom = 't', scrollX = TRUE), rownames = FALSE, width="100%", escape=FALSE)
Scraping tweets via rtweet
With Twitter handles in tow, we can now gather some tweets. There are different paradigms for working with/scraping tweets using R; here, we provide a simple walk-through using the
rtweet package, which has a lovely online vignette available here.
rtweet::get_timeline function is a super simple function for gathering the n-most recent tweets for a given user (or set of users) based on Twitter handles; below we gather the 2,000 most recent tweets for each Senator.
senate_tweets <- rtweet::get_timeline( senate_dets$twitter_handle, n = 2000)
I am not exactly sure about query limits; the above query returns ~200K tweets quickly and problem-free. Example output from the twitter scrape:
set.seed(999) senate_tweets %>% select(created_at, screen_name, text) %>% #followers_count, sample_n(5) %>% DT::datatable(options = list(pageLength = 5, dom = 't', scrollX = TRUE), rownames = FALSE, width="100%", escape=FALSE)
The plot below summarizes the number of tweets returned from our Twitter query by date of creation. So, most tweets have been generated in the last couple of years; older tweets are presumably tweets from less prolific Senate tweeters.
library(scales) senate_tweets %>% mutate(created_at = as.Date(gsub(' .*$', '', created_at))) %>% group_by(created_at) %>% summarize(n=n()) %>% ggplot(aes(x=created_at, group = 1)) + geom_line(aes(y=n), size=.5, color = 'steelblue') + theme_fivethirtyeight()+ theme(plot.title = element_text(size=12)) + labs(title="Senator tweets by date") + scale_x_date(labels = scales::date_format("%m-%Y"))
Twitter followers & political ideology
First, then, we take a quick look at the relationship between political ideology scores and number of Twitter followers. The results from our call to Twitter include the number of followers for each US Senator; so, we simply need to join the Twitter data with the DW-Nominate D1 scores obtained via VoteView.
senate_summary <- senate_tweets %>% group_by(screen_name) %>% summarize(followers = mean(followers_count)) %>% rename(twitter_handle = screen_name) %>% mutate (twitter_handle = tolower(twitter_handle)) %>% left_join(senate_dets %>% select(bioguide_id, twitter_handle, party, party_code, nominate_dim1)) %>% filter(complete.cases(.))
A portion of our summary table is presented below:
For illustrative purposes, we treat the New England Independents who caucus with Democrats (ie, King-ME and Sanders-VT) as Democrats in the figure below.
senate_summary %>% ggplot(aes(nominate_dim1, log(followers), color = as.factor(party)))+ geom_point()+ # geom_smooth(method="lm", se=T) + ggthemes::scale_color_stata()+ ggthemes::theme_fivethirtyeight()+ theme(legend.position = "none", plot.title = element_text(size=12), axis.title = element_text())+ xlab('DW-Nominate D1') + ylab('log (Twitter Followers)') + labs(title="DW-Nominate scores & log (Twitter followers)")
So, as Hughes and Lam (2017) have previously demonstrated in the case of Facebook followers, more conservative and more liberal lawmakers in the Senate tend to have stronger Twitter followings in comparison to their more moderate colleagues. (Note that we do not control for constituency size, ie, state populations.)
So, a bit of a copycat post (for R users) demonstrating some super neat methods developed by folks at Pew Research and the Washington Post. The
rtweet package is quite lovely, and facilitates a very clean interaction with Twitter’s APIs. Lots of fun to be had applying social media methodologies/analyses to the investigation of political ideology. Per usual, results presented here should be taken with a grain of salt, as our data set is relatively small. See references for more methodologically thorough approaches.
Postscript: News media ideologies
Quickly. If we flip the VSM we used to estimate the tweet-based ideology of US Senators on its head, such that each news source is represented as a vector of shared tweets by Senator, we can get an estimate of the political ideology of the news sources included in our Tweet data set. (Using more/less the same code from above.)
The plot below summarizes a two-dimensional solution. D1 seems to intuitively capture the liberal-conservative leanings of news sources. A national-local distinction seems to underly variation along D2. See this Pew Research viz for a slightly different approach with ~comparable results (at least along D1).
Boche, Adam, Jeffrey B Lewis, Aaron Rudkin, and Luke Sonnet. 2018. “The New Voteview. Com: Preserving and Continuing Keith Poole’s Infrastructure for Scholars, Students and Observers of Congress.” Public Choice. Springer, 1–16.
Eady, Gregory, Jan Zilinsky, Jonathan Nagler, and Joshua Tucker. 2018. “Trying to Understand How Jeff Flake Is Leaning? We Analyzed His Twitter Feed — and Were Surprised.” Washington Post, October. https://www.washingtonpost.com/news/monkey-cage/wp/2018/10/05/trying-to-understand-how-jeff-flake-is-leaning-we-analyzed-his-twitter-feed-and-were-surprised/?utm_term=.34e5b2a28490.
Hughes, Adam, and Onyi Lam. 2017. “Highly Ideological Members of Congress Have More Facebook Followers Than Moderates Do.” Pew Research Center, August. http://www.pewresearch.org/fact-tank/2017/08/21/highly-ideological-members-of-congress-have-more-facebook-followers-than-moderates-do/.
Poole, Keith T, and Howard Rosenthal. 1985. “A Spatial Model for Legislative Roll Call Analysis.” American Journal of Political Science. JSTOR, 357–84.
Van Kessel, Patrick, and Adam Hughes. 2018. “Moderates in Congress Go Local on Facebook More Than the Most Ideological Members.” Pew Research Center, July. http://www.pewresearch.org/fact-tank/2018/07/25/moderates-in-congress-go-local-on-facebook-more-than-the-most-ideological-members/.