Modeling political ideology in the 116th House
A final thought
A brief (and copycat) go at modeling roll call voting behavior in the US House of Representatives using (1) constituency demographics, (2) house member party affiliation, and (3) house member characteristics. This post is based directly ...
A super simple post that summarizes R-based methods for visual summary & collage-building using image attachments on Twitter. In the process, a bit of a photo homage to Congresswoman Xochitl Torres Small in her first year representing New Mexico’s 2nd district.
House & presidential election returns
Trump margins in 2016
Flipped House Districts from 2016 to 2018
The 31 House Democrats in Trump-supportive districts
A quick geographical perspective
The 13 House Democrats in solid Trump districts
Voting patterns in presidential elections
Voting patterns for the 31 Trump-House Dem districts
The 5 House Democrats that should probably vote against impeachment
New Mexico demographics via tidycensus
2016 Presidential Election
Presidential elections in New Mexico historically
New Mexico as bellwether?
Congressional delegation historically
New Mexico State Government
In this post, we piece together a brief political history of New Mexico using a host of data sources, including Wikipedia, the US Census, the ...
A very brief introduction
Some open source data sets
Extracting referring expressions to 45
Party-level stance towards 45
House Rep stance & 2016 presidential vote margins
Prevalence of 45 reference
In this post, we investigate how (& how often) members of the 116th House of Representatives refer to the 45th president of the United ...
Google n-gram data
Procrustes, PCA & visualizing semantic change
Detecting semantic change
I have developed a Git Hub guide that demonstrates a simple workflow for sampling Google n-gram data and building historical word embeddings with the aim of investigating lexical semantic change. Here, we build on this workflow, ...
A brief intro to nmelectiondatr
CD NM-02: an overview
A look at Pearce-Xochitl precincts
Straight- & split-ticket voting in NM-02
In this post, we consider some different precinct-level perspectives on Xochitl Torres Small’s surprising win over Yvette Herrell in New Mexico’s 2nd Congressional District (NM-02) in the 2018 general ...
Building a historical, genre-based corpus
Building a Naive Bayes classifier
Model assessment & confusion matrix
In this short post, we outline a Naive Bayes (NB) approach to genre-based text classification. First, we introduce & describe a corpus derived from Google News’ RSS feed, which includes source and genre information. We then ...
Congressional data sources
Scraping tweets via rtweet
Twitter followers & political ideology
Shared tweets as ideology
Postscript: News media ideologies
In this post, we consider some fairly recent studies conducted by folks at the Washington Post and the Pew Research Center that investigate the relationship between political ideology — as ...
Ideal points estimation
Legislators in political space
Legislation in 2D political space
Political space and marijuana
A quick geographical perspective
Political ideology in NMSL53
This is the second in a series of posts investigating voting patterns in New Mexico’s 53rd State Legislature (NMSL53). In this post, we ...
NMSL53: an overview
Attendance & party loyalty
Health care-related roll calls
Roll call details
Incorporating census data
Postscript: Vizualizing congressional composition
In this post, we introduce a new R data package, nmlegisdatr, that makes available roll call data for New Mexico’s 53rd (2017-18) State Legislature (NMSL53). While ...
Concreteness ratings and the lexvarsdatr package
Context & concreteness scores
This post considers a super-clever study presented in Snefjella and Kuperman (2015), in which the authors investigate the relationship between psychological distance and geographical distance using geolocated tweets. General idea/hypothesis:
The more we perceive an event/entity ...
Age distribution profiles
This post demonstrates a simple workflow for building census-based, historical socio-demographic profiles using the R package tidycensus. The goal is to outline a reproducible method for quick visual exploration of trend data made available via the American Community Survey (ACS).
We focus ...
From text to map
Corpus search and context
LSA, MDS, and semantic space
In this post, we demonstrate some different methodologies for exploring the geographical information found in text. First, we address some of the practical issues of extracting places/place-names from an annotated corpus, and demonstrate how to (1) ...
This post outlines a fairly simple workflow from annotated corpus to topic model, with a focus on the exploratory utility of topic models. We first consider some text structures relevant to topic modeling in R, and then demonstrate some approaches to ...
Language data and the census
Languages in the US
Linguistic diversity as entropy
Locating linguistic diversity
This post investigates linguistic diversity in the United States utilizing data made available by the US Census. We consider census language classifications, and introduce a simple methodology for quantifying linguistic diversity using entropy ...
Defining potential keyphrases
Corpus search for potential keyphrases
Selecting descriptive keyphrases with the tf-idf statisitic
Post script - State of the Union Addresses
This post outlines a simple framework for identifying and extracting keyphrases from component texts of a corpus. We first consider some functional characteristics of descriptive keyphrases, as ...
KWIC & BOW
Summary and shiny
This post demonstrates the use of a simple collection of functions from my R-package corpuslingr. Functions streamline two sets of corpus linguistics tasks:
annotated corpus search of grammatical constructions and complex lexical patterns in context, and
detailed summary and ...
New Mexico & the US
A simple model
Some final notes
In this post we investigate Spanish language maintenance within Hispanic communities in the US utilizing data from the US Census. Spanish language maintenance refers to the rate at which Hispanics within a given community ...