# Tractatus Logico (Phylo)sophicus

March 1, 2019
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Over the Christmas holidays, I read "Maths Meets Myths: Quantitative Approaches to Ancient Narratives," from the Springer Understanding Complex Systems collection.

The authors present their application of "hard" science techniques to datasets coming from the humanities — mostly large corpus of texts, legends and myths.

One paper in particular uses bioinformatics and phylogenetics to study the spread of a popular folk tale: Little Red Riding Hood. The story that I knew from Perrult and Grimm has patterns that are also found in African and East Asian tales.

### The Tractatus Logico-Philosophicus viewed as a phylogenetic tree

Inspired by this, I've had a look at Wittgenstein's Tractatus Logico Philosophicus (available on Project Gutenberg), which is presented as hierachically numbered statements and sub-statements.

We start by scraping the book into a dataframe with one row per statement:

``````library(rvest)
root <- page %>% html_node("#root")

df <- data.frame()
for (item in root %>% html_nodes('li')) {
label <- item %>% html_attr("data-name")
content <- item %>% html_text(trim = TRUE)

temp <- data.frame(label, content)
df <- rbind(df, temp)
}
``````

We then generate our cluster analysis based on the distance between the columns of `df`, hoping that the hierachical numbering of statements will yield something interesting.

We adopt the `single` method, described like so:

The single linkage method (which is closely related to the minimal spanning tree) adopts a ‘friends of friends’ clustering strategy

``````clusters <- hclust(dist(df), method = "single")
``````

### Dendograms galore

From these clusters, we can represent the book as dendograms, which are used in phylogenetics to represent evolutionary splits and genetic relationships in a tree.

``````plot(clusters, labels = clusters\$labels)
``````

``````d <- as.dendrogram(clusters)
plot(d, horiz = TRUE, type = "triangle")
``````

``````library(ape)
plot(as.phylo(clusters), type = "fan")
``````

The diagrams above show how our clusters have correctly grouped together the hierachical statements of the Tractatus.

From Mike Bostock's Tree of Life helped by Jason Davies' work parsing a Newick text file format (standard in tree representations) in Javascript, I re-implemented the above with `d3-jetpack` and ES6: https://bl.ocks.org/basilesimon/66db4338c15099f6e8d62f236db2ef2d.

I love how simple the result looks and how little we end up knowing about the book itself. The only thinkg I'll let you in the final, chapter seven put-down of this book about language, facts and truths of the world:

What we cannot speak about we must pass over in silence.

Precisely what I didn't do in this blog about phylogenetics and a book I never finished.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.