# Tractatus Logico (Phylo)sophicus

**Blog - BS**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Over the Christmas holidays, I read “Maths Meets Myths: Quantitative Approaches to Ancient Narratives,” from the Springer *Understanding Complex Systems* collection.

The authors present their application of “hard” science techniques to datasets coming from the humanities — mostly large corpus of texts, legends and myths.

One paper in particular uses bioinformatics and phylogenetics to study the spread of a popular folk tale: Little Red Riding Hood. The story that I knew from Perrult and Grimm has patterns that are also found in African and East Asian tales.

### The Tractatus Logico-Philosophicus viewed as a phylogenetic tree

Inspired by this, I've had a look at Wittgenstein's Tractatus Logico Philosophicus (available on Project Gutenberg), which is presented as hierachically numbered statements and sub-statements.

We start by scraping the book into a dataframe with one row per statement:

library(rvest) page <- read_html("http://www.tractatuslogico-philosophicus.com/") root <- page %>% html_node("#root") df <- data.frame() for (item in root %>% html_nodes('li')) { label <- item %>% html_attr("data-name") content <- item %>% html_text(trim = TRUE) temp <- data.frame(label, content) df <- rbind(df, temp) }

We then generate our cluster analysis based on the distance between the columns of `df`

, hoping that the hierachical numbering of statements will yield something interesting.

We adopt the `single`

method, described like so:

The

single linkagemethod (which is closely related to the minimal spanning tree) adopts a ‘friends of friends’ clustering strategy

clusters <- hclust(dist(df), method = "single")

### Dendograms galore

From these clusters, we can represent the book as dendograms, which are used in phylogenetics to represent evolutionary splits and genetic relationships in a tree.

plot(clusters, labels = clusters$labels)

d <- as.dendrogram(clusters) plot(d, horiz = TRUE, type = "triangle")

library(ape) plot(as.phylo(clusters), type = "fan")

The diagrams above show how our clusters have correctly grouped together the hierachical statements of the *Tractatus*.

From Mike Bostock's Tree of Life helped by Jason Davies' work parsing a Newick text file format (standard in tree representations) in Javascript, I re-implemented the above with `d3-jetpack`

and ES6: https://bl.ocks.org/basilesimon/66db4338c15099f6e8d62f236db2ef2d.

The resulting chart is at the top of this page.

I love how simple the result looks and how little we end up knowing about the book itself. The only thinkg I'll let you in the final, chapter seven put-down of this book about language, facts and truths of the world:

What we cannot speak about we must pass over in silence.

Precisely what I didn't do in this blog about phylogenetics and a book I never finished.

**leave a comment**for the author, please follow the link and comment on their blog:

**Blog - BS**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.