Text Analysis with R for Students of Literature

(This article was first published on Pachá (Batteries Included), and kindly contributed to R-bloggers)

About the book

I obtained a copy of this book by Matthew Jockers throughout Universities’ access from Springer. You can also get a copy from Amazon.

This book is short and to the point. I would actually strongly recommend it to anyone interested in text mining and natural language processing.

What I do like the most about this book? That you can download the exercises from the book’s website. I downloaded the zip, extracted the folder and then created a RStudio project to the folder and that’s it. Then I could follow the explanations without needing to transcript the code from the pdf. Amazing!

Table of contents

I couldn’t find it full on the web so I write it here:

Part Contents
Part I Microanalysis R Basics
First Foray into Text Analysis with R
Accessing and Comparing Word Frequency Data
Token Distribution Analysis
Correlation
Part II Mesoanalysis Measures of Lexical Variety
Hapax Richness
Do It KWIC
Do It KWIC (Better)
Part III Macroanalysis Clustering
Classification
Topic Modeling

To leave a comment for the author, please follow the link and comment on their blog: Pachá (Batteries Included).

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)