Blog Archives

Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda

January 10, 2019
By
Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda

In this post we will return to the Pitchfork music review data, parts of which I've analyzed in previous posts. Our goal here will be to use text mining and natural language processing (NLP) to understand linguistic signals of album quality. This type of analysis helps us understand what Pitchfork reviewers appreciate or dislike, and gives us a sense...

Read more »

Multilevel Modeling Solves the Multiple Comparison Problem: An Example with R

October 31, 2018
By
Multilevel Modeling Solves the Multiple Comparison Problem: An Example with R

Multiple comparisons of group-level means is a tricky problem in statistical inference. A standard practice is to adjust the threshold for statistical significance according to the number of pairwise tests performed. For example, according to the widely-known Bonferonni method, if we have 3 different groups for which we want to compare the means of a given variable, we would...

Read more »

Differences in Word Use Across Music Genres in Pitchfork Album Reviews

September 22, 2018
By
Differences in Word Use Across Music Genres in Pitchfork Album Reviews

In this post we will return to the data on Pitchfork music reviews, parts of which I've analyzed previously. The goal of this post will be to gain an understanding of distinctive words in the reviews of albums of different musical genres. This type of analysis helps us understand the musical aspects that distinguish written descriptions of the music...

Read more »

Sentiment Use Across the Course of Pitchfork Music Reviews: A Tidy Text Analysis with R

June 6, 2018
By
Sentiment Use Across the Course of Pitchfork Music Reviews: A Tidy Text Analysis with R

In this post, we'll return to the Kaggle data containing information on Pitchfork music reviews. In a previous post, I used this dataset to cluster music genres. In the current post, I will use R and the tidytext package (and philosophy) to examine the text of the music reviews. Specifically, the goal of the analysis described in this post...

Read more »

Anscombe’s Quartet: 1980’s Edition

January 7, 2018
By
Anscombe’s Quartet: 1980’s Edition

In this post, I'll describe a fun visualization of Anscombe's quartet I whipped up recently.If you aren't familiar with Anscombe's quartet, here's a brief description from its Wikipedia entry: "Anscombe's quartet comprises four datasets that have nearly identical simple descriptive statistics, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973...

Read more »

Clustering Music Genres with R

December 7, 2017
By
Clustering Music Genres with R

In a number of upcoming posts, I'll be analyzing an interesting dataset I found on Kaggle. The dataset contains information on 18,393 music reviews from the Pitchfork website. The data cover reviews posted between January 1999 and January 2016. I downloaded the data and did an extensive data munging exercise to turn the data into a tidy dataset for...

Read more »

Sensographics and Mapping Consumer Perceptions Using PCA and FactoMineR

September 10, 2017
By
Sensographics and Mapping Consumer Perceptions Using PCA and FactoMineR

In the last post, we focused on the preparation of a tidy dataset describing consumer perceptions of beverages. In this post, I'll describe some analyses I've been doing of these data, in order to better understand how consumers perceive the beverage category. This type of analysis is often used in sensographics- companies who produce food products (chocolate, sauces, etc.)...

Read more »

Showing Some Respect for Data Munging

August 1, 2017
By
Showing Some Respect for Data Munging

In this post, I'd like to focus on data munging, e.g. the process of acquiring and arranging data (typically in a tidy manner) prior to data analysis. It's common knowledge that data scientists spend an enormous amount of time munging data, but data analysis, modeling, and visualization get most of the attention at presentations, on blogs and in the...

Read more »

Analyzing Accupedo step count data in R: Part 2 – Adding weather data

March 27, 2017
By
Analyzing Accupedo step count data in R: Part 2 – Adding weather data

In my last set of posts, I wrote about analyzing data from the Accupedo step counter app I have on my phone. In this post, I'll talk about some additional analysis I've done by merging the step counter data with weather data from another source.The website www.wunderground.com has freely available weather data available for most parts of the world....

Read more »

Analyzing Accupedo step count data in R

January 4, 2017
By
Analyzing Accupedo step count data in R

Accupedo is a great (and free!) step counting app that I’ve been using for a while now to keep track of how much I walk every day. The app measures the number of steps you make, and has some nice visualizations that allow you to see how many steps you’ve walked in the past days, weeks, months and years....

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)