Is ‘Yeah’ Josh and Chuck’s favorite word?

[This article was first published on Shirin's playgRound, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Text mining and sentiment analysis of a Stuff You Should Know Podcast

Stuff You Should Know (or SYSK) is one of the many great podcasts from How Stuff Works. The two SYSK hosts Josh and Chuck have taught me so many fascinating things over the years, and today I want to use one of their podcasts to learn a little bit about text analysis in R.

Initially, I wanted to explore all SYSK podcasts. Unfortunately however, I could only find a transcript for the episode Earwax: Live With It, posted on March 19, 2015.

The complete R code can be found at the end of this post or as an R-Markdown on Github.

The podcast transcript

I copied the episode transcript from its web page and saved it as a tab delimited text file. The file can be downloaded from Github.

Separating Josh and Chuck

Of course, I wouldn’t actually want to separate Josh and Chuck. But for comparison’s sake in this analysis, I am creating two separate files for lines of dialogue spoken by either Josh or Chuck. I am also keeping the combination of both for background information.

How emotional are Josh and Chuck?

Sentiment analysis

The first thing I want to explore is a sentiment analysis of the lines spoken by Josh and Chuck. Sentiment analysis categorizes text data into positive and negative sentiments and gives information about the emotional state or attitude of the speaker or the contents of a text.

I am using the package syuzhet for sentiment analysis.

NRC sentiments

Saif Mohammad’s NRC Emotion Lexicon is a collection of words that were manually categorized based on their association with the emotions anger, anticipation, disgust, fear, joy, sadness, surprise, and trust, and with positive and negative sentiments.

What sentiment did the podcast have?

Before I go ahead with the sentiment analysis I want to get an idea of the podcast’s words’ association with the NRC categories.

As can be seen by the words and their associated emotions/ sentiments, sentiment analysis is not perfect. Most words make a lot of intuitive sense with their category (e.g. gross, fungus and spider in disgust), but a few I find to be really strange (like, why would waffle be associated with anger?). Still, the majority of categorisations make sense, so let’s go ahead with the sentiment analysis.

Does the podcast’s sentiment change over time?

Sentiment analysis for each line of dialogue produces a matrix with one column per sentiment/ emotion and one row per line. If any of the words in a line of dialogue could be associated with a given category, this category would get a value of 1 in the matrix. If there was no association with a category, its value would be 0. The lines of dialogue are sorted according to the original input text, in this case this means that they represent the order in which they were spoken in the podcast.
Because the plot would get too big with all categories, I split the data into positive and negative sentiments and emotions.

For analysing positive and negative sentiments, syuzhet implements four different methods, each of which uses a slightly different scale. But all of them assign negative values to indicate negative sentiment and positive values to indicate positive sentiments.

All of the methods rely on a precomputed lexicon of word-sentiment score associations. The emotional or sentiment valence is then computed based on the scores of the words from each line of dialogue.

The upper two plots show on the x-axis the progression of dialogue over time with each point being a line of dialogue. The sentiment score on the y-axis shows the intensity of the sentiment or emotions in the respective line of dialogue, i.e. the more words in a line were associated with the given category, the higher the line’s sentiment score.

From a first glance at these emotions and sentiments, it seems to me that the podcast is more positive than negative but we can get a better overview of positive and negative sentiments by scoring only positive and negative sentiments.

In the third graph we can see quite well that the trend goes towards positive scores, meaning the podcast is overall upbeat. While there are different peaks in both positive and negative directions in Chuck’s and Josh’s lines, there is no overall bias for one being more positive (or negative) than the other.

Finally, I am looking at the sentiment percentage values to get an idea about the percentage of positive versus negative scores along the podcast’s trajectory. Here, the podcast was divided into 20 bins and the mean sentiment valence calculated for each. This last plot shows a clear trend of increasing positivity towards the end of the podcast in Chuck’s lines. Josh on the other hand doesn’t change very much over the progression of the podcast. Interesting…

Quantitative text analysis

Building a corpus

In text analysis, a corpus refers to a collection of documents. Here, I am using the tm package to create my corpus from the character vectors of Josh’s and Chuck’s lines. SnowballC is used for word stemming.

Before I can analyse the text data meaningfully, however, I have to do some pre-processing:

  1. Removing punctuation

    Here, I am removing all punctuation marks. Before I do that, I will transform all hyphens to a space, because the text includes some words which are connected by hyphens and would otherwise be connected if I simply removed the hyphen with the removePunctuation function.

  2. Transforming to lower case

    R character string processing is case-sensitive, so everything will be converted to lower case.

  3. Stripping numbers

    Numbers are usually not very meaningful for text analysis (if we are not specifically interested in dates, for example), so they are removed as well.

  4. Removing stopwords

    Stopwords are collections of very common words which by themselves don’t tell us very much about the content of a text (e.g. and, but, the, however, etc.). The tm package includes a list of stopwords for the English language.

  5. Stripping whitespace

    I’m also removing superfluous whitespace around words.

  6. Stemming

    Finally, I’m stemming the words in the corpus. This means that words with a common root are shortened to this root. Even though stemming algorithms are not perfect, they allow us to compare conjugated words from the same origin.

Creating the Document Term Matrix

The document term matrix (DTM) lists the number of occurrences of each word per document in the corpus. Here, each document in the corpus represents one line of dialogue from the original transcript.

By restricting the DTM to words with a minimum number of letters and an occurrence in at least a minimum number of lines of dialogue (cutoff), we exclude less specific terms.

Is “Yeah” Josh and Chucks favorite word?

Most frequent words

By summing up the occurrences of each word over all documents we get the word count frequencies.

When accounting for stem words and the cutoff I set for the DTM to evaluate, Josh and Chuck spoke roughly the same number of words (Josh: 880, Chuck: 865) and had almost the same number of dialogue lines (Josh: 202, Chuck: 193). So, good job on neither one dominating the discussion. 😉

The lefthand plot shows the number of words spoken per line of dialogue. The background barplot shows the mean number of words spoken per line, the boxplot shows all individual data points (each point represents one line of dialogue and its corresponding word count). While the total number of words and of dialogue lines were basically the same, Chuck’s lines had a stronger deviation around the median with few very long lines. Josh on the other hand seems to have spoken lines with a more consistent length.

The righthand plot shows the most common words and how often they were used overall (red) and by Josh and Chuck separately (green and blue). The most frequent words include (not surprisingly) “ear” and “earwax”, but funnily also “yeah”. To be honest, while listening to the podcast I never noticed yeah being said exceptionally often but I guess the data doesn’t lie…

Wordclouds are another way to visualize the frequency of words. The frequency is indicated both by the size of the words (bigger words are more frequent than smaller words) and their color.

Word association

Associations among words bigger than 60% were plotted in a heatmap to find words that most often co-occured in the same line of dialogue.

Among the most conspicuous associations were cotton and swab in Josh’s lines (this was probably a hyphenated word to begin with: cotton-swab) and between secret and gland in Chuck’s line (probably secretory gland).

Hierarchical clustering

Hierarchical clustering can be used to classify words by sorting them into clusters according to similarity of occurence (i.e. their frequency or count).

We already knew that the words “earwax”, “ear” and “yeah” were the most common, so they were clustered accordingly.

Knowledge for the masses

Shorter words are more frequent than longer words

Most words have 4 or 5 letters, only a handful are longer than 7 letters. We don’t have words with fewer than 3 letters because they were excluded in the beginning when obtaining the DTM.

As we can see above, there is only a very small correlation between the length of words and how often they are used. As expected from what we intuitively know, the most common words tend to be shorter while long words are used only occasionally because they are often more specific terms. And there is no real difference between Josh or Chuck when it comes to the length (and complexity?) of the words they use. In general the words they use are rather on the short site, which makes sense as a big part of what makes their podcast so great is that they convey information in a down-to-earth, understandable way.

The frequency plot of all the letters in the alphabet shows that vocals are more common than consonants.

The plot above shows how often each letter occurs at which position in a word in all the words used by Josh and Chuck. For example, the letter j occured only once and at position one (so 100% of js are at this position). The other letters are more equally distributed but according to the fact that fewer words were longer than 6 letters, there are fewer letters at positions 7 to 10.

Conclusion

This little excursion into text analysis gave an interesting different look at a podcast one would normally “evaluate” intuitively while listening. This hard cold look through the data lens can highlight aspects that probably would be overlooked otherwise; for example, I never noticed that Josh and Chuck used the word yeah that much!

It would be very interesting to broaden the analysis to more podcasts to see if the yeah-thing was just a fluke of this episode or whether it’s a recurrent thing, maybe by trying speech-to-text-conversion tools.


R code

<span class="c1"># setting my custom theme of choice
</span><span class="n">library</span><span class="p">(</span><span class="n">ggplot2</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">ggrepel</span><span class="p">)</span><span class="w">

</span><span class="n">my_theme</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">base_size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">12</span><span class="p">,</span><span class="w"> </span><span class="n">base_family</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"sans"</span><span class="p">){</span><span class="w">
  </span><span class="n">theme_grey</span><span class="p">(</span><span class="n">base_size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">base_size</span><span class="p">,</span><span class="w"> </span><span class="n">base_family</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">base_family</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme</span><span class="p">(</span><span class="w">
    </span><span class="n">axis.text</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_text</span><span class="p">(</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">12</span><span class="p">),</span><span class="w">
    </span><span class="n">axis.text.x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_text</span><span class="p">(</span><span class="n">angle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">90</span><span class="p">,</span><span class="w"> </span><span class="n">vjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.5</span><span class="p">,</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">),</span><span class="w">
    </span><span class="n">axis.title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_text</span><span class="p">(</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">14</span><span class="p">),</span><span class="w">
    </span><span class="n">panel.grid.major</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_line</span><span class="p">(</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"grey"</span><span class="p">),</span><span class="w">
    </span><span class="n">panel.grid.minor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_blank</span><span class="p">(),</span><span class="w">
    </span><span class="n">panel.background</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_rect</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"aliceblue"</span><span class="p">),</span><span class="w">
    </span><span class="n">strip.background</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_rect</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"lightgrey"</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"grey"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">),</span><span class="w">
    </span><span class="n">strip.text</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_text</span><span class="p">(</span><span class="n">face</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"bold"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">12</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"navy"</span><span class="p">),</span><span class="w">
    </span><span class="n">legend.position</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"bottom"</span><span class="p">,</span><span class="w">
    </span><span class="n">legend.background</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_blank</span><span class="p">(),</span><span class="w">
    </span><span class="n">panel.margin</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">unit</span><span class="p">(</span><span class="m">.5</span><span class="p">,</span><span class="w"> </span><span class="s2">"lines"</span><span class="p">),</span><span class="w">
    </span><span class="n">panel.border</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_rect</span><span class="p">(</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"grey"</span><span class="p">,</span><span class="w"> </span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NA</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.5</span><span class="p">)</span><span class="w">
  </span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
<span class="c1"># reading lines of transcript
</span><span class="n">raw</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">readLines</span><span class="p">(</span><span class="s2">"sysk_earwax.transcript.txt"</span><span class="p">)</span><span class="w">
</span><span class="n">head</span><span class="p">(</span><span class="n">raw</span><span class="p">)</span><span class="w">
</span>
## [1] "Josh: Hey, and welcome to the podcast. I'm Josh Clark. There is Charles W. \"Chuck\" Bryant, there is Jeri. Yeah, it's Stuff You Should Know."
## [2] ""                                                                                                                                             
## [3] "Chuck: He just shrugged."                                                                                                                     
## [4] ""                                                                                                                                             
## [5] "Josh: Yeah, like \"eh, what are we going to do? That's what we are.\""                                                                        
## [6] ""
<span class="c1"># extracting lines beginning with Josh:/ Chuck: from transcript by looping over the names
</span><span class="n">names</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"Josh"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Chuck"</span><span class="p">)</span><span class="w">

</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">name</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">names</span><span class="p">){</span><span class="w">
  </span><span class="c1"># grep lines from Josh or Chuck
</span><span class="w">  </span><span class="n">assign</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">),</span><span class="w"> </span><span class="nf">as.character</span><span class="p">(</span><span class="n">raw</span><span class="p">[</span><span class="n">grep</span><span class="p">(</span><span class="n">paste0</span><span class="p">(</span><span class="s2">"^"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="s2">":"</span><span class="p">),</span><span class="w"> </span><span class="n">raw</span><span class="p">)]))</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
<span class="c1"># and removing the beginning of each line that indicates who's speaking
</span><span class="n">lines_Josh</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">"^Josh: "</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">lines_Josh</span><span class="p">)</span><span class="w">
</span><span class="n">lines_Chuck</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">"^Chuck: "</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">lines_Chuck</span><span class="p">)</span><span class="w">

</span><span class="c1"># Combining Josh's and Chuck's lines
</span><span class="n">lines_bg</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">lines_Josh</span><span class="p">,</span><span class="w"> </span><span class="n">lines_Chuck</span><span class="p">)</span><span class="w">

</span><span class="n">names</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"Josh"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Chuck"</span><span class="p">,</span><span class="w"> </span><span class="s2">"bg"</span><span class="p">)</span><span class="w">
</span>
<span class="c1"># get NRC sentiments for 
# a) each line of dialogue and
# b) each word spoken in the podcast
</span><span class="n">library</span><span class="p">(</span><span class="n">syuzhet</span><span class="p">)</span><span class="w">

</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">name</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">names</span><span class="p">){</span><span class="w">
  </span><span class="n">assign</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"get_nrc_sentiment"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">),</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">line_number</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">))),</span><span class="w"> 
                                                                 </span><span class="n">get_nrc_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)))))</span><span class="w">
  
  </span><span class="n">get_tokens</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_tokens</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)))</span><span class="w">
  </span><span class="n">assign</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"get_nrc_sentiments_tokens"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">),</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">word</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_tokens</span><span class="p">,</span><span class="w">
                                                                         </span><span class="n">get_nrc_sentiment</span><span class="p">(</span><span class="n">get_tokens</span><span class="p">)))</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
<span class="c1"># gather word sentiments for plotting
</span><span class="n">library</span><span class="p">(</span><span class="n">tidyr</span><span class="p">)</span><span class="w">

</span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_nrc_sentiments_tokens_bg</span><span class="w"> </span><span class="o">%>%</span><span class="w"> 
  </span><span class="n">gather</span><span class="p">(</span><span class="n">word</span><span class="p">,</span><span class="w"> </span><span class="n">sentiment</span><span class="p">,</span><span class="w"> </span><span class="n">anger</span><span class="o">:</span><span class="n">positive</span><span class="p">)</span><span class="w">
</span><span class="n">colnames</span><span class="p">(</span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="p">)[</span><span class="m">2</span><span class="o">:</span><span class="m">3</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"sentiment"</span><span class="p">,</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span><span class="w">
</span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="o">$</span><span class="n">name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"bg"</span><span class="w">
</span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="o">$</span><span class="n">value</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="m">0</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="p">[</span><span class="o">!</span><span class="n">duplicated</span><span class="p">(</span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="o">$</span><span class="n">word</span><span class="p">),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span>
<span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_nrc_sentiments_tokens_bg_gather</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">word</span><span class="p">,</span><span class="w"> </span><span class="n">label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">word</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">facet_wrap</span><span class="p">(</span><span class="o">~</span><span class="w"> </span><span class="n">sentiment</span><span class="p">,</span><span class="w"> </span><span class="n">ncol</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">my_theme</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme</span><span class="p">(</span><span class="w">
    </span><span class="n">axis.text</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_blank</span><span class="p">(),</span><span class="w">
    </span><span class="n">axis.ticks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_blank</span><span class="p">(),</span><span class="w">
    </span><span class="n">axis.title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_blank</span><span class="p">(),</span><span class="w">
    </span><span class="n">legend.position</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"none"</span><span class="p">,</span><span class="w">
    </span><span class="n">panel.grid.major</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_blank</span><span class="p">())</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_text_repel</span><span class="p">(</span><span class="n">segment.color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"aliceblue"</span><span class="p">,</span><span class="w"> </span><span class="n">segment.alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sentiment categories of words used in podcast"</span><span class="p">)</span><span class="w">
</span>
<span class="n">get_nrc_sentiment_Josh_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_nrc_sentiment_Josh</span><span class="w"> </span><span class="o">%>%</span><span class="w"> 
  </span><span class="n">gather</span><span class="p">(</span><span class="n">line_number</span><span class="p">,</span><span class="w"> </span><span class="n">sentiment</span><span class="p">,</span><span class="w"> </span><span class="n">anger</span><span class="o">:</span><span class="n">positive</span><span class="p">)</span><span class="w">
</span><span class="n">colnames</span><span class="p">(</span><span class="n">get_nrc_sentiment_Josh_gather</span><span class="p">)[</span><span class="m">2</span><span class="o">:</span><span class="m">3</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"sentiment"</span><span class="p">,</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span><span class="w">
</span><span class="n">get_nrc_sentiment_Josh_gather</span><span class="o">$</span><span class="n">name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"Josh"</span><span class="w">

</span><span class="n">get_nrc_sentiment_Chuck_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_nrc_sentiment_Chuck</span><span class="w"> </span><span class="o">%>%</span><span class="w"> 
  </span><span class="n">gather</span><span class="p">(</span><span class="n">line_number</span><span class="p">,</span><span class="w"> </span><span class="n">sentiment</span><span class="p">,</span><span class="w"> </span><span class="n">anger</span><span class="o">:</span><span class="n">positive</span><span class="p">)</span><span class="w">
</span><span class="n">colnames</span><span class="p">(</span><span class="n">get_nrc_sentiment_Chuck_gather</span><span class="p">)[</span><span class="m">2</span><span class="o">:</span><span class="m">3</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"sentiment"</span><span class="p">,</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span><span class="w">
</span><span class="n">get_nrc_sentiment_Chuck_gather</span><span class="o">$</span><span class="n">name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"Chuck"</span><span class="w">

</span><span class="n">get_nrc_sentiment_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rbind</span><span class="p">(</span><span class="n">get_nrc_sentiment_Josh_gather</span><span class="p">,</span><span class="w"> </span><span class="n">get_nrc_sentiment_Chuck_gather</span><span class="p">)</span><span class="w">
</span>
<span class="c1"># split into positive and negative emotions/ sentiments
</span><span class="n">get_nrc_sentiment_gather_pos</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_nrc_sentiment_gather</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">get_nrc_sentiment_gather</span><span class="o">$</span><span class="n">sentiment</span><span class="w"> </span><span class="o">%in%</span><span class="w"> 
                                                                 </span><span class="nf">c</span><span class="p">(</span><span class="s2">"anticipation"</span><span class="p">,</span><span class="w"> </span><span class="s2">"joy"</span><span class="p">,</span><span class="w"> </span><span class="s2">"positive"</span><span class="p">,</span><span class="w"> </span><span class="s2">"surprise"</span><span class="p">,</span><span class="w"> </span><span class="s2">"trust"</span><span class="p">)),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="n">get_nrc_sentiment_gather_neg</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_nrc_sentiment_gather</span><span class="p">[</span><span class="n">which</span><span class="p">(</span><span class="n">get_nrc_sentiment_gather</span><span class="o">$</span><span class="n">sentiment</span><span class="w"> </span><span class="o">%in%</span><span class="w"> 
                                                                 </span><span class="nf">c</span><span class="p">(</span><span class="s2">"anger"</span><span class="p">,</span><span class="w"> </span><span class="s2">"disgust"</span><span class="p">,</span><span class="w"> </span><span class="s2">"fear"</span><span class="p">,</span><span class="w"> </span><span class="s2">"negative"</span><span class="p">,</span><span class="w"> </span><span class="s2">"sadness"</span><span class="p">)),</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="n">library</span><span class="p">(</span><span class="n">RColorBrewer</span><span class="p">)</span><span class="w">

</span><span class="n">p</span><span class="m">1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_nrc_sentiment_gather_pos</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">line_number</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">brewer.pal</span><span class="p">(</span><span class="m">3</span><span class="p">,</span><span class="w"> </span><span class="s2">"Set1"</span><span class="p">)[</span><span class="m">3</span><span class="p">])</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">facet_grid</span><span class="p">(</span><span class="n">name</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">sentiment</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">my_theme</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme</span><span class="p">(</span><span class="w">
    </span><span class="n">axis.text.x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_text</span><span class="p">(</span><span class="n">angle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">vjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.5</span><span class="p">),</span><span class="w">
    </span><span class="n">legend.position</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"none"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Dialogue line number"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sentiment valence"</span><span class="p">,</span><span class="w">
       </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sentiments during podcast progression (positive sentiments)"</span><span class="p">)</span><span class="w">

</span><span class="n">p</span><span class="m">2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_nrc_sentiment_gather_neg</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">line_number</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">brewer.pal</span><span class="p">(</span><span class="m">3</span><span class="p">,</span><span class="w"> </span><span class="s2">"Set1"</span><span class="p">)[</span><span class="m">1</span><span class="p">])</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">facet_grid</span><span class="p">(</span><span class="n">name</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">sentiment</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">my_theme</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme</span><span class="p">(</span><span class="w">
    </span><span class="n">axis.text.x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_text</span><span class="p">(</span><span class="n">angle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">vjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.5</span><span class="p">),</span><span class="w">
    </span><span class="n">legend.position</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"none"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Dialogue line number"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sentiment valence"</span><span class="p">,</span><span class="w">
       </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sentiments during podcast progression (negative sentiments)"</span><span class="p">)</span><span class="w">
</span>
<span class="c1"># get sentiment scores
</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">name</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">names</span><span class="p">){</span><span class="w">
  </span><span class="n">sentiment</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">line_number</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">))),</span><span class="w"> 
                              </span><span class="n">syuzhet</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)),</span><span class="w"> </span><span class="n">method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"syuzhet"</span><span class="p">),</span><span class="w">
                              </span><span class="n">bing</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)),</span><span class="w"> </span><span class="n">method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"bing"</span><span class="p">),</span><span class="w">
                              </span><span class="n">afinn</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)),</span><span class="w"> </span><span class="n">method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"afinn"</span><span class="p">),</span><span class="w">
                              </span><span class="n">nrc</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)),</span><span class="w"> </span><span class="n">method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"nrc"</span><span class="p">))</span><span class="w">
  </span><span class="n">assign</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"sentiment"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">),</span><span class="w"> </span><span class="n">sentiment</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
<span class="c1"># gather for plotting
</span><span class="n">sentiment_Josh_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sentiment_Josh</span><span class="w"> </span><span class="o">%>%</span><span class="w"> 
  </span><span class="n">gather</span><span class="p">(</span><span class="n">line_number</span><span class="p">,</span><span class="w"> </span><span class="n">analysis</span><span class="p">,</span><span class="w"> </span><span class="n">syuzhet</span><span class="o">:</span><span class="n">nrc</span><span class="p">)</span><span class="w">
</span><span class="n">colnames</span><span class="p">(</span><span class="n">sentiment_Josh_gather</span><span class="p">)[</span><span class="m">2</span><span class="o">:</span><span class="m">3</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"algorithm"</span><span class="p">,</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span><span class="w">
</span><span class="n">sentiment_Josh_gather</span><span class="o">$</span><span class="n">name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"Josh"</span><span class="w">

</span><span class="n">sentiment_Chuck_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sentiment_Chuck</span><span class="w"> </span><span class="o">%>%</span><span class="w"> 
  </span><span class="n">gather</span><span class="p">(</span><span class="n">line_number</span><span class="p">,</span><span class="w"> </span><span class="n">analysis</span><span class="p">,</span><span class="w"> </span><span class="n">syuzhet</span><span class="o">:</span><span class="n">nrc</span><span class="p">)</span><span class="w">
</span><span class="n">colnames</span><span class="p">(</span><span class="n">sentiment_Chuck_gather</span><span class="p">)[</span><span class="m">2</span><span class="o">:</span><span class="m">3</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"algorithm"</span><span class="p">,</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span><span class="w">
</span><span class="n">sentiment_Chuck_gather</span><span class="o">$</span><span class="n">name</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"Chuck"</span><span class="w">

</span><span class="n">sentiment_gather</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rbind</span><span class="p">(</span><span class="n">sentiment_Josh_gather</span><span class="p">,</span><span class="w"> </span><span class="n">sentiment_Chuck_gather</span><span class="p">)</span><span class="w">
</span>
<span class="n">p</span><span class="m">3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sentiment_gather</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">line_number</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">algorithm</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_hline</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">yintercept</span><span class="o">=</span><span class="m">0</span><span class="p">),</span><span class="w"> </span><span class="n">linetype</span><span class="o">=</span><span class="s2">"dashed"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">facet_grid</span><span class="p">(</span><span class="n">name</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">algorithm</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">my_theme</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme</span><span class="p">(</span><span class="w">
    </span><span class="n">axis.text.x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">element_text</span><span class="p">(</span><span class="n">angle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">vjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">hjust</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.5</span><span class="p">),</span><span class="w">
    </span><span class="n">legend.position</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"none"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Dialogue line number"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sentiment valence"</span><span class="p">,</span><span class="w">
       </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sentiment during podcast progression according to different lexicons"</span><span class="p">)</span><span class="w">
</span>
<span class="c1"># get sentiment percent values
</span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">name</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">names</span><span class="p">){</span><span class="w">
  </span><span class="n">sentiment_percent_vals</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">bin</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="m">20</span><span class="p">,</span><span class="w">
                              </span><span class="n">syuzhet</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_percentage_values</span><span class="p">(</span><span class="n">get_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)),</span><span class="w"> </span><span class="n">method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"syuzhet"</span><span class="p">),</span><span class="w"> </span><span class="n">bins</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">20</span><span class="p">),</span><span class="w">
                              </span><span class="n">bing</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_percentage_values</span><span class="p">(</span><span class="n">get_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</span><span class="p">)),</span><span class="w"> </span><span class="n">method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"bing"</span><span class="p">),</span><span class="w"> </span><span class="n">bins</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">20</span><span class="p">),</span><span class="w">
                              </span><span class="n">afinn</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_percentage_values</span><span class="p">(</span><span class="n">get_sentiment</span><span class="p">(</span><span class="n">get</span><span class="p">(</span><span class="n">paste</span><span class="p">(</span><span class="s2">"lines"</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"_"</...

To leave a comment for the author, please follow the link and comment on their blog: Shirin's playgRound.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)