Blog Archives

Gender Roles with Text Mining and N-grams

April 14, 2017
By
Gender Roles with Text Mining and N-grams

Today is the one year anniversary of the janeaustenr package’s appearance on CRAN, its cranniversary, if you will. I think it’s time for more Jane Austen here on my blog. via GIPHY I saw this paper by Matthew Jockers and Gabi Kirilloff a number of months ago and the ideas in it have been knocking around in...

Read more »

How Do You Discover R Packages?

March 19, 2017
By
How Do You Discover R Packages?

Like I mentioned in my last blog post, I am contributing to a session at userR 2017 this coming July that will focus on discovering and learning about R packages. This is an increasingly important issue for R users as we all decide which of the 10,000+...

Read more »

Scraping CRAN with rvest

March 5, 2017
By
Scraping CRAN with rvest

I am one of the organizers for a session at userR 2017 this coming July that will focus on discovering and learning about R packages. How do R users find packages that meet their needs? Can we make this process easier? As somebody who is relatively new...

Read more »

What Programming Languages Are Used Most on Weekends?

February 6, 2017
By
What Programming Languages Are Used Most on Weekends?

Note: Cross-posted with the Stack Overflow blog. Check out the code for this analysis on Kaggle. For me, the weekends are mostly about spending time with my family, reading for leisure, and working on the open-source projects I am involved in. These w...

Read more »

Women in the 2016 Stack Overflow Survey

January 22, 2017
By
Women in the 2016 Stack Overflow Survey

Note: Cross-posted with the Stack Overflow blog The 2017 Stack Overflow Developer Survey opened last week, and we on the Data Team are looking forward to analyzing the survey results to better understand our developer community. I am particularly inte...

Read more »

Text Mining in R: A Tidy Approach

January 13, 2017
By
Text Mining in R: A Tidy Approach

I spoke on approaching text mining tasks using tidy data principles at rstudio::conf yesterday. I was so happy to have the opportunity to speak and the conference has been a great experience. If you want to catch up on what has been going on at rstudio::conf, Karl Broman put together a GitHub repo of slides and Sharon Machlis...

Read more »

Reddit Responds to the Election

December 5, 2016
By
Reddit Responds to the Election

It’s been about a month since the U.S. presidential election, with Donald Trump’s victory over Hillary Clinton coming as a surprise to most. Reddit user Jason Baumgartner collected and published every submission and comment posted to Reddit on the ...

Read more »

Measuring Gobbledygook

November 24, 2016
By
Measuring Gobbledygook

In learning more about text mining over the past several months, one aspect of text that I’ve been interested in is readability. A text’s readability measures how hard or easy it is for a reader to read and understand what a text is saying; it depends on how sentences are written, what words are chosen, and so forth....

Read more »

Mapping Election Results in Utah

November 10, 2016
By
Mapping Election Results in Utah

My adopted home state of Utah has been a weird place this election cycle. For the unfamiliar, Utah is extremely conservative when it comes to politics; it is one of the reddest of the red states and has backed the Republican candidate for president for...

Read more »

Tidy Text Mining with R

October 27, 2016
By
Tidy Text Mining with R

I am so pleased to announce that tidytext 0.1.2 is now available on CRAN. This release of tidytext, a package for text mining using tidy data principles by Dave Robinson and me, includes some bug fixes and performance improvements, as well as some new ...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)