Extract Top Reddit Posts of #rstats in 3 lines of R Code

[This article was first published on r-bloggers on Programming with R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post is kept (literally) minimal to demonstrate how simple is this hack using R (of course could be simple in other languages too). This is also to establish a point that R has got use-cases beyond statistics and data-mining.

Objective

rstats subreddit is one of the popular sources of R-related information / discussion on the internet. We’re trying to extract the top posts of rstats subreddit.

Data Format

Lucky for us, Reddit offers a json file for every subreddit (also post) and we’ll use that here.

subreddit url: "https://www.reddit.com/r/rstats/"
subreddit json: "https://www.reddit.com/r/rstats/.json"

jsonlite @ Action

The package that will help us in this endeavor is jsonlite (by Jeroen Ooms and Team) for parsing json files and feeds. It’s got a sweet function that fromJSON() that parses a json file and stores the result in a list object. Ultimately, we can find the required information – title, url of the subreddit in there.

library(jsonlite)

reddit <- fromJSON("https://www.reddit.com/r/rstats/.json")

(top10 <- reddit$data$children$data[1:10,c("title","url")])
##                                                                                                                                                            title
## 1                                                                                         Beginners Cookbook for Interactive Visualization in R with highcharter
## 2                                                                                                                                           Chaos game animation
## 3                                                                  Knitting an R Notebook to pdf- how do you wrap r code so it doesn't overflow across the page?
## 4  Taking a basic college stat class that does a lot of its work in R this fall... what would be a good way to learn/prepare myself going in with no experience?
## 5                                                                                                            Dealing with February heatmap, and 2 value heatmaps
## 6                                                                                                Logging into your Rstudio Environment from Any Browser: Amazing
## 7                                                                                                                    Creating a figure showing ratios on a line?
## 8                                                                                                                 Using expressions in cowplot plot_grid labels?
## 9                                                                                                              help needed with r -integration package with SPSS
## 10                                                                                                                                        Significance test in R
##                                                                                                         url
## 1  https://www.programmingwithr.com/beginners-cookbook-for-interactive-visualization-in-r-with-highcharter/
## 2                                     https://www.reddit.com/r/rstats/comments/cn4ym1/chaos_game_animation/
## 3          https://www.reddit.com/r/rstats/comments/cn6pok/knitting_an_r_notebook_to_pdf_how_do_you_wrap_r/
## 4        https://www.reddit.com/r/rstats/comments/cmuy42/taking_a_basic_college_stat_class_that_does_a_lot/
## 5       https://www.reddit.com/r/rstats/comments/cmylmk/dealing_with_february_heatmap_and_2_value_heatmaps/
## 6                                                                   https://jagg19.github.io/2019/08/aws-r/
## 7               https://www.reddit.com/r/rstats/comments/cmsqkr/creating_a_figure_showing_ratios_on_a_line/
## 8            https://www.reddit.com/r/rstats/comments/cmsvc0/using_expressions_in_cowplot_plot_grid_labels/
## 9         https://www.reddit.com/r/rstats/comments/cmtruo/help_needed_with_r_integration_package_with_spss/
## 10                                  https://www.reddit.com/r/rstats/comments/cmx3v0/significance_test_in_r/

3-lines

  • Load the library
  • Retrieve and Parse the json file
  • Extract the relevant information for the list object

Summary

This post while is primarily intended to demonstrate the simplicity of R and jsonlite for json parsing, it can also be used to automate such a script to email or send notification about top 10 rstats subreddit post at a scheduled interval.

Read more

Beginners Cookbook for Interactive Visualization in R with highcharter

If you like our posts, Please share it with your Friends and Network. Use our RSS Feed, to subscribe for latest update from programmingwithr.com

To leave a comment for the author, please follow the link and comment on their blog: r-bloggers on Programming with R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)