Site icon R-bloggers

Getting NYS Home Heating Oil Prices with {rvest}

[This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< section id="introduction" class="level1">

Introduction

If you live in New York and rely on heating oil to keep your home warm during the colder months, you know how important it is to keep track of heating oil prices. Fortunately, with a bit of R code, you can easily access the latest heating oil prices in New York.

The code uses the {dplyr} package to clean and manipulate the data, as well as the {timetk} package to plot the time series. Here’s a breakdown of what the code does:

The resulting data table is then cleaned and transformed using dplyr functions such as html_table, as_tibble, set_names, select, mutate, and arrange.

Finally, the resulting time series data is plotted using plot_time_series from the timetk package.

To run this code, you will need to have these packages installed on your machine. You can install them using the install.packages function in R. Here’s how you can install the packages:

install.packages("dplyr")
install.packages("xml2")
install.packages("rvest")
install.packages("tibble")
install.packages("purrr")
install.packages("lubridate")
install.packages("timetk")

Once you have installed the packages, you can copy and paste the code into your R console or RStudio and run it to get the latest heating oil prices in New York.

In conclusion, the code above provides a simple and efficient way to access and visualize heating oil prices in New York using R. By keeping track of these prices, you can make informed decisions about when to buy heating oil and how much to purchase, ultimately saving you money on your heating bills.

< section id="example" class="level1">

Example

Now let’s run it!

url  <- "https://www.eia.gov/opendata/qb.php?sdid=PET.W_EPD2F_PRS_SNY_DPG.W"
page <- xml2::read_html(url)
node <- rvest::html_node(
    x = page
    , xpath = "/html/body/div[1]/section/div/div/div[2]/div[1]/table"
)
ny_tbl <- node |>
    rvest::html_table() |>
    tibble::as_tibble() |>
    purrr::set_names('series_name','period','frequency','value','units') |>
    dplyr::select(period, frequency, value, units, series_name) |>
    dplyr::mutate(period = lubridate::ymd(period)) |>
    dplyr::arrange(period)

ny_tbl |>
    timetk::plot_time_series(.date_var = period, .value = value)

Voila!

To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version