Domestic data science – energy use

[This article was first published on R – scottishsnow, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I wrote previously about my home electricity use. We’ve an open home energy monitor logging our import of electricity. Our house isn’t typical for the UK, we don’t have a gas supply and our space and water heating is via an air source heat pump (which runs on electricity).

In my previous blog post I mentioned we were considering having Solar PV installed. We’ve recently done this, but this blog isn’t about solar. We’ve also been thinking of switching energy suppliers. Our current supplier, Ecotricity, charges a flat rate for kWh used, with an additional standing charge. However the wholesale price of electricity is not static for energy suppliers, it varies through the day based on the amount of energy available and how much consumers need/want to use (supply – demand). Typically there is a spike in energy costs at tea time when most folk get home from work and put on the kettle for a brew and cook their tea. Octopus is another green energy supplier, but their agile tariff varies their price of electricity through the day and even between days.

If you’d like to switch to Octopus you can use this referral code to give us both £50 credit.

This blog post compares how much our electricity cost from Ecotricity with how much it might cost from Octopus. Octopus publish their prices for the previous year, I’ve two years of their data which have a 30 time step. I’ve a little more than 12 months of my own consumption data at a 10 second time step. Ecotricity charge a flat rate, but it did change in Sept, I’ve used their earlier (lower) rate for my comparison.

For data prep, I’ve removed the year from the Octopus data to compare my single year of observations to multiple years. The time period for which we have complete overlaps is from 2019-01-31 to 2019-12-20.

Code for the analysis is at the bottom of the post.

First, how do prices vary through the day? We can see that for most of the day Octopus is much cheaper, it’s only the tea time spike for which they charge more. So in order for us to save money, the bulk of our use needs to avoid the tea time spike.

price

How does our use vary through the day? I’ve adjusted our 10 data to 30 minute windows to help with comparison. We heat our water at midday and also a top up at about 9 pm, hence those spikes. Our heating comes on at around 4 or 5 in the morning, hence the heavy use then. Our tea time use is inline with these other heavy periods.

use

What if we multiply my energy use by the cost for each hour? As we’d expect Octopus is cheaper for the majority of hours (points below the 1:1 line), but some time periods is more expensive.

hourly_cost

We can aggregate these costs to a daily time step, and include the standing charge. The Octopus standing charge is ~ 10 p a day less than Ecotricity, so price gaps start to become quite big. At their 2019 prices, Octopus is cheaper on every day of the year. This is especially so in winter when we use a lot of electricity during off-peak periods (for space heating).

daily_cost

What’s the total difference over the year (2019-01-31 to 2019-12-20)? The following figures are without us altering our electricity consumption patterns. A 30% reduction in our annual bill is huge. Unsurprisingly I’ve requested a switch of supplier.

  • Ecotricity costs £ 950 for the time period.
  • Octopus 2018 costs £ 779 for the time period.
  • Octopus 2019 costs £ 628 for the time period.

Finally, if we can eliminate the 4pm to 7pm peak, how much could we save? If we can cut electricity use/import during this expensive period we should be able to reduce our bills even further. 50-60p a day doesn’t seem a lot, but over a year it’ll add up. Over the time period from 2019-01-31 to 2019-12-20 our 4pm to 7pm Octopus cost would have been £232 at 2018 prices and £198 at 2019 prices.

peak

Analysis and graphics code below. I should probably have manipulated my data to let me plot the hourly and daily scatter plots with facets instead of patchwork. For some reason wordpress is not displaying pipes (%>%) correctly, I’ve contacted wordpress.com to see how to fix it.

# Packages
library(tidyverse)
library(readxl)
library(lubridate)
library(patchwork)

# data
tarrif_2018 = read_excel("agile_rates.2018-12-20.xlsx", sheet = "South Scotland") %>%
   select(date_2018 = date, from_2018 = from, rate_2018 = unit_rate_excl_vat) %>%
   mutate(date_2018 = date_2018 + years(1),
          date_2018 = str_sub(date_2018, 1, 10))

tarrif_2019 = read_excel("agile_rates_2019.xlsx", sheet = "South Scotland") %>%
   select(date_2019 = date, from_2019 = from, rate_2019 = unit_rate_excl_vat) %>%
   mutate(date_2019 = str_sub(date_2019, 1, 10))

f = list.files(".", pattern = "elec*")
use = lapply(f, function(i){
   read_csv(i, col_names = F) %>%
      mutate(datetime = dmy_hms(X1)) %>%
      select(datetime, Watts = X2)
})
use = do.call("rbind.data.frame", use)

use = use[!duplicated(use$datetime), ]

# Hourly price
tarrif_2018 %>%
   mutate(year = 2018) %>%
   select(year, from = from_2018, rate = rate_2018) %>%
   bind_rows(tarrif_2019 %>%
                mutate(year = 2019) %>%
                select(year, from = from_2019, rate = rate_2019)) %>%
   group_by(year, from) %>%
   summarise(rate_med = median(rate),
             rate_25 = quantile(rate, .25),
             rate_75 = quantile(rate, .75)) %>%
   mutate(from = str_sub(from, 1, 5)) %>%
   ggplot(aes(from, rate_med, colour = as.factor(year))) +
   geom_pointrange(aes(ymin = rate_25, ymax = rate_75)) +
   geom_hline(yintercept = 17.74, size = 1.2) +
   annotate("text", x = 5, y = 18.5, angle = 270,
            label = "Ecotricity 2019") +
   coord_flip() +
   scale_color_brewer(type = "qual", palette = "Dark2") +
   labs(title = "How does price vary through the day?",
        subtitle = "Octopus median and interquartile range",
        x = "Time of day",
        y = "Price (p/kWh)",
        colour = "Octopus") +
   theme_minimal() +
   theme(text = element_text(size = 15),
         plot.margin = margin(5, 10, 2, 2, "pt"))

# Chunk use to 30 min intervals
use_30 = use %>%
   mutate(mi = minute(datetime),
          mi = if_else(mi %
   group_by(date, hr, mi) %>%
   summarise(kWh = sum(kWh)) %>%
   mutate(from = paste(hr, mi, "00", sep = ":")) %>%
   ungroup()

<span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span># Hourly use
use_30 %>%
   group_by(from) %>%
   summarise(kWh_med = median(kWh),
             kWh_25 = quantile(kWh, .25),
             kWh_75 = quantile(kWh, .75)) %>%
   mutate(from = str_sub(from, 1, 5)) %>%
   ggplot(aes(from, kWh_med)) +
   geom_pointrange(aes(ymin = kWh_25, ymax = kWh_75)) +
   annotate("rect",
            xmin = "16:00", xmax = "19:00",
            ymin=0, ymax=Inf, alpha=0.2, fill="red") +
   coord_flip() +
   labs(title = "How does our use vary through the day?",
        subtitle = "Highlighted area is expensive Octopus time",
        x = "Time of day",
        y = "Use (kWh)") +
   theme_minimal() +
   theme(text = element_text(size = 15),
         plot.margin = margin(5, 10, 2, 2, "pt"))

# Join price and use
cost = use_30 %>%
   left_join(tarrif_2018, by = c(date = "date_2018",
                                 from = "from_2018")) %>%
   left_join(tarrif_2019, by = c(date = "date_2019",
                                 from = "from_2019")) %>%
   filter(!is.na(rate_2018) &
             !is.na(rate_2019)) %>%
   mutate(ecotricity = kWh * .1774,
          octopus_2018 = kWh * rate_2018 / 100,
          octopus_2019 = kWh * rate_2019 / 100)

# Hourly cost
x = cost %>%
   select(from, ecotricity, octopus_2018, octopus_2019) %>%
   gather(supplier, cost, -from) %>%
   group_by(from, supplier) %>%
   summarise(cost_med = median(cost)) %>%
   ungroup() %>%
   mutate(from = str_sub(from, 1, 5)) %>%
   spread(supplier, cost_med) %>%
   mutate(time = seq(0, 23.5, by = 0.5))

ggplot(x, aes(x = ecotricity, y = octopus_2018, colour = time)) +
   geom_point() +
   geom_abline(slope = 1) +
   scale_colour_viridis_c() +
   labs(title = "Hourly cost between suppliers",
        subtitle = "Octopus 2018",
        x = "Ecotricity (£)",
        y = "Octopus (£)",
        colour = "Hour") +
   theme_minimal() +
   theme(text = element_text(size = 15),
         plot.margin = margin(5, 10, 2, 2, "pt"),
         legend.position="none") +
   ggplot(x, aes(x = ecotricity, y = octopus_2019, colour = time)) +
   geom_point() +
   geom_abline(slope = 1) +
   scale_colour_viridis_c() +
   labs(subtitle = "Octopus 2019",
        x = "Ecotricity (£)",
        y = "Octopus (£)",
        colour = "Hour") +
   theme_minimal() +
   theme(text = element_text(size = 15),
         plot.margin = margin(5, 10, 2, 2, "pt"))

# Daily cost
cost_daily = cost %>%
   select(date, ecotricity, octopus_2018, octopus_2019) %>%
   group_by(date) %>%
   summarise(ecotricity = sum(ecotricity),
             octopus_2018 = sum(octopus_2018),
             octopus_2019 = sum(octopus_2019)) %>%
   mutate(ecotricity = ecotricity + .2959,
          octopus_2018 = octopus_2018 + .21,
          octopus_2019 = octopus_2019 + .21,
          jul = yday(as.Date(date)))

ggplot(cost_daily, aes(ecotricity, octopus_2018)) +
   geom_point(aes(colour = jul), alpha = 0.8) +
   scale_colour_viridis_c() +
   geom_abline(slope = 1) +
   labs(title = "Daily electricity cost",
        subtitle = "Octopus 2018",
        x = "Ecotricity (£)",
        y = "Octopus (£)",
        colour = "Day of year") +
   theme_minimal() +
   theme(text = element_text(size = 15),
         legend.position="none") +
   ggplot(cost_daily, aes(ecotricity, octopus_2019)) +
   geom_point(aes(colour = jul), alpha = 0.8) +
   scale_colour_viridis_c() +
   geom_abline(slope = 1) +
   labs(subtitle = "Octopus 2019",
        x = "Ecotricity (£)",
        y = "Octopus (£)",
        colour = "Day of year") +
   theme_minimal() +
   theme(text = element_text(size = 15))

# Peak use
x = cost %>%
   select(date, from, octopus_2018, octopus_2019) %>%
   filter(from %in% c("16:00:00",
                      "16:30:00",
                      "17:00:00",
                      "17:30:00",
                      "18:00:00",
                      "18:30:00",
                      "19:00:00")) %>%
   group_by(date) %>%
   summarise(octopus_2018 = sum(octopus_2018),
             octopus_2019 = sum(octopus_2019))

x %>%
   gather(year, cost, -date) %>%
   ggplot(aes(year, cost)) +
   geom_boxplot(fill = "yellow") +
   coord_flip() +
   labs(title = "How much will we spend during peak price?",
        subtitle = "Daily spread of cost between 4pm and 7pm",
        x = "",
        y = "Cost (£/day)") +
   theme_minimal() +
   theme(text = element_text(size = 15))

To leave a comment for the author, please follow the link and comment on their blog: R – scottishsnow.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)