Site icon R-bloggers

EV Charging Stations Analysis

[This article was first published on Andy Pickering, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< section id="introduction" class="level1">

Introduction

Recently I’ve been interested in analyzing trends in electric vehicle (EV) charging stations, using data from the Alternative Fuels Data Center’s Alternative Fuel Stations database. In this first post I’ll go over retrieving the data via an API, getting it into a tidy format, and some initial analysis and visualization.

< section id="data" class="level2">

Data

I’ll retrieve the EV station data using the AFDC API. The documentation for the AFDC fuel-stations API can be found at: https://developer.nrel.gov/docs/transportation/alt-fuel-stations-v1/all/#station-count-record-fields

< details open="" class="code-fold">< summary>Code
# API key is stored in my .Renviron file
api_key <- Sys.getenv("AFDC_KEY")

# base url for AFDC alternative fuel stations API
target <- "https://developer.nrel.gov/api/alt-fuel-stations/v1"

# Return data for all electric stations in Colorado
api_path <- ".json?&fuel_type=ELEC&state=CO&limit=all"

complete_api_path <- paste0(target,api_path,'&api_key=',api_key)

response <- httr::GET(url = complete_api_path)

if (response$status_code != 200) {
 print(paste('Warning, API call returned error code',response$status_code))
}

response$status_code
[1] 200
< details open="" class="code-fold">< summary>Code
ev_dat <- jsonlite::fromJSON(httr::content(response,"text"))

class(ev_dat)
[1] "list"
< details open="" class="code-fold">< summary>Code
names(ev_dat)
[1] "station_locator_url" "total_results"       "station_counts"     
[4] "fuel_stations"      
< details open="" class="code-fold">< summary>Code
ev_dat$total_results
[1] 2303
< details open="" class="code-fold">< summary>Code
ev_dat$station_counts$fuels$ELEC
$total
[1] 5695

$stations
$stations$total
[1] 2303

Finally, the data we want to analyze is in the fuel_stations data frame.

< details open="" class="code-fold">< summary>Code
ev <- ev_dat$fuel_stations
< section id="filter-out-non-ev-data-columns" class="level3">

Filter out non-EV data columns

The returned data contains many non-electric fields that we don’t need (they will all be NA since we requested electric fuel type only), so I’ll remove the non-relevant fields from the data frame to clean things up a bit, using the starts_with function from the Wickham et al. (2023) package. – I’ll also change the date column type and add a variable for year opened, since I want to look at how many stations were opened over time.

< details open="" class="code-fold">< summary>Code
# filter out non-EV related fields
ev <- ev %>% select(-dplyr::starts_with("lng")) %>% 
  select(-starts_with("cng")) %>%
  select(-starts_with("lpg")) %>%
  select(-starts_with("hy")) %>% 
  select(-starts_with("ng")) %>% 
  select(-starts_with("e85")) %>% 
  select(-starts_with("bd")) %>% 
  select(-starts_with("rd")) %>% 
  filter(status_code == 'E')


# change date field to date type and add a year opened variable
ev$open_date <- lubridate::ymd(ev$open_date)
ev$open_year <- lubridate::year(ev$open_date)

#colnames(ev)
< section id="analysis" class="level2">

Analysis

< section id="station-openings-over-time" class="level3">

Station Openings Over Time

< section id="how-many-stations-opened-each-year" class="level4">

How many stations opened each year?

First I’d like to look at how many EV stations opened over time, so I’ll make a new data frame summarizing the number of stations opened by year.

< details open="" class="code-fold">< summary>Code
ev_opened <- ev %>% 
  count(open_year,name = "nopened")  %>% 
  filter(!is.na(open_year))
< details open="" class="code-fold">< summary>Code
ev_opened %>% ggplot(aes(open_year, nopened)) + 
  geom_col() +
  xlab("Year Opened") +
  ylab("# Stations Opened") +
  ggtitle('EV Stations Opened in Colorado Each Year') +
  theme_grey(base_size = 15) +
  geom_text(aes(label = nopened), vjust = 0)
Figure 1: Number of EV Charging Stations Opened In Colorado each year
< section id="cumulative-sum-of-stations-opened-over-time" class="level4">

Cumulative sum of stations opened over time

We can also look at the cumulative sum of stations opened over time

< details open="" class="code-fold">< summary>Code
ev_opened %>% ggplot(aes(open_year,cumsum(nopened))) +
  geom_line(linewidth = 1.5) +
  xlab("Year") +
  ylab("# Stations") +
  ggtitle("Cumulative sum of EV stations opened in CO") +
  theme_grey(base_size = 15)
Figure 2: Cumulative sum of EV stations opened in CO
< section id="station-openings-by-levelcharger-type" class="level3">

Station openings by level/charger type

Next I want to dig a little deeper and break down the station openings by charger type and/or level. I’d expect to see more Level 2 chargers in earlier years, and an increase in DC fast charging stations in more recent years. I’ll make a new data frame with the number of chargers opened by year, grouped by charging level (Level 1, Level 2, or DC fast).

< details open="" class="code-fold">< summary>Code
ev_opened_level <- ev %>% 
  select(id,open_date,
         open_year,
         ev_dc_fast_num,
         ev_level2_evse_num,ev_level1_evse_num) %>%
  group_by(open_year) %>%
  summarize(n_DC = sum(ev_dc_fast_num,na.rm = TRUE), 
            n_L2 = sum(ev_level2_evse_num,na.rm = TRUE),
            n_L1 = sum(ev_level1_evse_num,na.rm = TRUE) ) %>% 
  filter(!is.na(open_year))

head(ev_opened_level)
# A tibble: 6 × 4
  open_year  n_DC  n_L2  n_L1
      <dbl> <int> <int> <int>
1      2010     1    21    18
2      2011     1    22     0
3      2012     9    42     0
4      2013    20    36    28
5      2014    24    63     0
6      2015    29   124     0

To make plotting easier, I’ll pivot the dataframe from wide to long format so I can group by charging level:

< details open="" class="code-fold">< summary>Code
ev_opened_level_long <- ev_opened_level %>% 
  tidyr::pivot_longer(cols = c('n_DC','n_L2','n_L1'),
                      names_to = "Level",
                      names_prefix = "n_",
                      values_to = "n_opened")

head(ev_opened_level_long)
# A tibble: 6 × 3
  open_year Level n_opened
      <dbl> <chr>    <int>
1      2010 DC           1
2      2010 L2          21
3      2010 L1          18
4      2011 DC           1
5      2011 L2          22
6      2011 L1           0

Now I can go ahead and plot the number of chargers opened over time, by level.

< details open="" class="code-fold">< summary>Code
g <- ev_opened_level_long %>% 
  ggplot(aes(open_year, n_opened, group = Level)) +
  geom_line(aes(col = Level), linewidth = 1.5) +
  geom_point(aes(col = Level)) +
  xlab("Year Opened") +
  ylab("# Charges Opened") +
  ggtitle("Number of Chargers Opened Per Year By Level")
  
plotly::ggplotly(g)
Figure 3: Number of Chargers Opened Per Year By Level
< section id="session-info" class="level2">

Session Info

< details open="" class="code-fold">< summary>Code
sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.1.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Denver
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] dplyr_1.1.3    ggplot2_3.4.4  jsonlite_1.8.7 httr_1.4.7    

loaded via a namespace (and not attached):
 [1] gtable_0.3.4      compiler_4.3.1    renv_1.0.3        tidyselect_1.2.0 
 [5] tidyr_1.3.0       scales_1.2.1      yaml_2.3.7        fastmap_1.1.1    
 [9] R6_2.5.1          labeling_0.4.3    generics_0.1.3    curl_5.1.0       
[13] knitr_1.44        htmlwidgets_1.6.2 tibble_3.2.1      munsell_0.5.0    
[17] lubridate_1.9.3   pillar_1.9.0      rlang_1.1.1       utf8_1.2.4       
[21] xfun_0.40         lazyeval_0.2.2    viridisLite_0.4.2 plotly_4.10.3    
[25] timechange_0.2.0  cli_3.6.1         withr_2.5.1       magrittr_2.0.3   
[29] crosstalk_1.2.0   digest_0.6.33     grid_4.3.1        rstudioapi_0.15.0
[33] lifecycle_1.0.3   vctrs_0.6.4       data.table_1.14.8 evaluate_0.22    
[37] glue_1.6.2        farver_2.1.1      fansi_1.0.5       colorspace_2.1-0 
[41] purrr_1.0.2       rmarkdown_2.25    ellipsis_0.3.2    tools_4.3.1      
[45] pkgconfig_2.0.3   htmltools_0.5.6.1
< section id="references" class="level2"> < !-- -->
< section class="quarto-appendix-contents" id="quarto-bibliography">

References

Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and Consistent Mapping Between JSON Data and r Objects.” https://arxiv.org/abs/1403.2805.
Wickham, Hadley. 2023. “Httr: Tools for Working with URLs and HTTP.” https://CRAN.R-project.org/package=httr.
Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. “Dplyr: A Grammar of Data Manipulation.” https://CRAN.R-project.org/package=dplyr.
To leave a comment for the author, please follow the link and comment on their blog: Andy Pickering.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version