Wind data in R, part I: more than one way to peel a potato

[This article was first published on The Manipulative Gerbil: Playing around with Energy Data, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last month, Christian Graul re-published “bReeze: Functions for Wind Resource Assessment” on CRAN. This is wonderful news: it gives R users a chance at the functionality which Python has in windpowerlib. I admit I am a bit late to the party–the original version was published way back in 2018, but there’s no better time than the present to write a new blog post to look at how R users can better understand wind potential tied to specific locations. I’m going to divide this post into three parts:
  1. A minimal introduction to wind power
  2. A quick attempt at downloading the data from NASA POWER, my favourite data source, and calculating the wind energy output
  3. An attempt to understand how bReeze would do it , and to compare the results with my reverse engineering approach
  4. A generalist round-up on wind power
If you already understand the basics of how a wind turbine generates electricity, then you can just forego the first part and start here . In the main though, I am going to try to reverse engineer a now-available CRAN package and try to show how we can come up with generally similar results writing our own functions. In the next installment, I will explore reasons for the differences in output between our own results and those obtained by the bReeze package when trying to understand the total electricity generated by a wind turbine.

Some introductory remarks on wind power

If you think about it, wind power is really just solar power in another form: since solar energy is unevenly distributed across the globe, we have an unequal distribution of thermal energy. This in turn drives the transfer of heat from one part of the earth to another, something we call wind. The earth’s rotation about its axis is another helping factor here, but in a nutshell this is what happens. So how does a wind turbine turn that into electricity?
If you have a volume of air of moving through the “swept area” (i.e., the area of its rotor blades as they rotate) of a wind turbine, let’s call it A and that wind has a density ρ while moving at a speed v , then it’s quick enough to calculate how much kinetic energy that wind flow is carrying:

The factor of 50%, as well as the cubic term in wind speed is a leftover from how kinetic energy is defined in physics. Essentially, this is energy which the wind has just by virtue of moving, and we can use it to make a turbine go around. Importantly, if you want the wind to keep flowing even after you extract some of it to drive a wind turbine, then you will eventually hit the “Betz Limit”–Wikipedia actually has a fairly good treatment of the calculus behind this–which is roughly ~50% of the wind energy. This is important because the design of wind turbines takes this limit into account–the wind turbines used for electricity will work under a specific regime of conditions.
For our purposes, we are only interested in a few salient facts about any given wind turbine. First, we want to know what the height of the hub (location of the nozzle) of the turbine is. Since a wind turbine might typically have a hub height at 100 m or more, a practical way of relating the wind speed at the hub height to the wind speed as measured at something more reasonable–say, 10 m elevation–is needed. The way to do this is quick and easy although it depends on the surrounding terrain:

Here, \alpha is the “Hellman exponent” where the turbine is located. Below, I use a simplified approach and make it 1/7 for land-based and 1/9 for offshore wind turbines. Note that the Hellman exponent is dependent on the “surface roughness” based on the location of the turbine. Most people would probably do the calculation using the surface roughness, but I am just trying to simplify this function. One important thing to remember is that the increase in wind speed as you go higher happens more rapidly in areas with jagged features–like a city–than in areas which are flat. Next, we want to know how much electricity is generated by a wind turbine if we know the location-specific wind speed at a given time. Wind turbine manufacturers provide this data in what is called the “power curve” of the wind turbine. Here’s what the power curve of the Enercon E126 looks like.
Figure 1:

Before moving on to the section where I talk about using NASA POWER data to model what the output of the Enercon E126 would be, I just wanted to tie in our original equation above which quantifies the amount of kinetic energy of the wind and the electricity generated by a turbine. Let’s take the same wind speeds from the power curve, and plug them into the expression for the kinetic energy in the wind. We can calculate the ratio of electricity output from the wind turbines

Our little attempt: hourly wind speeds

The first thing I’m going to do here is to get an idea of what the wind turbine we are looking at will do. After installing and loading “bReeze”, I am going to find the power curve associated with the Enercon E126 wind turbine. This is a turbine which has a hub height of 135 m. It has a “cut-in” wind speed, below which it does not generate electricity, of 3 m/s and a “cut-out” wind speed, higher than which it is turned off, of 25 m/S.
Incidentally, this is what the lovely Enercon E-126 looks like.

#We want the correct power curve here
pow_curve = bReeze::pc("Enercon_E126_7.5MW.pow")
pow_curve_df = data.frame(cbind(pow_curve$v, pow_curve$P))
colnames(pow_curve_df) = c("speed", "power")

In the next code block, I write a quick function which takes in a set of parameters: windspeed; the cut-in and cut-out windspeeds; the surface roughness at the location; the hub height; and a power curve. The returned value will be a vector containing the power output of the wind turbine expressed in kW.
calculate_wind_outputs <-function(windspeeds, measured_height, hub_height, power_curve, cutin_speed, cutout_speed, terrain)
  #First, we calculate what the wind speed would be at the hub height of the given turbine
  #We can use a simple linear adjustment and just increase the speeds by the correct amount
  if(terrain == "onshore")
    hellman_exponent = 1/7
  else if(terrain == "offshore")
    hellman_exponent = 1/9
  adjustment_factor = (hub_height/measured_height)^hellman_exponent
  adjusted_wind_speeds = windspeeds*adjustment_factor
  #Let's create a vector which has the same length as the wind speeds
  power_output = adjusted_wind_speeds
  for(i in 1:length(power_output))
    if(adjusted_wind_speeds[i] > cutin_speed & adjusted_wind_speeds[i] < cutout_speed)
      for(j in 1:nrow(power_curve))
        #Just make sure that this is working
        if(adjusted_wind_speeds[i] >= power_curve$speed[j] & adjusted_wind_speeds[i] < power_curve$speed[j+1])
          #We want the power output to match the power curve so long as the wind speed is the same as or greater than
          # the power curve's predicting variable and below the one above
          power_output[i] = power_curve$power[j]
    #You get no power output if your wind speed is either below the cut-in or above the cut-out speed
    else if(adjusted_wind_speeds[i] < cutin_speed | adjusted_wind_speeds[i] >= cutout_speed)
      power_output[i] = 0 
  #In keeping with good practice--a function should return only one variable

How do I know that this worked? If you take the same default values from the power curve, plug them into our function, and then plot the result, you should get the same plot as in Figure 1. A quick way to do that is shown below.
Another thing worth looking into is the power coefficient of the wind. This is defined as the ratio of “energy in the wind/electric energy generated by turbine” and in the next code block, I plot this value. You should find that the maximum value is just below the Betz Limit and coincides with a wind speed which is right at the plateau of the “rated power” of the wind turbine. An important side note–you can generally find the diameter of a wind turbine by looking up the manufacturer’s website, like I did. You might want to look out for the fact that wind turbines from the same manufacturer sometimes have confusingly similar names and also slightly different specifications (for whatever reason, this seems to happen when a single manufacturer markets products to both the North American and European markets). In this case, it shouldn’t be an issue.
#The power coefficient of the wind is the ratio of 
# wind power electricity to the power in the wind 
area_of_turbine = pi*(0.5*127)^2
wind_sppeds = pow_curve_df$speed
density_of_air = 1.225

#Keep in mind that the expression for kinetic energy was in J, not kWh
#So we first multiply by 1/1000
#Notice though that we are assuming both that the energy is constant
#throughout the hour and therefore that the wind turbine keeps going 
kinetic_energy_in_wind = 0.001*0.5*area_of_turbine*density_of_air*wind_sppeds^3
power_coefficient = pow_curve_df$power/kinetic_energy_in_wind

Something to notice is that the maximum power coefficient (~50%) comes in at around 10 m/s (wind speed at hub height), which is when the turbine is only generating 3.75 MW of power, or about 50% of its capacity. The thing to note here is where the sweet spot of 100% capacity kicks in, which for our case is between 16 m/s and 25 m/s. I will explain a little more about this in the final section of this blog post.

Finally, let’s use some real-world data and our function to give us a sort of back-of-the-envelope calculation of what electricity generated from a single E-126 turbine would look like. I wanted to make the next bit comparable to the default values in the bReeze user guide, so I borrowed some coordinates from roughly the area where the developers of that package live. These are used below to download a year’s worth of data.
Note: my operating assumption below is that the wind data which comes bundled with bReeze is from Neubuerg in Germany, and hence the choice of coordinates. You might also want to look up more on nasapower , my favourite of all time R package.
#The first thing we do is to download the hourly wind data from NASA POWER
#We want to get the wind speeds at both 10m and 50m
hourly_wind_data = nasapower::get_power(community = "re", pars = c("WS10M", "WS50M"), temporal_api = "HOURLY", 
                                        lonlat = c(11.1421496, 48.7716291), dates = c("2008-12-31", "2010-01-01"))

#As usual, we want to make sure we have a dataframe, which is the easiest way to work with this kind of thing in R
#The original data download is a tibble, with some metadata which we can discard here
hourly_wind_data = data.frame(hourly_wind_data)

#Put it through the works for the Enercon E-126
#I'm going to use the windspeeds from the 50m hub height, for fun
#Please note that as a decent member of the human race, I only use metric/SI units if at all possible
hourly_wind_energy = calculate_wind_outputs(windspeeds = hourly_wind_data$WS50M, measured_height = 50, hub_height = 135, cutin_speed = 3, cutout_speed = 25, power_curve = pow_curve_df, terrain = "onshore")
#What kind of capacity factor does this give us?
#Convert to MWh
#Then take into account that the total possible energy generated is 7.5 MW * 8760 hours in a year
capacity_factor_rough = 0.001*sum(hourly_wind_energy)/(7.5*8760)

Working through the code below, we find that the annual capacity factor for our selected turbine at our randomly selected is about 17%. Does bReeze give a much different answer?

How bReeze would do this, and a quick comparison

At its heart, bReeze is a package which does something really quite similar–it takes some data on wind speeds, defines the mast/hub height of a turbine and its power curve, and gives us values for annual power output. True to form for an R package however, it also lends itself to some statistical analyses of the wind profiles of specific locations. One thing about bReeze which makes sense when you think about it is that, being aimed at specialists, users really need to understand a lot of the moving parts of wind power before they can meaningfully use this particular R package.
First, let’s load some of the data bundled with bReeze and takea look at what it contains
#How to calculate it in bReeze
#We already did this using our manual method but we will leave out the re-naming of the columns
# and the conversion to a DF--bReeze uses these objects as they are
pow_curve = bReeze::pc("Enercon_E126_7.5MW.pow")
#Let's also get the wind speed data which is bunlded with bReeze
data("winddata", package = "bReeze")
data_from_breeze = winddata

One thing you will notice is that this data contains much more than just the wind speeds,but also the wind directions. You will see that these are given in units of degrees, which can be easily converted to cardinal directions, for example using this conversion here . Another interesting point, which will come up when we move on to the Weibull distribution (the next blog post, I hope!) is that this data comes not only with average wind speeds within a time slot, but also the standard deviation; so we know not only how windy it is, but can infer things like turbulence in the wind and other measures of variability. Lasly, the data bundled with bReeze covers 10-minute time intervals covering 270 days, from May, 2009 to the end of January, 2010. This might seem odd, but probably chimes with the real-world availability of actual windspeed measurements using ground-based equipment (which may or may not be more reliable than NASA’s satellite-based readings).
#To calculate the annual energy output from bReeze, we would simply run the following code
annual_energy_from_breeze = bReeze::aep(profile = our_wind_profile, hub.h = 135, pc = pow_curve)
#You can see that we are keeping that hub height and power curve from last time
# but we also now need to create a wind profile, shown below

#Our first sub-task is going to be to extract the wind speeds and standard deviations measured at 40 m
measured_40_speed = data_from_breeze$v1_40m_avg
measured_40_std = data_from_breeze$v1_40m_std
direction_from_breeze_set = data_from_breeze$dir1_40m_avg
#We combine the average wind speed, standard deviation of that wind speed and
# its direction into a "set"
set40m = bReeze::set(height = 40, v.avg = measured_40_speed, v.std = measured_40_std, dir.avg = direction_from_breeze_set)
#The next thing to do is to build a "mast" object, which combines a set of wind data
# together with timestamps
#Note that for this to work, we need to format the timestampes correctly
#For the sake of completeness, I am going to give the timestaps a CET timezone
timestamp_from_breeze = bReeze::timestamp(data_from_breeze$date_time, tz = "CET")
mast_for_breeze = bReeze::mast(timestamp = timestamp_from_breeze, set40m)
#So now we have a mast object which ties together a set of wind speed data to date time elements
# We build a "wind profile" which is correctly formatted to use with our function in bReeze
our_wind_profile = bReeze::windprofile(mast = mast_for_breeze, v.set = 1, dir.set = 1)
#Note that I have chosen to use the first (and only) wind speed and direction sets
#Now let's go back to our earlier line of code
annual_energy_from_breeze = bReeze::aep(profile = our_wind_profile, hub.h = 135, pc = pow_curve)
#We can now access the capacity factor--18%

If you run this code, you get an increase of the capacity factor from about 17.3% to 18.6%. At first glance this might seem unimportant, but it is in fact a significant relative increase. In my next blog post, I am going to explore two different sources for this discrepancy: the actual source of the data and the statistical “Weibull distribution” which bReeze uses to model the wind speeds throughout the year. That’s for a later discussion though!

Some concluding notes on choosing wind turbines

In a previous blog post , I described how to use R to graphically map the output of solar energy across specific regions, tying geography with solar insolation and electricity output. Similarly, electricity from the wind is closely tied to geography. How this ties into a specific wind turbine design is to consider where the rated power–when the output matches the “nameplate” or maximum performance of the turbine–and how that compares with the wind profile of the selected location. So if you have a location where the wind is not likely to ever be within the range of rated wind speed (when it first reaches the rated power, in our case 16 m/s at hub height) and the cut-out speed (25 m/s in our case), then you’ve probably not found the right wind turbine yet. At another level, you would also want to consider the expected power coefficient as well.
One really important aspect of this which I’ve left out is the question of wind “shading”–just like with solar panels having the potential to cast a shadow over panels in the following row, wind turbines can extract too much energy from other turbines which sit “down wind” from there, thus “shading”. Additional factors also play in to the layout and siting of a wind farm, including for example the distance between turbines and the distance to a power interconnection.
I am going to leave out wind farm design optimization from this post, but you might be interested in some of the reading notes below.

Data and some reading

  1. The Canadian Wind Energy Atlas–although I think their geographical resolution is a bit over the top. You will need an account setup.
  2. …not to be confused with the Canadian Wind Turbine Database , where you can download an Excel spreadsheet letting you know exactly how many MW of capacity are installed, and where. I really like this dataset.
  3. Some specialist reading on optimization of wind farm layouts
  4. Another specialist paper on the same topic, this time from NREL in the US
  5. Unless otherwise attributed, all my images are taken from sites using a Creative Commons license.
To leave a comment for the author, please follow the link and comment on their blog: The Manipulative Gerbil: Playing around with Energy Data. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)