This is a short post for my students in the CUNY MS Data Analytics program on sketching curves in R.
Suppose we want to find the derivative of . In addition to computing the derivative analytically, it might be interesting to graph this function to see what it looks like. When graphing a function, I like to generate a sequence of x values and then pass it to the function. Since R is vectorized, there is no need to write a loop. This is because for vectors (aka sequences) arithmetic operators work on an element-wise basis. In other words, they are equivalent to the higher order function map.
f <- function(x) (-4*x^5 + 3*x^2 - 5) / x^2 xs <- -10:10 plot(xs, f(xs), type='l')
This gives us the graph below. You may notice that there is a big gap between -1 and 1. Why is this? The short answer is that 0 is undefined for . While this is correct, the slightly longer answer is that the spacing of the values is too big at integer intervals. Hence the gap is 2 units wide and is not representative of the actual function. This is important to remember as incorrect scales can often lead to misleading results.
Let’s try again with a spacing of 0.1. What’s the best way to do this? If we want to use the syntactic sugar, then we need to scale the interval ourselves. For our example the scaling is easy. For the more general case, what is the best way to model the scaling? Getting back to the original discussion, here are two equivalent alternatives.
xs <- (-100:100) / 10 xs <- seq(-10, 10, by=0.1)
Note that in the first form the parentheses are mandatory due to the operator precedence rules of R.
Working with this interval, we get a more precise representation of the function. However, I still have this uneasy feeling that I don’t really know what this function looks like near 0. Let’s “zoom” into 0 by increasing the resolution by another order of magnitude. At a spacing of 0.01, this function looks very different from what we started with.
- Write a function that takes an integer sequence and scales it to a given precision. For example, given the sequence -5:5, write a function s such that s(-5:5, 0.1) returns the sequence c(-5.0, -4.9, -4.8, …, 4.9, 5.0). Do not use the seq function in your answer.
- Reproduce the graph of f within the domain [-4, 4] and precision 0.2 using the function above to generate the x values.
Composing the derivative
What do these graphs tell us about the derivative? It appears mostly well-behaved except when . It’s straightforward to use the product rule to compute the derivative. Here we let and . Why use the product rule instead of the quotient rule? It’s really a matter of style. Personally, it’s easier for me to remember fewer rules.
Returning to the original motivation for this discussion, the question is whether these curves can shed any light on the behavior of the derivative for this function. Now that we’ve deconstructed , what do these two functions look like?
g <- function(x) -4*x^5 + 3*x^2 - 5 h <- function(x) x^-2 xs <- seq(-2, 2, by=0.02) plot(xs, g(xs), type='l') lines(xs, h(xs), col='blue')
As expected, looks like a classic odd-order polynomial while behaves according to a negative exponent. What might be surprising is that is positive . The function goes to negative infinity at 0 because is slightly negative at 0, which is not obvious when first glancing at this graph.
What else does this graph tell us? It is useful to remember that the original function is the product of these two functions. A first observation is that around , begins to grow at a rate much faster than shrinks. Similarly, when it appears that grows very fast. Graphically then, it seems that the derivative of is going to be dominant when
Let’s write functions to represent the first derivative and overlay them as dotted lines onto the graph.1
g1 <- function(x) -20*x^4 + 6*x h1 <- function(x) -2 * x^-3 lines(xs, g1(xs), lty=3) lines(xs, h1(xs), col='blue', lty=3)
Now things are getting interesting. It takes a bit more effort to picture what the derivative of f looks like given these four curves. From a graphical perspective the product rule tells us to sum the product of the dotted black line and the solid blue line with the product of the dotted blue line and the solid black line.
To make it easier, here are the two products (g’h in orange and h’g in brown) along with the sum, which of course is f’ (in black).
Decomposing a function into smaller functions can be a useful exercise when looking to assess the relative impact of the constituent functions. Working from the opposite direction, it can also help in function approximation. Usually it is easier to build up a complex function from smaller functions rather than starting with a complicated function. I will explore this idea in a future post.
- Reproduce the last graph
The derivative and constants
Here is another example of using graphs to help illuminate the behavior of functions. Let’s look at why a constant has a derivative of 0. Consider the function . The derivative of this function is . What about the function ? Intuitively we wouldn’t expect the derivative to be any different. In fact, since the derivative is linear, the derivative is simply . We can apply this same logic to itself and deconstruct this function into and . Hence, any constant added to a function has no effect on the derivative.
Graphically, it is easy to see that the derivative of is the same as since the shift is merely a linear combination of the and . Why is the derivative of (a constant function in x) 0? A constant value is telling us that the function is independent of . Consequently, any change in has no effect on the constant function. Therefore the derivative of the function with respect to should be 0 everywhere. This observation also helps illustrate partial derivatives in multivariate calculus and why certain terms become 0.
While vertical shifts have no effect on the derivative, horizontal shifts do. Why is this? Simply put, it is because a horizontal shift modifies each value, so this change is dependent on . A student asked whether the derivative is constant if you account for the horizontal shift. How do we answer this? Let’s define a modified function . If we take the partial derivative of with respect to we get . Indeed, if we think about how the chain rule works, each instance of (x-n) will be preserved such that .
 See this handy reference for plot styles