Skip to main content

Section 2.5 The Chain Rule

We’ve been building up some intuition and rules to help us think about differentiating different functions and combinations of functions. We can find derivatives of scaled functions, sums of functions, differences of functions, products of functions, and also quotients of functions.
In this section, we’ll look at our last operation between functions: composition.

Subsection Composition and Decomposition

An important part of finding derivatives of products and quotients is identifying the component functions that are being multiplied/divided (often labeled \(u(x)\) or just \(u\) and \(v(x)\) or just \(v\)). From there, we find the derivatives of each of the component functions, and then use the formula from the Product Rule or Quotient Rule to put the pieces together.
Thinking about derivatives of composed functions will be the same: we’ll need to identify what functions are being composed inside of other functions, and use those pieces in some formulaic way to represent the derivative. On that note, let’s remind ourselves and practice working with composition (and decomposition) of functions.

Activity 2.5.1. Composition (and Decomposition) Pictionary.

This activity will involve a second group, or at least a partner. We’ll go through the first part of this activity, and then connect with a second group/person to finish the second part.
(a)
Build two functions, calling them \(f(x)\) and \(g(x)\text{.}\) Pick whatever kinds of functions you’d like, but this activity will work best if these functions are in a kind of sweet spot between “simple” and “complicated,” but don’t overthink this.
(b)
Compose \(g(x)\) inside of \(f(x)\) to create \((f\circ g)(x)\text{,}\) which we can also write as \(f\left(g(x)\right)\text{.}\)
(c)
Write your composed \(f\left(g(x)\right)\) function on a separate sheet of paper. Do not leave any indication of what your chosen \(f(x)\) and \(g(x)\) are. Just write your composed function by itself.
Now, pass this composed \(f\left(g(x)\right)\) to your partner/a second group.
(d)
You should have received a new function from some other person/group. It is different than yours, but also labeled \(f\left(g(x)\right)\) (with different choices of \(f(x)\) and \(g(x)\)).
Identify a possibility for \(f(x)\text{,}\) the outside function in this composition, as well as a possibility for \(g(x)\text{,}\) the inside function in this composition. You can check your answer by composing these!
(e)
Write a different pair of possibilities for \(f(x)\) and \(g(x)\) that still will give you the same composed function, \(f\left(g(x)\right)\text{.}\)
(f)
Check with your partner/the second group: did you identify the pair of functions that they originally used?
Did whoever you passed your composed function to correctly identify your functions?
A big thing to notice here is that when we pick the pieces of functions that we think were composed inside of each other, there’s not a single obvious answer. This is pretty different compared to, say, using the Quotient Rule. In these quotients, we have a natural division (literally) between the pieces. Here, it’s much more subjective for us when we decide to label an “inside” function and an “outside” function.
We will build up our intuition to find a good balance for how we pick these.

Subsection The Chain Rule, Intuitively

Before we build the Chain Rule for differentiating composed functions, we should talk about some notation. Earlier (in Notation for Derivatives) we talked about the derivative notation, \(\dydx\text{.}\) One of the things we mentioned is that while we know that the derivative is an instantaneous rate of change, this notation is helpful to tell us what is changing with regard to what.
In \(\dydx\text{,}\) we are calculating how much the \(y\)-variable changes when \(x\) increases. If we talked about \(\dd{f}{t}\text{,}\) then we are discussing how much \(f\) changes for an increase in \(t\text{,}\) whatever these variables represent.

Activity 2.5.2. Gears and Chains.

Let’s think about some gears. We’ve got three gears, all different sizes. But the gears are linked together, and a small motor works to spin one of the gears. Since the gears are linked, when one gear spins, they all spin. But since they are different sizes, they complete a different number of revolutions: the smaller ones spin more times than the larger ones, since they have a smaller circumference.
For our purpose, let’s say that Gear A is being driven by the motor.
(a)
Let’s try to quantify how much “faster” Gear B is spinning compared to Gear A. How many revolutions does Gear B complete in the time it takes Gear A to complete one revolution?
(b)
Now quantify the speed of Gear C compared to its neighbor, Gear B. How many revolutions does Gear C complete in the time it takes Gear B to complete one revolution?
(c)
Use the above relative “speeds” to compare Gear C and Gear A: how many revolutions does Gear C complete in the time it takes Gear A to complete one revolution?
More importantly, how do we find this?
(d)
Now, let’s translate this into some derivative notation: we’ve really been finding rates at which one thing changes (the speed of the gear spinning) relative to another’s.
Call the speed of Gear B compared to Gear A: \(\dd{B}{A}\text{.}\) Now, call the speed of Gear C compared to Gear B: \(\dd{C}{B}\text{.}\) Come up with a formula to find \(\dd{C}{A}\text{.}\)
So what we need to do now is somehow translate this intuitive idea of multiplying rates of change to build a strategy for thinking about derivatives of composed functions.
We can think of these linked gears as functions: Gear C changes based on what is happening with Gear B, which changes based on Gear A. We can translate Gear A to be an input variable, like \(x\text{.}\) Then Gear B is a function based on that: we can call it \(g(x)\text{.}\) Then Gear C is a function that takes in the position of Gear B (the function \(g(x)\)), and so we can think of it as \(f(g(x))\text{.}\)
To build the derivative rule for composite functions, we need to find how the “outside” function changes as the “inside” function changes (\(\dd{C}{B}\) in this case) and multiply that by how the “inside” function changes as the input variable changes (\(\dd{B}{A}\) here).

Subsection Doing is Different than Knowing

It is lovely to know that the Chain Rule is really just linking the two rates of change together to connect a function with an input variable through a middle processing function. That’s great!
But doing the Chain Rule is different than just knowing it, so let’s walk through a first example. Let’s find the following derivative:
\begin{equation*} \ddx{\sin(x^2)} \end{equation*}
We’ll call the “inside” function \(u=x^2\text{,}\) so we can really write the whole function (normally, we’re calling this \(y\)) as \(y=\sin(u)\text{.}\)
\begin{align*} \ddx{\underbrace{\sin(\overbrace{x^2}^{u})}_{y}}\amp =\dd{y}{u} \cdot \dd{u}{x}\\ \amp = \frac{d}{du}\left(\sin(u)\right) \cdot \frac{d}{dx}\left(x^2\right) \end{align*}
What we can notice, here, is that \(\sin(u)\) is just a function of some variable \(u\text{,}\) and we want to find \(\dd{y}{u}\text{,}\) the rate at which \(y=\sin(u)\) changes with regard to its input variable. This might feel a bit strange, since \(u\) isn’t just an input variable: it means something, since we have that \(u=x^2\text{.}\) This is fine! The extra \(\dd{u}{x}\) that we multiply will take care of linking this derivative to the input variable \(x\text{.}\)
\begin{align*} \ddx{\underbrace{\sin(\overbrace{x^2}^{u})}_{y}}\amp = \frac{d}{du}\left(\sin(u)\right) \cdot \frac{d}{dx}\left(x^2\right)\\ \amp = \cos(u) \cdot 2x\\ \amp = \cos(x^2)\cdot 2x \\ \amp = 2x\cos(x^2) \end{align*}
After we finished differentiating \(\frac{d}{du}\left(\sin(u)\right)\text{,}\) you’ll notice that we used the fact that \(u=x^2\) to write our combination of derivatives (the derivative of the “outside” function and the derivative of the “inside” function) in terms of the same input variable again.
The last line, rewriting \(\cos(x^2)\cdot 2x \) as \(2x\cos(x^2)\text{,}\) is just for aesthetics.
Now you’re ready to try some more examples! In each, focus on identifying a natural selection for the “inside” function, \(u\text{.}\)

Example 2.5.2.

Use the Chain Rule to differentiate the following:
(a)
\(\Ddx{\sqrt{x^2+4}}\)
Hint.
Notice that \(x^2+4\) is composed under the square root. Use \(u=x^2+4\text{.}\)
(b)
\(\Ddx{e^{\tan(x)}}\)
Hint.
Try letting \(u=\tan(x)\text{,}\) since it’s composed inside the exponent of the exponential function.
(c)
\(\Ddx{\sin^5(x)}\)
Hint.
You could think about this as \(\Ddx{\sin(x)\sin(x)\sin(x)\sin(x)\sin(x)}\) and try to use a very annoying product rule, but it might be easier to think about this as \(\Ddx{(\sin(x))^5}\text{.}\)

Subsection Generalizing the Derivative of the Exponential

Earlier, we looked at the specific derivative for \(f(x)=e^x\) (Theorem 2.3.9), but we haven’t talked about derivatives of other exponential functions. What about things like \(y=2^x\) or \(y=\left(\frac{1}{2}\right)^x\text{?}\) We can use a nice fact about exponentials and logarithms. We’ll think more about log functions later (starting in Section 3.2), but we can think a bit about them now.
A big fact to recall is that a logarithm is a way of finding an exponent with a specific property. If we want to find the exponent that we would need to put on the number \(e\) to give us \(9\) as an answer, we could use \(\ln(9)\text{.}\)
\begin{equation*} e^{\ln(9)} = 9 \end{equation*}
This is just because logs are defined in this circular way: they are, by definition, the exponent you would need to output whatever number is inside the log.
This means that if we want to think about the number \(2\text{,}\) written in a different way, we can think of \(e^{\ln(2)}\text{.}\)
Ok, but why would we ever use this? This seems like a ridiculous way to write a number as basic as 2!
Consider the following:
\begin{equation*} 2^x = \left(\underbrace{e^{\ln(2)}}_{=2}\right)^x \end{equation*}
But we also might notice that we can rewrite this using an exponent rule! We know that in general: \(\left(a^b\right)^c = a^{b\cdot c}\text{.}\) Let’s rewrite this exponential function:
\begin{align*} 2^x \amp = \left(e^{\ln(2)}\right)^x\\ \amp =e^{\ln(2)\cdot x} \end{align*}
Remember, \(\ln(2)\) is just a number: it’s specifically the number you have to put in the exponent on \(e\) to get 2. So this is just a coefficient on \(x\text{.}\) We can differentiate and use the Chain Rule!
\begin{align*} \ddx{2^x} \amp = \ddx{e^{\ln(2)\cdot x}}\\ \amp = e^{\ln(2)\cdot x}\cdot \ln(2) \end{align*}
Now we can remember that \(e^{\ln(2)\cdot x}\) is really \(\left(e^{\ln(2)}\right)^x\) which is just \(2^x\text{.}\)
So we get \(\Ddx{2^x} = 2^x\ln(2)\text{.}\) We can notice that we can recreate this with any (reasonable) value for the base of this exponential function.

Practice Problems Practice Problems

1.

For functions \(f(x)\) and \(g(x)\text{,}\) explain how to use the Chain Rule to find \(\ddx{f(g(x))}\text{.}\)

2.

For functions \(f(x)\) and \(g(x)\text{,}\) explain how to use the Chain Rule to find \(\ddx{g(f(x))}\text{.}\)

3.

Explain the differences in the two derivatives above.

4.

For functions \(f(x)\) and \(g(x)\text{,}\) explain how to use the Chain Rule to find \(\ddx{f(g(f(g(f(g(x))))))}\text{.}\)

5.

Consider the following table of values of \(f\text{,}\) \(g\text{,}\) \(f'\) and \(g'\text{.}\)
Table 2.5.4.
\(x\) \(f(x)\) \(f'(x)\) \(g(x)\) \(g'(x)\)
\(-1\) \(2\) \(3\) \(0\) \(-2\)
\(0\) \(1\) \(6\) \(-1\) \(-4\)
\(1\) \(0\) \(-4\) \(1\) \(\frac{7}{3}\)
Find the following derivatives, using the table.
(a)
\(\ddx{f(g(x))}\bigg\vert_{x=-1}\)
(b)
\(\ddx{f(f(x))}\bigg\vert_{x=0}\)
(c)
\(\ddx{g(f(x))}\bigg\vert_{x=1}\)
(d)
\(\ddx{g(g(x))}\bigg\vert_{x=-1}\)

7.

Let’s introduce a new kind of function: hyperbolic trigonometric functions. For now, let’s just defined \(y=\sinh(x)\) and \(y=\cosh(x)\) (the hyperbolic sine and hyperbolic cosine functions) this way:
\begin{align*} \sinh(x) \amp = \frac{e^x-e^{-x}}{2}\\ \cosh(x) \amp = \frac{e^x+e^{-x}}{2} \end{align*}
(a)
Show that \(\ddx{\sinh(x)} = \cosh(x)\text{.}\)
(b)
Show that \(\ddx{\cosh(x)} = \sinh(x)\text{.}\)
(c)
Let \(\tanh(x) = \dfrac{\sinh(x)}{\cosh(x)}\text{.}\) Find \(\ddx{\tanh(x)}\text{.}\)
(e)
Find \(\ddx{\sqrt{\cosh(x)}}\text{.}\)

8.

Find the following derivatives.
(a)
\(\Ddx{\dfrac{\sin(e^x)}{\sqrt{x}}}\)
(c)
\(\Ddx{\tan\left(\dfrac{x+1}{x^2+1}\right)}\)
(d)
\(\Ddx{(x+5)^2e^{\tan(x)}}\)