integration by substitution

Integration by substitution

Indefinite integrals

Let \[F^{\prime}(x)=f(x)\] such that \[F(x)\] is an antiderivative of \[f(x)\], then \[\int f(g(x))\cdot g^{\prime}(x)\,dx=F(g(x))+C\].

Our goal is to show that \[F(g(x))+C\] is indeed an antiderivative of \[f(g(x))\cdot g^{\prime}(x)\]. To prove this, we compute the derivative of \[F(g(x))+C\] using the chain rule, \[\frac{d}{dx}\biggl[ F(g(x))+C \biggr]=F^{\prime}(g(x))\cdot g^{\prime}(x)\]. Substituting \[F^{\prime}(x)\] back for \[f(x)\], we get \[f(g(x))\cdot g^{\prime}(x)\].

By the first fundamental theorem of calculus, if \[\frac{d}{dx}\biggl[ F(g(x))+C \biggr]=f(g(x))\cdot g^{\prime}(x)\], then by definition \[F(g(x))+C\] is an antiderivative of \[f(g(x))\cdot g^{\prime}(x)\], therefore proving the equality.

Many textbooks and sources might write something along the lines of \[\int f(g(x))\cdot g^{\prime}(x)\,dx=\int f(u)\,du\]. However, this is mathematically incorrect as in a table of integrals, and the variable, be it \[x\], \[u\], \[t\] or anything else is a dummy variable. Now putting that into perspective, the antiderivatives on the left and right are blatantly not equal. What the books mean instead, no doubt, is that if you substitute \[g(x)\] for \[u\] after taking the antiderivative on the right you get the antiderivative on the left. Therefore, a way to reformulate that is to write simply \[\int f(g(x))\cdot g^{\prime}(x)\,dx=\int f(u)\,du\bigg|_{u=g(x)}\], or \[\int f(u)\,du\] evaluated at \[u=g(x)\].

Example

To put this into practice is to integrate \[x^{3}\sqrt{x^{4}+16}\].

To solve \[\int x^{3}\sqrt{x^{4}+16}\,dx\], we first try to turn whatever that's inside, which is \[x^{3}\sqrt{x^{4}+16}\] into the form of \[f(g(x))\cdot g^{\prime}(x)\]. We define \[f(u)=\sqrt{u}\] (\[u\] is a placeholder value, has nothing to do with the \[u\]-subtitution we're doing later) and \[g(x)=x^{4}+16\].

Notice that \[g^{\prime}(x)=4x^{3}\], which makes \[f(g(x))\cdot g^{\prime}(x)=\sqrt{x^{4}+16}\cdot 4x^{3}\]. To fix the fact that we're gotten \[4x^{3}\] not \[x^{3}\], we just write that \[\int x^{3}\sqrt{x^{4}+16}\,dx=\frac{1}{4}\int 4x^{3}\sqrt{x^{4}+16}\,dx\]. The reason in which this is possible has been proven here.

Now that we've turned it into this form, we can now substitute everything we have into this equation \[\int f(g(x))\cdot g^{\prime}(x)\,dx=F(g(x))+C\]. Now, since \[F^{\prime}(x)=f(x)\], reversing that, \[f(u)=\sqrt{u}=F^{\prime}(u)\], making \[F(u)=\frac{u^{\frac{3}{2}}}{\frac{3}{2}}=\frac{2}{3}u^{\frac{3}{2}}\]. Substituting the rest, we get that \[\int \sqrt{x^{4}+16}\cdot 4x^{3}\,dx=\frac{2}{3}(x^{4}+16)^{\frac{3}{2}}+C\]. Accounting for the \[\frac{1}{4}\], we get our final answer as:

\begin{align*} \int \sqrt{x^{4}+16}\cdot x^{3}\,dx&=\frac{1}{4}\int \sqrt{x^{4}+16}\cdot 4x^{3}\,dx\\ &=\frac{1}{4}\cdot \frac{2}{3}\left( x^{4}+16 \right)^{\frac{3}{2}}+C\\ &=\frac{1}{6}\left( x^{4}+16 \right)^{\frac{3}{2}}+C\\ \end{align*}

This process is quite tedious, thus arises the \[u\]-substitution method as another way to implement the theorem above. The reason it works is that let \[u=g(x)\], then \[\int f(u)\cdot u^{\prime}\,dx=\int f(u)\cdot \frac{du}{dx}\,dx=\int f(u)\,du\].

Similarly, we substitute the most complicated part of the equation, \[u=x^{4}+16\] (\[u\] here is the equivalent of \[g(x)\]). \[u^{\prime}=\frac{du}{dx}=4x^{3}\], which implies that \[du=\frac{du}{dx}\cdot dx=4x^{3}\cdot dx\]. Then, \[\int x^{3}\sqrt{x^{4}+16}\,dx=\int \sqrt{x^{4}+16}\left( x^{3}\cdot dx \right)=\frac{1}{4}\int \sqrt{x^{4}+16}\left( 4x^{3}\cdot dx \right)\] (yes they represent the same thing). Substituting \[u\] and \[du\], we get \[\frac{1}{4}\int\sqrt{u}\,du=\frac{1}{4}\cdot \frac{2}{3}u^{\frac{3}{2}}=\frac{1}{6}\left( x^{4}+16 \right)^{\frac{3}{2}}+C\].

Note that the reason this convoluted notation works has something to do with differential forms.

Both of these steps are precisely the statement "I have \[\int f(g(x))\cdot g^{\prime}(x)\,dx\] therefore it must be \[F(g(x))+C\]".

Definite integrals

Let \[g:\left[ a,b \right]\to I\] be a differentiable function with a continuous derivative, where \[I\] is an interval and \[I\subset\mathbb{R}\]. Suppose that \[f:I\to\mathbb{R}\] is a continuous function. Then, \[\int_{a}^{b}f(g(x))\cdot g^{\prime}(x)\,dx=\int_{g(a)}^{g(b)}f(u)\,du\].

Since \[f\] is continuous, it has an antiderivative \[F\]. The composite function \[F(g(x))\] (or \[(F\circ g)(x)\]) is then defined. Since \[g\] is differentiable, similar to our previous proof, combining the chain rule and \[F^{\prime}(x)=f(x)\] gives us \[\frac{d}{dx}\biggl[ F(g(x)) \biggr]=\left( F\circ g \right)^{\prime}(x)=F^{\prime}(g(x))\cdot g^{\prime}(x)=f(g(x))\cdot g^{\prime}(x)\].

Applying the second fundamental theorem of calculus,

\begin{align*} \int_{a}^{b}f(g(x))\cdot g^{\prime}(x)\,dx&=\int_{a}^{b}(F\circ g)^{\prime}(x)\,dx\\ &=\left( \left( F\circ g \right)(b)+C \right)-\left( \left( F\circ g \right)(a)+C \right)\\ &=F(g(b))-F(g(a))\\ &=\int_{g(a)}^{g(b)}f(u)\,du\\ \end{align*}

where \[u\] is a dummy variable (unrelated to our following \[u\]-substitution, we simply use it to avoid ambiguity as \[x\] has been used).

If we were to use the standard \[u\]-notation, then it would just be a slightly altered version, \[\int_{a}^{b}f(g(x))\cdot g^{\prime}(x)\,dx=\int_{u=g(a)}^{u=g(b)}f(u)\,du\].

Example

Evaluate the integral \[\int_{0}^{1}e^{-2x}\,dx\].

As usual, we will use the more rigorous method to solve it first. Define \[f(u)=e^{u}\], \[g(x)=-2x\], then \[g^{\prime}(x)=-2\]. We also note that \[g(0)=0\] and \[g(1)=-2\]. Substituting what we have just defined, \[f(g(x))\cdot g^{\prime}(x)=e^{-2x}\cdot -2\]. Similarly, to fix the fact that we have an extra \[-2\], we simply write \[-\frac{1}{2}\int_{0}^{1} e^{-2x}\cdot -2\,dx\].

Now that that's done, based on what we've proven above,

\begin{align*} \int_{0}^{1}e^{-2x}\,dx&=-\frac{1}{2}\int_{0}^{1}e^{-2x}\cdot -2\,dx\\ &=-\frac{1}{2}(F(g(1))-F(g(0)))\\ \end{align*}

Since \[f(u)=e^{u}\], it follows that \[F(u)=e^{u}+C\], then,

\begin{align*} -\frac{1}{2}(F(g(1))-F(g(0)))&=-\frac{1}{2}\left( \left( e^{-2}+C \right)-\left( e^{0}+C \right) \right)\\ &=-\frac{1}{2}\left( e^{-2}-e^{0} \right)\\ &=\frac{1}{2}\left( 1-e^{-2} \right)\\ \end{align*}

Again, this entire process can be simplified with the \[u\]-substitution. First, pick, \[f(u)=e^{u}\], \[u=-2x\], then \[u^{\prime}=\frac{du}{dx}=-2\implies du=-2\cdot dx\]. We note that \[u(0)=0\] and \[u(1)=-2\]. Then we carry out the integration,

\begin{align*} \int_{x=0}^{x=1}e^{-2x}\,dx&=\int_{u=g(a)}^{u=g(b)}f(u)\,du\\ &=\int_{u=0}^{u=-2}e^{u}(-2\cdot\,dx)\\ &=\frac{1}{2}\int_{u=-2}^{u=0}e^{u}\cdot 1\,du\\ &=\frac{1}{2}\biggl[ e^{u} \biggr]_{u=-2}^{u=0}\\ &=\frac{1}{2}\left( 1-e^{-2} \right)\\ \end{align*}

integration by substitution