Integration by Parts
In the last few lectures, we’ve delved deeply into the details of finding a good candidate for an inverse derivation.
We have seen that, whereas derivation essentially involves computing the slope of the tangent line to a function’s graph, its inverse involves computing the area under the graph.
What we know
In the previous lectures, we defined the Riemann integral of a bounded function of a real variable \(f:[a,b]\to\mathbb R\). Note that the existence of the Riemann integral is not guaranteed for all functions. However, we saw that if \(f:[a,b]\to\mathbb R\) is continuous, then it is Riemann-integrable, although this is not strictly necessary. However, a bounded function is Riemann-integrable if and only if it has at most countably many discontinuities in \([a,b]\). We’ve denoted the integral of \(f\) as
\[\int_{a}^b f(x)\operatorname dx,\]and we’ve seen that it satisfies many desirable properties, such as being \(\mathbb R\)-linear. Moreover, the set of Riemann-integrable functions forms an \(\mathbb R\)-algebra, meaning that it is closed under sum and product of functions, and scalar product.
Most importantly, we have seen that integrating a function \(f:[a,b]\to\mathbb R\) essentially amounts to computing a primitive, also known as an antiderivative, that is a continuous function \(F:[a,b]\to\mathbb R\), differentiable in \((a,b)\) and such that \(F^\prime = f\). If such a function exists and \(f\) is Riemann-integrable, then the fundamental theorem of calculus tells us that
\[\int_{a}^b f(x)\operatorname dx=F(b)-F(a).\]Recall that a primitive exists if \(f\) is continuous, and an obvious choice is given by
\[F(x)=\int_{a}^x f(t)\operatorname dt.\]This has many useful consequences, one of which we have already seen in the Integration by Substitution formula. This formula enables us to compute integrals of functions of the form \(f(\varphi(x))\cdot\varphi^\prime(x)\), where \(\varphi:[a,b]\to I\) is continuously differentiable function and \(f:I\to\mathbb R\) is continuous
\[\int_{a}^{b}f(\varphi(x))\cdot\varphi^\prime(x)\operatorname dx = \int_{\varphi(a)}^{\varphi(b)}f(u)\operatorname du.\]The Integration by Parts formula
What if I asked you to compute the integral of a seemingly simple function such as \(\log x\)? More specifically, say that I asked to compute the following integral
\[\int_1^e\log x\operatorname dx.\]Well, \(\log x\) is definitely not among the functions listed in our table of easy integrals. Then you might think to try and solve this integral by substitution. Let’s try that. One reasonable substitution that might come to mind is be to define \(u(x) = e^{-x}\), so that \(\operatorname du = -e^{-x}\operatorname dx\). Then, recalling our recipe for taking integrals of composite functions (that is, integration by substitution), we have that
\[\int_1^e\log x\operatorname dx = -\int_0^{1}ue^{-u}\operatorname du.\]Does that make sense?
It doesn’t seem we’ve made much of a progress, does it? Maybe you could try other substitutions, bit I can assure you it wouldn’t lead anywhere easily, unless you really really know your good old Gamma functions.
Clearly, it’s not possible to transform the integral of \(\log x\) into a form that can be easily integrated. And that is bad, as this is such a simple function that its integral is ubiquitous in essentially any field of science - from the definition of entropy in physics, to surprisal in information theory and information content in machine learning.
What shall we do then? Should we give up mathematics and go do something else?
We’re in luck today, as here enters the star of the show:
In our last lecture, we’ve seen how substitution is essentially the integration counterpart of the chain rule, that is how to differentiate composite functions. You should recall that when we talked about derivatives we had two main pieces of machinery to differentiate functions:
- the chain rule:
- Leibniz rule:
Integration by parts is then a way of using Leibniz rule to simplify the computation of certain integrals that cannot be solved by more elementary methods, such as the one we have just seen.
Let $F, G$ be continuously differentiable functions on $[a,b]\subset\mathbb R$, such that $F^\prime := f$ and $G^\prime := g$ are Riemann-integrable functions on $[a,b]$. Then
\[\int_{a}^b F(x)g(x)\operatorname dx=\left[F(x)G(x)\right]_{a}^b-\int_{a}^b f(x)G(x)\operatorname dx.\]Although the formula in the theorem may seem daunting at first, a useful mnemonic trick is to start with Leibniz formula for the derivative of a product
\[
(f\cdot g)^\prime(x)=(f^\prime\cdot g)(x) + (f\cdot g^\prime)(x),
\]
then formally integrate it. After all, recall that, thanks to the Fundamental Theorem of Calculus, to compute an integral it is sufficient to look for a primitive function.
\[
\begin{split}
\int(f^\prime g)(x)\operatorname dx + \int (f\cdot g^\prime)(x)\operatorname dx &= \int (f\cdot g)^\prime(x)\operatorname dx
&=f(x)\cdot g(x) + C.
\end{split}
\]
Finally, rearranging the formula above brings it in the form of the theorem.
\[
\int f^\prime (x)g(x)\operatorname dx = f(x)g(x)-\int f(x)g^\prime(x)\operatorname dx +C.
\]
Let’s quickly see how the IbP theorem is proved.
Let $H(x) = (F\cdot G)(x)$. Since $F$ and G are continuously differentiable functions on $[a,b]$, they are continuous on $[a,b]$, and so is $H(x)$. Using Leibniz rule, we have
\begin{equation} H^\prime (x) = F^\prime(x)\cdot G(x) + F(x)\cdot G^\prime(x) = f(x)\cdot G(x) + F(x)\cdot g(x),\label{E.1} \end{equation}
so $H^\prime (x)$ is Riemann-integrable on $[a,b]$. By applying the Fundamental Theorem of Calculus to the integral of $H^\prime(x)$ we get
\begin{equation} \int_{a}^b H^\prime(x)\operatorname dx = \mathscr H(b)-\mathscr H(a), \label{E.2} \end{equation}
where $\mathscr H$ is a primitive function of $H^\prime$. However, we know one obvious primitive for $H^\prime$, which is $H$ itself. Finally, integrating \eqref{E.1} and putting it together with \eqref{E.2}, we get
\[H(b) - H(a) = \int_{a}^b f(x)\cdot G(x)\operatorname dx + \int_{a}^b F(x)\cdot g(x)\operatorname dx,\]which proves the theorem after rearranging the equation.
Great! So now we have a new method of computing integrals at our disposal! But, how can we use it in practice? Is it really useful? Here is an informal, human-readable summary of what the theorem is really telling us.
If you are given the integral of the product of two functions, that is
\[\int_{a}^b f(x)g(x)\operatorname dx\]proceed as follows:
-
Choose the function that you can integrate easily (let’s call it \(f\), with a primitive \(F\)) and the function that you can easily differentiate (let’s call it \(g\)).
-
Integrate \(f\) to \(F\) and compute
\[(g\cdot F)(b)-(g\cdot F)(a).\] -
Differentiate \(g\) and compute
\[\int_{a}^b F(x)g^\prime(x)\operatorname dx.\]If you made a smart choice in 1., this integral should be easier to compute than the original one.
-
Subtract the results you got from 2. and 3. to get the integral you started with.
You will have noticed that I said to choose the function that is easiest to integrate for \(f\) and the one that is easiest to differentiate for \(g\). What does that mean in practice? In truth, only experience will tell you which choice is the smart one, but we still have one last mnemonic trick that might help with choosing, a sort of guideline.
A common strategy is to choose \(g\) and \(f\) according to the order of preference specified by the LIATE acronym \[ g \longrightarrow \text{L. I. A. T. E.} \longleftarrow f \] where the initials stand for
- Logarithimic functions
- Inverse trigonometric functions
- Algebraic functions
- Trigonometric functions
- Exponential functions
We’re now ready to revisit our motivating example, and let’s see how Integration by Parts can help us.
Lets’ try to compute
\[\int_{1}^e \log x\operatorname dx.\]Where is the product in this integral? Let’s make an apparently silly choice and set \(f(x)=1\) and \(g(x)=\log(x)\). Clearly, \((f\cdot g)(x)=\log x\), while
\[\int 1\operatorname dx = x+C\](recalling that primitive functions are only defined up to additive constants \(C\in\mathbb R\)), and
\[g^\prime (x) = \frac{\operatorname d}{\operatorname dx}\log x = \frac{1}{x}.\]Using the notation from our previous recipe, we can write that \(F(x)=x\) (we can safely disregard the integration constant as we are computing definite integrals), and \(g^\prime(x)=x^{-1}\). Additionally, notice that the integral from step 3. in our recipe is now extremely easy to compute and is simply
\[\int_{1}^e F(x)\cdot g^\prime(x)\operatorname dx=\int_{1}^e 1\operatorname dx=\left[x\right]_{1}^e = e-1\]Putting everything together, we get
\[\begin{split} \int_{1}^e \log x \operatorname dx &= \left[F(x)g(x)\right]_{1}^e - \int_{1}^e F(x)g^\prime(x)\operatorname dx\\ &= \left[x\log x\right]_{1}^e-\int_{1}^e 1\operatorname dx\\ &= e - \left(e - 1\right) = 1. \end{split}\]However, this is not the only way in which the IbP formula is useful! Let’s look at an another example that requires a little more ingenuity!
Consider now \(h(x)=\cos^2(x)\). This function is continuous across the entire real line, and we’ll try to find its primitive \(H(x)\). We’ll start by noting that \(h(x) = 1-\sin^2(x)\). This immediately tells us that
\begin{equation} \int \cos^2(x)\operatorname dx = \int(1-\sin^2(x))\operatorname dx = x - \int\sin^2x\operatorname dx. \label{ex2.1} \end{equation}
Moreover, we know from our table of elementary integrals that
\[\int \sin x\operatorname dx=-\cos x+C,\]while \((\sin x)^\prime=\cos(x)\). If we set \(f(x)=g(x)=\sin(x)\), we have \(f(x)g(x)=\sin^2(x)\), and we will apply the IbP formula to this product.
\begin{equation} \int\sin^2x\operatorname dx = \int(f\cdot g)(x)\operatorname dx=-\cos(x)\sin(x) + C+\int \cos^2(x)\operatorname dx. \label{ex2.2} \end{equation}
Substituting equation \eqref{ex2.1} into equation \eqref{ex2.2}, we finally get
\[\int\cos^2x\operatorname dx = x + \cos(x)\sin(x) + C - \int\cos^2(x)\operatorname dx.\]Bringing all the integrals to the left-hand side, we conclude that
\[\int\cos^2x\operatorname dx = \frac{x+\sin x\cos x}{2} + C.\]To conclude this lecture, here are a few rules of thumb for solving integrals using Integration by Parts.
- Choose \(f\) to be the largest factor of the integrand that you can easily integrate, either directly or by using the substitution method. You will often need to rewrite the integral to identify this largest factor, bearing in mind that \(f(x)=1\cdot f(x)\). This is especially useful when you cannot integrate any obvious factor within the integrand.
- If you can integrate all factors in the integrand, then choose \(g\) first to be the factor whose derivative changes form or becomes a constant.
- Sometimes you have to use Integration by Parts more than once when evaulating an integral. In this case, try to stay with the same function-type choice for all Integrations by Parts.
As always, the best way to get to grips with a new concept is to get your hands dirty, so here are some fun exercises to help you practise!
Consider \(f(x)=x\log(x)\). Can you integrate \(f(x)\) over \(I=[0,1]\subset\mathbb R\)? If so, compute its integral over \(I\).
Evaluate the following integrals
\[\begin{align} &\int_{0}^\pi x^2\cos(4x)\operatorname dx, \\ &\int 6\arctan\left(\frac{8}{x}\right) \operatorname dx, \\ &\int (4x^2-9x^2+7x+3)e^{-x}\operatorname dx \end{align}\]Let \(f\) be a real, continuously differentiable function on $[a,b]\subset\mathbb R$, such that $f(a)=f(b)=0$ and \[ \int_{a}^b f^2(x)\operatorname dx = 1. \] Prove that \[ \int_{a}^b xf(x)f^\prime(x)\operatorname dx = -\frac{1}{2}. \]
Define $f(x)$ to be the function \[ f(x) := \int_{x}^{x+1}\sin(t^2)\operatorname d t. \]
- Prove that \(|f(x)|<1/x \text{ if } x>0.\)
- Prove that $2f(x) = \cos(x^2)-\cos((x+1)^2)+r(x)$, where $|r(x)|<c/x$ and $c\in\mathbb R$ is a constant.
Hint: for part 1, start with the substitution $u=t^2$, and than integrate by parts to show that \[ f(x) = \frac{\cos x^2}{2x}-\frac{\cos(x+1)^2}{2(x+1)}-\int_{x^2}^{(x+1)^2}\frac{\cos u}{4u^{3/2}}\operatorname du, \] and then replace $\cos u$ by $-1$.
There is a useful trick for computing the previous $\log x$ integral. First, you should convince yourself that
\[ \log(x)=\left.\frac{\operatorname d}{\operatorname ds}x^s\right|_{s=0}. \]
then we have
\[\begin{split} \int_1^e\log x\operatorname dx=\int_1^e\left.\frac{\operatorname d}{\operatorname ds}x^s\right|_{s=0}\operatorname dx&=\left.\frac{\operatorname d}{\operatorname ds}\int_1^ex^s\operatorname dx\right|_{s=0} \\ &=\left.\frac{\operatorname d}{\operatorname ds}\left[\frac{x^{s+1}}{s+1}\right]_1^e\right|_{s=0} \\ &=\left.\frac{\operatorname d}{\operatorname ds}\frac{e^{s+1}-1}{s+1}\right|_{s=0} \\ &=\left.\frac{se^{s+1}+1}{(s+1)^2}\right|_{s=0}=1. \end{split}\]You might have spotted something suspicious here: the exchange of the integral-derivative order must be justified! To show that this is a legitimate move, one could either invoke Leibniz integral rule, or take out the big guns, in the form of the dominated convergence theorem. In any case, these are considerations going well beyond the scope of this lecture, so you will just have to trust me for the moment that we can indeed exchange the order of integration and differentiation in this case!