# Areas under Curves

In this lecture, we devise a way to calculate the area under a curve. How do we do this? Have you seen the Spaceship Earth at Epcot? If you look at the Spaceship Earth from afar, it looks like a smooth sphere but if you come close, you will find out that it is made of a whole bunch of triangles (actually tetrahedra).

Spaceship Earth

This may give us a clue as to how to tackle our problem. Perhaps we can too to approximate the given curved region with some simple geometric figures of which we know how to calculate areas. Considering that our region is a part of a rectangle except for the curved top, the best candidate would be rectangles. By the way this is in no way a new idea. Ancient Greeks already knew that they could approximate complex curved regions or solids by simple geometric objects such as triangles, rectangles, disks, etc. There are three convenient ways to approximate the area under a curve by rectangles. They are called the left-end point method, midpoint method, and right-end point method, respectively. Let us first discuss the left-end point method. We want to approximate the area under a curve given by $y=f(x)$ on the interval $[a,b]$. Partition $[a,b]$ by $n$ equal subintervals
$$a=x_0<x_1<\cdots<x_n=b,$$
where $\Delta x_k:=\ell([x_{k-1},x_k])=\frac{b-a}{n}$ and $x_k=x_0+k\frac{b-a}{n}$, $k=1,2,\cdots,n$. On each subinterval $[x_{k-1},x_k]$ we consider the rectangle with base $\Delta x_k=\frac{b-a}{n}$ and height $f(x_{k-1})$ (i.e. the left-end point of $[x_{k-1},x_k]$). The area $A$ of the region is then approximated by adding areas of these rectangles:
\begin{align*}
A&\approx f(x_0)\Delta x_1+f(x_1)\Delta x_2+\cdots+f(x_{n-1})\Delta x_n\\
&=\sum_{k=0}^{n-1}f(x_k)\Delta x\\
&=\sum_{k=0}^{n-1}f\left(x_0+k\frac{(b-a)}{n}\right)\frac{b-a}{n}.
\end{align*}
But we are not just satisfied with an approximation. Can we find the exact area $A$? The following figures would give us a clue. The figures show the area under the curve $y=x^2$ on the unit interval $[0,1]$ being approximated by the left-end point method with $n=4$, $n=10$, $n=20$, $n=50$, and $n=100$, respectively.

Approximation by the Left-end point method with n=4

Approximation by the Left-end point method with n=10

Approximation by the Left-end point method with n=20

Approximation by the Left-end point method with n=50

Approximation by the Left-end point method with n=100

It is clear from the above figures that the more rectangles we use the better approximation we get (or equivalently the smaller the error becomes). So if we imagine that we somehow can increase the number of rectangles to infinity, the error will be gone and we will obtain the exact area $A$. How do we then achieve this? Simple, by taking the limit
\label{eq:left-endpt}
A=\lim_{n\to\infty}\sum_{k=0}^{n-1}f\left(x_0+k\frac{(b-a)}{n}\right)\frac{b-a}{n}.

Before we discuss an example, let me list three useful sums.
\begin{align}
\label{eq:sum1}
1+2+3+\cdots+n&=\sum_{k=1}^nk=\frac{n(n+1)}{2},\\
\label{eq:sum2}
1^2+2^2+3^3+\cdots+n^2&=\sum_{k=1}^nk^2=\frac{n(n+1)(2n+1)}{6},\\
\label{eq:sum3}
1^3+2^3+3^3+\cdots+n^3&=\sum_{k=1}^nk^3=\left[\frac{n(n+1)}{2}\right]^2.
\end{align}

Example. Let $f(x)=x^2$ and $a=1$, $b=2$. Find the area under the curve $y=f(x)$.

Solution.
\begin{align*}
A&=\lim_{n\to\infty}\sum_{k=0}^{n-1}f\left(1+\frac{k}{n}\right)\frac{1}{n}\\
&=\lim_{n\to\infty}\sum_{k=0}^{n-1}\left(1+\frac{k}{n}\right)^2\frac{1}{n}\\
&=\lim_{n\to\infty}\left\{\frac{1}{n}\sum_{k=0}^{n-1} 1+\frac{2}{n^2}\sum_{k=0}^{n-1}k+\frac{1}{n^3}\sum_{k=0}^{n-1}k^2\right\}\\
&=\lim_{n\to\infty}\left\{\frac{n-1}{n}+\frac{2}{n^2}\frac{(n-1)n}{2}+\frac{1}{n^3}\frac{(n-1)n(2n-1)}{6}\right\}\\
&=\frac{7}{3}.
\end{align*}

In a similar manner, we can also calculate the area $A$ under curve $y=f(x)$ on the closed interval $[a,b]$ by the midpoint method

\label{eq:midpt}
\begin{aligned}
A&=\lim_{n\to\infty}\sum_{k=0}^{n-1}f\left(\frac{x_k+x_{k+1}}{2}\right)\frac{b-a}{n}\\
&=\lim_{n\to\infty}\sum_{k=0}^{n-1}f\left(x_0+\frac{(2k+1)}{2}\frac{(b-a)}{n}\right)\frac{b-a}{n},
\end{aligned}

and by the right-end point method
\label{eq:right-endpt}
A=\lim_{n\to\infty}\sum_{k=1}^{n}f\left(x_0+k\frac{(b-a)}{n}\right)\frac{b-a}{n}.

For calculating the exact area under a curve, you can use any of the left-end point, midpoint, and right-end point methods. But what is you are only interested in approximating the area under a curve? Assuming that you are using the same number of rectangles to approximate the area, is there a difference between the methods. If fact, there is. In the above example, we found the exact area $\frac{7}{3}$ of the region under $y=x^2$ on the interval $[1,2].$ The approximation by the left-end point method with $n=100$ is 2.31835. The approximation by the midpoint method with $n=100$ is 2.333325. The approximation by the right-end point method with $n=100$ is 2.34835. Let us now compare the errors from the left-end point, midpoint, and right-end point methods, respectively.
\begin{align*}
E_{\mathrm{left}}&=\left|\frac{7}{3}-2.31835\right|\approx 0.014983333,\\
E_{\mathrm{mid}}&=\left|\frac{7}{3}-2.333325\right|\approx 0.0000083333333,\\
E_{\mathrm{right}}&=\left|\frac{7}{3}-2.34835\right|\approx 0.015016667.
\end{align*}
Clearly, the midpoint method gives rise to the best approximation among the three approximation methods when the same number of rectangles are used. This is indeed true in general. If the function is increasing on the closed interval, the left-end point method underestimates while the right-end point method overestimates. If the functions is decreasing on the closed interval, the left-end point method overestimates while the right-end point method underestimates. The midpoint method averages these two estimates.

# Antiderivatives

A function $F(x)$ is called an antiderivative of a function $f(x)$ if it is a solution of the differential equation $\frac{d}{dx}F(x)=f(x)$. For instance, $F(x)=\frac{1}{2}x^2$ is an antiderivative of $f(x)=x$. There can be more than one (actually infinitely many) antiderivative of a function $f(x)$ but they all differ by a constant. In other words, if $F(x)$ and $G(x)$ are antiderivatives of $f(x)$, then $G(x)=F(x)+C$, where $C$ is a constant. In fact, one can easily check that for any constant $C$, $\frac{1}{2}x^2+C$ is an antiderivative of $x$ since the derivative of any constant is zero. On the other hand, if $F(x)$ and $G(x)$ are both antiderivatives of $f(x)$, they differ only by a constant. This can be easily seen: Since $F(x)$ and $G(x)$ are both antiderivatives of $f(x)$,
$$\frac{d}{dx}F(x)=\frac{d}{dx}G(x)=f(x),$$
ans so
$$\frac{d}{dx}(F(x)-G(x))=0.$$
Hence, $F(x)=G(x)+C$, where $C$ is a constant.

If $F(x)$ is an antiderivative of a function $f(x)$,
$$F(x)+C,$$
where $C$ is an arbitrary constant is called the indefinite integral of $f(x)$ and is denoted by $\int f(x)dx$, i.e.
$$\int f(x)dx=F(x)+C.$$
Example. Find the indefinite integral of each of the following functions.

1. $f(x)=\sin x$.

Solution. Let $F(x)$ be an antiderivative of $f(x)$. Then $F’(x)=f(x)$. If we let $y=F(x)$, then
$$\frac{dy}{dx}=\sin x.$$
We know one such function $y$ which satisfies the equaion $\frac{dy}{dx}=\sin x$. It is $y=-\cos x$. So, the indefinite integral of $\sin x$ is
$$\int \sin xdx=-\cos x+C,$$
where $C$ is an arbitrary constant.

2. $f(x)=\frac{1}{x}$.

Solution. An antidetivative of $\frac{1}{x}$ is $\ln x$. Note that $f(x)=\frac{1}{x}$ is defined for $x<0$ while $\ln x$ is not, so in fact the antiderivative should be written as $\ln|x|$ instead of $\ln x$. Hence, the indefinte integral is
$$\int\frac{1}{x}dx=\ln |x|+C,$$
where $C$ is an arbitrary constant.

3. $f(x)=x^n$, where $n\ne -1$.

Solution. $F(x)=\frac{1}{n+1}x^{n+1}$ is an antiderivative of $f(x)=x^n$. Hence, the indefinite integral of $x^n$ is
$$\int x^n dx=\frac{1}{n+1}x^{n+1}+C,$$
where $C$ is an arbitrary constant.

The most general solution of the differential equation
$$\frac{dy}{dx}=f(x)$$
is the indefinite integral
$$y=\int f(x)dx=F(x)+C,$$
where $C$ is an arbitrary constant. With an additional condition, the arbitrary constant $C$ may be determined. Such a condition is called an initial condition.

Example. Solve the differential equation $\frac{dy}{dx}=\sin x$ with the condition $y(0)=1$.

Solution. In the previous example, we find that the general solution is given by
$$y(x)=-\cos x+C,$$
where $C$ is an arbitrary constant. The condition $y(0)=1$ implies that $-\cos 0+C=1$ i.e. $-1+C=1$. So, we obtain $C=2$. Therefore, the solution we seek is
$$y(x)=-\cos x+2.$$

Indefinite integrals satisfy the following properties:
\begin{align*}
\int (f(x)+g(x))dx&=\int f(x)dx+\int g(x)dx,\\
\int c f(x)dx&=c\int f(x)dx,
\end{align*}
where $c$ is a constant. For this we say indefinite integrals are linear. If you have studied linear algebra, you know that a linear map is a map from a vector space to another vector space which preserves vector space operations (vector addition and scalar multiplication). It turns out that functions may be treated as vectors and the indefinite integral $\int f(x)dx$ may be considered as a linear map. This sort of abstract treatment is important in advanced mathematics, physics and engineering. The linearity of indefinite integrals can be used to find the indefinite integral of a complicated function. For example,

Example. Find the indefinite integral of
$$f(x)=1-x^3+12x^5.$$

Solution.
\begin{align*}
\int (1-x^3+12x^5)dx&=\int dx-\int x^3dx+12\int x^5dx\\
&=x-\frac{x^4}{4}+\frac{12}{6}x^6+C\\
&=x-\frac{x^4}{4}+2x^6+C,
\end{align*}
where $C$ is an arbitrary constant.

Some Important Formulas

\begin{align}
\int x^ndx&=\frac{x^n}{n+1}+C\ (n\ne -1\ \mbox{is a rational number})\\
\int \sin kxdx&=-\frac{\cos kx}{k}+C\ (k\ne 0\ \mbox{is a constant})\\
\int \cos kx dx&=\frac{\sin kx}{k}+C\ (k\ne 0\ \mbox{is a constant})\\
\int \sec^2 xdx&=\tan x+C\\
\int \csc^2 xdx&=-\cot x+C\\
\int \sec x\tan xdx&=\sec x+C\\
\int \csc x\cot xdx&=-\csc x+C
\end{align}

Example. [Initial Value Problem] A balloon is ascending at the constant speed of 12 ft/sec is at a height 80 ft above the ground when a package is dropped. How long does it take for the package to reach the ground?

Solution. In order to answer the question we need to have $h(t)$, the motion i.e. the position (height) function of the falling package. We don’t have it yet. If we know the velocity $v(t)$ of the falling package, we would be abel to find $h(t)$ by solving the differential equation

\label{eq:height}
\frac{dh(t)}{dt}=v(t).

However, we don’t have it either. Instead, what is known is the acceleration $a(t)$ of the freely falling package which is constant
$$a(t)=-9.8\mathrm{m/sec^2}=-32\mathrm{ft/sec^2}.$$
The velocity $v(t)$ can be found by solving the differential equation

\label{eq:velocity}
\frac{dv(t)}{dt}=a(t)

with $a(t)=-32\mathrm{ft/sec^2}$. The solution of \eqref{eq:velocity} is the indefinite integral
\begin{align*}
v(t)&=\int -32dt\\
&=-32t+C_1,
\end{align*}
where $C_1$ is a constant. At the time the package was dropped from the balloon, the balloon was ascending at the rate $12\mathrm{ft/sec}$. So the package’s initial velocity is $v(0)=12\mathrm{ft/sec}$. Using this we find $C_1=12$ and so
$$v(t)=-32t+12.$$
Now, we are ready to find $h(t)$. The solution of the differential equation \eqref{eq:height} with $v(t)=-32t+12$ is the indefinite integral
\begin{align*}
h(t)&=\int (-32t+12)dt\\
&=-16t^2+12t+C_2,
\end{align*}
where $C_2$ is a constant. At the time the package was dropped from the balloon the height was 80ft, i.e. $h(0)=80\mathrm{ft}$. Using this we find $C_2=80$. Hence, we have
$$h(t)=-16t^2+12t+80.$$
Setting $h(t)=0$, we obtain the quadratic equation
$$-16t^2+12t+80=0$$
whose solutions are $t_1\approx -1.89$ or $t_2\approx 2.64$. Therefore, it takes 2.64 seconds for the package to reach the ground.