Category Archives: Mathematical Physics

Self-Adjoint Differential Equations III: Real Eigenvalues, Gram-Schmidt Orthogonalization

In here, I mentioned that the eigenvalues of a Hermitian operator are real and that the eigenfunctions of a Hermitian operator are orthogonal.

Let $\mathcal{L}$ be a Hermitian operator and let $u_i$, $u_j$ be eigenfunctions of $\mathcal{L}$ with eigenvalues $\lambda_i$, $\lambda_j$, respectively. Then
\begin{align*}
\mathcal{L}u_i+\lambda_iwu_i=0\ \ \ \ \ (1)\\
\mathcal{L}u_j+\lambda_jwu_j=0\ \ \ \ \ (2)
\end{align*}
The complex conjugation of (2) is
$$\mathcal{L}u_j^\ast+\lambda_j^\ast wu_j^\ast=0\ \ \ \ \ (3)$$
Multiply (1) by $u_j^\ast$ and (3) by $u_i$:
\begin{align*}
u_j^\ast\mathcal{L}u_i+u_j^\ast\lambda_iwu_i=0\ \ \ \ \ (4)\\
u_i\mathcal{L}u_j^\ast+u_i\lambda_j^\ast wu_j^\ast=0\ \ \ \ \ (5)
\end{align*}
Subtracting (5) from (4), we obtain
$$u_j^\ast\mathcal{L}u_i-u_i\mathcal{L}u_j^\ast=(\lambda_j^\ast-\lambda_i)wu_iu_j^\ast\ \ \ \ \ (6)$$
Integrating (6), we get
$$\int_a^bu_j^\ast\mathcal{L}u_i dx-\int_a^b u_i\mathcal{L}u_j^\ast dx=(\lambda_j^\ast-\lambda_i)\int_a^b u_iu_j^\ast wdx\ \ \ \ \ (7)$$
Since $\mathcal{L}$ is Hermitian, the LHS of (7) vanishes. Thus
$$(\lambda_j^\ast-\lambda_i)\int_a^bu_iu_j^\ast wdx=0$$ If $i=j$ then $\int_a^bu_iu_j^\ast wdx\ne 0$ so $\lambda_i^\ast=\lambda_i$, i.e. $\lambda_i$ is real.

If $i\ne j$ and if $\lambda_i\ne\lambda_j$, then $\int_a^bu_iu_j^\ast wdx=0$, i.e. $u_i$ and $u_j$ are orthogonal, where we consider
$$\langle u_i,u_j\rangle=\int_a^b u_iu_j^\ast wdx\ \ \ \ \ (8)$$ as an inner product. The inner product (8) is called the Hermitian product weight by $w(x)$.

What if $i\ne j$ but $\lambda_i=\lambda_j$? Then $\int_a^bu_iu_j^\ast wdx$ may not vanish, i.e. $u_i$ and $u_j$ may not orthogonal.Such case is labeled degenerate. For example, consider the differential equation
$$\frac{d^2}{dx^2}y(x)+n^2y(x)=0$$
Once can easily see that $y_1(x)=\cos nx$ and $y_2(x)=\sin nx$ both satisfy the differential equation. So $\cos nx$ and $\sin nx$ are eigenfunctions that correspond to the same eigenvalue $n^2$. That is, $\cos nx$ and $\sin nx$ are degenerate eigenfunctions. They are however orthogonal because
$$\int_0^{2\pi}\cos nx\sin nxdx=0$$

It is good to know the following formulas:
$$
\int_{x_0}^{x_0+2\pi}\sin mx\sin nxdx=C_n\delta_{nm},
$$
where
$$C_n=\left\{\begin{array}{ccc}
\pi & \mbox{if} & n\ne 0\\
0 & \mbox{if} & n=0
\end{array}\right.$$
$$
\int_{x_0}^{x_0+2\pi}\cos mx\cos nxdx=D_n\delta_{nm},
$$
where
$$D_n=\left\{\begin{array}{ccc}
\pi & \mbox{if} & n\ne 0\\
2\pi & \mbox{if} & n=0
\end{array}\right.$$
$$\int_{x_0}^{x_0+2\pi}\sin mx\cos nxdx=0$$

Orthogonal functions can be used to expand a functions, for example, as a Fourier series expansion. Certain classes of function (sectionally continuous or piecewise continuous) may be represented by a series of orthogonal functions to any desired degree of accuracy. Such property is referred to as completeness.

Example. Square Wave

Consider the square wave
$$f(x)=\left\{\begin{array}{ccc}
\frac{h}{2} & \mbox{if} & 0<x<\pi,\\
-\frac{h}{2} & \mbox{if} & -\pi<x<0
\end{array}\right.$$

Note that the function may be expanded in any of a variety of orthogonal eigenfunctions such as Legendere, Hermite, Chebyshev, etc. The choice of eigenfunction is made on the basis of convenience. For example, we may choose to use $\cos nx$, $\sin nx$.  The eigenfunction series can then be written as a Fourier series
$$f(x)=\frac{a_0}{2}+\sum_{n=1}^\infty(a_n\cos nx+b_n\sin nx)\ \ \ \ \ (8)$$
where
\begin{align*}
a_n&=\frac{1}{\pi}\int_{-\pi}^\pi f(x)\cos nt dt,\\
b_n&=\frac{1}{\pi}\int_{-\pi}^\pi f(t)\sin nt dt,\ n=0,1,2,\cdots
\end{align*}
So by the formula (8) we find that
\begin{align*}
a_n&=0,\\
b_n&=\frac{h}{n\pi}(1-\cos n\pi)\\
&=\frac{h}{n\pi}(1-(-1)^n)\\
&=\left\{\begin{array}{ccc}
0 & \mbox{if} & n=\mbox{even}\\
\frac{2h}{n\pi} & \mbox{if} & n=\mbox{odd}
\end{array}\right.
\end{align*}
Hence the Fourier series expansion of $f(x)$ is given by
$$f(x)=\frac{2h}{\pi}\sum_{n=0}^\infty\frac{\sin(2n+1)x}{2n+1}$$

If $N$ liearly independent eigenfunctions correspond to the same eigenvalue, the eigenvalue is said to be $N$-fold degenerate. In the above example, for each $n$ there are two possible solutions $\cos nx$, $\sin nx$. We may say the eigenfunctions are degenerate or the eigenvalue is degenerate.

Gram-Schmidt Orthogonalization

Consider
$$\int_a^b \varphi_i^2wdx=N_i^2$$
Since $\mathcal{L}u(x)+\lambda w(x)u(x)=0$ is linear, $\mu_i\varphi_i$ would be a solution as well for any constant $\mu_i$. If we set $\psi_i=\frac{\varphi_i}{N_i}$ then
$$\int_a^b\psi_i^2wdx=1$$
and
$$\int_a^b\psi_i(x)\psi_j(x)w(x)dx=\delta_{ij}$$

Suppose that $\{u_n(x)\}_{n=0}^\infty$ are eigenfunctions that are not mutually orthogonal. Gram-Schmidt orthogonalization process allows us to come up with orthonormal eigenfunctions $\{\varphi_n\}_{n=0}^\infty$ from $\{u_n(x)\}_{n=0}^\infty$.
Let us start with $n=0$. Let $\psi_0(x)=u_0(x)$. Then the normalized eigenfunction is denoted by $\varphi_0(x)$.
$$\varphi_0(x)=\frac{\psi_0(x)}{\left[\int\psi_0^2(x)wdx\right]^{1/2}}$$
For $n=1$, let
\begin{align*}
\psi_1&=u_1(x)-\langle u_1(x),\varphi_0(x)\rangle\varphi_0(x)\\
&=u_1(x)-\int u_1\varphi_0wdx\varphi_0(x)\end{align*}
and
$$\varphi_1(x)=\frac{\psi_1}{\left[\int\psi_1^2wdx\right]^{1/2}}$$

Figure 1

Figure 1 clearly shows how we can come up with $\psi_1$ using a vector projection. Note that the angle $\theta$ appeared in Figure 1 is for an intuitive purpose only and does not depict a real angle.

Figure 2

For $n=2$, as one can see from Figure 2, $\psi_2$ that is orthogonal to both $\varphi_0$ and $\varphi_1$ can be obtained as
\begin{align*}
\psi_2&:=u_2(x)-\langle u_2(x),\varphi_0(x)\rangle\varphi_0(x)-\langle u_2(x),\varphi_1(x)\rangle\varphi_1(x)\\
&=u_2(x)-\int u_2(x)\varphi_0(x)w(x)dx\varphi_0(x)-\int u_2(x)\varphi_1(x)w(x)dx\varphi_1(x)
\end{align*}
The normalzed eigenfunction $\varphi_2$ is then
$$\varphi_2:=\frac{\psi_2}{\left[\int\psi_2^2wdx\right]^{1/2}}$$
Now we can see a clear pattern and $\psi_n$ which is orthogonal to $\varphi_0,\cdots,\varphi_{n-1}$ would be given by
\begin{align*}
\psi_n&:=u_n(x)-\sum_{j=0}^{n-1}\langle u_j(x),\varphi_j(x)\rangle\varphi_j(x)\\
&=u_n(x)-\sum_{j=0}^{n-1}\left[\int u_j(x)\varphi_j(x)w(x)dx\right]\varphi_j(x)
\end{align*}
with its normalization
$$\varphi_n:=\frac{\varphi_n}{\left[\int\psi_n^2wdx\right]^{1/2}}$$

Example. Find orthonormal set from
$$u_n(x)=x^n,\ n=0,1,2,\cdots$$
with $-1\leq x\leq 1$ and $w(x)=1$.

Solution. By Gram-Schmidt orthonormalization, we obtain
\begin{align*}
\varphi_0(x)&=\frac{1}{\sqrt{x}},\\
\varphi_1(x)&=\sqrt{\frac{3}{2}}x,\\
\varphi_2(x)&=\sqrt{\frac{5}{2}}\cdot\frac{1}{2}(3x^2-1),\\ \varphi_3(x)&=\sqrt{\frac{7}{2}}\cdot\frac{1}{2}(5x^3-3x),\\
\vdots\\
\varphi_n(x)&=\sqrt{\frac{2n+1}{2}}P_n(x),
\end{align*}
where $P_n(x)$ is the $n$th-order Legendre polynomial.

References:

G. Arfken, Mathematical Methods for Physicists, 3rd Edition, Academic Press 1985

Self-Adjoint Differential Equations II: Hermitian Operators

Let $\mathcal{L}$ be a second-order self-adjoint differential operator. Then $\mathcal{L}u(x)$ may be written as
$$\mathcal{L}u(x)=\frac{d}{dx}\left[p(x)\frac{du(x)}{dx}\right]+q(x)u(x)\ \ \ \ \ (1)$$ as we discussed here. Multiply (1) by $v^\ast$ ($v^\ast$ is the complex conjugate of $v$) and integrate
\begin{align*}
\int_a^bv^\ast\mathcal{L}udx&=\int_a^bv^\ast\frac{d}{dx}\left[p(x)\frac{du(x)}{dx}\right]dx+\int_a^bv^\ast qudx\\
&=\int_a^bv^\ast d\left[p(x)\frac{du(x)}{dx}\right]+\int_a^bv^\ast qudx\\
&=v^\ast p\frac{du}{dx}|_a^b-\int_a^b {v^\ast}^\prime pu’dx+\int_a^bv^\ast qudx
\end{align*}
We may impose
$$v^\ast p\frac{du}{dx}|_a^b=0\ \ \ \ \ (2)$$
as a boundary condition.
\begin{align*}
-\int_a^b {v^\ast}^\prime pu’dx&=-\int_a^b {v^\ast}^\prime pdu\\
&=-{v^\ast}^\prime pu|_a^b+\int_a^b u(p{v^\ast}^\prime)’dx
\end{align*}
We may also impose
$$-{v^\ast}^\prime pu|_a^b=0\ \ \ \ \ (3)$$
as a boundary condition. Then
\begin{align*}
\int_a^bv^\ast\mathcal{L}udx&=\int_a^b u(p{v^\ast}^\prime)’dx+\int_a^bv^\ast qudx\\
&=\int_a^b u\mathcal{L}v^\ast dx
\end{align*}

Definition. A self-adjoint operator $\mathcal{L}$ is called a Hermitian operator with respect to the functions $u(x)$ and $v(x)$ if

$$\int_a^bv^\ast\mathcal{L}udx=\int_a^b u\mathcal{L}v^\ast dx\ \ \ \ \ (4)$$

That is, a self-adjoint operator $\mathcal{L}$ which satisfies the boundary conditions (2) and (3) is a Hermitian operator.

Hermitian Operators in Quantum Mechanics

In quantum mechanics, the differential operators need to be neither second-order nor real. For example, the momentum operator is given by $\hat p=-i\hbar\frac{d}{dx}$. Therefore we need an extended notion of Hermitian operators in quantum mechanics.

Definition. The operator $\mathcal{L}$ is Hermitian if
$$\int \psi_1^\ast\mathcal{L}\psi_2 d\tau=\int(\mathcal{L}\psi_1)^\ast\psi_2 d\tau\ \ \ \ \ (5)$$
Note that (5) coincides with (4) if $\mathcal{L}$ is real. In terms of Dirac’s braket notation (5) can be written as
$$\langle\psi_1|\mathcal{L}\psi_2\rangle=\langle\mathcal{L}\psi_1|\psi_2\rangle$$

The adjoint operator $A^\dagger$ of an operator $A$ is defined by
$$\int \psi_1^\ast A^\dagger \psi_2 d\tau=\int(A\psi_1)^\ast\psi_2 d\tau\ \ \ \ \ (6)$$ Again in terms of Dirac’s braket notation (6) can be written as
$$\langle\psi_1|A^\dagger\psi_2\rangle=\langle A\psi_1|\psi_2\rangle$$
If $A=A^\dagger$ then $A$ is said to be self-adjoint. Clearly, self-adjoint operators are Hermitian operators. However the converse need not be true. Although we will not delve into this any deeper here, the difference is that Hermitian operators are always assumed to be bounded while self-adjoint operators are not necessarily restricted to be bounded. That is, bounded self-adjoint operators are Hermitian operators. Physicists don’t usually distinguish self-adjoint operators and Hermitian operators, and often they mean self-adjoint operators by Hermitian operators. In quantum mechanics, observables such as position, momentum, energy, angular momentum are represented by (Hermitian) linear operators and the measurements of observables are given by the eigenvalues of linear operators. Physical observables are regarded to be bounded and continuous, because the measurements are made in a laboratory (so bounded) and points of discontinuity are mathematical points and nothing smaller than the Planck length can be observed. As well-known any bounded linear operator defined on a Hilbert space is continuous.

For those who are interested: This may cause a notational confusion, but in mathematics the complex conjugate $a^\ast$ is replaced by $\bar a$ and the adjoint $a^\dagger$ is replaced by $a^\ast$. Let $\mathcal{H}$ be a Hilbert space. By the Riesz Representation Theorem, it can be shown that for any bounded linear operator $a:\mathcal{H}\longrightarrow\mathcal{H}’$, there exists uniquely a bounded linear operator $a^\ast: \mathcal{H}’\longrightarrow\mathcal{H}$ such that
$$\langle a^\ast\eta|\xi\rangle=\langle\eta|a\xi\rangle$$ for all $\xi\in\mathcal{H}$, $\eta\in\mathcal{H}’$. This $a^\ast$ is defined to be the adjoint of the bounded operator $a$. ${}^\ast$ defines an involution on $\mathcal{B}(\mathcal{H})$, the set of all bounded lineart operators of $\mathcal{H}$ and $\mathcal{B}(\mathcal{H})$ with ${}^\ast$ becomes a C${}^\ast$-algebra. In mathematical formulation of quantum mechanics, observables are represented by self-adjoint operators of the form $a^\ast a$, where $a\in\mathcal{B}(\mathcal{H})$. Note that $a^\ast a$ is positive i.e. its eigenvalues are non-negative.

Definition. The expectation value of an operator $\mathcal{L}$ is
$$\langle\mathcal{L}\rangle=\int \psi^\ast\mathcal{L}\psi d\tau$$
$\langle\mathcal{L}\rangle$ corresponds to the result of a measurement of the physical quantity represented by $\mathcal{L}$ when the physical system is in a state described by $\psi$. The expectation value of an operator should be real and this is guaranteed if the operator is Hermitian. To see this suppose that $\mathcal{L}$ is Hermitian. Then
\begin{align*}
\langle\mathcal{L}\rangle^\ast&=\left[\int \psi^\ast\mathcal{L}\psi d\tau\right]^\ast\\
&=\int\psi\mathcal{L}^\ast\psi^\ast d\tau\\
&=\int(\mathcal{L}\psi)^\ast\psi d\tau\\
&=\int\psi^\ast\mathcal{L}\psi d\tau\ (\mbox{since $\mathcal{L}$ is Hermitian})\\
&=\langle\mathcal{L}\rangle
\end{align*}
That is, $\langle\mathcal{L}\rangle$ is real.

There are three important properties of Hermitian (self-adjoint) operators:

  • The eigenvalues of a Hermitian operator are real.
  • The eigenfunctions of a Hermitian operator are orthogonal.
  • The eigenfunctions of a Hermitian operator form a complete set.

References:

  1. G. Arfken, Mathematical Methods for Physicists, 3rd Edition, Academic Press 1985
  2. W. Greiner, Quantum Mechanics, An Introduction, 4th Edition, Springer-Verlag 2001
  3. P. Szekeres, A Course in Modern Mathematical Physics: Groups, Hilbert Space and Differential Geometry, Cambridge University Press 2004

Self-Adjoint Differential Equations I

Let $\mathcal{L}$ be the second-order linear differential operator
$$\mathcal{L}=p_0(x)\frac{d^2}{dx^2}+p_1(x)\frac{d}{dx}+p_2(x)$$
which acts on a function $u(x)$ as
$$\mathcal{L}u(x)=p_0(x)\frac{d^2u(x)}{dx^2}+p_1(x)\frac{du(x)}{dx}+p_2(x)u(x).\ \ \ \ \ (1)$$

Define an adjoint operator $\bar{\mathcal{L}}$ by
\begin{align*}
\bar{\mathcal{L}}&:=\frac{d^2}{dx^2}[p_0u]-\frac{d}{dx}[p_1u]+p_2u\\
&=p_0\frac{d^2u}{dx^2}+(2p_0^\prime-p_1)\frac{du}{dx}+(p_0^{\prime\prime}-p_1+p_2)u.
\end{align*}
If $\mathcal{L}=\bar{\mathcal{L}}$, $\mathcal{L}$ is said to be self-adjoint. One can immediately see that $\mathcal{L}=\bar{\mathcal{L}}$ if and only if $$p_0^\prime=p_1.\ \ \ \ \ (2)$$ Let $p(x)=p_0(x)$ and $q(x)=p_2(x)$. Then
\begin{align*}
\mathcal{L}=\bar{\mathcal{L}}&=p\frac{d^2u}{dx^2}+\frac{dp}{dx}\frac{du}{dx}+qu\\
&=\frac{d}{dx}\left[p(x)\frac{du(x)}{dx}\right]+qu(x).
\end{align*}
Note that one can transform a non-self-adjoint 2nd-order linear differential operator to a self-adjoint one. The idea is similar to that of finding a integrating factor to transform a non-separable first-order linear differential equation to a separable one.

Suppose that (1) is not self-adjoint, i.e. $p_1\ne p_0′$. Multiply $\mathcal{L}$ by $\frac{f(x)}{p_0(x)}$. Then
$$\mathcal{L}’:=\frac{f}{p_0}\mathcal{L}=f\frac{d^2u}{dx^2}+f\frac{p_1}{p_0}\frac{du}{dx}+f\frac{p_2}{p_0}u.$$
Suppose $\mathcal{L}’$ is self-adjoint. Then by (2)
$$f’=f\frac{p_1}{p_0}.$$
That is,
$$f(x)=\exp\left[\int^x\frac{p_(t)}{p_0(t)}dt\right].$$
If $p_1=p_0′$, then
\begin{align*}
\frac{f(x)}{p_0}&=\frac{1}{p_0}\exp\left[\int^x\frac{p_1}{p_0}dt\right]\\
&=\frac{1}{p_0}\exp\left[\int^x\frac{p_0^\prime}{p_0}dt\right]\\
&=\frac{1}{p_0}\exp(\ln p_0(x))\\
&=\frac{1}{p_0(x)}\cdot p_0\\
&=1
\end{align*}
i.e. $f(x)=p_0(x)$ as expected.

Eigenfunctions, Eigenvalues

From separation of variables or directly from a physical problem, we have second-order linear differential equation of the form
$$\mathcal{L}u(x)+\lambda w(x)u(x)=0,\ \ \ \ \ (3)$$
where $\lambda$ is a constant and $w(x)>0$ is a function called a density or weighting function. The constant $\lambda$ is called an eigenvalue and $u(x)$ is called an eigenfunction.

Example. [Shcrödinger Equation]

The Schrödinger equation
$$H\psi=E\psi$$
is of the form (3). Recall that $H$ is the Hamiltonian operator
$$H=-\frac{\hbar^2}{2m}\frac{d^2}{dx^2}+V(x)$$
where $V(x)$ is a potential. So $H$ is a second-order linear differential operator. The weight function $w(x)=-1$ and $E$ is energy as an eigenvalue. Clearly Schrödinger equation is self-adjoint.

Example. [Legendre's Equations]

Legendre’s equation
$$(1-x^2)y^{\prime\prime}-2xy’+n(n+1)y=0$$ is of the form (3), where $\mathcal{L}y=(1-x^2)y^{\prime\prime}-2xy’$, $w(x)=1$, and $\lambda=n(n+1)$. Since $p_0^\prime=-2x=p_1$, Legendre’s equations are self-adjoint.

Legendre Functions III: Special Values, Parity, Orthogonality

Special Values

From the generating function
$$g(x,t)=\frac{1}{(1-2xt+t^2)^{1/2}},$$
when $x=1$ we obtain
\begin{align*}
g(1,t)&=\frac{1}{(1-2t+t^2)^{1/2}}\\
&=\frac{1}{1-t}\\
&=\sum_{n=0}^\infty t^n,
\end{align*}
since $|t|<1$. On the other hand,
$$g(1,t)=\sum_{n=0}^\infty P_n(1)t^n.$$
So by comparison we get
$$P_n(1)=1.$$ Similarly, if we let $x=-1$,
$$P_n(-1)=(-1)^n.$$
For $x=0$, the generating function results
$$(1+t^2)^{-1/2}=1-\frac{1}{2}t^2+\frac{3}{8}t^4+\cdots+(-1)^n\frac{1\cdot 3\cdots (2n-1)}{2^nn!}t^{2n}+\cdots.$$
Thus we obtain
\begin{align*}
P_{2n}(0)&=(-1)^n\frac{1\cdot 3\cdots (2n-1)}{2^nn!}=(-1)^n\frac{(2n-1)!!}{(2n)!!},\\
P_{2n+1}(0)&=0,\ n=0,1,2,\cdots.
\end{align*}
Recall that the double factorial !! is defined by
\begin{align*}
(2n)!!&=2\cdot 4\cdot 6\cdots (2n),\\
(2n-1)!!&=1\cdot 3\cdot 5\cdots (2n-1).
\end{align*}

Parity

$g(t,x)=g(-t,-x)$, that is
$$\sum_{n=0}^\infty P_n(x)t^n=\sum_{n=0}^\infty P_n(-x)(-t)^n$$
which results the parity
$$P_n(-x)=(-1)^nP_n(x).\ \ \ \ \ (1)$$
(1) tells that if $n$ is even, $P_n(x)$ is an even function and if $n$ is odd, $P_n(x)$ is an odd function.

Orthogonality

Multiply the Legendre’s diferential equation
$$\frac{d}{dx}[(1-x^2)P_n'(x)]+n(n+1)P_n(x)=0\ \ \ \ \ (2)$$ by $P_m(x)$.
$$P_m(x)\frac{d}{dx}[(1-x^2)P_n'(x)]+n(n+1)P_m(x)P_n(x)=0.\ \ \ \ \ (3)$$
Replace $n$ by $m$ in (2) and then multiply the resulting equation by $P_n(x)$.
$$P_n(x)\frac{d}{dx}[(1-x^2)P_m'(x)]+m(m+1)P_m(x)P_n(x)=0.\ \ \ \ \ (4)$$
Subtract (4) from (3) and integrate the resulting equation with respect to $x$ from $-1$ to 1.
\begin{align*}
\int_{-1}^1&\left\{P_m(x)\frac{d}{dx}[(1-x^2)P_n'(x)]-P_n(x)\frac{d}{dx}[(1-x^2)P_m'(x)]\right\}dx\\
&=[m(m+1)P_m(x)P_n(x)-n(n+1)P_m(x)P_n(x)].\end{align*}
Using integration by parts,
\begin{align*}
\int_{-1}^1P_m(x)\frac{d}{dx}[(1-x^2)P_n'(x)]dx&=\\&(1-x^2)P_m(x)P_n’(x)|_{-1}^1-\int_{-1}^1P_m(x)P_n(x)dx\\
&=-\int_{-1}^1P_m(x)P_n(x)dx.
\end{align*}
Since the integration of the second term inside $\{\ \ \}$ would have the same value, the LHS vanishes.
Hence for $m\ne n$,
$$\int_{-1}^1P_m(x)P_n(x)dx=0.\ \ \ \ \ (5)$$
That is, $P_m(x)$ and $P_n(x)$ are orthogonal for the interval $[-1,1]$.
For $x=\cos\theta$, the orthogonality (5) is given by
$$\int_0^\pi P_n(\cos\theta)P_m(\cos\theta)\sin\theta d\theta=0.$$

Integrate
$$(1-2xt+t^2)^{-1}=\left[\sum_{n=0}^\infty P_n(x)t^n\right]^2$$
with respect to $x$ from $-1$ to $1$. Due to the orthogonality (5), the integration of all the crossing terms in the RHS will vanish, and so we obtain
$$\int_{-1}^1\frac{dx}{1-2xt+t^2}=\sum_{n=0}^\infty \left\{\int_{-1}^1[P_n(x)]^2dx\right\}t^{2n}.$$
\begin{align*}
\int_{-1}^1\frac{dx}{1-2xt+t^2}&=\frac{1}{2t}\int_{(1-t)^2}^{(1+t)^2}\frac{dy}{y}\\
&=\frac{1}{t}\ln\left(\frac{1+t}{1-t}\right)\\
&=\sum_{n=0}^\infty\frac{2}{2n+1}t^{2n}\ (\mbox{since $|t|<1$}).
\end{align*}
Therefore we have the normalizer of Legendre polynomial $P_n(x)$
$$\int_{-1}^1[P_n(x)]^2dx=\frac{2}{2n+1}.$$

Expansion of Functions

Suppose that
$$\sum_{n=0}^\infty a_nP_n(x)=f(x).\ \ \ \ \ (6)$$
Multiply (6) by $P_m(x)$ and integrate with respect to $x$ from $-1$ to 1:
$$\sum_{n=0}^\infty a_n\int_{-1}^1 P_n(x)P_m(x)dx=\int_{-1}^1f(x)P_m(x)dx.$$
By the orthogonality (5), we obtain
$$\frac{2}{2m+1}a_m=\int_{-1}^1f(x)P_m(x)dx\ \ \ \ \ (7)$$
and hence $f(x)$ can be written as
$$f(x)=\sum_{n=0}^\infty\frac{2n+1}{2}\left(\int_{-1}^1 f(t)P_m(t)dt\right)P_n(x).\ \ \ \ \ (8)$$
This expansion in a series of Legendre polynomials is called a Legendre series. Clearly if $f(x)$ is continuous (or integrable) on the interval $[-1,1]$, it can be expanded as a Legendre series.

(7) can be considered as an integral transform, a finite Legendre transform and (8) can be considered as the inverse transform.

Let us consider the integral operator
$$\mathcal{P}_m:=P_m(x)\frac{2m+1}{2}\int_{-1}^1P_m(t)[\ \cdot\ ]dt.\ \ \ \ \ (9)$$
Then
$$\mathcal{P}_mf(t)=a_mP_m(x).$$
The operator (9) projects out the $m$th component of the function $f(x)$.

Legendre Functions II: Recurrence Relations and Special Properties

In this lecture, we derive some important recurrence relations of Legendre functions and use them to show that Legendre functions are indeed solutions of a differential equation, called Legendre’s differential equation.

Differentiating the generating function
$$g(x,t)=(1-2xt+t^2)^{-1/2}=\sum_{n=0}^\infty P_n(x)t^n,\ |t|<1\ \ \ \ \ \mbox{(1)}$$
with respect to $t$, we get
\begin{align*}
\frac{\partial g(x,t)}{\partial t}&=\frac{x-t}{(1-2xt+t^2)^{3/2}}\ \ \ \ \ \mbox{(2)}\\&=\sum_{n=0}^\infty nP_n(x)t^{n-1}.\ \ \ \ \ \mbox{(3)}\end{align*}
(2) can be written as
$$\frac{x-t}{(1-2xt+t^2)(1-2xt+t^2)^{1/2}}=\frac{(x-t)(1-2xt+t^2)^{-1/2}}{1-2xt+t^2}.$$
By (1) and (3), we obtain
$$(x-t)\sum_{n=0}^\infty P_n(x)t^n=(1-2xt+t^2)\sum_{n=0}^\infty nP_n(x) t^{n-1}$$ or
$$(1-2xt+t^2)\sum_{n=0}^\infty nP_n(x) t^{n-1}+(t-x)\sum_{n=0}^\infty P_n(x)t^n=0$$
which can be written out as
\begin{align*}
\sum_{n=0}^\infty nP_n(x)t^{n-1}-\sum_{n=0}^\infty &2xnP_n(x)t^n+\sum_{n=0}^\infty nP_n(x)t^{n+1}\\&+\sum_{n=0}^\infty P_n(x)t^{n+1}-\sum_{n=0}^\infty xP_n(x)t^n=0.\ \ \ \ \ \mbox{(4)}\end{align*}
In (4) replace $n$ by $n+1$ in the first term, and replace $n$ by $n-1$ in the third and fourth term. Then (4) becomes
\begin{align*}
\sum_{n=0}^\infty (n+1)P_{n+1}(x)t^n-\sum_{n=0}^\infty &2xnP_n(x)t^n+\sum_{n=0}^\infty (n-1)P_{n-1}(x)t^n\\&+\sum_{n=0}^\infty P_{n-1}(x)t^n-\sum_{n=0}^\infty xP_n(x)t^n=0.
\end{align*}
This can be simplified to
$$\sum_{n=0}^\infty[(n+1)P_{n+1}(x)-(2n+1)xP_n(x)+nP_{n-1}(x)]t^n=0$$
which implies that
$$(2n+1)xP_n(x)=(n+1)P_{n+1}(x)+nP_{n-1}(x).\ \ \ \ \ \mbox{(5)}$$
The recurrence relation (5) can be used to calculate Legendre polynomials. For example, we found $P_0(x)=1$ and $P_1(x)=x$ here. For $n=1$, (5) is
$$3xP_1(x)=2P_2(x)+P_0(x)$$
i.e.
$$P_2(x)=\frac{1}{2}(3x^2-1).$$
Continuing this using the recurrence relation (5), we obtain
\begin{align*}
P_3(x)&=\frac{1}{2}(5x^3-3x),\\
P_4(x)&=\frac{1}{8}(35x^4-30x^2+3),\\
P_5(x)&=\frac{1}{8}(63x^5-70x^3+15x),\\
\cdots.
\end{align*}
A great advantage of having the recurrence relation (5) is that one can easily calculate Legendre polynomials using a computer with a simple programming. This can be easily done for instance in Maxima.

Let us load the following simple program to run the recurrence relation (5).

(%i1) Legendre(n,x):=block ([],
if n = 0 then 1
else
if n = 1 then x
else  ((2*n – 1)*x*Legendre(n – 1, x)-(n – 1)*Legendre(n – 2,x))/n);

(%o1) Legendre(n, x) := block([], if n = 0 then 1
else (if n = 1 then x else ((2 n – 1) x Legendre(n – 1, x)
- (n – 1) Legendre(n – 2, x))/n))

Now we are ready to calculate Legendre polynomials. For example, let us calculate $P_3(x)$.

(%i2) Legendre(3,x);

The output is not exactly what we may like because it is not simplified.

In Maxima, simplification can be done by the command ratsimp.

(%i3) ratsimp(Legendre(3,x));

The output is

That looks better. Let us calculate one more, say $P_4(x)$.

Now we differentiate $g(x,t)$ with respect to $x$.
$$\frac{\partial g(x,t)}{\partial x}=\frac{t}{(1-2xt+t^2)^{3/2}}=\sum_{n=0}^\infty P_n’(x)t^n.$$
From this we obtain
$$(1-2xt+t^2)\sum_{n=0}^\infty P_n’(x)t^n-t\sum_{n=0}^\infty P_n(x)t^n=0$$
which leads to
$$P_{n+1}’(x)+P_{n-1}’(x)=2xP_n’(x)+P_n(x).\ \ \ \ \ \mbox{(6)}$$
Add 2 times $\frac{d}{dx}(5)$ to $2n+1$ times (6). Then we get
$$(2n+1)P_n=P_{n+1}’(x)-P_{n-1}’(x).\ \ \ \ \ \mbox{(7)}$$
$\frac{1}{2}[(6)+(7)]$ results
$$P_{n+1}’(x)=(n+1)P_n(x)+xP_n’(x).\ \ \ \ \ \mbox{(8)}$$
$\frac{1}{2}[(6)-(7)]$ results
$$P_{n-1}’(x)=-nP_n(x)+xP_n’(x).\ \ \ \ \ \mbox{(9)}$$
Replace $n$ by $n-1$ in (7) and add the result to $x$ times (9):
$$(1-x^2)P_n’(x)=nP_{n-1}(x)-nxP_n(x).\ \ \ \ \ \mbox{(10)}$$
Differentiate (10) with respect to $x$ and add the result to $n$ times (9):
$$(1-x^2)P_n^{\prime\prime}(x)-2xP_n’(x)+n(n+1)P_n(x)=0.\ \ \ \ \ \mbox{(11)}$$
The linear second-order differential equation (11) is called Legendre’s differential equation and as seen $P_n(x)$ satisfies (11). This is why $P_n(x)$ is called a Legendre polynomial.

In physics (11) is often expressed in terms of differentiation with respect to $\theta$. Let $x=\cos\theta$. Then by the chain rule,
\begin{align*}
\frac{dP_n(\cos\theta)}{d\theta}&=-\sin\theta\frac{dP_n(x)}{dx},\ \ \ \ \ \mbox{(12)}\\ \frac{d^2P_n(\cos\theta)}{d\theta^2}&=-x\frac{dP_n(x)}{dx}+(1-x^2)\frac{d^2P_n(x)}{dx^2}.\ \ \ \ \ \mbox{(13)}
\end{align*}
Using (12) and (13), Legendre’s differential equation (11) can be written as
$$\frac{1}{\sin\theta}\frac{d}{d\theta}\left[\sin\theta\frac{dP_n(\cos\theta)}{d\theta}\right]+n(n+1)P_n(\cos\theta)=0.$$

Tensors I

Tensors may be considered as a generalization of vectors and covectors. They are extremely important quantities for studying differential geometry and physics.

Let $M^n$ be an $n$-dimensional differentiable manifold. For each $x\in M^n$, let $E_x=T_xM^n$, i.e. the tangent space to $M^n$ at $x$. We denote the canonical basis of $E$ by $\partial=\left(\frac{\partial}{\partial x^1},\cdots,\frac{\partial}{\partial x^n}\right)$ and its dual basis by $\sigma=dx=(dx^1,\cdots,dx^n)$, where $x^1,\cdots,x^n$ are local coordinates. The canonical basis $\frac{\partial}{\partial x^1},\cdots,\frac{\partial}{\partial x^1}$ also simply denoted by $\partial_1,\cdots,\partial_n$.

Covariant Tensors

Definition. A covariant tensor of rank $r$ is a multilinear real-valued function
$$Q:E\times E\times\cdots\times E\longrightarrow\mathbb{R}$$
of $r$-tuples of vectors. A covariant tensor of rank $r$ is also called a tensor of type $(0,r)$ or shortly $(0,r)$-tensor. Note that the values of $Q$ must be independent of the basis in which the components of the vectors are expressed. A covariant vector (also called covector or a 1-form) is a covariant tensor of rank 1. An important of example of covariant tensor of rank 2 is the metric tensor $G$:
$$G(v,w)=\langle v,w\rangle=\sum_{i,j}g_{ij}v^iw^j.$$

In componenents, by multilinearity
\begin{align*}
Q(v_1\cdots,v_r)&=Q\left(\sum_{i_1}v_1^{i_1}\partial_{i_1},\cdots,\sum_{i_r}v_r^{i_r}\partial_{i_r}\right)\\
&=\sum_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}Q(\partial_{i_1},\cdots,\partial_{i_r}).
\end{align*}
Denote $Q(\partial_{i_1},\cdots,\partial_{i_r})$ by $Q_{i_1,\cdots,i_r}$. Then
$$Q(v_1\cdots,v_r)=\sum_{i_1,\cdots,i_r}Q_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}.\ \ \ \ \ \mbox{(1)}$$
Using the Einstein’s convention, (1) can be shortly written as
$$Q(v_1\cdots,v_r)=Q_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}.$$
The set of all covariant tensors of rank $r$ forms a vector space over $\mathbb{R}$. The number of components in such a tensor is $n^r$. The vector space of all covariant $r$-th rank tensors is denoted by
$$E^\ast\otimes E^\ast\otimes\cdots\otimes E^\ast=\otimes^r E^\ast.$$

If $\alpha,\beta\in E^\ast$, i.e. covectors, we can form the 2nd rank covariant tensor, the tensor product $\alpha\otimes\beta$ of $\alpha$ and $\beta$: Define $\alpha\otimes\beta: E\times E\longrightarrow\mathbb{R}$ by
$$\alpha\otimes\beta(v,w)=\alpha(v)\beta(w).$$
If we write $\alpha=a_idx^i$ and $\beta=b_jdx^j$, then
$$(\alpha\otimes\beta)_{ij}=\alpha\otimes\beta(\partial_i,\partial_j)=\alpha(\partial_i)\beta(\partial_j)=a_ib_j.$$

Contravariant Tensors

A contravariant vector, i.e. an element of $E$ can be considered as a linear functional $v: E^\ast\longrightarrow\mathbb{R}$ defined by
$$v(\alpha)=\alpha(v)=a_iv^i,\ \alpha=a_idx^i\in E^\ast.$$

Definition. A contravariant tensor of rank $s$ is a multilinear real-valued function $T$ on $s$-tuples of covectors
$$T:E^\ast\times E^\ast\times\cdots\times E^\ast\longrightarrow\mathbb{R}.$$ A contravariant tensor of rank $s$ is also called a tensor of type $(s,0)$ or shortly $(s,0)$-tensor.
For 1-forms $\alpha_1,\cdots,\alpha_s$
$$T(\alpha_1,\cdots,\alpha_s)=a_{1_{i_1}}\cdots a_{s_{i_s}}T^{i_1\cdots i_s}$$
where
$$T^{i_1\cdots i_s}:=T(dx^{i_1},\cdots,dx^{i_s}).$$
The space of all contravariant tensors of rank $s$ is denoted by
$$E\otimes E\otimes\cdots\otimes E:=\otimes^s E.$$
Contravariant vectors are contravariant tensors of rank 1. An example of a contravariant tensor of rank 2 is the inverse of the metric tensor $G^{-1}=(g^{ij})$:
$$G^{-1}(\alpha,\beta)=g^{ij}a_ib_j.$$

Given a pair $v,w$ of contravariant vectors, we can form the tensor product $v\otimes w$ in the same manner as we did for covariant vectors. It is the 2nd rank contravariant tensor with components $(v\otimes w)^{ij}=v^jw^j$. The metric tensor $G$ and its inverse $G^{-1}$ may be written as
$$G=g_{ij}dx^i\otimes dx^j\ \mbox{and}\ G^{-1}=g^{ij}\partial_i\otimes\partial_j.$$

Mixed Tensors

Definition. A mixed tensor, $r$ times covariant and $s$ times contravariant, is a real multilinear function $W$
$$W: E^\ast\times E^\ast\times\cdots\times E^\ast\times E\times E\times\cdots\times E\longrightarrow\mathbb{R}$$
on $s$-tuples of covectors and $r$-tuples of vectors. It is also called a tensor of type $(s,r)$ or simply $(s,r)$-tensor. By multilinearity
$$W(\alpha_1,\cdots,\alpha_s, v_1,\cdots, v_r)=a_{1_{i_1}}\cdots a_{s_{i_s}}W^{i_1\cdots i_s}{}_{j_1\cdots j_r}v_1^{j_1}\cdots v_r^{j_r}$$
where
$$W^{i_1\cdots i_s}{}_{j_1\cdots j_r}:=W(dx^{i_1},\cdots,dx^{i_s},\partial_{j_1},\cdots,\partial_{j_r}).$$

A 2nd rank mixed tensor may arise from a linear operator $A: E\longrightarrow E$. Define $W_A: E^\ast\times E\longrightarrow\mathbb{R}$ by $W_A(\alpha,v)=\alpha(Av)$. Let $A=(A^i{}_j)$ be the matrix associated with $A$, i.e. $A(\partial_j)=\partial_i A^i{}_j$. Let us calculate the component of $W_A$:
$$W_A^i{}_j=W_A(dx^i,\partial_j)=dx^i(A(\partial_j))=dx^i(\partial_kA^k{}_j)=\delta^i_kA^k{}_j=A^i{}_j.$$
So the matrix of the mixed tensor $W_A$ is just the matrix associated with $A$. Conversely, given a mixed tensotr $W$, once convariant and once contravariant, we can define a linear transformation $A$ such that $W(\alpha,v)=\alpha(A,v)$. We do not distinguish between a linear transformation $A$ and its associated mixed tensor $W_A$. In components, $W(\alpha,v)$ is written as
$$W(\alpha,v)=a_iA^i{}_jv^j=aAv.$$

The tensor product $w\otimes\beta$ of a vector and a covector is the mixed tensor defined by
$$(w\otimes\beta)(\alpha,v)=\alpha(w)\beta(v).$$ The associated transformation is can be written as
$$A=A^i{}_j\partial_i\otimes dx^j=\partial_i\otimes A^i{}_jdx^j.$$

For math undergraduates, different ways of writing indices (raising, lowering, and mixed) in tensor notations can be very confusing. Main reason is that in standard math courses such as linear algebra or elementary differential geometry (classical differential geometry of curves and surfaces in $\mathbb{E}^3$) the matrix of a linear transformation is usually written as $A_{ij}$. Physics undergraduates don’t usually get a chance to learn tensors in undergraduate physics courses. In order to study more advanced differential geometry or physics such as theory of special and general relativity, and field theory one must be able to distinguish three different ways of writing matrices $A_{ij}$, $A^{ij}$, and $A^i{}_j$. To summarize, $A_{ij}$ and $A^{ij}$ are bilinear forms on $E$ and $E^\ast$, respectively that are defined by
$$A_{ij}v^iv^j\ \mbox{and}\ A^{ij}a_ib_j\ (\mbox{respectively}).$$ $A^i{}_j$ is the matrix of a linear transformation $A: E\longrightarrow E$.

Let $(E,\langle\ ,\ \rangle)$ be an inner product space. Given a linear transformation $A: E\longrightarrow E$ (i.e. a mixed tensor), one can associate a bilinear covariant bilinear form $A’$ by
$$A’(v,w):=\langle v,Aw\rangle=v^ig_{ij}A^j{}_k w^k.$$ So we see that the matrix of $A’$ is
$$A’_{ik}=g_{ij}A^j{}_k.$$ The process can be said as “we lower the index $j$, making it a $k$, by mans of the metric tensor $g_{ij}$.” In tensor analysis one uses the same letter, i.e. instead of $A’$, one writes
$$A_{ik}:=g_{ij}A^j{}_k.$$ This is clearly a covariant tensor. In general, the components of the associated covariant tensor $A_{ik}$ differ from those of the mixed tensor $A^i{}_j$. But if the basis is orthonormal, i.e. $g_{ij}=\delta^i_j$ then they coincide. That is the reason why we simply write $A_{ij}$ without making any distiction in linear algebra or in elementary differential geometry.

Similarly, one may associate to the linear transformation $A$ a contravariant bilinear form
$$\bar A(\alpha,\beta)=a_iA^i{}_jg^{jk}b_k$$ whose matrix components can be written as
$$A^{ik}=A^i{}_jg^{jk}.$$

Note that the metric tensor $g_{ij}$ represents a linear map from $E$ to $E^\ast$, sending the vector with components $v^j$ into the covector with components $g_{ij}v^j$. In quantum mechanics, the covector $g_{ij}v^j$ is denoted by $\langle v|$ and called a bra vector, while the vector $v^j$ is denoted by $|v\rangle$ and called a ket vector. Usually the inner product on $E$
$$\langle\ ,\ \rangle:E\times E\longrightarrow\mathbb{R};\ \langle v,w\rangle=g_{ij}v^iw^j$$ is considered as a covariant tensor of rank 2. But in quantum mechanics $\langle v,w\rangle$ is not considered as a covariant tensor $g_{ij}$ of rank 2 acting on a pair of vectors $(v,w)$, rather it is regarded as the braket $\langle v|w\rangle$, a bra vector $\langle v|$ acting on a ket vector $|w\rangle$.

Helmholtz Equation

Helmholtz equation
$$\nabla^2\psi+k^2\psi=0\ \ \ \ \ \mbox{(1)}$$
is extremely important in physics. Solving many physically important partial differential equations such as heat equation, wave equation (Klein-Gordon equation), Maxwell’s equations, and Schrödinger equation, etc. often require solving Helmholtz equation (1).

In this notes, we discuss how to solve Helmholtz equation using separation of variables in rectangular, cylindrical, and spherical coordinate systems. The solutions we discuss here will be used when you solve boundary value problems associated with Helmholtz equation.

Helmholtz Equation in Rectangular Coordinates

Assume that $\psi(x,y,z)=X(x)Y(y)Z(z)$. Then the equation (1) becomes
$$YZ\frac{d^2X}{dx^2}+XZ\frac{d^2Y}{dy^2}+XY\frac{d^2Z}{dz^2}+k^2XYZ=0.\ \ \ \ \ \mbox{(2)}$$
Dividing (2) by $XYZ$, we obtain
$$\frac{1}{X}\frac{d^2X}{dx^2}+\frac{1}{Y}\frac{d^2Y}{dy^2}+\frac{1}{Z}\frac{d^2Z}{dz^2}+k^2=0.\ \ \ \ \mbox{(3)}$$
Let us write (3) as
$$\frac{1}{X}\frac{d^2X}{dx^2}=-\frac{1}{Y}\frac{d^2Y}{dy^2}-\frac{1}{Z}\frac{d^2Z}{dz^2}-k^2.\ \ \ \ \ \mbox{(4)}$$
Now we have a paradox. The LHS of (4) depends only on the $x$-variable while the RHS of (4) depends on $y$ and $z$-variables. One way to to avoid this paradox is to assume that the LHS and the RHS of (4) is a constant, say $-l^2$. If you are wondering why we choose a negative constant, the reason comes from physics. For a physical reason, we need an oscillating solution which can be obtained by choosing a negative separation constant. Often boundary conditions for Helmholtz equation lead to a trivial solution for a positive separation constant. Continuing a similar process, we separate Helmholtz equation into three ordinary differential equations:
\begin{align*}
\frac{1}{X}\frac{d^2 X}{dx^2}&=-l^2,\\
\frac{1}{Y}\frac{d^2Y}{dy^2}&=-m^2,\\
\frac{1}{Z}\frac{d^2Z}{dz^2}&=-n^2,
\end{align*}
where $k^2=l^2+m^2+n^2$.

Each mode is given by
$$\psi_{lmn}(x,y,z)=X_l(x)Y_m(y)Z_n(z)$$ and the most general solution is given by the linear combination of the modes
$$\psi(x,y,z)=\sum_{i,m,n}a_{lmn}\psi_{lmn}(x,y,z).$$

Helmholtz Equation in Cylindrical Coordinates

In cylindrical coordinate system $(\rho,\varphi,z)$, Helmholtz equation (1) is written as
$$\frac{1}{\rho}\frac{\partial}{\partial\rho}\left(\rho\frac{\partial\psi}{\partial\rho}\right)+\frac{1}{\rho^2}\frac{\partial^2\psi}{\partial\varphi^2}+\frac{\partial^2\psi}{\partial z^2}+k^2\psi=0.\ \ \ \ \ \mbox{(5)}$$

We assume that $\psi(\rho,\varphi,z)=P(\rho)\Phi(\varphi)Z(z)$. Then (5) can be written as
$$\frac{\Phi Z}{\rho}\frac{\partial}{\partial\rho}\left(\rho\frac{\partial\psi}{\partial\rho}\right)+\frac{PZ}{\rho^2}\frac{\partial^2\psi}{\partial\varphi^2}+P\Phi\frac{\partial^2\psi}{\partial z^2}+k^2=0.\ \ \ \ \ \mbox{(6)}$$
As we have done in rectangular coordinate system, by introducing the separation constants we can separate (6) into three ordinary differential equations
\begin{align*}
\frac{d^2Z}{dz^2}=l^2z,\\
\frac{d^2\Phi}{d\phi^2}=-m^2\Phi,\\
\rho\frac{d}{d\rho}\left(\rho\frac{dP}{d\rho}\right)+(n^2\rho^2-m^2)P=0,\ \ \ \ \ \mbox{(7)}
\end{align*}
where $n^2=k^2+l^2$. The last equation (7) is Bessel’s differential equation.

The general solution of Helmholtz equation in cylindrical coordinates is given by
$$\psi(\rho,\varphi,z)=\sum_{m,n}a_{mn}P_{mn}(\rho)\Phi_m(\varphi)Z_n(z).$$

Helmholtz Equation in Spherical Coordinates

In spherical coordinates $(r,\theta,\varphi)$, Helmholtz equation (1) is written as
$$\frac{1}{r^2\sin\theta}\left[\sin\theta\frac{\partial}{\partial r}\left(r^2\frac{\partial\psi}{\partial r}\right)+\frac{\partial}{\partial\theta}\left(\sin\theta\frac{\partial\psi}{\partial\theta}\right)+\frac{1}{\sin\theta}\frac{\partial^2\psi}{\partial\varphi^2}\right]=-k^2\psi.\ \ \ \ \ \mbox{(8)}$$
Assume that $\psi(r,\theta,\varphi)=R(r)\Theta(\theta)\phi(\varphi)$. Then (8) can be written as
$$\frac{1}{Rr^2}\frac{d}{dr}\left(r^2\frac{dR}{dr}\right)+\frac{1}{\Theta r^2\sin\theta}\frac{d}{d\theta}\left(\sin\theta\frac{d\Theta}{d\theta}\right)+\frac{1}{\Phi r^2\sin^2\theta}\frac{d^2\Phi^2}{d\varphi^2}=-k^2.\ \ \ \ \ \mbox{(9)}$$
By introducing separation constants, (9) is separated into three ordinary differential equations
\begin{align*}
\frac{1}{\Phi}\frac{d^2\Phi}{d\varphi^2}=-m^2,\\
\frac{1}{\sin\theta}\frac{d}{d\theta}\left(\sin\theta\frac{d\Theta}{d\theta}\right)+\left(Q-\frac{m^2}{\sin^2\theta}\right)\Theta=0,\ \ \ \ \ \mbox{(10)}\\
\frac{1}{r^2}\frac{d}{dr}\left(r^2\frac{dR}{dr}\right)+\left(k^2-\frac{Q}{r^2}\right)R=0.\ \ \ \ \ \mbox{(11)}
\end{align*}
The second equation (10) is the associated Legendre equation with $Q=l(l+1)$. The third equation (11) is spherical Bessel equation with $k^2>0$.

The general solution of Helmholtz equation (8) is then given by
$$\psi(r,\theta,\varphi)=\sum_{Q,m}R_Q(r)\Theta_{Qm}(\theta)\Phi_m(\varphi).$$

The restriction that $k^2$ be a constant is unnecessary. For instance the separation process will still be possible for $k^2=f(r)$. If $k^2=f(r)$, (11) is the associated Laguerre equation. The associated Laguerre equation is appeared in the hydrogen atom problem in quantum mechanics.

Legendre Functions I: A Physical Origin of Legendre Functions

Consider an electric charge $q$ placed on the $z$-axis at $z=a$.

Electric Potential

The electrostatic potential of charge $q$ is $$\varphi=\frac{1}{4\pi\epsilon_0}\frac{q}{r_1}.\ \ \ \ \ \mbox{(1)}$$ Using the Laws of Cosine, one can write $r_1$ in terms of $r$ and $\theta$:
$$r_1=\sqrt{r^2+a^2-2ar\cos\theta}$$
and thereby the electrostatic potential (1) can be written as
$$\varphi=\frac{q}{4\pi\epsilon_0}(r^2+a^2-2ar\cos\theta)^{-1/2}.\ \ \ \ \ \mbox{(2)}$$

Recall the Binomial Expansion Formula: Suppose that $x,y\in\mathbb{R}$ and $|x|>|y|$. Then
$$(x+y)^r=\sum_{k=0}^\infty\begin{pmatrix}r\\k\end{pmatrix}x^{r-k}y^k,\ \ \ \ \ \mbox{(3)}$$ where $\begin{pmatrix}r\\k\end{pmatrix}=\frac{r!}{k!(r-k)!}$.

Legendre Polynomials: If $r>a$ (or more specifically $r^2>|a^2-2ar\cos\theta|$), we can expand the radical to obtain:
$$\varphi=\frac{q}{4\pi\epsilon_0 r}\sum_{n=0}^\infty P_n(\cos\theta)\left(\frac{a}{r}\right)^n.$$ The coefficients $P_n$ are called the Legendre polynomials. The Legendre polynomials can be defined by the generating function
$$g(t,x)=(1-2xt+t^2)^{-1/2}=\sum_{n=0}^\infty P_n(x)t^n,\ \ \ \ \ \mbox{(4)}$$ where $|t|<1$. Using the binomial expansion formula (3), we obtain
\begin{align*}
(1-2xt+t^2)^{-1/2}&=\sum_{n=0}^\infty\frac{(2n)!}{2^{2n}(n!)^2}(2xt-t^2)^n\ \ \ \ \ \mbox{(5)}\\
&=\sum_{n=0}^\infty\frac{(2n-1)!!}{(2n)!!}(2xt-t^2)^n.
\end{align*}
Let us write out the first three terms:
\begin{align*}
\frac{0!}{2^0(0!)^2}&(2xt-t^2)^0+\frac{2!}{2^2(1!)^2}(2xt-t^2)^1+\frac{4!}{2^4(2!)^2}(2xt-t^2)^2\\
&=1t^0+xt^1+\left(\frac{3}{2}x^2-\frac{1}{2}\right)t^2+\mathcal{O}t^3.
\end{align*}
Thus we see that $P_0(x)=1$, $P_1(x)=x$, and $P_2(x)=\frac{3}{2}x^2-\frac{1}{2}$. In practice, we don’t calculate Legendre polynomials using the power series (5). Instead, we use the recurrence relation of Legendre polynomials that will be discussed later.

The Maxima name for Legendre polynomial $P_n(x)$ is legendre_p(n,x). The following graphs of $P_2(x)$, $P_3(x)$, $P_4(x)$, $P_5(x)$, $-1\leq x\leq 1$ is made by Maxima using the command:

plot2d([legendre_p(2,x),legendre_p(3,x),legendre_p(4,x),legendre_p(5,x)],[x,-1,1]);

Legendre Polynomials

Now expand the polynomial $(2xt-t^2)^n$ in the power series (5):
\begin{align*}
(1-2xt+t^2)^{-1/2}&=\sum_{n=0}^\infty\frac{(2n)!}{2^{2n}(n!)^2}t^n\sum_{k=0}^n(-1)^k\frac{n!}{k!(n-k)!}(2x)^{n-k}t^k\\
&=\sum_{n=0}^\infty\sum_{k=0}^n(-1)^k\frac{(2n)!}{2^{2n}n!k!(n-k)!}(2x)^{n-k}t^{k+n}.\ \ \ \ \ \mbox{(6)}
\end{align*}
By rearranging the order of summation, (6) can be written as
$$(1-2xt+t^2)^{-1/2}=\sum_{n=0}^\infty\sum_{k=0}^{[n/2]}(-1)^k\frac{(2n-k)!}{2^{2n-2k}k!(n-k)!(n-2k)!}(2x)^{n-2k}t^n,$$ where
$$\left[\frac{n}{2}\right]=\left\{\begin{array}{ccc}
\frac{n}{2} & \mbox{for} & n=\mbox{even}\\
\frac{n-1}{2} & \mbox{for} & n=\mbox{odd}.
\end{array}\right.$$

Hence,
$$P_n(x)=\sum_{k=0}^{[n/2]}(-1)^k\frac{(2n-k)!}{2^{2n-2k}k!(n-k)!(n-2k)!}(2x)^{n-2k}.\ \ \ \ \ \mbox{(7)}$$
In practice, we hardly use the formula (7). Again, we use the recurrence relation of Legendre polynomials instead.

Electric Dipole: The generating function (3) can be used for the electric multipole potential. Here we consider an electric dipole. Let us place electric charges $q$ and $-q$ at $z=a$ and $z=-a$, respectively.

Electric Dipole Potential

The electric dipole potential is given by
$$\varphi=\frac{q}{4\pi\epsilon_0}\left(\frac{1}{r_1}-\frac{1}{r_2}\right).\ \ \ \ \ \mbox{(8)}$$
$r_2$ is written in terms of $r$ and $\theta$ using the Laws of Cosine as
\begin{align*}
r_2^2&=r^2+a^2-2ar\cos(\pi-\theta)\\
&=r^2+a^2+2ar\cos\theta.
\end{align*}
So by the generating function (3), the electric dipole potential (8) can be written as
\begin{align*}
\varphi&=\frac{q}{4\pi\epsilon_0 r}\left\{\left[1-2\left(\frac{a}{r}\right)\cos\theta+\left(\frac{a}{r}\right)^2\right]^{-\frac{1}{2}}-\left[1+2\left(\frac{a}{r}\right)\cos\theta+\left(\frac{a}{r}\right)^2\right]^{-\frac{1}{2}}\right\}\\
&=\frac{q}{4\pi\epsilon_0 r}\left[\sum_{n=0}^\infty P_n(\cos\theta)\left(\frac{a}{r}\right)^n-\sum_{n=0}^\infty P_n(\cos\theta)(-1)^n\left(\frac{a}{r}\right)^n\right]\\
&=\frac{2q}{4\pi\epsilon_0 r}\left[P_1(\cos\theta)\left(\frac{a}{r}\right)+P_3(\cos\theta)\left(\frac{a}{r}\right)^3+\cdots\right]
\end{align*}
for $r>a$.

For $r\gg a$,
$$\varphi\approx\frac{2aq}{4\pi\epsilon_0 r}\frac{P_1(\cos\theta)}{r^2}=\frac{2aq}{4\pi\epsilon_0 r}\frac{\cos\theta}{r^2}.$$
This is usual electric dipole potential. The quantity $2aq$ is called the dipole moment in electromagnetism.

Spherical Bessel Functions

When the Helmholtz equation is separated in spherical coordinates the radial equation has the form
$$r^2\frac{d^2R}{dr^2}+2r\frac{dR}{dr}+[k^2r^2-n(n+1)]R=0.\ \ \ \ \ \mbox{(1)}$$
The equation (1) looks similar to Bessel’s equation.  If we use the transformation $R(kr)=\frac{Z(kr)}{(kr)^{1/2}}$, (1) turns into Bessel’s equation
$$r^2\frac{d^2Z}{dr^2}+r\frac{dZ}{dr}+\left[k^2r^2-\left(n+\frac{1}{2}\right)^2\right]Z=0.\ \ \ \ \ \mbox{(2)}$$
Hence $Z(kr)=J_{n+\frac{1}{2}}(x)$, Bessel function of order $n+\frac{1}{2}$ where $n$ is an integer.

Spherical Bessel Functions: Spherical Bessel functions of the first kind and the second kind are defined by
\begin{align*}
j_n(x)&:=\sqrt{\frac{\pi}{2x}}J_{n+\frac{1}{2}}(x),\\
n_n(x)&:=\sqrt{\frac{\pi}{2x}}N_{n+\frac{1}{2}}(x)=(-1)^{n+1}\sqrt{\frac{\pi}{2x}}J_{-n-\frac{1}{2}}(x).
\end{align*}
Spherical Bessel functions $j_n(kr)$ and $n_n(kr)$ are two linearly independent solutions of the equation (1).

One can obtain power series representations of $j_n(x)$ and $n_n(x)$ using Legendre Duplication Formula

$$z!\left(z+\frac{1}{2}\right)!=2^{-2z-1}\pi^{1/2}(2z+1)!$$ from $$J_{n+\frac{1}{2}}(x)=\sum_{s=0}^\infty\frac{(-1)^s}{s!\left(s+n+\frac{1}{2}\right)!}\left(\frac{x}{2}\right)^{2s+n+\frac{1}{2}}:$$
\begin{align*}
j_n(x)&=2^nx^n\sum_{s=0}^\infty\frac{(-1)^s(s+n)!}{s!(2s+2n+1)!}x^{2s},\\
n_n(x)&=(-1)^{n+1}\frac{2^n\pi^{1/2}}{x^{n+1}}\sum_{s=0}^\infty\frac{(-1)^s}{s!\left(s-n-\frac{1}{2}\right)!}\left(\frac{x}{2}\right)^{2s}\\
&=\frac{(-1)^{n+1}}{2^nx^{n+1}}\sum_{s=0}^\infty\frac{(-1)^s(s-n)!}{s!(2s-2n)!}x^{2s}.
\end{align*}
From these power series representations, we obtain
\begin{align*}
j_0(x)&=\frac{\sin x}{x}\left(=\sum_{s=0}^\infty\frac{(-1)^s}{(2s+1)!}x^{2s}\right)\\
n_0(x)&=-\frac{\cos x}{x}\\
j_1(x)&=\frac{\sin x}{x^2}-\frac{\cos x}{x}\\
n_1(x)&=-\frac{\cos x}{x^2}-\frac{\sin x}{x}.
\end{align*}
Orthogonality: Recall the orthogonality of Bessel functions
$$\int_0^aJ_\nu\left(\frac{\alpha_{\nu p}}{a}\rho\right)J_\nu\left(\frac{\alpha_{\nu q}}{a}\rho\right)\rho d\rho=\frac{a^2}{2}[J_{\nu+1}(\alpha_{\nu p})]^2\delta_{pq}$$ as discussed here. By a substitution, we obtain the orthogonality of spherical Bessel functions
$$\int_0^aj_n\left(\frac{\alpha_{np}}{a}\rho\right)j_n\left(\frac{\alpha_{nq}}{a}\rho\right)\rho^2 d\rho=\frac{a^3}{2}[j_{n+1}(\alpha_{np})]^2\delta_{pq},$$ where $\alpha_{np}$ and $\alpha_{nq}$ are roots of $j_n$.

Example: [Particle in a Sphere]

Let us consider a particle inside a sphere with radius $a$. The wave function that describes the state of the particle satisfies Schrödinger equation
$$-\frac{\hbar^2}{2m}\nabla^2\psi=E\psi\ \ \ \ \ \mbox{(3)}$$
with boundary conditions:
\begin{align*}
&\psi(r\leq a)\ \mbox{is finite},\\
&\psi(a)=0.
\end{align*}
This corresponds to a potential $V=0$, $r\leq a$ and $V=\infty$, $r>a$. Here $m$ is the mass of the particle, $\hbar=\frac{h}{2\pi}$ is the reduced Planck constant (also called Dirac constant).
Note that (3) is the Helmholtz equation $\nabla^2\psi+k^2\psi=0$ with $k^2=\frac{2mE}{\hbar^2}$, whose radial part satisfies
$$\frac{d^2R}{dr^2}+\frac{2}{r}\frac{dR}{dr}+\left[k^2-\frac{n(n+1)}{r^2}\right]R=0.$$ Now we determine the minimum energy (zero-point energy) $E_{\mbox{min}}$. Since any angular dependence would increase the energy, we take $n=0$. The solution $R$ is given by
$$R(kr)=Aj_0(kr)+Bn_0(kr).$$ Since $n_0(kr)\rightarrow\infty$ at the origin, $B=0$. From the boundary condition $\psi(a)=0$, $R(a)=0$, i.e. $j_0(ka)=0$. Thus $ka=\frac{2mE}{\hbar}a=\alpha$ is a root of $j_0(x)$.The smallest $\alpha$ is the first zero of $j_0(x)$, $\alpha=\pi$. Therefore,
\begin{align*}
E_{\mbox{min}}&=\frac{\hbar^2\alpha^2}{2ma^2}\\
&=\frac{\hbar^2\pi^2}{2ma^2}\\
&=\frac{h^2}{8ma^2},
\end{align*}
where $h$ is the Planck constant. This means that for any finite sphere, the particle will have a positive minimum energy (or zero-point energy).

Neumann Functions, Bessel Function of the Second Kind $N_\nu(X)$

In here, we have considered Bessel functions $J_\nu(x)$ for $\nu=\mbox{integer}$ case only. Note that $J_\nu$ and $J_{-\nu}$ are linearly independent if $\nu$ is a non-integer. If $\nu$ is an integer, $J_\nu$ and $J_{-\nu}$ satisfy the relation $J_{-\nu}=(-1)^\nu J_\nu$, i.e. they are no longer linearly independent. Thus we need a second solution for Bessel’s equation.

Let us define
$$N_\nu(x):=\frac{\cos\nu\pi J_\nu(x)-J_{-\nu}(x)}{\sin\nu\pi}.$$
$N_\nu(x)$ is called Neumann function or Bessel function of the second kind. For $\nu=\mbox{integer}$, $N_\nu(x)$ is an indeterminate form of type $\frac{0}{0}$. So by l’Hôpital’s rule
\begin{align*}
N_n(x)&=\lim_{\nu\to n}\frac{\frac{\partial}{\partial\nu}[\cos\nu\pi J_\nu(x)-J_{-\nu}(x)]}{\frac{\partial}{\partial\nu}\sin\nu\pi}\\
&=\frac{1}{\pi}\lim_{\nu\to n}\left[\frac{\partial J_\nu(x)}{\partial\nu}-(-1)^\nu\frac{\partial J_{-\nu}(x)}{\partial\nu}\right].
\end{align*}
Neumann function can be also written as a power series:
\begin{align*}
N_n(x)=&\frac{2}{\pi}\left[\ln\left(\frac{x}{2}\right)+\gamma-\frac{1}{2}\sum_{p=1}^n\frac{1}{p}\right]J_n(x)\\
&-\frac{1}{\pi}\sum_{r=0}^\infty(-1)^r\frac{\left(\frac{x}{2}\right)^{n+2r}}{r!(n+r)!}\sum_{p=1}^r\left[\frac{1}{p}+\frac{1}{p+n}\right]\\
&-\frac{1}{\pi}\sum_{r=0}^{n-1}\frac{(n-r-1)!}{r!}\left(\frac{x}{2}\right)^{-n+2r},
\end{align*}
where $\gamma$ is the Euler-Mascheroni number.

In Maxima, Neumann function is denoted by bessel_y(n,x). Let us plot $N_0(x)$, $N_1(x)$, $N_3(x)$ altogether using the command

plot2d([bessel_y(0,x),bessel_y(1,x),bessel_y(2,x)],[x,0.5,10]);

Neumann Functions

We now show that Neumann functions do satisfy Bessel’s differential equation. Differentiate Bessel’s equation
$$x^2\frac{d^2}{dx^2}J_{\pm\nu}(x)+x\frac{d}{dx}J_{\pm\nu}(x)+(x^2-\nu^2)J_{\pm\nu}(x)=0$$
with respect to $\nu$. (Here of course we are assuming that $\nu$ is a continuous variable.) Then we obtain the following equations:
\begin{align*}
x^2\frac{d^2}{dx^2}\frac{\partial J_\nu(x)}{\partial\nu}+x\frac{d}{dx}\frac{\partial J_\nu(x)}{\partial\nu}+(x^2-\nu^2)\frac{\partial J_\nu(x)}{\partial\nu}=2\nu J_\nu(x)\ \ \ \ \ \mbox{(1)}\\
x^2\frac{d^2}{dx^2}\frac{\partial J_{-\nu(x)}}{\partial\nu}+x\frac{d}{dx}\frac{\partial J_{-\nu(x)}}{\partial\nu}+(x^2-\nu^2)\frac{\partial J_{-\nu(x)}}{\partial\nu}=2\nu J_\nu(x)\ \ \ \ \ \mbox{(2)}
\end{align*}
Subtract $\frac{1}{\pi}(-1)^n$ times (2) from $\frac{1}{\pi}$ times (1) and then take the limit of the resulting equation as $\nu\to n$. Then we see that $N_n$ satisfies the Bessel’s differential equation
$$x^2\frac{d^2}{dx^2}N_n(x)+x\frac{d}{dx}N_n(x)+(x^2-n^2)N_n(x)=0.$$
The general solution of Bessel’s differential equation is given by
$$y_n(x)=AJ_n(x)+BN_n(x).$$