Category Archives: Differential Geometry

Parallel Transport, Holonomy, and Curvature

Let $\gamma: [0,1]\longrightarrow M$ be a path. Using connection $\nabla$, one can consider the notion of moving a vector in $L_{\gamma(0)}$ to $L_{\gamma(1)}$ without changing it. This is parallel transporting a vector from $L_{\gamma(0)}$ to $L_{\gamma(1)}$. The change is measured relative to $\nabla$, so if $\xi(t)\in L_{\gamma(t)}$ is moving without changing, it must satisfy the differential equation
$$\nabla_{\dot{\gamma}(t)}\xi=0,$$
where $\dot{\gamma}(t)$ is the tangent vector field to the curve $\gamma(t)$. The image of $\gamma(t)$ is covered by the $U_\alpha$’s on which $L$ has nowhere vanishing sections $s_\alpha$’s. Since $\gamma([0,1])$ is compact, the image of $\gamma$ is covered by only finitely many of such open sets. Let $U_\alpha$ be one of such open sets and assume that it contains $\gamma(0)$. Then $\xi|_{U_\alpha}(t)=\xi_\alpha(\gamma(t))s_\alpha(\gamma(t))$ where $\xi_\alpha: U_\alpha\longrightarrow\mathbb{C}$.
\begin{align*}
\nabla_{\dot{\gamma}(t)}\xi&=d\xi_\alpha(\dot{\gamma}(t))s_\alpha(\gamma(t))+A_\alpha(\dot{\gamma}(t))\xi_\alpha(\dot{\gamma}(t))s_\alpha(\gamma(t))\\
&=\left(\frac{d\xi_\alpha}{dt}(\gamma(t))+A_\alpha(\dot{\gamma}(t))\xi_\alpha(\dot{\gamma}(t))\right)s_\alpha(\gamma(t)).
\end{align*}
$\nabla_{\dot{\gamma}(t)}\xi=0$ implies that
$$\frac{d\xi_\alpha}{dt}=-A_\alpha(\dot{\gamma}(t))\xi_\alpha.$$
The solution of this equation is given by
$$\xi_\alpha(t)=\xi_\alpha(\gamma(0))\exp\left(-\int_0^tA_\alpha(\dot{\gamma}(u))du\right).$$
The standard existence and uniqueness theorems (Frobenius’ theorem) tell that parallel transport defines an isomorphism $L_{\gamma(0)}\cong L_{\gamma(t)}$ for any $\gamma(t)\in U_\alpha$. Suppose that the path $\gamma$ is covered by finitely many open sets $U_\alpha$, $U_{\alpha_1}$, $U_{\alpha_2}$, $\cdots$, $U_{\alpha_n}$ as shown in the following figure.

As discussed, we know that $L_{\gamma(0)}\cong L_{\gamma(t_1)}$. Using $\xi_{\alpha_1}(\gamma(t_1))=\xi_\alpha(\gamma(t_1))$ as the initial condition, we also find $\xi|_{U_{\alpha_1}}(t)=\xi_{\alpha_1}(\gamma(t))s_{\alpha_1}(\gamma(t))$ where
$$\xi_{\alpha_1}(t)=\xi_{\alpha_1}(\gamma(t_1))\exp\left(-\int_{t_1}^tA_{\alpha_1}(\dot{\gamma}(u))du\right).$$
This implies that  $L_{\gamma(t_1)}\cong L_{\gamma(t_2)}$. Continuing this process, we obtain $L_{\gamma(t_2)}\cong L_{\gamma(t_3)}$, $\cdots$, $L_{\gamma(t_n)}\cong L_{\gamma(1)}$. Since the relation $\cong$ is transitive, we have
$$L_{\gamma(0)}\stackrel{P_\gamma}{\cong}L_{\gamma(1)}.$$
In general, $P_\gamma$ depends on $\gamma$ and $\nabla$. Now we are particularly interested in the case when $\gamma:[0,1]\longrightarrow M$ is a loop i.e. $\gamma(0)=\gamma(1)$. Then we can define the holonomy $\mathrm{hol}(\nabla,\gamma)$ of the connection $\nabla$ along the loop $\gamma$ by
$$P_\gamma(s)=\mathrm{hol}(\nabla,\gamma)s$$
for any nowhere vanishing section $s\in L_{\gamma(0)}$. So, what is really the meaning of the holonomy? In Euclidean space (the world we are familiar with), we can move a  vector without changing its direction and magnitude by parallel translation. That is, in Euclidean space parallel translation is parallel transport. So, we do not distinguish vectors that have the same direction and magnitude in Euclidean space. In a curved manifold, there is no such parallel translation and parallel transport is considered relative to the connection $\nabla$ as we discussed above. For those who live in a manifold with connection $\nabla$, they will not know the difference when a vector is parallel transported relative to $\nabla$ along a loop. The initial vector and the one that comes back to the initial point after parallel transport must coincide. However, in our perspective (for those who live in Euclidean space) we notice a difference between them. The holonomy measures such a difference.

Since $\gamma$ is a loop, both $\gamma(0)$ and $\gamma(1)$ belong to the same open set, say $U_\alpha$.
\begin{align*}
P_\gamma(\xi(0))&=\xi(1)\\
&=\xi_\alpha({\gamma(1)})\exp\left(-\oint_{\gamma}A_\alpha(\dot{\gamma}(u))du\right)s(\gamma(1))\\
&=\xi_\alpha({\gamma(0)})\exp\left(-\oint_{\gamma}A_\alpha(\dot{\gamma}(u))du\right)s(\gamma(0)).
\end{align*}
On the other hand, $\xi(0)=\xi_\alpha(\gamma(0))s(\gamma(0))$. So
\begin{align*}
P_\gamma(\xi(0))&=P_\gamma(\xi_\alpha(\gamma(0))s(\gamma(0)))\\
&=\xi_\alpha(\gamma(0))P_\gamma(s(\gamma(0))).
\end{align*}
Hence, we see that
$$P_\gamma(s(\gamma(0)))=\exp\left(-\oint_{\gamma}A_\alpha(\dot{\gamma}(u))du\right)s(\gamma(0))$$
and that the holonomy is given by
$$\mathrm{hol}(\nabla,\gamma)=\exp\left(-\oint_\gamma A_\alpha\right).$$
If $\gamma$ is the boundary of a disk, then by Stokes’ theorem we have
\begin{align*}
\mathrm{hol}(\nabla,\gamma)&=\exp\left(-\int_DdA_\alpha\right)\\
&=\exp\left(-\int_D F\right)\ \ \ \ \ \ \ (1)
\end{align*}
where $D$ is the interior of the disk.

Proposition. If $L\stackrel{\pi}{\longrightarrow}M$ is a line bundle with connection $\nabla$ and $\Sigma$ is a compact submanifold of $M$ with boundary loop $\gamma=\partial M$, then
$$\mathrm{hol}(\nabla,\gamma)=\exp\left(-\int_\Sigma F\right).\ \ \ \ \ \ \ (2)$$

Proof. By compactness, we can triangulate $\Sigma$ so that each of the triangles is in some $U_\alpha$. Then we apply (1) to each triangle and the holonomy up and down the interior edges cancels to give the required result.

Remark. Clearly holonomy is a gauge invariant quantity. In gauge theory, (2) is called a Wilson line or a Wilson loop. It is important to note that the gauge connection may be constructed from the collection of Wilson loops up to gauge transformation.

Example. [Parallel Transport on the 2-Sphere] In this example, we calculate the holonomy of the standard connection on $TS^2$. Before we proceed, let us take look at the figure below.

It clearly shows that the holonomy is $e^{i\theta}$ since the discrepancy between the initial vector and the parallel transported vector along the loop is given by a rotation by angle $\theta$. Recall that
\begin{align*}
\frac{\partial}{\partial\theta}&=(-\sin\theta\sin\phi,\cos\theta\cos\phi,0),\\
\frac{\partial}{\partial\phi}&=(\cos\theta\cos\phi,\sin\theta\cos\phi,-\sin\phi).
\end{align*}
The unit normal vector field $\hat n$ is computed to be
\begin{align*}
\hat n&=(\cos\theta\sin\phi,\sin\theta\sin\phi,\cos\phi)\\
&=\sin\phi\frac{\partial}{\partial\phi}\times\frac{\partial}{\partial\theta}.
\end{align*}
Consider a nowhere vanishing section
$$s=(-\sin\theta,\cos\theta,0).$$
Then
$$ds=(-\cos\theta,-\sin\theta,0)d\theta.$$
\begin{align*}
\nabla s&=\pi(dx)\\
&=ds-\langle ds,\hat n\rangle\hat n\\
&=(-\cos\theta\cos^2\phi,-\sin\theta\cos^2\phi,\sin\theta\cos\phi)d\theta\\
&=\cos\phi\hat n\times s d\theta\\
&=i\cos\phi sd\theta.
\end{align*}
The last expression is obtained by the definition of scalar multiplication
$$(\alpha+i\beta)v=\alpha v+\beta\cdot u\times v$$
for $\alpha,\beta\in\mathbb{C}$ and $u,v\in T_\ast S^2$, as seen here. So the connection 1-form is
$$A=i\cos\phi d\theta$$
and the curvature is
\begin{align*}
F&=dA\\
&=-i\sin\phi d\phi\wedge d\theta.
\end{align*}
Note that $\sin\phi d\phi\wedge d\theta$ is the area form on $S^2$, so
$$F=-i\mathrm{area}.$$
The area of the region bounded by the loop is
$$\int_0^\theta\int_0^{\frac{\pi}{2}}\sin\phi d\phi d\theta=\theta.$$
Therefore, the holomony is
$$\mathrm{hol}(\nabla,\gamma)=e^{i\theta}$$
as we already know.

 

References:

[1] M. Murray, Notes on Line Bundles

Sections of a Line Bundle II: Gauge Potential, Gauge Transformation, and Field Strength

A connection on a line bundle can be defined in a pretty much similar fashion to a connection on a manifold that is discussed here since sections are like vector fields. Let $L\longrightarrow M$ be a line bundle. A connection $\nabla$ is a bilinear map which maps a pair $(X,s)$ of a tangent vector field $X$ on $M$ and a section $s: M\longrightarrow L$ to a section $\nabla_Xs$ such that
\begin{align*}
\nabla_{fX+gY}s&=f\nabla_Xs+g\nabla_Ys\ (\mbox{linearity})\\
\nabla_Xfs&=df(X)s+f\nabla_Xs\ (\mbox{Leibniz rule})
\end{align*}
where $X,Y\in\mathfrak{X}(M)$, $f,g\in C^\infty (M)$ and $s:M\longrightarrow L$ is a section. Denote by $\Gamma(M,L)$ the set of sections $M\longrightarrow L$. If we omit specifying tangent vector field on which $\nabla$ acts, Liebniz rule can be written as
$$\nabla fs=df\otimes s+f\nabla s$$
where the tensor product $\otimes$ is evaluated as
$$df\otimes s(X(m),m)=df(X(m))s(m)$$
for $m\in M$.

Example. Trivial bundle $L=M\times\mathbb{C}$.

Let $\nabla$ be a general connection. Let $s$ be a nowhere vanishing section. Define a 1-form $A’$ on $M$ by $\nabla s=A’\times s$. If $\xi\in\Gamma(M,L)$ then $\xi=fs$ for some $f:M\longrightarrow\mathbb{C}$. By Leibniz,
\begin{align*}
\nabla\xi&=df\otimes s+f\nabla s\\
&=(df+fA’)s.
\end{align*}
Recall that every section of a trivial bundle looks like $s(x)=(x,g(x))$ for some function $g: M\longrightarrow\mathbb{C}$. By identifying sections with functions, the ordinary differentiation $d$ of functions defines a connection. More specifically,
$$ds:=dg\otimes s.$$
Now,
\begin{align*}
\nabla s-ds&=A’\otimes s-dg\otimes s\\
&=(A’-dg)\otimes s.
\end{align*}
Let $A:=A’-dg$. Then $A$ is a 1-form on $M$ and
$$\nabla s=ds+A\otimes s.$$
Hence, all connections on $L$ are of the form
$$\nabla=d+A$$
where $A$ is a 1-form on $M$.

Let $L\stackrel{\pi}{\longrightarrow}M$ be a line bundle and $s_\alpha: U_\alpha\longrightarrow L$ local nowhere vanishing sections, Define a one-form $A_\alpha$ on $U_\alpha$ by $\nabla s_\alpha=A_\alpha\otimes s_\alpha$. $A_\alpha$ is called a connection one-form (in differential geometry) or a gauge potential (in physics) on $U_\alpha$. If $\xi\in\Gamma(M,L)$ then $\xi|_{U_\alpha}=\xi_\alpha s_\alpha$ where $\xi_\alpha : U_\alpha\longrightarrow\mathbb{C}$. By Leibniz rule,
\begin{align*}
\nabla\xi|_{U_\alpha}&=d\xi_\alpha\otimes s_\alpha+\xi_\alpha\nabla s_\alpha\\
&=( d\xi_\alpha+A_\alpha\xi_\alpha)\otimes s_\alpha.
\end{align*}
Since each fibre $L_m$ is a one-dimensional complex vector space, the transition map would be $g_{\alpha\beta}: U_\alpha\cap U_\beta\longrightarrow\mathrm{GL}(1,\mathbb{C})\cong\mathbb{C}^\times$, where $\mathbb{C}^\times$ is the multiplicative group of non-zero complex numbers. The transition maps satisfy
\begin{equation}
\label{eq:transition}
s_\alpha=g_{\alpha\beta}s_\beta. \ \ \ \ \ (1)
\end{equation}
The collection of functions $\xi_\alpha$ defines a section $\xi$ if on any intersection $U_\alpha\cap U_\beta\ne\emptyset$, $\xi_\alpha=g_{\alpha\beta}\xi_\beta$. The transition map $g_{\alpha\beta}$ gives rise to the change of coordinates. Since $s_\alpha$ and $s_\beta$ are related by (1) on $U\alpha\cap U_\beta\ne\emptyset$,
$$\nabla s_\alpha=(dg_{\alpha\beta})\otimes s_\beta+g_{\alpha\beta}\nabla s_\beta.$$
Since $\nabla s_\alpha=A_\alpha\otimes s_\alpha$,
\begin{align*}
A_\alpha\otimes s_\alpha&=(dg_{\alpha\beta})\otimes s_\beta+g_{\alpha\beta}\nabla s_\beta\\
&=(dg_{\alpha\beta})\otimes s_\beta+g_{\alpha\beta}A_\beta\otimes s_\beta\\
&=(dg_{\alpha\beta}+g_{\alpha\beta}A_\beta)\otimes s_\beta.
\end{align*}
So, we obtain
\begin{equation}
\label{eq:gauge}
A_\alpha=g^{-1}_{\alpha\beta}dg_{\alpha\beta}+A_\beta.\ \ \ \ \ \ (2)
\end{equation}
In physics, this is the gauge transformation for electromagnetism. The converse is also true, namely if $\{A_\alpha\}$ is a collection of 1-forms satisfying (2) on $U_\alpha\cap U_\beta$, then there exists a connection $\nabla$ such that $\nabla s_\alpha=A_\alpha\otimes s_\alpha$. First define $\nabla s_\alpha=A_\alpha\otimes s_\alpha$ for each nowhere vanishing  section $s_\alpha: U_\alpha\longrightarrow L$. On $U_\alpha\cap U_\beta\ne\emptyset$, by (1)
\begin{align*}
\nabla s_\alpha&=\nabla(g_{\alpha\beta}s_\beta)\\
&=dg_{\alpha\beta}\otimes s_\beta+g_{\alpha\beta}\nabla s_\beta.
\end{align*}
This must coincide with $A_\alpha\otimes s_\alpha$. By the gauge transformation (2)
\begin{align*}
A_\alpha\otimes s_\alpha&=g^{-1}_{\alpha\beta}dg_{\alpha\beta}\otimes s_\alpha+A_\beta\otimes s_\alpha\\
&=dg_{\alpha\beta}\otimes(g^{-1}_{\alpha\beta}s_\alpha)+A_\beta\otimes(g_{\alpha\beta}s_\beta)\\
&=dg_{\alpha\beta}\otimes s_\beta+g_{\alpha\beta}A_\beta\otimes s_\beta\\
&=\nabla s_\alpha.
\end{align*}
For $\xi\in\Gamma(M,L)$, $\nabla s_\alpha$ is linearly extended to $\nabla\xi$.

Next discussion requires some knowledge of differential forms, wedge product and exterior derivative. If you are not so familiar with these, please study them before you continue. One good source is Barrett O’Neil’s Elementary Differential Geometry [2].

Let $F_\alpha$ be the two-form
$$F_\alpha=dA_\alpha.$$
Physically $F_\alpha$ is the field strength relative to the section (field) $s_\alpha: U_\alpha\longrightarrow L$. Recall that on $U_\alpha\cap U_\beta\ne\emptyset$ the gauge potentials $A_\alpha$ and $A_\beta$ are related by the gauge transformation (2). If $F_\alpha$ and $F_\beta$ do not agree on $U_\alpha\cap U_\beta$, it would be a physically awkward situation. The following proposition tells us that it will not happen.

Proposition. If $s_\beta: U_\beta\longrightarrow L$ is another local section where $U_\alpha\cap U_\beta\ne\emptyset$, then $F_\alpha=F_\beta$.

Proof. \begin{align*}
F_\alpha&=dA_\alpha\\
&=d(g^{-1}_{\alpha\beta}dg_{\alpha\beta}+A_\beta)\\
&=dg^{-1}_{\alpha\beta}\wedge dg_{\alpha\beta}+g^{-1}_{\alpha\beta}d(dg_{\alpha\beta})+dA_\beta\\
&=-g^{-1}_{\alpha\beta}(dg_{\alpha\beta})g^{-1}_{\alpha\beta}\wedge dg_{\alpha\beta}+dA_\beta\\
&=dA_\beta=F_\beta.
\end{align*}
From second line to third line, $d(dg_{\alpha\beta})=d^2g_{\alpha\beta}=0$ and $dg^{-1}_{\alpha\beta}=-g^{-1}_{\alpha\beta}(dg_{\alpha\beta})g^{-1}_{\alpha\beta}$ (which is obtained from  $g^{-1}_{\alpha\beta}g_{\alpha\beta}=I$) have been used.

Physically speaking the proposition says that the field strength is invariant under gauge transformation. The two-forms agree on the intersection of two open sets in the cover and hence define a global two-form. It is denoted by $F$ and is also called the curvature of the connection $\nabla$ in differential geometry.

Remark. In a principal G-bundle with a Lie group $G$, the transition map is given by $g_{\alpha\beta}:U_\alpha\cap U_\beta\longrightarrow G$and the connection 1-forms (gauge potentials) $A_\alpha$ take values in $\mathfrak{g}$, the Lie algebra of $G$. The gauge transformation is given by
$$A_\alpha=g^{-1}dg_{\alpha\beta}+g^{-1}_{\alpha\beta}A_\alpha g_{\alpha\beta}.$$
The curvature (field strength) $F$ is invariant under the gauge transformation and is given by
$$F=dA_\alpha+[A_\alpha,A_\alpha].$$
Note that for each pair of tangent vector fields $(X,Y)$, $F$ is evaluated as
$$F(X,Y)=dA_\alpha(X,Y)+[A_\alpha(X),A_\alpha(Y)].$$

References:

[1] M. Murray, Notes on Line Bundles

[2] B. O’Neill, Elementary Differential Geometry, Academic Press, 1966

Sections of a Line Bundle I

A section of a line bundle is like a vector field. It is a map $s: M\longrightarrow L$ such that $s(m)\in L_m$ or $\pi\circ s(m)=m$. Section of a line bundle is one-to-one.

Example. For the trivial bundle $L=M\times\mathbb{C}$,  every section $s$ looks like $s(x)=(x,f(x))$ for some function $f$.

Example. For a tangent bundle $TM$, sections are vector fields.
\begin{align*}
s: M&\longrightarrow TM\\
x&\longmapsto v_x\in T_xM
\end{align*}
For the tangent bundle $TS^2$ (minitwistor space) over $S^2$, one can think of a section as a map $s: S^2\longrightarrow TS^2$ such that $\langle s(x),x\rangle=0$ for each $x\in S^2$.

Proposition. A line bundle $L$ is trivial if and only if it has a nowhere vanishing section.

Proof. Suppose that $L$ is trivial. Let $\varphi: L\longrightarrow M\times\mathbb{C}$ be the trivialization. Then $s: M\longrightarrow L$ defined by $s(m)=\varphi^{-1}(m,1)$ is a nowhere vanishing section. Conversely, if $s$ is a nowhere vanishing section, define a trivialization $M\times\mathbb{C}\longrightarrow L$ by $(m,\lambda)\longmapsto\lambda s(m)$. This is an isomorphism.

Physically sections are fields and if we cannot differentiate fields, we cannot do physics. Let $L\stackrel{\pi}{\longrightarrow}M$ be a line bundle and $s:M\longrightarrow L$ a section. Let $\gamma:(-\epsilon,\epsilon)\longrightarrow M$ be a path through $\gamma())=m$. The conventional definition of the derivative of $s$ would be
$$\lim_{t\to 0}\frac{s(\gamma(t))-s(\gamma(0))}{t}.$$
However, this definition makes no sense because $s(\gamma(t))\in L_{\gamma(t)}$ and $s(\gamma(0))\in L_{\gamma(0)}=L_m$ and we cannot perform the required subtraction
$$s(\gamma(t))-s(\gamma(0)).$$
So, we need to devise a way to differentiate sections of a line bundle. To get a clue, we need to review what we already know and maybe start from there since we cannot create something from nothing. At least we learned how to differentiate vector fields in Euclidean space, say $\mathbb{R}^3$. Let $X$ be a vector field in $\mathbb{R}^3$. The covariant derivative $\nabla_vX$ of $X$ in the direction of the tangent vector $v\in T_p\mathbb{R}^3$ is
\begin{align*}
\nabla_vX&=X’(p+tv)(0)\\
&=\lim_{t\to 0}\frac{X(p+tv)-X(p)}{t}.
\end{align*}
At this moment, one may say “Wait a minute! We have already examined the definition and know that it does not work for sections of a line bundle.” I know and please be patient. We haven’t got a clue yet and something useful may come out of this along the way. Since we cannot create the derivative of a section out of thin air, it is still important to review what we already know. We can naturally extend the above definition to the covariant derivative $\nabla_XY$ of a vector field $Y$ with respect to a vector field $X$. The covariant derivative $\nabla$ satisfies the following properties:

  1. $\nabla_{f X+gZ}Y=f\nabla_XY+g\nabla_ZY$;
  2. $\nabla_XfY=(Xf)Y+f\nabla_XY$ where $Xf$ denotes the directional derivative $Xf=\sum_{i=1}^n\alpha^i\frac{\partial f}{\partial x^i}$.

The first property is linearity and the second property is Leibniz rule. These are the most basic rules that you would expect from differentiation. Denote by $\mathfrak{X}(\mathbb{R}^3)$ the set of all tangent vector fields on $\mathbb{R}^3$. The covariant derivative may be regarded as a bilinear map $\nabla:\mathfrak{X}(\mathbb{R}^3)\times\mathfrak{X}(\mathbb{R}^3)\longrightarrow\mathfrak{X}(\mathbb{R}^3)$ satisfying the properties 1 and 2 and we write $\nabla(X,Y)$ as $\nabla_XY$. This gives us a clue on how to define the derivative of a section. It turns out that there isn’t a unique way to differentiate a section for there can be many different maps $\nabla$ satisfying the properties 1 and 2. In fact, one needs to make a choice. Such a choice of differentiation is called a connection.

Definition. Let $M$ be a differentiable manifold of dimension $n$ and $\mathfrak{X}(M)$ the set of all tangent vectors on $M$. A connection on $M$ is a bilinear map $\nabla:\mathfrak{X}(M)\times\mathfrak{X}(M)\longrightarrow\mathfrak{X}(M)$ such that
\begin{align*}
\nabla_{f X+gZ}Y&=f\nabla_XY+g\nabla_ZY\\
\nabla_XfY&=(Xf)Y+f\nabla_XY
\end{align*}
where $\nabla(X,Y)$ is written as $\nabla_XY$.

The one we defined here is a way of differentiating (connection) of vector fields on a differentiable manifold, but we still have not defined a way of differentiating sections of a line bundle. But we now have a much clearer picture about it. Before we continue, let us briefly discuss differentials because they are closely related to directional derivative. For each $i=1,\cdots,n$, the differential 1-form $dx^i$ is a 1-form on $T_\ast M$ such that
$$dx^i\left(\frac{\partial}{\partial x^j}\right)=\frac{\partial x^i}{\partial x^j}=\delta_{ij}$$
where $\delta_{ij}$ is the Kronecker’s delta. That is, $dx^i$ is is the dual vector of the tangent vector $\frac{\partial}{\partial x^i}$, $i=1,\cdots,n$ and that the  $dx^i$, $i=1,\cdots,n$ form the standard basis for the cotangent space $T^\ast M$. For any tangent vector field $X=\sum_{j=1}^n\alpha^j\frac{\partial}{\partial x^j}$,
$$dx^i(X)=\alpha^i=Xx^i.$$
So if we define the differential of $f$ by
$$df:=\sum_{i=1}^n\frac{\partial f}{\partial x^i}dx^i,$$
then
$$df(X)=\sum_{i=1}^n\frac{\partial f}{\partial x^i}dx^i(X)=Xf.$$
Thus the Leibniz rule can be also written as
$$\nabla_XfY=df(X)Y+f\nabla_XY.$$

Let $\gamma: (-\epsilon,\epsilon)\longrightarrow M$ be a path with $\gamma(0)=p$. On a local coordinate neighborhood $(U(p),\varphi)$, $\gamma(t)$ is written as $\gamma(t)=(x^1(t),\cdots,x^n(t))$ and
$$\frac{d\gamma}{dt}(0)=\sum_{i=1}^n\frac{dx^i}{dt}(0)\left(\frac{\partial}{\partial x^i}\right)_p\in T_p(M).$$
Now,
\begin{align*}
df\left(\frac{d\gamma}{dt}(0)\right)&=\sum_{i=1}^n\left(\frac{\partial}{\partial x^i}\right)_pdx^i\left(\frac{d\gamma}{dt}(0)\right)\\
&=\sum_{i=1}^n\left(\frac{\partial}{\partial x^i}\right)_p\frac{dx^i}{dt}(0)\\
&=\frac{df}{dt}(0).
\end{align*}
For the tangent vector field $\dot{\gamma}(t)=\frac{d\gamma}{dt}$, we have
$$df(\dot{\gamma}(t))=\frac{df}{dt}.$$

We will discuss connection on a line bundle in the second part of this lecture. I would like to end this lecture with a physical motivation for considering bundles.  The fields in physics are usually given by map $\phi:M\longrightarrow\mathbb{C}^n$ where $M$ is the spacetime. In quantum mechanics, particles are described by so-called complex-valued wave functions (also called state function) $\phi: M\longrightarrow\mathbb{C}$. Due to the Uncertainty Principle, one cannot pinpoint the exact location of a particle. The best thing one can do is to measure a probable location of the particle. The probability of a particle in the state  $\phi$ to be discovered in the region $U\subset M$ is
$$\int_Udx\langle\phi(x)|\phi(x)\rangle.$$
To define probability, all we need to know is that $\phi(x)$ takes its value in $\mathbb{C}$ with Hermitian product, and there is no reason for this to be the same vector space for all values of $\phi(x)$. Functions like $\phi$ which are the generalization of complex-valued functions are called sections of vector bundles.

References:

[1] M. Murray, Notes on Line Bundles

[2] B. O’Neill, Elementary Differential Geometry, Academic Press, 1966

Line Bundles

Simply speaking, a line bundle is a complex vector bundle such that each fibre $F_x$ is a one-dimensional complex vector space i.e. one-dimensional vector space over the complex field $\mathbb{C}$. More specifically,

Definition. A complex line bundle over a manifold $M$ is a manifold $L$ and a smooth onto map $\pi: L\longrightarrow M$ such that

  1. For each $m\in M$, $\pi^{-1}(m)=L_m$ is a one-dimensional complex vector space.
  2. For each $m\in M$, there exists an open neighborhood $U(m)\subset M$ such that $\pi^{-1}(U(m))\stackrel{\varphi}{\cong}U(m)\times\mathbb{C}$ (here $\cong$ means “is homeormorphic to” as usual) and $\varphi(L_m)\subset\{m\}\times\mathbb{C}$. Moreover, $\varphi|_{L_m}:L_m\longrightarrow\{m\}\times\mathbb{C}$ is a linear isomorphism.

Example. The Trivial Bundle $M\times\mathbb{C}$.

Example. If $u\in S^2$, the tangent plane at $u$ is identified with
$$T_uS^2=\{v\in\mathbb{R}^3:\langle u,v\rangle=0\}.$$
We can make this 2-dimensional real vector space a 1-dimensional complex vector space by defining
$$(a+i\beta)v:=\alpha v+\beta\cdot u\times v.$$
So, the tangent bundle $TS^2$ is a line bundle. $TS^2$ as a complex line bundle is called the mini-twistor space and it plays an important role in the study of BPS monopoles in physics.

Example. Let $\Sigma\subset\mathbb{R}^3$ be a surface. If $x\in\Sigma$ and $\hat n_x$ is a unit normal, then $T_x\Sigma=\hat n_x^\perp$ (the orthogonal complement of $\hat n_x$). We make this a 1-dimensional complex vector space by defining
$$(\alpha+i\beta)v=\alpha v+\beta\hat n_x\times v.$$
So, the tangent bundle $T\Sigma$ is a line bundle.

Example. [Hopf Bundle] Let $\mathbb{C}P^1$ be the set of all lines through the origin in $\mathbb{C}^2$. Denote the line through the vector $z=(z^0,z^1)$ by $[z]=[z^0,z^1]$. Define two open sets $U_i$, $i=0,1$ by
$$U_i=\{[z^0,z^1]:z^i\ne 0\},\ i=0,1$$
and $\psi_i:U_i\longrightarrow\mathbb{C}$ by
$$\psi_0([z])=\frac{z^1}{z^0},\ \psi_1([z])=\frac{z^0}{z^1}.$$
Then $\mathbb{C}P^1$ is a complex manifold of dimension 1. As a manifold $\mathbb{C}P^1$ is diffeomorphic to $S^2$. An explicit diffeomorphism $S^2\longrightarrow\mathbb{C}P^1$ is given by
$$(x^1,x^2,x^3)\longmapsto[x^1+ix^2,1-x^3].$$
Define a line bundle $H\subset\mathbb{C}^2\times\mathbb{C}P^1$ over $\mathbb{C}P^1$ by
$$H=\{(\omega,[z]): \omega=\lambda z\ \mbox{for some}\ \lambda\in\mathbb{C}\setminus\{0\}\}.$$
Define a projection $\pi:H\longrightarrow\mathbb{C}P^1$ by $\pi(\omega,[z])=[z]$. The fibre $H_{[z]}=\pi^{-1}([z])$ is the set $\{(\lambda z,[z]):\lambda\in\mathbb{C}\setminus\{0\}\}$ which is identified with the line $[z]$ through the vector $z$. The fibre $H_{[z]}$ can be made to a 1-dimensional complex vector space by
\begin{align*}
\alpha(\omega,[z])+\beta(\omega’,[z])&:=(\alpha\omega+\beta\omega’,[z]),\ \alpha,\beta\in\mathbb{C}\setminus\{0\},\\
0(\omega,[z])&:=(0,0).
\end{align*}

References:

[1] M. Murray, Notes on Line Bundles

Vector Bundles

Let $M$ be a differentiable manifold of dimension $n$. Consider an atlas $\mathcal{U}=\{U_\alpha\}_{\alpha\in\mathcal{A}}$ along with coordinates $x_\alpha^1,\cdots,x_\alpha^n$ in $U_\alpha$. For $x=(x_\alpha^1(x),\cdots,x_\alpha^n(x))\in U_\alpha$, a tangent vector is given by
$$v=\sum_{j=1}^nv_\alpha^j\frac{\partial}{\partial x_\alpha^j}.$$
If $x\in U_\alpha\cap U_\beta$, then $v$ is also written as
$$v=\sum_{j=1}^nv_\beta^j\frac{\partial}{\partial x_\beta^j}.$$
Here, the change of coordinates is given by
$$v_\beta^j=vx_\beta^j=\sum_{k=1}^nv_\alpha^k\frac{\partial x_\beta^j}{\partial x_\alpha^k}.$$
For $x\in U_\alpha\cap U_\beta$ and $f=(f^1,\cdots,f^n)\in\mathbb{R}^n$, define
\begin{align*}
h_{\alpha\beta}(x)(f)&=\left(\sum_{k=1}^n\frac{\partial x_\beta^1}{\partial x_\alpha^k}f^k,\cdots,(\sum_{k=1}^n\frac{\partial x_\beta^n}{\partial x_\alpha^k}f^k\right)\\
&=\begin{pmatrix}
\frac{\partial x_\beta^1}{\partial x_\alpha^1} & \cdots & \frac{\partial x_\beta^1}{\partial x_\alpha^n}\\
\vdots & \ddots & \vdots\\
\frac{\partial x_\beta^n}{\partial x_\alpha^1} & \cdots & \frac{\partial x_\beta^n}{\partial x_\alpha^n}
\end{pmatrix}\begin{pmatrix}
f^1\\
\vdots\\
f^n
\end{pmatrix}
\end{align*}
Hence $h_{\alpha\beta}:U_\alpha\cap U_\beta\longrightarrow\mathrm{Aut(\mathbb{R}^n)}$. The resulting bundle over $M$ with fibre $F=\mathbb{R}^n$ is called the tangent bundle of $M$ and is denoted by $TM$. Note that $TM$ is the set of all tangent vectors of $M$ i.e. $TM=\bigcup_{x\in M}T_xM$. For each $x\in U_\alpha$, the fibre $\pi^{-1}(x)$ of $x\in M$ is $T_xM\cong\{x\}\times\mathbb{R}^n$. The local trivialization map $h_\alpha:\pi^{-1}(U_\alpha)\longrightarrow U_\alpha\times\mathbb{R}^n$ is given by
$$h_\alpha(v)=(x,(v_\alpha^1,\cdots,v_\alpha^n)),\ v\in T_xU_\alpha(=T_xM),\ x\in U_\alpha.$$

A fibre bundle $(E,M,F,\pi)$ is called a vector bundle over $M$ if each fibre $F_x$ of $x\in M$ is a vector space. So, a tangent bundle is a vector bundle. The tangent bundle $TM$ is a differentiable manifold of dimension $2n$ with local coordinates in $\pi^{-1}(U_\alpha)$ being $(x_\alpha^1,\cdots,x_\alpha^n,v_\alpha^1,\cdots,v_\alpha^n)$. The Jacobian is given by
\begin{align*}
J(x_\beta^1,\cdots,x_\beta^n;x_\alpha^1,\cdots,x_\alpha^n)&=\frac{\partial(x_\beta^1,\cdots,x_\beta^n)}{\partial(x_\alpha^1,\cdots,x_\alpha^n)}\\
&=\begin{pmatrix}
\frac{\partial x_\beta^1}{\partial x_\alpha^1} & \cdots & \frac{\partial x_\beta^1}{\partial x_\alpha^n}\\
\vdots & \ddots & \vdots\\
\frac{\partial x_\beta^n}{\partial x_\alpha^1} & \cdots & \frac{\partial x_\beta^n}{\partial x_\alpha^n}
\end{pmatrix}:U_\alpha\cap U_\beta\longrightarrow\mathrm{GL}(n,\mathbb{R}).
\end{align*}
Let $g_{\alpha\beta}=J(x_\beta^1,\cdots,x_\beta^n;x_\alpha^1,\cdots,x_\alpha^n)$ Then $g_{\alpha\beta}$ satisfies
\begin{align*}
g_{\alpha\alpha}(x)&=I_n;\\
g_{\beta\alpha}(x)&=g_{\alpha\beta}^{-1}(x),\ x\in U_\alpha\cap U_\beta;\\
g_{\alpha\beta}(x)g_{\beta\gamma}(x)g_{\gamma\alpha}(x)&=I_n,\ x\in U_\alpha\cap U_\beta\cap U_\gamma.
\end{align*}
For $x\in U_\alpha\cap U_\beta$ and $f\in\mathbb{R}^n$, $h_{\alpha\beta}(x)(f)=g_{\alpha\beta}\cdot f$. So, $\mathrm{GL}(n,\mathbb{R})$ acts on the fibre $\mathbb{R}^n$. The map $g_{\alpha\beta}$ itself is often called a transition map.

If the transition map $h_\alpha\beta$ is the group action of a Lie group $G$ on the fibre $F$, the fibre bundle $(E,M,F,\pi)$ is called a $G$-bundle and the Lie group $G$ is called a structure group. The tangent bundle $TM$ is also a G-bundle with structure group $\mathrm{GL}(n,\mathbb{R})$.

Differentiable Manifolds and Tangent Spaces

In $\mathbb{R}^n$, there is a globally defined orthonormal frame
$$E_{1p}=(1,0,\cdots,0)_p,\ E_{2p}=(0,1,0,\cdots,0)_p,\cdots,E_{np}=(0,\cdots,0,1)_p.$$
For any tangent vector $X_p\in T_p(\mathbb{R}^n)$, $X_p=\sum_{i=1}^n\alpha^iE_{ip}$. Note that the coefficients $\alpha^i$ are the ones that distinguish tangent vectors in $T_p(\mathbb{R}^n)$. For a differentiable function $f$, the directional derivative $X_p^\ast f$ of $f$ with respect to $X_p$ is given by
$$X_p^\ast f=\sum_{i=1}^n\alpha^i\left(\frac{\partial f}{\partial x_i}\right).$$
We identify each $X_p$ with the differential operator
$$X_p^\ast=\sum_{i=1}^n\alpha^i\frac{\partial}{\partial x_i}:C^\infty(p)\longrightarrow\mathbb{R}.$$
Then the frame fields $E_{1p},E_{2p},\cdots,E_{np}$ are identified with
$$\left(\frac{\partial}{\partial x_1}\right)_p,\left(\frac{\partial}{\partial x_2}\right)_p,\cdots,\left(\frac{\partial}{\partial x_n}\right)_p$$
respectively. Unlike $\mathbb{R}^n$, we cannot always have a globally defined frame on a differentiable manifold. So it is necessary for us to use local coordinate neighborhoods that are homeomorphic to $\mathbb{R}^n$ and the associated frames $\frac{\partial}{\partial x_1},\frac{\partial}{\partial x_2},\cdots,\frac{\partial}{\partial x_n}$.

Example. The points $(x,y,z)$ are represented in terms of the spherical coordinates $(\phi,\theta)$ as
$$x=\sin\phi\cos\theta,y=\sin\phi\sin\theta,z=\cos\phi,\ 0\leq\phi\leq\pi,\ 0\leq\theta\leq 2\pi.$$
By chain rule, one finds the standard basis $\frac{\partial}{\partial\phi},\frac{\partial}{\partial\theta}$ for $T_\ast S^2$:
\begin{align*}
\frac{\partial}{\partial\phi}&=\cos\phi\cos\theta\frac{\partial}{\partial x}+\cos\phi\sin\theta\frac{\partial}{\partial y}-\sin\phi\frac{\partial}{\partial z},\\
\frac{\partial}{\partial\theta}&=-\sin\phi\sin\theta\frac{\partial}{\partial x}+\sin\phi\cos\theta\frac{\partial}{\partial y}.
\end{align*}
The frame field is not globally defined on $S^2$ since $\frac{\partial}{\partial\theta}$ at $\phi=0,\pi$. More generally, the following theorem holds.

Frame field on 2-sphere

Theorem. [Hairy Ball Theorem] If $n$ is even, a non-vanishing $C^\infty$ vector field on $S^n$ does not exist i.e. a $C^\infty$ vector field on $S^n$ must take zero value at some point of $S^n$.

The Hairy Ball Theorem tells us why we have ball spots on our heads. It can be also stated as “you cannot comb a hairy ball flat.” There may also be a meteorological implication of this theorem. It may implicate that there must be at least one spot on earth where there is no wind at all. No-wind spot may be the eye of a hurricane. So, as long as there is wind (and there always is) on earth, there must be a hurricane somewhere at all times.

It has been known that all odd-dimensional spheres have at least one non-vanishing $C^\infty$ vector field and that only spheres $S^1, S^3, S^7$ have a $C^\infty$ field of basis. For instance, there are three mutually perpendicular unit vector fields on $S^3\subset\mathbb{R}^4$ i.e. a frame field: Let $S^3=\{(x^1,x^2,x^3,x^4)\in\mathbb{R}^4: \sum_{i=1}^4(x^i)^2=1\}$. Then
\begin{align*}
X&=-x^2\frac{\partial}{\partial x^1}+x^2\frac{\partial}{\partial x^2}+x^4\frac{\partial}{\partial x^3}-x^3\frac{\partial}{\partial x^4},\\
Y&=-x^3\frac{\partial}{\partial x^1}-x^4\frac{\partial}{\partial x^2}+x^1\frac{\partial}{\partial x^3}+x^2\frac{\partial}{\partial x^4},\\
Z&=-x^4\frac{\partial}{\partial x^1}+x^3\frac{\partial}{\partial x^2}-x^2\frac{\partial}{\partial x^3}+x^1\frac{\partial}{\partial x^4}
\end{align*}
form an orthonormal basis of $C^\infty$ vector fields on $S^3$.

Fibre Bundles

A fibre bundle is an object $(E,M,F,\pi)$ consisting of

  1. The total space $E$;
  2. The base space $M$ with an open covering $\mathcal{U}=\{U_\alpha\}_{\alpha\in\mathcal{A}}$;
  3. The fibre $F$ and the projection map $E\stackrel{\pi}{ \longrightarrow}M$.

The simplest case is $E=M\times F$. In this case, the bundle is called a trivial bundle. In general the total space may be too complicated for us to understand, so it would be nice if we can always find smaller parts that are simple enough for us to understand such as trivial bundles. For this reason, we want the fibre bundle to have the additional property: For each $U_\alpha\in\mathcal{U}$, there exists a homeomorphism $h_\alpha : \pi^{-1}(U_\alpha)\longrightarrow U_\alpha\times F$. Such a homeomorphism $h_\alpha$ is called a local trivialization. For each $x\in M$, $F_x:=\pi^{-1}(x)$ is homeomorphic to $\{x\}\times F$. $F_x$ is called the fibire of $x$.

Let $x\in U_\alpha\cap U_\beta$. Then $F_x^\alpha\subset\pi^{-1}(U_\alpha)$ and $F_x^\beta\subset\pi^{-1}(U_\beta)$ may not be the same. However, the two fibres are homeomorphic. For each $x\in M$, denote by $h_{\alpha\beta}(x)$ the homeomorphism from $F_x^\alpha$ to $F_x^\beta$. Then for each $x\in M$, $h_{\alpha\beta}(x)\in\mathrm{Aut}(F)$ where $\mathrm{Aut}(F)$ is the group of homeomorphisms from $F$ to itself i.e. the automorphism group of $F$. The map $h_{\alpha\beta}: U_\alpha\cap U_\beta\longrightarrow\mathrm{Aut}(F)$ is called a transition map. Note that for $U_\alpha,U_\beta\in\mathcal{U}$ with $U_\alpha\cap U_\beta\ne\emptyset$, $h_\alpha\circ h_\beta^{-1}:(U_\alpha\cap U_\beta)\times F\longrightarrow (U_\alpha\cap U_\beta)\times F$ satisfies
$$h_\alpha\circ h_\beta^{-1}(x,f)=(x,h_{\alpha\beta}(x)f)$$
for any $x\in U_\alpha\cap U_\beta$, $f\in F$

Structural Equations

Definition. The dual 1-forms $\theta_1,\theta_2,\theta_3$ of a frame $E_1,E_2,E_3$ on $\mathbb{E}^3$ are defined by
$$\theta_i(v)=v\cdot E_i(p),\ v\in T_p\mathbb{E}^3.$$
Clearly $\theta_i$ is linear.

Example. The dual 1-forms of the natural frame $U_1,U_2,U_3$ are $dx_1$, $dx_2$, $dx_3$ since
$$dx_i(v)=v_i=v\cdot U_i(p)$$
for each $v\in T_p\mathbb{E}^3$.

For any vector field $V$ on $\mathbb{E}^3$,
$$V=\sum_i\theta_i(V)E_i.$$
To see this, let us calculate for each $V(p)\in T_p\mathbb{E}^3$
\begin{align*}
\sum_i\theta_i(V(p))E_i(p)&=\sum_i(V(p)\cdot E_i(p))E_i(p)\\
&=\sum_iV_i(p)E_i(p)\\
&=V(p).
\end{align*}

Lemma. Let $\theta_1,\theta_2,\theta_3$ be the dual 1-forms of a frame $E_1, E_2, E_3$. Then any 1-form $\phi$ on $\mathbb{E}^3$ has a unique expression
$$\phi=\sum_i\phi(E_i)\theta_i.$$

Proof. Let $V$ be any vector field on $\mathbb{E}^3$. Then
\begin{align*}
\sum_i\phi(E_i)\theta_i(V)&=\sum_i\phi(E_i)\theta_i(V)\\
&=\phi(\sum_i\theta_i(V)E_i)\ \mbox{by linearity of $phi$}\\
&=\phi(V).
\end{align*}
Let $A=(a_{ij})$ be the attitude matrix of a frame field $E_1$, $E_2$, $E_3$, i.e.
$$E_i=\sum_ja_{ij}U_j,\ i=1,2,3.\ \ \ \ \ \mbox{(1)}$$
Clearly $\theta_i=\sum_j\theta_i(U_j)dx_j$. On the other hand,
$$\theta_i(U_j)=E_i\cdot U_j=\left(\sum_ka_{ik}U_k\right)\cdot U_j=a_{ij}.$$ Hence the dual formulation of (1) is
$$\theta_i=\sum_ja_{ij}dx_j.\ \ \ \ \ \mbox{(2)}$$

Theorem. [Cartan Structural Equations] Let $E_1$, $E_2$, $E_3$ be a frame field on $\mathbb{E}^3$ with dual 1-forms $\theta_1$, $\theta_2$, $\theta_3$ and connection forms $\omega_{ij}$, $i,j=1,2,3$. Then

  1. The First Structural Equations: $$d\theta_i=\sum_j\omega_{ij}\wedge\theta_j.$$
  2. The Second Structural Equations: $$d\omega_{ij}=\sum_k\omega_{ik}\wedge\omega_{kj}.$$

Proof. The exterior derivative of (2) is
$$d\theta_i=\sum_jda_{ij}\wedge dx_j.$$ Since $\omega=dA\cdot{}^tA$ and ${}^tA=A^{-1}$ (recall that $A$ is an orthogonal matrix), $dA=\omega\cdot A$, i.e.
$$da_{ij}=\sum_k\omega_{ik}a_{kj}.$$
So,
\begin{align*}
d\theta_i&=\sum_j\left\{\left(\sum_k\omega_{ik}a_{kj}\right)\wedge dx_j\right\}\\
&=\sum_k\left\{\omega_{ik}\wedge\sum_j a_{kj}dx_j\right\}\\
&=\sum_k\omega_{ik}\wedge\theta_k.
\end{align*}

From $\omega=dA\cdot{}^tA$,
$$\omega_{ij}=\sum_kda_{ik}a_{jk}.\ \ \ \ \ \mbox{(3)}$$
The exterior derivative of (3) is
\begin{align*}
d\omega_{ij}&=\sum_k da_{jk}\wedge d_{ik}\\
&=-\sum_k da_{ik}\wedge da_{jk},
\end{align*}
i.e.
\begin{align*}
d\omega&=-dA\wedge{}^t(dA)\\
&=-(\omega\cdot A)\cdot({}^tA\cdot{}^t\omega)\\
&=-\omega\cdot (A\cdot{}^tA)\cdot{}^t\omega\\
&=-\omega\cdot{}^t\omega\ \ \ (A\cdot{}^tA=I)\\
&=\omega\cdot\omega.\ \ \ (\mbox{$\omega$ is skew-symmetric.})
\end{align*}
This is equivalent to the second structural equations.

Example. [Structural Equations for the Spherical Frame Field] Let us first calculate the dual forms and connection forms.

From the spherical coordinates
\begin{align*}
x_1&=\rho\cos\varphi\cos\theta,\\
x_2&=\rho\cos\varphi\sin\theta,\\
x_3&=\rho\sin\varphi,
\end{align*}
we obtain differentials
\begin{align*}
dx_1&=\cos\varphi\cos\theta d\rho-\rho\sin\varphi\cos\theta d\varphi-\rho\cos\varphi\sin\theta d\theta,\\
dx_2&=\cos\varphi\sin\theta d\rho-\rho\sin\varphi\sin\theta d\varphi+\rho\cos\varphi\cos\theta d\theta,\\
dx_3&=\sin\varphi d\rho+\rho\cos\varphi d\varphi.
\end{align*}
From the spherical frame field $F_1$, $F_2$, $F_3$ discussed here, we find its attitude matrix
$$A=\begin{pmatrix}
\cos\varphi\cos\theta & \cos\varphi\sin\theta & \sin\varphi\\
-\sin\theta & \cos\theta & 0\\
-\sin\varphi\cos\theta & -\sin\varphi\sin\theta & \cos\varphi
\end{pmatrix}.$$
Thus by (2) we find the dual 1-forms
\begin{align*}
\begin{pmatrix}
\theta_1\\
\theta_2\\
\theta_3
\end{pmatrix}&=\begin{pmatrix}
\cos\varphi\cos\theta & \cos\varphi\sin\theta & \sin\varphi\\
-\sin\theta & \cos\theta & 0\\
-\sin\varphi\cos\theta & -\sin\varphi\sin\theta & \cos\varphi
\end{pmatrix}\begin{pmatrix}
dx_1\\
dx_2\\
dx_3
\end{pmatrix}\\
&=\begin{pmatrix}
d\rho\\
\rho\cos\theta d\theta\\
\rho d\varphi
\end{pmatrix}.
\end{align*}
\begin{align*}
&dA=\\
&\begin{bmatrix}
-\sin\varphi\cos\theta d\varphi-\cos\varphi\sin\theta d\theta & -\sin\varphi\sin\theta d\varphi+\cos\varphi\cos\theta d\theta & \cos\varphi d\varphi\\
-\cos\theta d\theta & -\sin\theta d\theta & 0\\
-\cos\varphi\cos\theta d\varphi+\sin\varphi\sin\theta d\theta & -\cos\varphi\sin\theta d\varphi-\sin\varphi\sin\theta d\theta & -\sin\varphi d\varphi
\end{bmatrix}\end{align*}
and so,
\begin{align*}
\omega&=\begin{pmatrix}
0 & \omega_{12} & \omega_{13}\\
-\omega_{12} & 0 & \omega_{23}\\
-\omega_{13} & -\omega_{23} & 0
\end{pmatrix}\\
&=dA\cdot{}^tA\\
&=\begin{pmatrix}
0 & \cos\varphi d\theta & d\varphi\\
-\cos\varphi d\theta & 0 & \sin\varphi d\theta\\
-d\varphi & -\sin\varphi d\theta & 0
\end{pmatrix}.
\end{align*}
From these dual 1-forms and connections forms one can immediately verify the first and the second structural equations.

Tensors I

Tensors may be considered as a generalization of vectors and covectors. They are extremely important quantities for studying differential geometry and physics.

Let $M^n$ be an $n$-dimensional differentiable manifold. For each $x\in M^n$, let $E_x=T_xM^n$, i.e. the tangent space to $M^n$ at $x$. We denote the canonical basis of $E$ by $\partial=\left(\frac{\partial}{\partial x^1},\cdots,\frac{\partial}{\partial x^n}\right)$ and its dual basis by $\sigma=dx=(dx^1,\cdots,dx^n)$, where $x^1,\cdots,x^n$ are local coordinates. The canonical basis $\frac{\partial}{\partial x^1},\cdots,\frac{\partial}{\partial x^1}$ also simply denoted by $\partial_1,\cdots,\partial_n$.

Covariant Tensors

Definition. A covariant tensor of rank $r$ is a multilinear real-valued function
$$Q:E\times E\times\cdots\times E\longrightarrow\mathbb{R}$$
of $r$-tuples of vectors. A covariant tensor of rank $r$ is also called a tensor of type $(0,r)$ or shortly $(0,r)$-tensor. Note that the values of $Q$ must be independent of the basis in which the components of the vectors are expressed. A covariant vector (also called covector or a 1-form) is a covariant tensor of rank 1. An important of example of covariant tensor of rank 2 is the metric tensor $G$:
$$G(v,w)=\langle v,w\rangle=\sum_{i,j}g_{ij}v^iw^j.$$

In componenents, by multilinearity
\begin{align*}
Q(v_1\cdots,v_r)&=Q\left(\sum_{i_1}v_1^{i_1}\partial_{i_1},\cdots,\sum_{i_r}v_r^{i_r}\partial_{i_r}\right)\\
&=\sum_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}Q(\partial_{i_1},\cdots,\partial_{i_r}).
\end{align*}
Denote $Q(\partial_{i_1},\cdots,\partial_{i_r})$ by $Q_{i_1,\cdots,i_r}$. Then
$$Q(v_1\cdots,v_r)=\sum_{i_1,\cdots,i_r}Q_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}.\ \ \ \ \ \mbox{(1)}$$
Using the Einstein’s convention, (1) can be shortly written as
$$Q(v_1\cdots,v_r)=Q_{i_1,\cdots,i_r}v_1^{i_1}\cdots v_r^{i_r}.$$
The set of all covariant tensors of rank $r$ forms a vector space over $\mathbb{R}$. The number of components in such a tensor is $n^r$. The vector space of all covariant $r$-th rank tensors is denoted by
$$E^\ast\otimes E^\ast\otimes\cdots\otimes E^\ast=\otimes^r E^\ast.$$

If $\alpha,\beta\in E^\ast$, i.e. covectors, we can form the 2nd rank covariant tensor, the tensor product $\alpha\otimes\beta$ of $\alpha$ and $\beta$: Define $\alpha\otimes\beta: E\times E\longrightarrow\mathbb{R}$ by
$$\alpha\otimes\beta(v,w)=\alpha(v)\beta(w).$$
If we write $\alpha=a_idx^i$ and $\beta=b_jdx^j$, then
$$(\alpha\otimes\beta)_{ij}=\alpha\otimes\beta(\partial_i,\partial_j)=\alpha(\partial_i)\beta(\partial_j)=a_ib_j.$$

Contravariant Tensors

A contravariant vector, i.e. an element of $E$ can be considered as a linear functional $v: E^\ast\longrightarrow\mathbb{R}$ defined by
$$v(\alpha)=\alpha(v)=a_iv^i,\ \alpha=a_idx^i\in E^\ast.$$

Definition. A contravariant tensor of rank $s$ is a multilinear real-valued function $T$ on $s$-tuples of covectors
$$T:E^\ast\times E^\ast\times\cdots\times E^\ast\longrightarrow\mathbb{R}.$$ A contravariant tensor of rank $s$ is also called a tensor of type $(s,0)$ or shortly $(s,0)$-tensor.
For 1-forms $\alpha_1,\cdots,\alpha_s$
$$T(\alpha_1,\cdots,\alpha_s)=a_{1_{i_1}}\cdots a_{s_{i_s}}T^{i_1\cdots i_s}$$
where
$$T^{i_1\cdots i_s}:=T(dx^{i_1},\cdots,dx^{i_s}).$$
The space of all contravariant tensors of rank $s$ is denoted by
$$E\otimes E\otimes\cdots\otimes E:=\otimes^s E.$$
Contravariant vectors are contravariant tensors of rank 1. An example of a contravariant tensor of rank 2 is the inverse of the metric tensor $G^{-1}=(g^{ij})$:
$$G^{-1}(\alpha,\beta)=g^{ij}a_ib_j.$$

Given a pair $v,w$ of contravariant vectors, we can form the tensor product $v\otimes w$ in the same manner as we did for covariant vectors. It is the 2nd rank contravariant tensor with components $(v\otimes w)^{ij}=v^jw^j$. The metric tensor $G$ and its inverse $G^{-1}$ may be written as
$$G=g_{ij}dx^i\otimes dx^j\ \mbox{and}\ G^{-1}=g^{ij}\partial_i\otimes\partial_j.$$

Mixed Tensors

Definition. A mixed tensor, $r$ times covariant and $s$ times contravariant, is a real multilinear function $W$
$$W: E^\ast\times E^\ast\times\cdots\times E^\ast\times E\times E\times\cdots\times E\longrightarrow\mathbb{R}$$
on $s$-tuples of covectors and $r$-tuples of vectors. It is also called a tensor of type $(s,r)$ or simply $(s,r)$-tensor. By multilinearity
$$W(\alpha_1,\cdots,\alpha_s, v_1,\cdots, v_r)=a_{1_{i_1}}\cdots a_{s_{i_s}}W^{i_1\cdots i_s}{}_{j_1\cdots j_r}v_1^{j_1}\cdots v_r^{j_r}$$
where
$$W^{i_1\cdots i_s}{}_{j_1\cdots j_r}:=W(dx^{i_1},\cdots,dx^{i_s},\partial_{j_1},\cdots,\partial_{j_r}).$$

A 2nd rank mixed tensor may arise from a linear operator $A: E\longrightarrow E$. Define $W_A: E^\ast\times E\longrightarrow\mathbb{R}$ by $W_A(\alpha,v)=\alpha(Av)$. Let $A=(A^i{}_j)$ be the matrix associated with $A$, i.e. $A(\partial_j)=\partial_i A^i{}_j$. Let us calculate the component of $W_A$:
$$W_A^i{}_j=W_A(dx^i,\partial_j)=dx^i(A(\partial_j))=dx^i(\partial_kA^k{}_j)=\delta^i_kA^k{}_j=A^i{}_j.$$
So the matrix of the mixed tensor $W_A$ is just the matrix associated with $A$. Conversely, given a mixed tensotr $W$, once convariant and once contravariant, we can define a linear transformation $A$ such that $W(\alpha,v)=\alpha(A,v)$. We do not distinguish between a linear transformation $A$ and its associated mixed tensor $W_A$. In components, $W(\alpha,v)$ is written as
$$W(\alpha,v)=a_iA^i{}_jv^j=aAv.$$

The tensor product $w\otimes\beta$ of a vector and a covector is the mixed tensor defined by
$$(w\otimes\beta)(\alpha,v)=\alpha(w)\beta(v).$$ The associated transformation is can be written as
$$A=A^i{}_j\partial_i\otimes dx^j=\partial_i\otimes A^i{}_jdx^j.$$

For math undergraduates, different ways of writing indices (raising, lowering, and mixed) in tensor notations can be very confusing. Main reason is that in standard math courses such as linear algebra or elementary differential geometry (classical differential geometry of curves and surfaces in $\mathbb{E}^3$) the matrix of a linear transformation is usually written as $A_{ij}$. Physics undergraduates don’t usually get a chance to learn tensors in undergraduate physics courses. In order to study more advanced differential geometry or physics such as theory of special and general relativity, and field theory one must be able to distinguish three different ways of writing matrices $A_{ij}$, $A^{ij}$, and $A^i{}_j$. To summarize, $A_{ij}$ and $A^{ij}$ are bilinear forms on $E$ and $E^\ast$, respectively that are defined by
$$A_{ij}v^iv^j\ \mbox{and}\ A^{ij}a_ib_j\ (\mbox{respectively}).$$ $A^i{}_j$ is the matrix of a linear transformation $A: E\longrightarrow E$.

Let $(E,\langle\ ,\ \rangle)$ be an inner product space. Given a linear transformation $A: E\longrightarrow E$ (i.e. a mixed tensor), one can associate a bilinear covariant bilinear form $A’$ by
$$A’(v,w):=\langle v,Aw\rangle=v^ig_{ij}A^j{}_k w^k.$$ So we see that the matrix of $A’$ is
$$A’_{ik}=g_{ij}A^j{}_k.$$ The process can be said as “we lower the index $j$, making it a $k$, by mans of the metric tensor $g_{ij}$.” In tensor analysis one uses the same letter, i.e. instead of $A’$, one writes
$$A_{ik}:=g_{ij}A^j{}_k.$$ This is clearly a covariant tensor. In general, the components of the associated covariant tensor $A_{ik}$ differ from those of the mixed tensor $A^i{}_j$. But if the basis is orthonormal, i.e. $g_{ij}=\delta^i_j$ then they coincide. That is the reason why we simply write $A_{ij}$ without making any distiction in linear algebra or in elementary differential geometry.

Similarly, one may associate to the linear transformation $A$ a contravariant bilinear form
$$\bar A(\alpha,\beta)=a_iA^i{}_jg^{jk}b_k$$ whose matrix components can be written as
$$A^{ik}=A^i{}_jg^{jk}.$$

Note that the metric tensor $g_{ij}$ represents a linear map from $E$ to $E^\ast$, sending the vector with components $v^j$ into the covector with components $g_{ij}v^j$. In quantum mechanics, the covector $g_{ij}v^j$ is denoted by $\langle v|$ and called a bra vector, while the vector $v^j$ is denoted by $|v\rangle$ and called a ket vector. Usually the inner product on $E$
$$\langle\ ,\ \rangle:E\times E\longrightarrow\mathbb{R};\ \langle v,w\rangle=g_{ij}v^iw^j$$ is considered as a covariant tensor of rank 2. But in quantum mechanics $\langle v,w\rangle$ is not considered as a covariant tensor $g_{ij}$ of rank 2 acting on a pair of vectors $(v,w)$, rather it is regarded as the braket $\langle v|w\rangle$, a bra vector $\langle v|$ acting on a ket vector $|w\rangle$.

Connection Forms

Let $E_1, E_2, E_3$ be an arbitrary frame field on $\mathbb{E}^3$. At each $v\in T_p\mathbb{E}^3$, $\nabla_v E_i\in T_p\mathbb{E}^3$, $i=1,2,3$. So, there exists uniquely 1-forms $\omega_{ij}:T_p\mathbb{E}^3\longrightarrow\mathbb{R}$, $i,j=1,2,3$ such that
\begin{align*}
\nabla_vE_1&=\omega_{11}(v)E_1(p)+\omega_{12}(v)E_2(p)+\omega_{13}(v)E_3(p),\\
\nabla_vE_2&=\omega_{21}(v)E_1(p)+\omega_{22}(v)E_2(p)+\omega_{23}(v)E_3(p),\\
\nabla_vE_3&=\omega_{31}(v)E_1(p)+\omega_{32}(v)E_2(p)+\omega_{33}(v)E_3(p)
\end{align*}
for each $v\in T_p\mathbb{E}^3$. These equations are called the connection equations of the frame field $E_1$, $E_2$, $E_3$. One can clearly see that $\omega_{ij}$ is determined by
$$\omega_{ij}(v)=\nabla_v E_i\cdot E_j(p).\ \ \ \ \ \mbox{(1)}$$ The 1-forms $\omega_{ij}$ are called the connection forms of the frame field $E_1,E_2,E_3$. Often the matrix $\omega=(\omega_{ij})$ is called the connection 1-form of the frame field $E_1,E_2,E_3$. The linearity of $\omega_{ij}$ is due to the linearity of the covariant derivative $\nabla E_i$.

Proposition. The matrix $\omega$ is a skew symmetric matrix, i.e. $\omega+{}^t\omega=0$.

Proof. Since $E_i\cdot E_j=0$, the directional derivative $v[E_i\cdot E_j]=0$. On the other hand, by Leibniz rule,
\begin{align*}
v[E_i\cdot E_j]&=\nabla_vE_i\cdot E_j(p)+E_i(p)\cdot \nabla_vE_j\\
&=\omega_{ij}(v)+\omega_{ji}(v).
\end{align*}
Hence,
$$\omega_{ij}+\omega_{ji}=0.\ \ \ \ \ \mbox{(2)}$$

If $i=j$ in (2), we get $\omega_{ii}=0$. So, the connection 1-form $\omega$ is written as
$$\omega=\begin{pmatrix}
0 & \omega_{12} & \omega_{13}\\
-\omega_{12} & 0 &\omega_{23}\\
-\omega_{13} & -\omega_{23} & 0
\end{pmatrix}.\ \ \ \ \ \mbox{(3)}$$

Remark. The set of all $3\times 3$ skew symmetric matrices is denoted by $\mathfrak{o}(3)$. It is the Lie algebra of the orthogonal group $\mathrm{O}(3)$. The orthogonal group $\mathrm{O}(3)$ is the set of all $3\times 3$ orthogonal matrices and it is a Lie group. Recall that a square matrix $A$ is orthogonal if and only if $A\cdot{}^tA=I$, i.e. $A^{-1}={}^tA$.

The connection equations of the frame field $E_1$, $E_2$, $E_3$
$$\nabla_VE_i=\sum_i\omega_{ij}(V)E_j,\ i=1,2,3\ \ \ \ \ \mbox{(4)}$$
where $V$ is a vector field on $\mathbb{E}^3$ become
$$\begin{array}{ccccccc}
\nabla_VE_1&=&&&\omega_{12}(V)E_2&+&\omega_{13}(V)E_3,\\
\nabla_VE_2&=&-\omega_{12}(V)E_1& & &+&\omega_{23}(V)E_3,\\
\nabla_VE_3&=&-\omega_{13}(V)E_1&-&\omega_{23}(V)E_2.
\end{array}
$$
The connections equations are in fact a generalization of the Frenet-Serret formulas.

Let $Y$ be a vector field defined on a region containing a curve $\alpha(t)$. Then $Y_\alpha(t):=Y(\alpha(t))$ defined a vector field on the curve $\alpha(t)$. Then one can easily see that
$$\nabla_{\dot\alpha(t)}Y=\frac{d}{dt}Y_\alpha(t).$$
Let $\alpha(t)$ be a curve with unit speed. Let $E_1=T$, $E_2=N$, $E_3=B$. Then
\begin{align*}
\omega_{12}&=\nabla_{\dot\alpha_(t)}E_1\cdot E_2=\dot T\cdot N=(\kappa N)\cdot N=\kappa,\\
\omega_{13}&=\nabla_{\dot\alpha_(t)}E_1\cdot E_3=\dot T\cdot B=0,\\
\omega_{23}&=\nabla_{\dot\alpha_(t)}E_2\cdot E_3=\dot N\cdot B=(-\kappa T+\tau B)=\tau.
\end{align*}
The connection equations (4) are then nothing but the Frenet-Serret formulas
$$\begin{array}{ccccccc}
\dot T&=&&&\kappa N&&\\
\dot N&=&-\kappa T& & &+&\tau B\\
\dot B&=&&-&\tau N.
\end{array}
$$

The frame $E_1,E_2,E_3$ can be written in terms of the natural frame $U_1,U_2,U_3$ as
\begin{align*}
E_1&=a_{11}U_1+a_{12}U_2+a_{13}U_3,\\
E_2&=a_{21}U_1+a_{22}U_2+a_{23}U_3,\\
E_3&=a_{31}U_1+a_{32}U_2+a_{33}U_3.
\end{align*}
Each real-valued function $a_{ij}:\mathbb{E}^3\longrightarrow\mathbb{R}$ is uniquely determined by $a_{ij}=E_i\cdot U_j$. The matrix $A=(a_{ij})$ is called the attitude matrix (also called rotation matrix or orientation matrix) of the frame field $E_1,E_2,E_3$. One can clearly see that the attitude matrix $A$ is an orthogonal matrix. In the above remark, I mentioned that the set of all $3\times $ skew symmetric matrices is the Lie algebra $\mathfrak{o}(3)$. The Lie algebra $\mathfrak{g}$ of a Lie group $G$ is defined to be the tangent space $T_e G$ to $G$ at the identity element $e$. (A Lie group is a differentiable manifold, so it make sense to talk about tangent spaces to $G$.)

Let us define a curve $\gamma: \mathbb{R}\longrightarrow\mathrm{O}(3)$ by
$$\gamma(t)=A(t)\cdot{}^tA(0).$$
Then $\gamma(0)=I$.
Hence $\dot{\gamma}(0)=\frac{dA(t)}{dt}|_{t=0}\cdot{}^tA(0)$ is a tangent vector to $\mathrm{O}(3)$ at the identity matrix $I$. That is, $\dot{\gamma}(0)\in\mathfrak{o}(3)$. Hence one can easily expect that the following theorem holds.

Theorem. If $A=(a_{ij})$ is the attitude matrix and $\omega=(\omega_{ij})$ the connection 1-form of a frame field $E_1, E_2, E_3$, then
$$\omega=dA\cdot{}^tA\ \ \ \ \ \mbox{(4)}$$
or equivalently
$$\omega_{ij}=\sum_k da_{ik} \cdot a_{jk}\ \mbox{for}\ i,j=1,2,3.$$

Proof. For each $v\in T_p\mathbb{E}^3$,
$$\omega_{ij}(v)=\nabla_vE_i\cdot E_j(p).$$
In terms of the natural field $U_i$, $i=1,2,3$,
$$E_i=\sum_ka_{ik}U_k,\ i=1,2,3.$$
So,
\begin{align*}
\nabla_vE_i&=\sum_k v[a_{ik}]U_k(p)\\
&=\sum_k da_{ik} U_k(p).
\end{align*}
Hence,
$$\omega_{ij}=\sum_k da_{ik}a_{jk},$$
i.e.
$$\omega=dA\cdot{}^tA.$$

Remark. In general, if $G$ is a Lie group then its Lie algebra $\mathfrak{g}$ is given by the set of differential $1$-forms
$$\mathfrak{g}=\{g^{-1}dg:\ g\in G\}=\{(dg^{-1})g:\ g\in G\}.$$

Example. Let us compute the connection forms of the cylindrical frame field. The attitude matrix is
$$A=\begin{pmatrix}
\cos\theta & \sin\theta & 0\\
-\sin\theta & \cos\theta & 0\\
0 & 0 & 1
\end{pmatrix}.$$ Thus
$$dA=\begin{pmatrix}
-\sin\theta d\theta & \cos\theta d\theta & 0\\
-\cos\theta d\theta & -\sin\theta d\theta & 0\\
0 & 0 & 0
\end{pmatrix}.$$
Hence,
\begin{align*}
\omega&=dA\cdot{}^tA\\
&=\begin{pmatrix}
-\sin\theta d\theta & \cos\theta d\theta & 0\\
-\cos\theta d\theta & -\sin\theta d\theta & 0\\
0 & 0 & 0
\end{pmatrix}\begin{pmatrix}
\cos\theta & -\sin\theta & 0\\
\sin\theta & \cos\theta & 0\\
0 & 0 & 1\end{pmatrix}\\
&=\begin{pmatrix}
0 & d\theta & 0\\
-d\theta & 0 & 0\\
0 & 0 & 0
\end{pmatrix}.
\end{align*}
The connection equations of the cylindrical frame field are then
\begin{align*}
\nabla_VE_1&=d\theta(V)E_2=V[\theta]E_2,\\
\nabla_VE_2&=-d\theta(V)E_1=-V[\theta]E_1,\\
\nabla_VE_3&=0
\end{align*}
for all vector fields $V$. As expected the vector field $E_3$ is parallel.