# Functional Analysis 5: Completion of Metric Spaces

This lecture will conclude our discussion on metric spaces with completion of metric spaces.

Definition. Let $X=(X,d)$ and $\tilde X=(\tilde X,d)$ be two metric space. A mapping $T: X\longrightarrow\tilde X$ is said to be an isometry of $T$ preserves distances, i.e.
$$\forall x,y\in X,\ \tilde d(Tx,Ty)=d(x,y).$$ The space $X$ is said to be isometric with $\tilde X$ if there exists a bijective isometry of $X$ onto $\tilde X$.

Theorem [Completion] For a metric space $(X,d)$ there exists a complete metric space $\hat X=(\hat X,\hat d)$ which has s subspace $W$ that is isometric with $X$ and is dense in $\hat X$. $\hat X$ is called the completion of $X$ and it is unique up to isometries.

Proof. This will be a lengthy proof and I have divided it into steps.

Step 1. Construction of $\hat X=(\hat X,\hat d)$.

Let $(x_n)$ and $(y_n)$ be Cauchy sequences in $X$. We say $(x_n)$ is equivalent to $(x_n’)$, and write $(x_n)\sim (x_n’)$, if
$$\lim_{n\to\infty}d(x_n,x_n’)=0.$$ $\sim$ is actually an equivalence relation on the set of all Cauchy sequences of $X$. Clearly $\sim$ is reflexive and symmetric. Let us show that $\sim$ is also transitive. Let $(x_n)\sim (x_n’)$ and $(x_n’)\sim (x_n^{\prime\prime})$. Then
$$\lim_{n\to\infty}d(x_nx_n’)=0\ \mbox{and}\ \lim_{n\to\infty}d(x_n’,x_n^{\prime\prime})=0.$$
\begin{align*}
\lim_{n\to\infty}d(x_n,x_n^{\prime\prime})&\leq\lim_{n\to\infty}d(x_n,x_n’)+\lim_{n\to\infty}d(x_n’,x_n^{\prime\prime})\\
&=0
\end{align*}
Let $\hat X$ be the set of all equivalence classes $\hat x,\hat y,\cdots$ of Cauchy sequences. Define
$$\hat d(\hat x,\hat y)=\lim_{n\to\infty}d(x_n,y_n),$$
where $x_n\in\hat x$ and $y_n\in\hat y$. We claim that $\hat d$ is a metric on $\hat X$. It suffices to show that $\hat d$ is well-defined. The conditions (M1)-(M3) hold due to the fact that $d$ is a metric. First we show that $\hat d(\hat x,\hat y)$ exists. It follows from (M3) that
$$d(x_n,y_n)\leq d(x_n,x_m)+d(x_m,y_m)+d(y_m,y_n)$$ and so we have
$$d(x_n,y_n)-d(x_m,y_m)\leq d(x_n,x_m)+d(y_m,y_n).$$ Similarly, we obtain
$$d(x_m,y_m)-d(x_n,y_n)\leq d(x_n,x_m)+d(y_m,y_n).$$ Hence,
$$|d(x_n,y_n)-d(x_m,y_m)|\leq d(x_n,x_m)+d(y_m,y_n)\rightarrow 0$$ as $n,m\rightarrow\infty$, i.e.
$$\lim_{n,m\to\infty}|d(x_n,y_n)-d(x_m,y_m)|=0.$$
Since $\mathbb{R}$ is complete, $\displaystyle\lim_{n\to\infty}d(x_n,y_n)$ exists. Now we show that the limit is independent of the choice of representatives $(x_n)$ and $(y_n)$. If $(x_n)\sim (x_n’)$ and $(y_n)\sim(y_n’)$, then
$$|d(x_n,y_n)-d(x_n’,y_n’)|\leq d(x_n,x_m)+d(y_m,y_n)\rightarrow 0$$ as $n\to\infty$.

Step 2. Construction of an isometry $T: X\longrightarrow W\subset \hat X$.

For each $b\in X$, let $\hat b$ be the equivalence class of the Cauchy sequence $(b,b,b,\cdots)$. Then $T(b):=\hat b\in\hat X$. Now,
$$\hat d(Tb,Tc)=\hat d(\hat b,\hat c)=d(b,c).$$ So, $T$ is an isometry. An isometry is automatically injective. $T$ is onto since since $T(X)=W$. Let us show that $W$ is dense in $\hat X$. Let $\hat x\in \hat X$ and let $(x_n)\in\hat x$. Since $(x_n)$ is Cauchy, given $\epsilon>0$ $\exists N$ such that $d(x_n,x_N)<\frac{\epsilon}{2}$ $\forall n\geq N$. Let $(x_N,x_N,\cdots)\in\hat x_N$. Then $\hat x_N\in W$.
\begin{align*}
\hat d(\hat x,\hat x_N)=\lim_{n\to\infty}d(x_n,x_N)\leq\frac{\epsilon}{2}<\epsilon&\Longrightarrow \hat x_N\in B(\hat x,\epsilon)\\
&\Longrightarrow B(\hat x,\epsilon)\cap W\ne\emptyset.
\end{align*}
Hence, $\bar W=\hat X$ i.e. $W$ is dense in $\hat X$.

Step 3. Completeness of $\hat X$.

Let $(\hat x_n)$ be any Cauchy sequence in $\hat X$. Since $W$ is dense in $\hat X$, $\forall \hat x_n$, $\exists\hat z_n\in W$ such that $\hat d(\hat x_n,\hat z_n)<\frac{1}{n}$.
\begin{align*}
\hat d(\hat z_m,\hat z_n)&\leq \hat d(\hat z_n,\hat x_m)+\hat d(\hat x_m,\hat x_n)+\hat d(\hat x_n,\hat z_n)\\
&<\frac{1}{m}+\hat d(\hat x_m,\hat x_n)+\frac{1}{n}.
\end{align*}
Given $\epsilon>0$ by Archimedean property $\exists$ a positive integer $N_1$ such that $N>\frac{\epsilon}{3}$. Since $(\hat x_n)$ is a Cauchy sequence, $\exists$ a positive integer $N_2$ such that $\hat d(\hat x_m,\hat x_n)<\frac{\epsilon}{3}$ $\forall m,n\geq N$. Let $N=\max\{N_1,N_2\}$. Then $\forall m,n\geq N$, $\hat d(\hat z_m,\hat z_n)<\epsilon$ i.e. $(\hat z_m)$ is Cauchy. Since $T: X\longrightarrow W$ is an isometry and $\hat z_m\in W$, the sequence $(z_m)$, where $z_m=T^{-1}\hat z_m$, is Cauchy in $X$. Let $\hat x\in\hat X$ be the class to which $(z_m)$ belongs. Show that $\hat x$ is the limit of $(\hat x_n)$. For each $n=1,2,\cdots$,
\begin{align*}
\hat d(\hat x_n,\hat x)&\leq \hat d(\hat x_n,\hat z_n)+\hat d(\hat z_n,\hat x)\\
&<\frac{1}{n}+\hat d(\hat z_n,\hat x)\\
&=\frac{1}{n}+\lim_{m\to\infty}d(z_n,z_m)
\end{align*} since $(z_m)\in\hat x$ and $(z_n,z_n,\cdots)\in\hat z_n\in W$. This implies that $\displaystyle\lim_{n\to\infty}\hat d(\hat x_n,\hat x)=0$ i.e. the Cauchy sequence $(\hat x_n)$ in $\hat X$ has the limit $\hat x\in\hat X$. Therefore, $\hat X$ is complete.

Step 4. Uniqueness of $\hat X$ up to isometries.

Suppose that $(\tilde X,\tilde d)$ is another completion of $X$ i.e. it is a complete metric space with a subspace $\tilde W$ dense in $\tilde X$ and isometric with $X$. We show that $\hat X$ is isometric with $\tilde X$. Let $X$ is isometric with $W$ and $\tilde W$ via isometries $T$ and $\tilde T$, respectively. Then $W$ is isometric with $\tilde W$ via the isometry $\rho=\tilde T\circ T^{-1}$.

$$\begin{array}{ccc} & & W\\ & \nearrow &\downarrow\\ X & \longrightarrow & \tilde{W} \end{array}$$

Let $\hat x\in\hat X$. Then $\exists$ a sequence in $(\hat x_n)\in W$ such that $\displaystyle\lim_{n\to\infty}\hat x_n=\hat x$. $(\hat x_n)$ is a Cauchy sequence and $\rho$ is an isometry, so $(\tilde x_n)$, where $\tilde x_n:=\rho\hat x_n$, is a Cauchy sequence in $\tilde W\subset \tilde X$. Since $\tilde X$ is complete, $\exists\tilde x\in\tilde X$ such that $\displaystyle\lim_{n\to\infty}\tilde x_n=\tilde x$. Define a mapping $\psi:\hat X\longrightarrow\tilde X$ by $\psi\hat x=\tilde x$. Then we claim that $\hat X$ is isometric with $\tilde X$ via $\psi$.

Step A. $\psi$ is well-defined.

It suffices to show that $T\hat x$ does not depend on the choice of $(\hat x_n)\in W$ such that $\displaystyle\lim_{n\to\infty}\hat x_n=\hat x$. Let $(\hat x_n’)$ be another sequence in $W$ such that $\displaystyle\lim_{n\to\infty}\hat x_n’=\hat x$. Then $(\tilde x_n’)$, where $\tilde x_n’=\rho\hat x_n’$, is a Cauchy sequence in $\tilde W$ and so $\exists\tilde x’\in\tilde X$ such that $\displaystyle\lim_{n\to\infty}\tilde x_n’=\tilde x’$. Now,
\begin{align*}
\tilde d(\tilde x,\tilde x’)&=\lim_{n\to\infty}\tilde d(\tilde x_n,\tilde x_n’)\\
&=\lim_{n\to\infty}\hat d(\hat x_n,\hat x_n’)\ (\rho\ \mbox{is an isometry})\\
&=\hat d(\hat x,\hat x)\\
&=0.
\end{align*}
Hence, $\tilde x=\tilde x’$.

Step B. $\psi$ is onto.

Let $\tilde x\in\tilde X$. Then $\exists$ a sequence $(\tilde x_n)$ in $\tilde W$ such that $\displaystyle\lim_{n\to\infty}\tilde x_n=\tilde x$. $(\tilde x_n)$ is Cauchy (since it is a convergent sequence) and $\rho^{-1}$ is an isometry, so the sequence $(\hat x_n)\subset \hat X$, where $\hat x_n=\rho^{-1}\tilde x_n$, is Cauchy. Since $\hat X$ is complete, $\exists\hat x\in\hat X$ such that $\displaystyle\lim_{n\to\infty}\hat x_n=\hat x$. Clearly $\psi\hat x=\tilde x$ and hence $\psi$ is onto.

Step C. $\psi$ is an isometry.

Let $\hat x,\hat y\in\hat X$. Then $\exists$ sequences $(\hat x_n)$, $(\hat y_n)$ in $W$ such that $\displaystyle\lim_{n\to\infty}\hat x_n=\hat x$ and $\displaystyle\lim_{n\to\infty}\hat y_n=\hat y$, respectively.
\begin{align*}
\hat d(\hat x,\hat y)&=\lim_{n\to\infty}\hat d(\hat x_n,\hat y_n)\\
&=\lim_{n\to\infty}\tilde d(\tilde x_n,\tilde y_n)\ (\tilde x_n:=\rho\hat x_n,\ \tilde y_n:=\rho y_n)\\
&=\tilde d(\tilde x,\tilde y)\ (\lim_{n\to\infty}\tilde x_n=\tilde x,\ \lim_{n\to\infty}\tilde y_n=\tilde y)\\
&=\tilde d(\psi\hat x,\psi\hat y).
\end{align*}
Thus, $\psi$ is an isometry.

Remember that an isometry from a metric space into another metric space is automatically one-to-one. Therefore, $\hat X$ is isometric with $\tilde X$ via $\psi$.

# Group Theory 6: Lagrange’s Theorem

In this lecture, we study Lagrange’s Theorem which is named after the Italian/French mathematician, physicists and astronomer Joseph-Louis Lagrange. It states that the order of a subgroup $H$ of a finite group $G$ divides the order of $G$. Lagrange’s Theorem has many important applications in group theory. Before we discuss the theorem we first need to study an important class of binary relations called equivalence relations.

Definition. A binary relation $R$ on a nonempty set $S$ is a subset $R\subset S\times S$. If $(a,b)\in R$, we say $a$ is $R$-related to $b$ and write $aRb$. A binary relation $R$ on a set $S$ is called an equivalence relation on $S$ if it satisfies:

1. $aRa$ $\forall a\in S$. ($R$ is reflexive.)

2. $\forall a,b,\in S$, $aRb\Longrightarrow bRa$. ($R$ is symmetric.)

3. $\forall a,b,c\in S$, $aRb$ and $bRc$ $\Longrightarrow$ $aRc$. ($R$ is transitive.)

Examples. 1. Define a binary relation $\equiv\mod n$ on $\mathbb{Z}$, the set of integers by
$$\forall a,b\in\mathbb{Z},\ a\equiv b\mod n\ \mbox{if}\ n|(a-b).$$ Then $\equiv\mod n$ is an equivalence relation in $\mathbb{Z}$, called the congruence relation modulo $n$. If $a\equiv b\mod n$, we say $a$ is congruent to $b$ modulo $n$. Note that $n|(a-b)$ if and only if $a-b=nk$ for some $k\in\mathbb{Z}$ if and only if $a$ and $b$ have the same remainder when they are divided by $n$ if and only if $a-b\in n\mathbb{Z}$ where
$$n\mathbb{Z}=\{nk: k\in\mathbb{Z}\},$$ the set of all integer multiples of $n$. $\forall n\in\mathbb{Z}$, $n\mathbb{Z}$ is a subgroup of the additive group $(\mathbb{Z},+)$.

2. Let $G$ be a group and $H\leq G$. Define a binary relation $R$ on $G$ by
$$\forall a,b\in G,\ aRb\ \mbox{if}\ ab^{-1}\in H.$$ Then $R$ is an equivalence relation on $G$.

Definition. If $R$ is an equivalence relation on a set $S$, then $[a]$, the equivalence class of $a\in S$, is defined by
$$[a]=\{b\in S: bRa\}.$$

Theorem. Let $R$ be an equivalence relation on a set $S$. Then

1. $S=\bigcup_{a\in S}[a]$.

2. $\forall a,b\in S$, either $[a]=[b]$ or $[a]\cap [b]=\emptyset$.

This means that an equivalence relation $R$ on a set $S$ partitions $S$ into equivalence classes. The set of all equivalence classes $\{[a]: a\in S\}$ is denoted by $S/R$ and is called the quotient set of $S$ modulo $R$.

The converse of the above theorem is also true, namely

Theorem. Let $\wp=\{A_i:i\in I\}$ be a partition of a set $S$ i.e. $S=\bigcup_{i\in I}A_i$ and $\forall i\ne j\in I$, $A_i\cap A_j=\emptyset$. Define a binary relation $R$ on $S$ by
$$\forall a,b\in S,\ aRb\ \mbox{if}\ a,b\in A_i\ \mbox{for some}\ i\in I.$$ Then $R$ is an equivalence relation on $S$, called the equivalence relation induced by the partition $\wp$.

Example. Let $f: X\longrightarrow Y$ be a surjective map. Then $\{f^{-1}(y): y\in Y\}$ forms a partition of $X$. The equivalence relation induced by this partition is
$$\ker f=\{(a,b)\in X\times X: f(a)=f(b)\}.$$ This equivalence relation is called the kernel of the function $f$.

Let $G$ be a group and $H\leq G$. Consider the equivalence relation $R$ on $G$ given by
$$\forall a,b\in G,\ aRb\ \mbox{if}\ ab^{-1}\in H.$$
Then for each $a\in G$, $[a]=Ha$ where $Ha=\{ha:h\in H\}$. $Ha$ is called a right coset of $H$ in $G$.

Example. Consider $\equiv\mod n$, the congruence relation modulo $n$ on $\mathbb{Z}$. For each $a\in \mathbb{Z}$, $[a]=n\mathbb{Z}+a$. The right coset $n\mathbb{Z}+a$ is called the congruence class of $a$ modulo $n$.

Now we are ready to discuss Lagrange’s Theorem.

Theorem [Lagrange's Theorem] If $G$ is a finite group and $H\leq G$, then $|H|||G|$.

Proof. $G=Ha_1\cup Ha_2\cup\cdots\cup Ha_k$ and $Ha_i\cup Ha_j=\emptyset$ if $i\ne j$, $i,j=1,2,\dots, k$. Due to the cancellation law, we see that $|H|=|Ha_i|$ for all $i=1,2,\cdots, k$. Therefore,
\begin{align*}
|G|&=\sum_{i=1}^k|Ha_i|\\
&=\sum_{i=1}^k|H|\\
&=k|H|.
\end{align*}
This completes the proof.

If $G$ is finite, the number of right cosets of $H$ in $G$, namely $\frac{|G|}{|H|}$ is called the index of $H$ in $G$ and is written as $|G:H|$ or $i_G(H)$. Here, we use the notation $|G:H|$.

It should be noted that the converse of Lagrange’s Theorem need not be true. For example, the alternating group of index 4
$$A_4=\{1,(123),(132),(124),(142),(134),(143),(234),(243),(12)(34),(13)(24),(14)(23)\}$$ has no subgroup of order 6 though $|A_4|=12$ and $6|12$.

Theorem. A group $G$ of prime order is cyclic.

Proof. Let $|G|=p$, a prime. Let $H\leq G$. Then by Lagrange’s Theorem, $|H|||G|$ and so either $|H|=1$ or $|H|=p$. If $H\ne\{e\}$, then $H=G$. If $a\ne e\in G$, $\langle a\rangle=G$.

Let $G$ be a finite group and $a\ne e\in G$. Then $H=\langle a\rangle\leq G$. So, $a^m=e$ for some $m\in\mathbb{N}$.

Definition. If $G$ is finite, then the order of $a\ne e\in G$, written $|a|$ or $o(a)$, is the least positive integer $n$ such that $a^n=e$.

Corollary. If $G$ is finite and $a\ne e\in G$, then $|\langle a\rangle|||G|$.

Theorem. If $G$ is a finite group of order $n$, then $a^n=e$ for all $a\in G$.

Proof. Let $a\ne e\in G$ and $|\langle a\rangle|=k$. Then by Lagrange’s Theorem $k|n$ i.e. $n=kl$ for some $l\in\mathbb{N}$. So,
$$a^n=a^{kl}=(a^k)^l=e^l=e.$$

# Group Theory 5: Subgroups

In this lecture, we study subgroups. A nonempty subset $H$ of a group $G$ is called a subgroup of $G$ if $H$ itself forms a group relative to the operation $\cdot$ in $G$. If $H$ is a subgroup of $G$, we simply write $H\leq G$. Clearly $G\leq G$ and $\{e\}\leq G$. $G$ and $\{e\}$ are called trivial subgroups of $G$. Before we discuss subgroups further we need to go over some basic properties of a group as we need them.

Lemma. If $G$ is a group, then:

(a) Its identity element is unique.

(b) Each $a\in G$ has a unique inverse $a^{-1}\in G$.

(c) $\forall a\in G$, $(a^{-1})^{-1}=a$.

(d) $\forall a,b\in G$, $(ab)^{-1}=b^{-1}a^{-1}$.

The proof of this lemma is pretty much straightforward and is left to readers.

In order to show that a nonempty subset $H$ of a group $G$ is a subgroup we need to check:

1. $H$ is closed under $\cdot$, the operation in $G$. That is, $\forall a,b\in H$, $ab\in H$.

2. $e\in H$.

3. $\forall a\in H$, $a^{-1}\in H$.

Note that the associative law holds automatically. It turns out that there is a simpler criterion to check if a nonempty subset is a subgroup of a group.

Lemma. A nonempty subset $H\subset G$ is a subgroup of $G$ if and only of $\forall a,b\in H$, $ab^{-1}\in H$.

Proof. ($\Longrightarrow$) Clear.

($\Longleftarrow$) Let $a,b\in H$. Then by assumption, $aa^{-1}=e\in H$. Since $a,e\in H$, $ea^{-1}=a^{-1}\in H$. Since $b^{-1}\in H$, $a(b^{-1})^{-1}=ab\in H$. Therefore, $H$ is a subgroup of $G$.

Example. $\forall n\in\mathbb{N}\cup\{0\}$, $n\mathbb{Z}\leq(\mathbb{Z},+)$. Here, $n\mathbb{Z}=\{nx: x\in\mathbb{Z}\}$. In fact, they are all the subgroups of $(\mathbb{Z},+)$.

Definition. The cyclic subgroup of $G$ generated by $a\in G$ is $\{a^k: k\in\mathbb{Z}\}$. It is denoted by $\langle a\rangle$.

Example. Let $G$ be a group and $a\in G$. Let $C(a)=\{g\in G: ga=ag\}$. Then $C(a)\leq G$. $C(a)$ is called the centralizer of $a$ in $G$.

Example. Let $G$ be a group and let $Z(G)=\{z\in G: zx=xz\ \forall x\in G\}$. Then $Z(G)$ is a subgroup of $G$, called the center of $G$.

Example. Let $G$ be a group and $H\leq G$. $\forall a\in G$, $a^{-1}Ha=\{a^{-1}ha: h\in H\}$ is also a subgroup of $G$.

Lemma. Suppose that $(G,\cdot)$ is a group and $H$ is a finite nonempty subset of $H$. If $H$ is closed under $\cdot$, then $H$ is a subgroup of $G$.

Proof. Let $H=\{a_1,a_2,\cdots,a_n\}$. Then $\forall a\in H$, $aH=\{aa_1,aa_2,\cdots,aa_n\}\subset H$. On the other hand, if $aa_i=aa_j$ then $a_i=a_j$. So, $|aH|=|H|$. This implies that $aH=H$ and hence
\begin{align*}
a\in aH &\Longrightarrow a=aa_i\ \mbox{for some}\ i\\
&\Longrightarrow a_i=e\in H.
\end{align*}
Since $e\in aH$, $e=aa_j$ for some $j$ $\Longrightarrow$ $a_j=a^{-1}$.

Example. The symmetric group $S_3=\{1,(123),(132),(23),(13),(12)\}$ has 5 subgroups $\{1\}$, $\{1,(12)\}$, $\{1,(13)\}$, $\{1,(23)\}$, and $A_3=\{1,(123),(132)\}$. $A_3$ is called alternating group of degree 3. Cycles of length 2 such as (12), (13), (23) are called transpositions. Any permutation can be written as a product of either an even number of transpositions or an odd number of transpositions. If a permutation is a product of an even number of transpositions, the permutation is called an even permutation. If a permutation is a product of an odd number of transpositions, the permutation is called an odd permutation. The identity permutation 1 is an even permutation as $\begin{pmatrix} 1 & 2 & 3\\ 1 & 2 & 3 \end{pmatrix}$ can be written as (12)(21)(23)(32)(31)(13). The permutation (123) can be written as (12)(13). So, (123) is an even permutation. Similarly (132) is an even permutation as well. So it turns out that the alternating group $A_3$ is the set of all even permutations. In general, the set $A_n$ of all even permutations of $\{1,2,3,\cdots,n\}$ form a subgroup of the symmetric group $S_n$. Remember that $S_n$ has order $n!$. Since there are exactly the same number of even permutations and odd permutations, the alternating group $A_n$ of degree $n$ has order $\frac{n!}{2}$.

Example. $\mathrm{SL}(n,\mathbb{R})=\{A\in\mathrm{GL}(n,\mathbb{R}): \det A=1\}$ is a subgroup of the general linear group $\mathrm{GL}(n,\mathbb{R})$ of degree $n$. $\mathrm{SL}(n,\mathbb{R})$ is called the special linear group of degree $n$. The general linear group $\mathrm{GL}(n,\mathbb{R})$ is the group of isometries of $\mathbb{R}^n$. An isometry of $\mathbb{R}^n$ is a linear bijective map from $\mathbb{R}^n$ onto itself which preserves the inner product. What this means is that if $T: \mathbb{R}^n\longrightarrow\mathbb{R}^n$ is an isometry then for any two vectors $v,w\in\mathbb{R}^n$, $\langle Tv,Tw\rangle=\langle v,w\rangle$. The special linear group $\mathrm{SL}(n,\mathbb{R})$ is the group of all orientation preserving isometries of $\mathbb{R}^n$.

# Functional Analysis 4: Convergence, Cauchy Sequence, Completeness

The set $\mathbb{Q}$ of rational numbers is not complete (or not a continuum) since it has gaps or holes. For instance, $\sqrt{2}$ is not in $\mathbb{Q}$. On the other hand, the set $\mathbb{R}$ of real numbers has no gaps or holes, so it is complete (or is a continuum). Let $(x_n)$ be a sequence of real numbers. Suppose that $(x_n)$ converges to a real number $x$. Then by the triangle inequality, for any $m,n\in\mathbb{N}$, we have
$$|x_m-x_n|\leq |x_m-x|+|x-x_n|.$$
Hence, $\displaystyle\lim_{m,n\to\infty}|x_m-x_n|=0$, i.e. $(x_n)$ is a Cauchy sequence. Conversely, Georg Cantor introduced the completeness axiom that every Cauchy sequence of real numbers converges and defined a real number as the limit of a Cauchy sequence of rational numbers. For instance, consider the Cauchy sequence $(x_n)$ defined by
$$x_1=1,\ x_{n+1}=\frac{x_n}{2}+\frac{1}{x_n},\ \forall n\geq 2.$$
If $(x_n)$ converges to a number $x$, it would satisfy $x^2=2$ i.e. $(x_n)$ converges to $\sqrt{2}$. There is another way to obtain the completeness of $\mathbb{R}$ by a Dedekind cut, though we are not going to delve into that here.

More generally, one can also consider a complete metric space and that is what we are going to study in this lecture.

Definition. A sequence $(x_n)$ is a metric space $(X,d)$ is said to converge or to be convergent to $x\in X$ if
$$\lim_{n\to\infty}d(x_n,x)=0.$$
$x$ is called the limit if $(x_n)$ and we write
$$\lim_{n\to\infty}x_n=x\ \mbox{or}\ x_n\rightarrow x.$$
If $(x_n)$ is not convergent, it is sad to be divergent. We can generalize the definiton of the convergence of a sequence we learned in calculus in terms of a metric as:

Definition. $\displaystyle\lim_{n\to\infty}d(x_n,x)=0$ if and only if given $\epsilon>0$ $\exists$ a positive integer $N$ s.t. $x_n\in B(x,\epsilon)$ $\forall n\geq N$.

A nonempty subset $M\subset X$ is said to be bounded if
$$\delta(M)=\sup_{x,y\in M}d(x,y)<\infty.$$

Lemma. Let $(X,d)$ be a metric space.

(a) A convergent sequence in $X$ is bounded and its limit is unique.

(b) If $x_n\rightarrow x$ and $y_n\rightarrow y$, then $d(x_n,y_n)\rightarrow d(x,y)$.

Proof. (a) Suppose that $x_n\rightarrow x$. Then one can find a positive integer $N$ such that $d(x_n,x)<1$ $\forall n\geq N$. Let $M=2\max\{d(x_1,x),\cdots,d(x_{N-1},x),1\}$. Then for all $m,n\in\mathbb{N}$,
\begin{align*}
d(x_m,x_n)&\leq d(x_m,x)+d(x,x_n)\ (\mbox{ (M3) triangle inequality)}\\
&\leq M.
\end{align*}
This means that $\delta((x_n))\leq M<\infty$ i.e. $(x_n)$ is bounded.

Suppose that $x_n\rightarrow x$ and $x_n\rightarrow y$. Then
\begin{align*}
0\leq d(x,y)&\leq d(x,x_n)+d(x_ny)\\
&\rightarrow 0
\end{align*}
as $n\to\infty$. So, $d(x,y)=0\Rightarrow x=y$ by (M1).

(b) By (M3),
$$d(x_n,y_n)\leq d(x_n,x)+d(x,y)+d(y,y_n)$$
and so we obtain
$$d(x_n,y_n)-d(x,y)\leq d(x_n)+d(y,y_n).$$
Similarly, we also obtain the inequality
$$d(x,y)-d(x_n,y_n)\leq d(x,x_n)+d(y_n,y).$$
Hence,
$$0\leq |d(x_n,y_n)-d(x,y)|\leq d(x_n,x)+d(y_n,y)\rightarrow 0$$
as $n\to\infty$.

Definition. A sequence $(x_n)\subset (X,d)$ is said to be Cauchy if given $\epsilon>0$ $\exists$ a positive integer $N$ such that
$$d(x_m,x_n)<\epsilon\ \forall m,n\geq N.$$
The space $X$ is said to be complete if every Cauchy sequence in $X$ converges.

Examples. The real line $\mathbb{R}$ and the complex plane $\mathbb{C}$ are complete.

Theorem. Every convergent sequence is Cauchy.

Proof. Suppose that $x_n\rightarrow x$. Then given $\epsilon>0$ $\exists$ a poksitive integer $N$ s.t. $d(x_n,x)<\frac{\epsilon}{2}$ for all $n\geq N$. Now, $\forall m,n\geq N$
$$d(x_m,x_n)\leq d(x_m,x)+d(x,x_n)<\frac{\epsilon}{2}+\frac{\epsilon}{2}=\epsilon.$$
Therefore, $(x_n)$ is Cauchy.

Theorem. Let $M$ be a nonempty subset of a metric space $(X,d)$. Then

(a) $x\in\bar M\Longleftrightarrow \exists$ a seqence $(x_n)\subset M$ such that $x_n\rightarrow x$.

(b) $M$ is closed $\Longleftrightarrow$ given a sequence $(x_n)\subset M$, $x_n\rightarrow x$ implies $x\in M$.

Proof. (a) ($\Longrightarrow$) Since $x\in\bar M$, $\forall n\in\mathbb{N}$ $\exists x_n\in B\left(x,\frac{1}{n}\right)\cap M\ne\emptyset$. Let $\epsilon>0$ be given. Then by the Archimedean property, $\exists$ a positive integer $N$ s.t. $N\geq\frac{1}{\epsilon}$. Now,
$$n\geq N\Longrightarrow d(x_n,x)<\frac{1}{n}\leq\frac{1}{N}<\epsilon.$$

($\Longleftarrow$) Suppose that $\exists$ a sequence $(x_n)\subset M$ s.t. $x_n\rightarrow x$. Then given $\epsilon>0$ $\exists$ a positive integer $N$ s.t. $x_n\in B(x,\epsilon)$ $\forall n\geq N$. This means that $\forall\epsilon>0$, $B(x,\epsilon)\cap M\ne\emptyset$. So, $x\in\bar M$.

(b) ($\Longrightarrow$) Clear

($\Longleftarrow$) It suffices to show that $\bar M\subset M$. Let $x\in\bar M$. Then $\exists$ a sequence $(x_n)\subset M$ such that $x_n\rightarrow x$. By assumption, $x\in M$.
Theorem. A subspace $M$ of a complete metric space $X$ itself is complete if and only if $M$ is closed in $X$.

Proof. ($\Longrightarrow$) Let $M\subset X$ be complete. Let $(x_n)$ be a sequence in $M$ such that $x_n\rightarrow x$. Then $(x_n)$ is Cauchy. Since $M$ is complete, every Cauchy sequence must converge and hence $x\in M$. This means that $M$ is closed.

($\Longleftarrow$) Suppose that $M\subset X$ is closed. Let $(x_n)$ be a Cauchy sequence in $M\subset X$. Since $X$ is complete, $\exists x\in X$ such that $x_n\rightarrow x$. Since $M$ is closed, $x\in M$. Therefore, $M$ is complete.

Example. In $\mathbb{R}$ with Euclidean metric, the closed intervals $[a,b]$ are complete. $\mathbb{Z}$, the set of integers is also complete by the above theorem since it is closed in $\mathbb{R}$. One can directly see why $\mathbb{Z}$ is complete without quoting the theorem though. Let $(x_n)$ be a Cauchy sequence in $\mathbb{Z}$. Then we see that there exists a positive integer $N$ such that $x_N=x_{N+1}=x_{N+2}=\cdots$. Hence any Cauchy sequence in $\mathbb{Z}$ is a convergent sequence in $\mathbb{Z}$. Therefore, $\mathbb{Z}$ is complete.

Theorem. A mapping $T: X\longrightarrow Y$ is continuous at $x_0\in X$ if and only if $x_n\rightarrow x$ implies $Tx_n\rightarrow Tx_0$.

Proof. ($\Longrightarrow$) Suppose that $T$ is continuous and $x_n\rightarrow x$ in $X$. Let $\epsilon>0$ be given. Then $\exists\delta>0$ s.t. whenever $d(x,x_0)<\delta$, $d(Tx,Tx_0)<\epsilon$. Since $x_n\rightarrow x$, $\exists$ a positive integer $N$ s.t. $d(x_n,x_0)<\delta$ $\forall n\geq N$. So, $\forall n\geq N$, $d(Tx_n,Tx_0)<\epsilon$. Hence, $Tx_n\rightarrow Tx_0$.

($\Longleftarrow$) Suppose that $T$ is not continuous. Then $\exists\epsilon>0$ s.t. $\forall\delta>0$, $\exists x\ne x_0$ satisfying $d(x,x_0)<\delta$ but $d(Tx,tx_0)\geq\epsilon$. So, $\forall n=1,2,\cdots$, $\exists x_n\ne x_0$ satisfying $d(x_n,x_0)<\frac{1}{n}$ but $d(Tx_n,Tx_0)\geq\epsilon$. This means that $x_n\rightarrow x_0$ but $Tx_n\not\rightarrow Tx_0$.

# Group Theory 4: Examples of Groups

In the first lecture here, we defined a group. We have also seen an example of a group here, which is a symmetric group. In this lecture, we study some well-known examples of groups of finite and infinite orders. Recall that the order of a group os the number of elements in the group.

Examples of Groups

1. $(\mathbb{Z},+)$, $(\mathbb{Q},+)$, $(\mathbb{R},+)$, and $(\mathbb{C},+)$ are abelian groups of infinite order.
2. $(\mathbb{Q}\setminus\{0\},\cdot)$ is an abelian group of infinite order.
3. $(\mathbb{R}^+,\cdot)$ is an abelian group of infinite order. Here $\mathbb{R}^+$ denotes the set of all positive real numbers.
4. Let $E_n=\{e^{\frac{2k\pi i}{n}}: k=0,1,2,\cdots,n-1\}$. $e^{\frac{2k\pi i}{n}}$, $k=0,1,2,\cdots,n-1$ are the $n$-th roots of unity i.e. the zeros of $z^n=1$. $E_n$ forms an abelian group of order $n$. It is generated by a single element $e^{\frac{2\pi i}{n}}$. Such a group is called a cyclic group.

Definition. A group $G$ is said to be a finite group if it has a finite number of elements. The number of elements in $G$ is called the order of $G$ as mentioned previously and it is denoted by $|G|$ or $\mathrm{ord}(G)$. We use the notation $|G|$ for the order of the group $G$.

Example.Let $\mathbb{Z}_n=\{0,1,2,\cdots,n-1\}$. $\mathbb{Z}_n$ is the set of all possible remainders when an integer is divided by 2. The addition $+$ on $\mathbb{Z}_n$ is defined naturally as follows.
\begin{aligned} \mathbb{Z}_2\\ \begin{array}{|c||c|c|} \hline + & 0 & 1\\ \hline \hline 0 & 0 & 1\\ \hline 1 & 1 & 0\\ \hline \end{array} \end{aligned}
\begin{aligned} \mathbb{Z}_3\\ \begin{array}{|c||c|c|c|} \hline + & 0 & 1 & 2\\ \hline \hline 0 & 0 & 1 & 2\\ \hline 1 & 1 & 2 & 0\\ \hline 2 & 0 & 1 & 2\\ \hline \end{array} \end{aligned}
$(\mathbb{Z}_n,+)$ is an abelian group of order $n$. We will see later that the group $E_n$ and $\mathbb{Z}_n$ are indeed the same group.

Example. The set $M_{m\times n}(\mathbb{R})$ of all $m\times n$ matrices of real entries under matrix addition is an abelian group of infinite order.

Example. The set $\mathrm{GL}(n,\mathbb{R})$ of alll $n\times n$ non-singular matrices of real entries under matrix multiplication is a non-abelian group of infinite order. $\mathrm{GL}(n,\mathbb{R})$ is called the general linear group of degree $n$. It can be viewed as the set of all invertible linear transformations $T: \mathbb{R}^n\longrightarrow\mathbb{R}^n$ that preserve the standard Euclidean inner product. A map $T: \mathbb{R}^n\longrightarrow\mathbb{R}^n$ is said to be an isometry if it is a linear isomorphism (i.e linear, one-to-one and onto, but remember from linear algebra that a linear map $T: \mathbb{R}^n\longrightarrow\mathbb{R}^n$ is one-to-one, it is also onto) and it preserves inner product i.e. for any vectors $v$ and $w$ in $\mathbb{R}$, $\langle Tv,Tw\rangle=\langle v,w\rangle$.

Example. [Klein four-group (Vierergruppe in German)]

Klein four-group

Let $V=\{e,a,b,c\}$ where $a$ is counterclockwise rotation of the rectangle in the picture about the $x$-axis by $180^\circ$, $b$ is counterclockwise rotation of the rectangle about the $y$-axis by $180^\circ$ and $c$ is counterclockwise rotation of the rectangle about the $z$-axis (the axis coming out of the origin toward you) by $180^\circ$. Then $a$, $b$ and $c$ satisfy the following relationship
\begin{align*}
a^2=b^2=c^2=e,\ ab&=ba=c,\ bc=cb=a,\\
ca&=ac=b.
\end{align*}
Here, $\cdot$ is the function composition i.e. successive application of rotations. By labling the vertices of the rectangle as 1, 2, 3, 4 as seen in the picture, we can define each rotation as a permutation of $\{1,2,3,4\}$.
\begin{align*}
e&=\begin{pmatrix}
1 & 2 & 3 & 4\\
1 & 2 & 3 & 4
\end{pmatrix},\ a=\begin{pmatrix}
1 & 2 & 3 & 4\\
2 & 1 & 4 &3
\end{pmatrix},\\
b&=\begin{pmatrix}
1 & 2 & 3 & 4\\
4 & 3 & 2 & 1
\end{pmatrix},\ c=\begin{pmatrix}
1 & 2 & 3 & 4\\
3 & 4 & 1 &2
\end{pmatrix}.
\end{align*}
For instance, we calculate $ab$:
\begin{align*}
ab&=\begin{pmatrix}
1 & 2 & 3 & 4\\
2 & 1 & 4 &3
\end{pmatrix}\begin{pmatrix}
1 & 2 & 3 & 4\\
4 & 3 & 2 & 1
\end{pmatrix}\\
&=\begin{pmatrix}
1 & 2 & 3 & 4\\
2 & 1 & 4 &3
\end{pmatrix}\begin{pmatrix}
4 & 3 & 2 & 1\\
3 & 4 & 1 & 2
\end{pmatrix}\\
&=\begin{pmatrix}
1 & 2 & 3 & 4\\
3 & 4 & 1 &2
\end{pmatrix}\\
&=c.
\end{align*}

Example. [The $n$-th dihedral group $D_n$, $n\geq 3$]

Consider an equilateral triangle shown in the following picture.

Dihedral group D3

$\rho$ is counterclockwise rotation about the axis coming out of the origin by $\frac{360^\circ}{3}=120^\circ$ and $\mu_i$, $i=1,2,3$ is counterclockwise rotation about the axis of rotation through each vertex $i$ by $180^\circ$. As permutations of the vertices $\{1,2,3\}$, $\rho$ and $\mu_i$ , $i=1,2,3$ are given by
\begin{align*}
\rho&=\begin{pmatrix}
1 & 2 & 3\\
2 & 3 & 1
\end{pmatrix},\ \mu_1=\begin{pmatrix}
1 & 2 & 3\\
1 & 3 & 2
\end{pmatrix}\\
\mu_2&=\begin{pmatrix}
1 & 2 & 3\\
3 & 2 & 1
\end{pmatrix},\ \mu_3=\begin{pmatrix}
1 & 2 & 3\\
2 & 1 &3
\end{pmatrix}.
\end{align*}
$D_3=\{e,\rho,\rho^2,\mu_1,\mu_2,\mu_3\}$ is a group called the 3rd dihedral group. $$\mu_1\mu_2=\rho^2\ne\rho=\mu_2\mu_1.$$
So, we see that $D_3$ is not an abelian group. $D_3$ is the group of symmetries of an equilateral triangle.

Now this time let us consider a square as shown in the following picture.

Dihedral group D4

$\rho$ is counterclockwise rotation about the $z$-axis coming out of the origin by $\frac{360^\circ}{4}=90^\circ$. $\mu_i$, $i=1,2$ are counterclockwise rotations about $y$-axis and $x$-axis, respectively by $180^\circ$. $\delta_i$, $i=1,2$ are counterclockwise rotations about the axis through the vertices 2 and 4 and the vertices through 1 and 3, respectively by $180^\circ$. As permutations of the vertices $\{1,2,3,4\}$, $\rho$, $\mu_i$ and $\delta_i$, $i=1,2$ are given by
\begin{align*}
\rho&=\begin{pmatrix}
1 & 2 & 3 & 4\\
2 & 3 & 4 & 1
\end{pmatrix},\\
\mu_1&=\begin{pmatrix}
1 & 2 & 3 & 4\\
2 & 1 & 4 & 3
\end{pmatrix},\ \mu_2=\begin{pmatrix}
1 & 2 & 3 & 4\\
4 & 3 & 2 & 1
\end{pmatrix},\\
\delta_1&=\begin{pmatrix}
1 & 2 & 3 & 4\\
3 & 2 & 1 & 4
\end{pmatrix},\ \delta_2=\begin{pmatrix}
1 & 2 & 3 & 4\\
1 & 4 & 3 & 2
\end{pmatrix}.
\end{align*}
$D_4=\{e,\rho,\rho^2,\rho^3,\mu_1,\mu_2,\delta_1,\delta_2\}$ is the group of symmetries of a square, called the 4th dihedral group. It is also called the octic group. Note that $|D_n|=2n$, $n\geq 3$.

# Functional Analysis 3: Basic (Metric) Topology

Let $(X,d)$ be a metric space.

Definition. A subset $U\subset X$ is said to be open if $\forall x\in U$ $\exists\epsilon>0$ s.t. $B(x,\epsilon)\subset U$.

If $U\subset X$ is open then $U$ can be expressed as union of open balls $B(x,\epsilon)$. Hence, the set of all open balls in $X$, $\mathcal{B}=\{B(x,\epsilon): x\in X,\ \epsilon>0\}$ form a basis for a topology (a metric topology, the topology induced by the metric $d$) on $X$. Those who have not studied topology before may simply understand it as the set of all open sets in $X$.

Definition. A subset $F\subset X$ is said to be closed if its complement, $F^c=X\setminus F$ is open in $X$.

The following is the definition of a continuous function that you are familiar with from calculus. The definition is written in terms of metrics.

Definition. Let $(X,d_X)$ and $(Y,d_Y)$ be metric spaces. A mapping $T:X\longrightarrow Y$ is said to be continuous at $x_0\in X$ if $\forall\epsilon>0$ $\exists\delta>0$ s.t $d_Y(Tx,Tx_0)<\epsilon$ whenever $d_X(x,x_0)<\delta$.

$T$ is said to be continuous if it is continuous at every point of $X$.

The above definition can be generalized in terms of open sets as follows.

Theorem. A mapping $T: (X,d_X)\longrightarrow(Y,d_Y)$ is continuous if and only if $\forall$ open set $U$ in $Y$, $T^{-1}U$ is open in $X$.

Proof. (Only if, $\Rightarrow$) Suppose that $T:X\longrightarrow Y$ is continuous. Let $U$ be open in $Y$. Then we show that $T^{-1}U$ is open in $X$. Let $x_0\in T^{-1}U$. Then $Tx_0\in U$. Since $U$ is open in $Y$, $\exists\epsilon>0$ s.t. $B(Tx_0,\epsilon)\subset U$. By the continuity of $T$, for this $\epsilon>0$ $\exists\delta>0$ s.t. whenever $d(x,x_0)<\delta$, $d(Tx,Tx_0)<\epsilon$. This means that
$$TB(x_0,\delta)\subset B(Tx_0,\epsilon)\subset U\Longrightarrow B(x_0,\delta)\subset T^{-1}(TB(x_0,\delta))\subset T^{-1}U.$$ Hence, $T^{-1}U$ is open in $X$.

(If, $\Leftarrow$) Suppose that $\forall$ open set $U$ in $Y$, $T^{-1}U$ is open in $X$. We show that $T$ is continuous. Let $x_0\in X$ and let $\epsilon>0$ be given. Then $B(Tx_0,\epsilon)$ is open in $Y$. So by the assumption, $x_0\in T^{-1}B(Tx_0,\epsilon)$ is open in $X$. This means that $\exists\delta>0$ s.t.
$$B(x_0,\delta)\subset T^{-1}B(Tx_0,\epsilon)\Longrightarrow TB(x_0,\delta)\subset T(T^{-1}B(Tx_0,\epsilon))\subset B(Tx_0,\epsilon).$$ This is equivalent to saying that $\exists\delta>0$ s.t. whenever $d(x,x_0)<\delta$, $d(Tx,Tx_0)<\epsilon$. That is, $T$ is continuous at $x_0$. Since the choice $x_0\in X$ was arbitrary, the proof is complete.

Let $A\subset X$. $x\in X$ is called an accumulation point or a limit point of $A$ if $\forall$ open set $U(x)$ in $X$, $(U(x)-\{x\})\cap A\ne\emptyset$. Here the notation $U(x)$ means that it contains $x$. The set of all accumulation points of $A$ is denoted by $A’$ and is called the derived set of $A$. $\bar A:=A\cup A’$ is called the closure of $A$. $\bar A$ is the smallest closed set containing $A$.

Theorem. Let $A\subset X$. Then $x\in\bar A$ if and only if $\forall$ open set $U(x)$, $U(x)\cap A\ne\emptyset$.

Definition. $D\subset X$ is said to be dense if $\bar D=X$. This means that $\forall$ open set $U$ in $X$, $U\cap D\ne\emptyset$.

Definition. $X$ is said to be separable if it has a countable dense subset.

Examples. The real line $\mathbb{R}$ is separable. The complex plane $\mathbb{C}$ is also separable.

Theorem. The space $\ell^\infty$ is not separable.

Proof. Let $y=(\eta_1,\eta_2,\eta_3,\cdots)$ be a sequence of zeros and ones. Then $y\in\ell^\infty$. We can then associate $y$ with the binary representation
$$\hat y=\frac{\eta_1}{2}+\frac{\eta_2}{2^2}+\frac{\eta_3}{2^3}+\cdots\in [0,1].$$ Each $\hat y\in [0,1]$ has a binary representation and different $\hat y$’s have different binary representations. So, there are uncountably many sequences of zeros and ones. If $y$ and $z$ are sequences of zeros and ones and $y\ne z$, then $d(y,z)=1$. This means that for any two distinct sequences $y$ and $z$ of zeros and ones, $B\left(y,\frac{1}{3}\right)\cap B\left(z,\frac{1}{3}\right)=\emptyset$. Let $A$ be a dense subset of $\ell^\infty$. Then for each sequence $y$ of zeros and ones, $B\left(y,\frac{1}{3}\right)$ has at least one element of $A$. This means that $A$ cannot be countable.

Theorem. The space $\ell^p$ with $1\leq p<\infty$ is separable.

Proof. Let $A$ be the set of all sequences $y$ of the form
$$y=(\eta_1,\eta_2,\cdots,\eta_n,0,0,\cdots,0),$$ where $n$ is a positive integer and the $\eta_j$’s are rational. For each $n=1,2,\cdots$, the number of sequences of the form $y=(\eta_1,\eta_2,\cdots,\eta_n,0,0,\cdots,0)$ is the same as the number of functions from $\{1,2,3,\cdots,n\}$ to $\mathbb{Q}$, the set of all rational numbers. $\mathbb{Q}$ has the cardinality $\aleph_0$ and so the number is $\aleph_0^n=\aleph_0$. The cardinality of $A$ is then $\aleph_0\cdot\aleph_0=\aleph_0$ i.e. $A$ is countable. Now we show that $A$ is dense in $\ell^p$. Let $x=(\xi_j)\in\ell^p$. Let $\epsilon>0$ be given. Since $\displaystyle\sum_{j=1}^\infty|\xi_j|^p<\infty$, $\exists$ a positive integer $N$ s.t. $\displaystyle\sum_{j=N+1}^\infty|\xi_j|^p<\frac{\epsilon^p}{2}$. Since rationals are dense in $\mathbb{R}$, one can find $y=(\eta_1,\eta_2,\cdots,\eta_N,0,0,\cdots)\in A$ s.t. $\displaystyle\sum_{j=1}^N|\xi_j-\eta_j|^p<\frac{\epsilon^p}{2}$. Hence,
$$[d(x,y)]^p=\sum_{j=1}^N|\xi_j-\eta_j|^p+\sum_{j=N+1}^\infty|\xi_j|^p<\epsilon^p,$$
i.e. $d(x,y)<\epsilon$. This means that $y\in B(x,\epsilon)\cap A\ne\emptyset$. This completes the proof.

# Group Theory 3: Preliminaries (Basic Number Theory)

In this lecture, we study some basic number theory as it is needed to study group theory.

Let $\mathbb{Z}$ denote the set of integers. $\mathbb{Z}$ satisfies well-ordering principle, namely any non-empty set of nonnegative integers has a smallest member.

One of the most fundamental theorems regarding numbers is Euclid’s Algorithm. Although we will not discuss its proof, it can be proved using well-ordering principle.

Theorem. [Euclid's Algorithm] If $m$ and $n$ are integers with $n>0$, then $\exists$ integers $q$ and $r$ with $0\leq r<n$ such that $m=qn+r$.

Euclid’s algorithm hints us how we can define the notion that one integer divides another.

Definition. Given $m\ne 0, n\in\mathbb{Z}$, we say $m$ divides $n$ and write $m|n$ if $n=cm$ for some $c\in\mathbb{Z}$.

Examples. $2|14$, $(-7)|14$, $4|(-16)$.

If $m|n$, we call $m$ a divisor or a factor of $n$, and $n$ a multiple of $m$. To indicate $m$ is not a divisor of $n$, we write $m\not|n$. For example, $3\not|5$.

Lemma. The following properties hold.

(a) $1|n$ $\forall n$.

(b) If $m\ne 0$ then $m|0$.

(c) If $m|n$ and $n|q$, then $m|q$.

(d) If $m|n$ and $m|q$ then $m|(\mu n+\nu q)$ $\forall \mu,\nu$.

(e) If $m|1$ then $m=\pm 1$.

(f) If $m|n$ and $n|m$ then $m=\pm n$.

Definition. Given $a,b$ (not both 0), their greatest common divisor (in short gcd) $c$ is defined by the following properties:

(a) $c>0$

(b) $c|a$ and $c|b$

(c) If $d|a$ and $d|b$ then $d|c$.

If $c$ is the gcd of $a$ and $b$, we write $c=(a,b)$.

$(24,9)=3$. Note that the gcd 3 can be written in terms of 24 and 9 as $3\cdot 9+1\cdot (-24)$ or $(-5)9+2\cdot 24$. In general, we have the following theorem holds.

Theorem. If $a,b$ are not both 0, their gcd exists uniquely. Moreover, $\exists m,n\in\mathbb{Z}$ s.t. $c=ma+nb$.

Now let us talk about how to find the gcd of two positive numbers $a$ and $b$. W.L.O.G. (Without Loss Of Generality), we may assume that $b<a$. Then by Euclid’s algorithm we have
$$a=bq+r,\ \mbox{where}\ 0\leq r<b.$$
Let $c=(a,b)$. Then $c|r$, so $c$ is a common divisor of $b$ and $r$. If $d$ is a common divisor of $b$ and $r$, it is also a common divisor of $a$ and $b$. This implies that $d\leq c$ and so $c=(b,r)$. Finding $(b,r)$ is of course easier because one of the numbers is smaller than before.

Example. [Finding GCD]
\begin{aligned} (100,28)&=(28,16)\ &(100&=28\cdot 3+16)\\ &=(16,12)\ &(28&=16\cdot 1+12)\\ &=(12,4)\ &(16&=12\cdot 1+14)\\ &=4. \end{aligned}
By working backward, we can also find integers $m$ and $n$ such that
$$4=m\cdot 100+n\cdot 28.$$
\begin{align*}
4&=16+12(-1)\\
&=16+(-1)[28+(-1)16]\\
&=(-1)28+2\cdot 16\\
&=(-1)28+2[100+(-3)28]\\
&=2\cdot 100+(-7)28.
\end{align*}
Therefore, $m=2$ and $n=-7$.

Definition. We say that $a$ and $b$ are relatively prime if $(a,b)=1$.

Theorem. The integers $a$ and $b$ are relatively prime if and only if $1=ma+nb$ for some $m$ and $n$.

Theorem. If $(a,b)=1$ and $a|bc$ then $a|c$.

Theorem. If $b$ and $c$ are both relatively prime to $a$, then $bc$ is also relatively prime to $a$.

Definition. A prime number, or shortly prime, is an integer $p>1$ such that $\forall a\in\mathbb{Z}$, either $p|a$ or $(p,a)=1$.

Suppose that $p$ is a prime as defined above and $p=ab$, where $1\leq a<p$. Then $p\not|a$ since $a<p$, so $(p,a)=1$. This implies that $p|b$. On the other hand, $b|p(=ab)$ and hence $p=b$ and $a=1$. So, the above definition coincides with the definition of a prime we are familiar with.

Theorem. If $p$ is a prime and $p|a_1a_2\cdots a_n$, then $p|a_i$ for some $i$ with $1\leq i\leq n$.

Proof. If $p|a_1$, we are done. If not, $(p,a_1)=1$ and so $p|a_2a_3\cdots a_n$. Continuing this, we see that $p|a_i$ for some $i$.

Regarding primes, we have the following theorems.

Theorem. If $n>1$, then either $n$ is a prime or the product of primes.

Theorem. [Unique Factorization Theorem] Given $n>1$, there is a unique way to write $n$ in the form $n=p_1^{a_1}p_2^{a_2}\cdots p_k^{a_k}$, where $p_1<p_2<\cdots<p_k$ are primes and the exponents $a_1,\cdots,a_k$ are all positive.

Theorem. [Euclid] There is an infinite number of primes.

# Functional Analysis 2: $\ell^p$ and $L^p$ as Metric Spaces

Let $p\geq 1$ be a fixed number and let
$$\ell^p=\left\{x=(\xi_j): \sum_{j=1}^\infty|\xi_j|^p<\infty\right\}.$$
Define $d:\ell^p\times\ell^p\longrightarrow\mathbb{R}^+\cup\{0\}$ by
$$d(x,y)=\left(\sum_{j=1}^\infty|\xi_j-\eta_j|^p\right)^{\frac{1}{p}}.$$
Then $(\ell^p,d)$ is a metric space. The properties (M1) and (M2) are clearly satisfied. We prove the remaining property (M3) the triangle inequality. $p=1$ case can be easily shown by the triangle inequality of numbers. We need a few steps to do this. First we prove the following inequality: $\forall\alpha>0,\beta>0$,
$$\alpha\beta\leq\frac{\alpha^p}{p}+\frac{\beta^q}{q},$$
where $p>1$ and $\frac{1}{p}+\frac{1}{q}=1$. The numbers $p$ and $q$ are called conjugate exponenets. It follows from $\frac{1}{p}+\frac{1}{q}=1$ that $(p-1)(q-1)=1$ i.e. $\frac{1}{p-1}=q-1$. If we let $u=t^{p-1}$ then $t=u^{\frac{1}{p-1}}=u^{q-1}$. By comparing areas, we obtain
$$\alpha\beta\leq\int_0^{\alpha}t^{p-1}dt+\int_0^{\beta}u^{q-1}du=\frac{\alpha^p}{p}+\frac{\beta^q}{q}.$$
Next, using this inequality we prove the Hölder inequality
$$\sum_{j=1}^\infty|\xi_j\eta_j|\leq\left(\sum_{k=1}^\infty|\xi_k|^p\right)^{\frac{1}{p}}\left(\sum_{m=1}^\infty|\eta_m|^q\right)^{\frac{1}{q}}$$
where $p>1$ and $\frac{1}{p}+\frac{1}{q}=1$. When $p=2$ and $q=2$, we obtain the well-known Cauchy-Schwarz inequality.

Proof. Let $(\tilde\xi_j)$ and $(\tilde\eta_j)$ be two sequences such that
$$\sum_{j=1}^\infty|\tilde\xi_j|^p=1,\ \sum_{j=1}^\infty|\tilde\eta_j|^q=1.$$
Let $\alpha=|\tilde\xi_j|$ and $\beta=|\tilde\eta_j|$. Then by the inequality we proved previously,
$$|\tilde\xi_j\tilde\eta_j|\leq\frac{|\tilde\xi_j|^p}{p}+\frac{|\tilde\eta_j|^q}{q}$$
and so we obtain
$$\sum_{j=1}^\infty|\tilde\xi_j\tilde\eta_j|\leq\sum_{j=1}^\infty\frac{|\tilde\xi_j|^p}{p}+\sum_{j=1}^\infty\frac{|\tilde\eta_j|^q}{q}=1.$$
Now take any nonzero $x=(\xi_j)\in\ell^p$, $y=(\eta_j)\in\ell^q$. Setting
$$\tilde\xi_j=\frac{\xi_j}{\left(\displaystyle\sum_{k=1}^\infty|\xi_k|^p\right)^{\frac{1}{p}}},\ \tilde\eta_j=\frac{\eta_j}{\left(\displaystyle\sum_{m=1}^\infty|\eta_m|^q\right)^{\frac{1}{q}}}.$$
results the Hölder inequality.

Next, we prove the Minkowski inequality
$$\left(\sum_{j=1}^\infty|\xi_j+\eta_j|^p\right)^{\frac{1}{p}}\leq\left(\sum_{k=1}^\infty|\xi_k|^p\right)^{\frac{1}{p}}+\left(\sum_{m=1}^\infty|\eta_m|^p\right)^{\frac{1}{p}}$$
where $x=(\xi_j)\,y=(\eta_j)\in\ell^p$ and $p\geq 1$. $p=1$ case comes from the triangle inequality for numbers. Let $p>1$. Then
\begin{align*}
|\xi_j+\eta_j|^p&=|\xi_j+\eta_j||\xi_j|\eta_j|^{p-1}\\
&=(|\xi_j|+|\eta_j|)|\xi_j+\eta_j|^{p-1}\ (\mbox{triangle inequality for numbers}).
\end{align*}
For a fixed $n$, we have
$$\sum_{j=1}^n|\xi_j+\eta_j|^p\leq\sum_{j=1}^n|\xi_j||\xi_j+\eta_j|^{p-1}+\sum_{j=1}^n|\eta_j||\xi_j+\eta_j|^{p-1}.$$
Using the Hölder inequality, we get the following inequality
\begin{align*}
\sum_{j=1}^n|\xi_j||\xi_j+\eta_j|^{p-1}&\leq \sum_{j=1}^\infty |\xi_j||\xi_j+\eta_j|^{p-1}\\
&\leq\left(\sum_{k=1}^\infty |\xi_k|^p\right)^{\frac{1}{p}}\left(\sum_{m=1}^\infty(|\xi_m+\eta_m|^{p-1})^q\right)^{\frac{1}{q}}\ (\mbox{Hölder})\\
&=\left(\sum_{k=1}^\infty|\xi_k|^p\right)^{\frac{1}{p}}\left(\sum_{m=1}^\infty|\xi_m+\eta_m|^p\right)^{\frac{1}{q}}.
\end{align*}
Similarly we also get the inequality
$$\sum_{j=1}^n|\eta_j||\xi_j+\eta_j|^{p-1}\leq \left(\sum_{k=1}^\infty|\eta_k|^p\right)^{\frac{1}{p}}\left(\sum_{m=1}^\infty|\xi_m+\eta_m|^p\right)^{\frac{1}{q}}.$$
Combining these two inequalities, we get
$$\sum_{j=1}^n|\xi_j+\eta_j|^p\leq\left\{\left(\sum_{k=1}^\infty|\xi_k|^p\right)^{\frac{1}{p}}+\left(\sum_{k=1}^\infty|\eta_k|^p\right)^{\frac{1}{p}}\right\}\left(\sum_{m=1}^\infty|\xi_m+\eta_m|^p\right)^{\frac{1}{q}}$$
and by taking the limit $n\to \infty$ on the left hand side, we get
$$\sum_{j=1}^\infty|\xi_j+\eta_j|^p\leq\left\{\left(\sum_{k=1}^\infty|\xi_k|^p\right)^{\frac{1}{p}}+\left(\sum_{k=1}^\infty|\eta_k|^p\right)^{\frac{1}{p}}\right\}\left(\sum_{m=1}^\infty|\xi_m+\eta_m|^p\right)^{\frac{1}{q}}.$$
Finally dividing this inequality by $\displaystyle\left(\sum_{m=1}^\infty|\xi_m+\eta_m|^p\right)^{\frac{1}{q}}$ results the Minkowski inequality. The Minkowski inequality tells that
$$d(x,y)=\left(\sum_{j=1}^\infty|\xi_j-\eta_j|^p\right)^{\frac{1}{p}}<\infty$$
for $x,y\in\ell^p$. Let $x=(\xi_j), y=(\eta_j),\ z=(\zeta_j)\in\ell^p$. Then
\begin{align*}
d(x,y)&=\left(\sum_{j=1}^\infty|\xi_j-\eta_j|^p\right)^{\frac{1}{p}}\\
&\leq\left(\sum_{j=1}^\infty[|\xi_j-\zeta_j|+|\zeta_j-\eta_j|]^p\right)^{\frac{1}{p}}\\
&\leq\left(\sum_{j=1}^\infty|\xi_j-\zeta_j|^p\right)^{\frac{1}{p}}+\left(\sum_{j=1}^\infty|\zeta_j-\eta_j|^p\right)^{\frac{1}{p}}\\
&=d(x,z)+d(z,y).
\end{align*}

A measurable function $f$ on a closed interval $[a,b]$ is said to belong to $L^p$ if $\int_a^b|f(t)|^p dt<\infty$. $L^p$ is a vector space. For functions $f,g\in L^p$, we define
$$d(f,g)=\left\{\int_a^b|f(t)-g(t)|^pdt\right\}^{\frac{1}{p}}.$$
Then clearly (M2) symmetry is satisfied and one can also prove that (M3) triangle inequality holds. However, (M1) is not satisfied since what we have is that if $d(f,g)=0$ then $f=g$ a.e. (almost everywhere) i.e. the set $\{t\in[a,b]: f(t)\ne g(t)\}$ has measure $0$. It turns out that $=$ a.e. is an equivalence relation on $L^p$, so by considering $f\in L^p$ as its equivalence class $[f]$, $d$ can be defined as a metric on $L^p$ (actually the quotient space of $L^p$). Later, we will be particularly interested in the case when $p=2$ in which case $L^p$ as well as $\ell^p$ become Hilbert spaces. Those of you who want to know details about $L^p$ space are referred to

Real Analysis, H. L. Royden, 3rd Edition. Macmillan Publishing Company, 1988

# Group Theory 2: Preliminaries (Functions)

In my previous notes here, I mentioned some about logical symbols. The logical symbols I will use often are $\forall$ which means “for all”, “for any”, “for each”, or “for every” depending on the context, $\exists$ which means “there exists”, and $\ni$ which means “such that” (don’t be confused with $\in$ which means “be an element of”). We also use s.t. for “such that.” There are also $\Longrightarrow$ which means “implies” and $\Longleftrightarrow$ which means “if and only if.” I guess these pretty much cover what we use most of time.

Now lets review about functions in a more formal way. Let $X$ and $Y$ be two non-empty sets. The the Cartesian product $X\times Y$ of $X$ and $Y$ is defined as the set
$$X\times Y=\{(x,y): x\in X,\ y\in Y\}.$$
A subset $f$ of the Cartesian product $X\times Y$ (we write $f\subset X\times Y$) is called a graph from $X$ to $Y$. A graph $f\subset X\times Y$ is called a function from $X$ to $Y$ (we write $f: X\longrightarrow Y$) if whenever $(x,y_1),(x,y_2)\in f$, $y_1=y_2$. If $f: X\longrightarrow Y$ and $(x,y)\in f$, we also write $y=f(x)$. A function $f: X\longrightarrow Y$ is said to be one-to-one or injective if whenever $(x_1,y),(x_2,y)\in f$, $x_1=x_2$. This is equivalent to saying $f(x_1)=f(x_2)$ implies $x_1=x_2$. A function $f: X\longrightarrow Y$ is said to be onto or surjective if $\forall y\in Y$ $\exists x\in X$ s.t. $(x,y)\in f$. A function $f: X\longrightarrow Y$ is said to be one-to-one and onto (or bijective) if it is both one-to-one and onto (or both injective and surjective).

Let $f: X\longrightarrow Y$ and $g: Y\longrightarrow Z$ be two functions. Then the composition or the composite function $g\circ f: X\longrightarrow Z$ is defined by $g\circ f(x)=g(f(x))$ $\forall x\in X$. The function composition $\circ$ may be considered as an operation and it is associative.

Lemma. If $h: X\longleftrightarrow Y$, $g:Y\longleftrightarrow Z$ and $f:Z\longleftrightarrow W$, then $f\circ(g\circ h)=(f\circ g)\circ h$.

Note that $\circ$ is not commutative i.e. it is not necessarily true that $f\circ g=g\circ f$ even when both $f\circ g$ and $g\circ f$ are defined.

The following lemmas will be useful when we study group theory later.

Lemma. If both $f: X\longrightarrow Y$ and $g: Y\longrightarrow Z$ are one-to-one, so is $g\circ f: X\longrightarrow Z$.

Lemma. If both $f: X\longrightarrow Y$ and $g: Y\longrightarrow Z$ are onto, so is $g\circ f: X\longrightarrow Z$.

As an immediate consequence of combining these two lemmas, we obtain

Lemma. If both $f: X\longrightarrow Y$ and $g: Y\longrightarrow Z$ are bijective, so is $g\circ f: X\longrightarrow Z$.

If $f\subset X\times Y$, then the inverse graph $f^{-1}\subset Y\times X$ is defined by
$$f^{-1}=\{(y,x)\in Y\times X: (x,y)\in f\}.$$
If $f: X\longrightarrow Y$ is one-to-one and onto (bijective) then its inverse graph $f^{-1}$ is a function $f^{-1}: Y\longrightarrow X$. The inverse $f^{-1}$ is also one-to-one and onto.

Lemma. If $f: X\longrightarrow Y$ is a bijection, then $f\circ f^{-1}=\imath_Y$ and $f^{-1}\circ f=\imath_X$, where $\imath_X$ and $\imath_Y$ are the identity mappings of $X$ and $Y$, respectively.

Let $A(X)$ be the set of all one-to-one functions of $X$ onto $X$ itself. Then $(A(X),\circ)$ is a group. If $X$ is a finite set of $n$-elements (we may conveniently say $X=\{1,2,\cdots,n\})$, then $(A(X),\circ)$ is a finite group of order $n!$, called the symmetric group of degree $n$. The symmetric group of degree $n$ is denoted by $S_n$ and the elements of $S_n$ are called permutations.

# Analytic Continuation

The function $f(z)=\displaystyle\frac{1}{1+z}$ has an isolated singularity at $z=-1$. It has the Maclaurin series representation

$$f(z)=\sum_{n=0}^\infty(-1)^nz^n$$
for $|z|<1$. The power series $f_1(z)=\displaystyle\sum_{n=0}^\infty(-1)^nz^n$ converges only on the open unit disk $D_1:\ |z|<1$. For instance, the series diverges at $z=\frac{3}{2}i$ i.e. $f_1\left(\frac{3}{2}i\right)$ is not defined. The first 25 partial sums of the series $f_1\left(\frac{3}{2}i\right)$ are listed below and they do not appear to be approaching somewhere.

S[1] = 1.
S[2] = 1. – 1.500000000 I
S[3] = -1.250000000 – 1.500000000 I
S[4] = -1.250000000 + 1.875000000 I
S[5] = 3.812500000 + 1.875000000 I
S[6] = 3.812500000 – 5.718750000 I
S[7] = -7.578125000 – 5.718750000 I
S[8] = -7.578125000 + 11.36718750 I
S[9] = 18.05078125 + 11.36718750 I
S[10] = 18.05078125 – 27.07617188 I
S[11] = -39.61425781 – 27.07617188 I
S[12] = -39.61425781 + 59.42138672 I
S[13] = 90.13208008 + 59.42138672 I
S[14] = 90.13208008 – 135.1981201 I
S[15] = -201.7971802 – 135.1981201 I
S[16] = -201.7971802 + 302.6957703 I
S[17] = 455.0436554 + 302.6957703 I
S[18] = 455.0436554 – 682.5654831 I
S[19] = -1022.848225 – 682.5654831 I
S[20] = -1022.848225 + 1534.272337 I
S[21] = 2302.408505 + 1534.272337 I
S[22] = 2302.408505 – 3453.612758 I
S[23] = -5179.419137 – 3453.612758 I
S[24] = -5179.419137 + 7769.128706 I
S[25] = 11654.69306 + 7769.128706 I

Also shown below are the graphics of partial sums of the series $f_1\left(\frac{3}{2}i\right)$.

The first 10 partial sums

The first 20 partial sums

The first 30 partial sums

Let us expand $f(z)=\displaystyle\frac{1}{1+z}$ at $z=i$. Then we obtain
\begin{align*}
f(z)&=\frac{1}{1+z}\\
&=\frac{1}{1+i}\cdot\frac{1}{1+\frac{z-i}{1+i}}\\
&=\sum_{n=0}^\infty (-1)^n\frac{(z-i)^n}{(1+i)^{n+1}}
\end{align*}
for $|z-i|<\sqrt{2}$. Let $f_2(z)=\displaystyle\sum_{n=0}^\infty (-1)^n\frac{(z-i)^n}{(1+i)^{n+1}}$. This series converges only on the open disk $D_2:\ |z-i|<\sqrt{2}$, in particular at $z=\frac{3}{2}i$ and $f_2\left(\frac{3}{2}i\right)=f\left(\frac{3}{2}i\right)=\frac{4}{13}-\frac{6}{13}i$. The first 25 partial sums of the series $f_2\left(\frac{3}{2}i\right)$ are listed below and it appears that they are approaching a number. In fact, they are approaching the complex number $f\left(\frac{3}{2}i\right)=\frac{4}{13}-\frac{6}{13}i$.

S[1] = 0.5000000000 – 0.5000000000 I
S[2] = 0.2500000000 – 0.5000000000 I
S[3] = 0.3125000000 – 0.4375000000 I
S[4] = 0.3125000000 – 0.4687500000 I
S[5] = 0.3046875000 – 0.4609375000 I
S[6] = 0.3085937500 – 0.4609375000 I
S[7] = 0.3076171875 – 0.4619140625 I
S[8] = 0.3076171875 – 0.4614257812 I
S[9] = 0.3077392578 – 0.4615478516 I
S[10] = 0.3076782227 – 0.4615478516 I
S[11] = 0.3076934814 – 0.4615325928 I
S[12] = 0.3076934814 – 0.4615402222 I
S[13] = 0.3076915741 – 0.4615383148 I
S[14] = 0.3076925278 – 0.4615383148 I
S[15] = 0.3076922894 – 0.4615385532 I
S[16] = 0.3076922894 – 0.4615384340 I
S[17] = 0.3076923192 – 0.4615384638 I
S[18] = 0.3076923043 – 0.4615384638 I
S[19] = 0.3076923080 – 0.4615384601 I
S[20] = 0.3076923080 – 0.4615384620 I
S[21] = 0.3076923075 – 0.4615384615 I
S[22] = 0.3076923077 – 0.4615384615 I
S[23] = 0.3076923077 – 0.4615384616 I
S[24] = 0.3076923077 – 0.4615384615 I
S[25] = 0.3076923077 – 0.4615384615 I

The following graphics shows that the real parts of the partial sums of the series $f_2\left(\frac{3}{2}i\right)$ are approaching $\frac{3}{14}$ (blue line).

The real parts of the first 25 partial sums

The next graphics shows that the imaginary parts of the partial sums of the series $f_2\left(\frac{3}{2}i\right)$  are approaching $-\frac{6}{13}$ (blue line).

The imaginary parts of the first 25 partial sums

Also shown below is the graphics of the first 25 partial sums of the series $f_2\left(\frac{3}{2}i\right)$. They are approaching the complex number $f\left(\frac{3}{2}i\right)=\frac{4}{13}-\frac{6}{13}i$ (the intersection of horizontal and vertical blue lines).

The first 25 partial sums

Note that $f_1(z)=f_2(z)$ on $D_1\cap D_2$. Define $F(z)$ as

$$F(z)=\left\{\begin{array}{ccc} f_1(z) & \mbox{if} & z\in D_1,\\ f_2(z) & \mbox{if} & z\in D_2. \end{array}\right.$$

Analytic continuation

Then $F(z)$ is analytic in $D_1\cup D_2$. The function $F(z)$ is called the analytic continuation into $D_1\cup D_2$ of either $f_1$ or $f_2$, and $f_1$ and $f_2$ are called elements of $F$. The function $f_1(z)$ can be continued analytically to the punctured plane $\mathbb{C}\setminus\{-1\}$ and the function $f(z)=\frac{1}{1+z}$ is indeed the analytic continuation into $\mathbb{C}\setminus\{-1\}$ of $f_1$. In general, whenever analytic continuation exists it is unique.