Category Archives: Linear Algebra

Determinants II: Determinants of Order $n$

A determinant of order $n$ can be calculated by expanding it in terms of determinants of order $n-1$. Let $A=(a_{ij})$ be an $n\times n$ matrix and let us denote by $A_{ij}$ the $(n-1)\times (n-1)$ matrix obtained by deleting the $i$-th row and the $j$-th column from $A$:

Then $\det A$ is given by the Laplace expansion
\begin{align*}
\det A&=(-1)^{i+1}a_{i1}\det A_{i1}+\cdots+(-1)^{i+n}a_{in}\det A_{in}\\
&=(-1)^{1+j}a_{1j}\det A_{1j}+\cdots+(-1)^{n+j}a_{nj}\det A_{nj}.
\end{align*}

All the properties of the determinants of order 2 we studied here still hold in general for the determinants of order $n$. In particular,

Theorem. Let $A^1,\cdots,A^n$ be column vectors of dimension $n$. They are linearly dependent if and only if
$$\det(A^1,\cdots,A^n)=0.$$

Example. Let us calculate the determinant of the following $3\times 3$ matrix
$$A=\begin{pmatrix}
a_{11} & a_{12} & a_{13}\\
a_{21} & a_{22} & a_{23}\\
a_{31} & a_{32} & a_{33}
\end{pmatrix}.$$
You may use any column or row to calculate $\det A$ using the Laplace expansion. In this example, we use the first row to calculate $\det A$. By the Laplace expansion,
\begin{align*}
\det A&=a_{11}\det A_{11}-a_{12}\det A_{12}+a_{13}\det A_{13}\\
&=a_{11}\left|\begin{array}{cc}
a_{22} & a_{23}\\
a_{32} & a_{33}
\end{array}\right|-a_{12}\left|\begin{array}{cc}
a_{21} & a_{23}\\
a_{31} & a_{33}
\end{array}\right|+a_{13}\left|\begin{array}{cc}
a_{21} & a_{22}\\
a_{31} & a_{32}
\end{array}\right|.
\end{align*}
Replay this with
$$A=\begin{pmatrix}
2 & 1 & 0\\
1 & 1 & 4\\
-3 & 2 & 5
\end{pmatrix}.$$
Since the first row or the third column contains 0, you may want to use the first row or the third column to do the Laplace expansion.

For $3\times 3$ matrices, there is a quicker way to calculate the determinant as shown in the following figure. You multiply three entries along each indicated arrow. When you multiply three entries along each red arrow, you also multiply by $-1$. This is called the Rule of Sarrus named after a French mathematician Pierre Frédéric Sarrus. Please be warned that the rule of Sarrus works only for $3\times 3$ matrices.

The Rule of Sarrus

Example. [Cross Product] Let $v=v_1E_1+v_2E_2+v_3E_3$ and $w=w_1E_1+w_2E_2+w_3E_3$ be two vectors in Euclidean 3-space $\mathbb{R}^3$. The cross product is defined by
$$v\times w=\left|\begin{array}{ccc}
E_1 & E_2 & E_3\\
v_1 & v_2 & v_3\\
w_1 & w_2 & w_3
\end{array}\right|.$$
Note that the cross product is perpendicular to both $v$ and $w$.

Clearly, if there are many 0 entries in a given determinant, it would be easier to calculate the determinant since you will have a lesser than usual number of terms that you actually have to calculate in the Laplace expansion. For any given determinant, we can indeed make it happen. Recall the theorem we studied here:

Theorem. If one adds a scalar multiple of one column (row) to another column (row), then the value of the determinant does not change.

Using the particular column (row) operation in the Theorem, we can turn a given determinant into one with more 0 entries.

Example. Find
$$\left|\begin{array}{cccc}
1 & 3 & 1 & 1\\
2 & 1 & 5 & 2\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|.$$

Solution. By the above Theorem,
\begin{align*}
\left|\begin{array}{cccc}
1 & 3 & 1 & 1\\
2 & 1 & 5 & 2\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|&=\left|\begin{array}{cccc}
0 & 4 & -1 & -2\\
2 & 1 & 5 & 2\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|\ (\mbox{add $-1$ times row 3 to row 1})\\
&=\left|\begin{array}{cccc}
0 & 4 & -1 & -2\\
0 & 3 & 1 & -4\\
1 & -1 & 2 & 3\\
4 & 1 & -3 & 7
\end{array}\right|\ (\mbox{add $-2$ times row 3 to row 2})\\
&=\left|\begin{array}{cccc}
0 & 4 & -1 & -2\\
0 & 3 & 1 & -4\\
1 & -1 & 2 & 3\\
0 & 5 & -11 & -5
\end{array}\right|\ (\mbox{add $-4$ times row 3 to row 4})\\
&=\left|\begin{array}{ccc}
4 & -1 & -2\\
3 & 1 & -4\\
5 & -11 & -5
\end{array}\right|\ (\mbox{Laplace expansion along column 1})
\end{align*}
You may compute the resulting determinant of order 3 using the rule of Sarrus or you may further simpify it. For instance, you may do:
\begin{align*}
\left|\begin{array}{ccc}
4 & -1 & -2\\
3 & 1 & -4\\
5 & -11 & -5
\end{array}\right|&=\left|\begin{array}{ccc}
7 & 0 & -6\\
3 & 1 & -4\\
5 & -11 & -5
\end{array}\right|\ (\mbox{add row 2 to row 1})\\
&=\left|\begin{array}{ccc}
7 & 0 & -6\\
3 & 1 & -4\\
38 & 0 & -49
\end{array}\right|\ (\mbox{add 11 times row 2 to row 3})\\
&=\left|\begin{array}{cc}
7 & -6\\
38 & -49
\end{array}\right|\ (\mbox{Laplace expansion along column 2})\\
&=-115.
\end{align*}

Determinants I: Determinants of Order 2

Let $A=\begin{pmatrix}
a & b\\
c & d
\end{pmatrix}$. Then we define the determinant $\det A$ by
$$\det A=ad-bc.$$
$\det A$ is also denoted by $|A|$ or $\left|\begin{array}{ccc}
a & b\\
c & d
\end{array}\right|$. In terms of the column vectors $A^1,A^2$, the determinant of $A$ may also be written as $\det(A^1,A^2)$.

Example. If $A=\begin{pmatrix}
2 & 1\\
1 & 4
\end{pmatrix}$, then $\det A=8-1=7$.

Property 1. The determinant $\det (A^1,A^2)$ may be considered as a bilinear map of the column vectors. As a bilinear map $\det (A^1,A^2)$ is linear in each slot. For example, if $A^1=C+C’$ then
$$\det(A^1,A^2)=\det(C,A^2)+\det(C’,A^2).$$
If $x$ is a number,
$$\det(xA^1,A^2)=x\det(A^1,A^2).$$

Property 2. If the two columns are equal, the determinant is 0.

Property 3. $\det I=\det (E^1,E^2)=1$.

Combining Properties 1-3, we can show that:

Theorem. If one adds a scalar multiple of one column to another, then the value of the determinant does not change.

Proof. We prove the theorem for a particular case.
\begin{align*}
\det(A^1+xA^2,A^2)&=\det(A^1,A^2)+x\det(A^2,A^2)\\
&=\det(A^1,A^2).
\end{align*}

Theorem. If the two columns are interchanged, the determinant changes by a sign i.e.
$$\det(A^2,A^1)=-\det(A^1,A^2).$$

Proof. \begin{align*}
0&=\det(A^1+A^2,A^1+A^2)\\
&=\det(A^1,A^2)+\det(A^2,A^1).
\end{align*}

Theorem. $\det A=\det {}^tA$.

Proof. This theorem can be proved directly from the definition of $\det A$.

Remark. Because of this theorem, we can also say that if one adds a scalar multiple of one row to another row, then the value of the determinant does not change.

Theorem. The column vectors $A^1,A^2$ are linearly dependent if and only if $\det(A^1,A^2)=0$.

Proof. Suppose that $A^1,A^2$ are linearly dependent. Then there exists numbers $x,y$, not all equal to 0 such that $xA^1+yA^2=0$. Let us say $x\ne 0$. Then $A^1=-\frac{y}{x}A^2$. So,
\begin{align*}
\det(A^1,A^2)&=\det\left(-\frac{y}{x}A^2,A^2\right)\\
&=-\frac{y}{x}\det(A^2,A^2)\\
&=0.
\end{align*}
To prove the converse, suppose that $A^1,A^2$ are linearly independent. Then $E^1,E^2$ can be written as linear combinations of $A^1,A^2$:
\begin{align*}
E^1&=xA^1+yA^2,\\
E^2&=zA^1+wA^2.
\end{align*}
Now,
\begin{align*}
1&=\det(E^1,E^1)\\
&=xw\det(A^1,A^2)+yz\det(A^2,A^1)\\
&=(xw-yz)\det(A^1,A^2).
\end{align*}
Hence, $\det(A^1,A^2)\ne 0$.

Orthogonal Bases

Let $V$ be a vector space with a positive definite scalar product $\langle\ ,\ \rangle$. A basis $\{v_1,\cdots,v_n\}$ of $V$ is said to be orthogonal if $\langle v_i,v_j\rangle=0$ if $i\ne j$. In addition, if $||v_i||=1$ for all $i=1,\cdots,n$, then the basis is said to be orthonormal.

Example. $E_1,\cdots,E_n$ of $\mathbb{R}^n$ form an orthonormal basis of $\mathbb{R}^n$.

Why having a orthonormal basis is big deal? To answer this question, let us suppose that $e_1,\cdots,e_n$ is an orthonormal basis of a vector space $V$. Let $v,w\in V$. Then
\begin{align*}
v&=v_1e_1+\cdots+v_ne_n,\\
w&=w_1e_1+\cdots+w_ne_n.
\end{align*}
Since $\langle e_i,e_j\rangle=\delta_{ij}$,
$$\langle v,w\rangle=v_1w_1+\cdots+v_nw_n=v\cdot w.$$
Hence, once an orthonormal basis is given, the scalar product $\langle\ ,\ \rangle$ is identified with the dot product. Next question is then, can we always come up with an orthonormal basis? The answer is affirmative. Given a basis, we can construct a new basis which is orthonormal through a process called the Gram-Schmidt orthogonalization process. Here is how it works. Let $w_1,\cdots,w_n$ be a basis of a vector space $V$. Let $v_1=w_1$ and
$$v_2=w_2-\frac{\langle w_2,v_1\rangle}{\langle v_1,v_1\rangle}v_1.$$
Then $v_2$ is perpendicular to $v_1$. Note that if $w_2$ is already perpendicular to $v_1=w_1$, then $v_2=w_2$.

Gram-Schmidt Orthogonalization Process

Let
$$v_3=w_3-\frac{\langle w_3,v_1\rangle}{\langle v_1,v_1\rangle}v_1-\frac{\langle w_3,v_2\rangle}{\langle v_2,v_2\rangle}v_2.$$
Then $v_3$ is perpendicular both $v_1$ and $v_2$ as seen in the following figure.
Continuing this process, we have
$$v_n=w_n-\frac{\langle w_n,v_1\rangle}{\langle v_1,v_1\rangle}v_1-\cdots-\frac{\langle w_n,v_{n-1}\rangle}{\langle v_{n-1},v_{n-1}\rangle}v_{n-1}$$
and $v_1,\cdots,v_n$ are mutually perpendicular.

Gram-Schmidt Orthogonalization Process

Therefore, we have the following theorem holds.

Theorem. Let $V\ne\{O\}$ be a finite dimensional vector space with a positive definite scalar product. Then $V$ has an orthonormal basis.

Example. Find an orthonormal basis for the vector space generated by
$$A=(1,1,0,1),\ B=(1,-2,0,0),\ C=(1,0,-1,2).$$
Here the scalar product is the dot product.

Solution. Let
\begin{align*}
A’&=A,\\
B’&=B-\frac{B\cdot A’}{A’\cdot A’}A’\\
&=\frac{1}{3}(4,-5,0,1),\\
C’&=C-\frac{C\cdot A’}{A’\cdot A’}-\frac{C\cdot B’}{B’\cdot B’}B’\\
&=\frac{1}{7}(-4,-2,-7,6).
\end{align*}
Then $A’,B’,C’$ is an orthogonal basis. We obtain an orthonormal basis by normilizing each basis member:
\begin{align*}
\frac{A’}{||A’||}&=\frac{1}{\sqrt{3}}(1,1,0,1),\\
\frac{B’}{||B’||}&=\frac{1}{\sqrt{42}}(4,-5,0,1),\\
\frac{C’}{||C’||}&=\frac{1}{\sqrt{105}}(-4,-2,-7,6).
\end{align*}

Theorem. Let $V$ be a vector space of dimension $n$ with a positive definite scalar product $\langle\ ,\ \rangle$. Let $\{w_1,\cdots,w_r,u_1,\cdots,u_s\}$ with $r+s=n$ be an orthonormal basis of $V$. Let $W$ be a subspace generated by $w_1,\cdots,w_r$ and let $U$ be a subspace generated by $u_1,\cdots,u_s$. Then $U=W^{\perp}$ and $\dim V=\dim W+\dim W^{\perp}$.

Scalar Products

Let $V$ be vector space. A scalar product is a map $\langle\ ,\ \rangle: V\times V\longrightarrow\mathbb{R}$ such that

SP 1. $\langle v,w\rangle=\langle w,v\rangle$

SP 2. $\langle u,v+w\rangle=\langle u,v\rangle+\langle u,w\rangle$

SP 3. If $x$ is a number, then
$$\langle xu,v\rangle=x\langle u,v\rangle=\langle u,xv\rangle$$

Additionally, we also assume the condition

SP 4. $\langle v,v\rangle>0$ if $v\ne O$

A scalar product with SP 4 is said to be positive definite.

Remark. If $v=O$, then $\langle v,w\rangle=0$ for any vector $w$ in $V$. This follows immediately from SP 3.

Example. Let $V=\mathbb{R}^n$ and define
$$\langle X,Y\rangle=X\cdot Y.$$
Then $\langle\ ,\ \rangle$ is a positive definite scalar product.

Example. Let $V$ be the function space of all continuous real-valued function on $[-\pi,\pi]$. For $f,g\in V$, we define
$$\langle f,g\rangle=\int_{-\pi}^{\pi}f(t)g(t)dt.$$
Then $\langle\ ,\ \rangle$ is a positive definte scalar product.

Using a scalar product, we can introduce the notion of orthogonality of vectors. Two vectors $v,w$ are said to be orthogonal or perpendicular if $\langle v,w\rangle=0$.

Let $S\subset V$ be a subspace of $V$. Let $S^{\perp}=\{w\in V: \langle v,w\rangle=0\ \mbox{for all}\ v\in V\}$.Then $S^{\perp}$ is also a subspace of $V$. (Check for yourself.) It is called the orthogonal space of $S$.

Define the length or the norm of $v\in V$ by
$$||v||=\sqrt{\langle v,v\rangle}.$$
It follows from SP 3 that
$$||cv||=|c|||v||$$
for any number $c$.

The Projection of a Vector onto Another Vector

For vectors in $\mathbb{R}^2$ or $\mathbb{R}^3$, the vector projection of a vector $v$ onto another vector $w$ is
$$||v||\cos\theta\frac{w}{||v||}=\langle v,w\rangle\frac{w}{||w||^2}=\frac{\langle v,w\rangle}{\langle w,w\rangle}w$$
as seen in the above Figure. The number $c=\frac{\langle v,w\rangle}{\langle w,w\rangle}$ is called the component of $v$ along $w$. Note that the vector projection of $v$ onto $w$
$$\frac{\langle v,w\rangle}{\langle w,w\rangle}w$$
can still be defined in any vector space with a scalar product.

Proposition. The vector $v-cw$ is perpendicular to $w$.

Proof. \begin{align*}
\langle v-cw,w\rangle&=\langle v,w\rangle-c\langle w,w\rangle\\
&=\langle v,w\rangle-\langle v,w\rangle\\
&=0.
\end{align*}

Example. Let $V=\mathbb{R}^n$. Then the component of $X=(x_1,\cdots,x_n)$ along $E_i$ is
$$X\cdot E_i=x_i.$$

Example. Let $V$ be the space of continuous functions on $[-\pi,\pi]$. Let $f(x)=\sin kx$, where $k$ is a positive integer. Then
$||f||=\sqrt{\pi}$. The component of $g(x)$ along $f(x)$ is
$$\frac{\langle g,f\rangle}{\langle f,f\rangle}=\frac{1}{\pi}\int_{-\pi}^{\pi}g(x)\sin kxdx.$$
It is called the Fourier coefficient of $g$ along $f$.

The following two inequalities are well-known for vectors in $\mathbb{R}^n$. They still hold in any vector space with a positive definite scalar product.

Theorem [Schwarz Inequality] Let $V$ be a vector space with a positive definite scalar product. For any $v,w\in V$,
$$|\langle v,w\rangle|\leq ||v||||w||.$$

Theorem [Triangle Inequality] Let $V$ be a vector space with a positive definite scalar product. For any $v,w\in V$,
$$||v+w||\leq ||v||+||w||.$$

Sage: Basic Matrix Operations

For using Sage: Sage is an open source math software whose interface is a web browser (in particular firefox). You don’t have to install Sage in your computer to use it. You can access any sage server including the main Sage server. I am running a Sage server at sage.st.usm.edu. If you are a student at the University of Southern Mississippi, you are more than welcome to create an account at sage.st.usm.edu and use it.

Matrix Constructions

In Sage, $2\times 3$ matrix
$$\begin{pmatrix}
1 & 1 & -2\\
-1 & 4 & -5
\end{pmatrix}$$ can be created as follows. Let us say we want to call the matrix $A$. Type the following command in the blank line of your sage worksheet:

A=matrix([[1,1,-2],[-1,4,-5]])

In case you are familiar with Maple, not like Maple you will not see your matrix $A$ as an output when you click on “evaluate”. To see your matrix, you need to type

A

in the next blank line and click on “evaluate”again:

[ 1  1 -2]
[-1  4 -5]

Scalar Multiplication

If you want to multiply the matrix $A$ by a number 5, type the command

5*A

and click on “evaluate”. The output is

[  5   5 -10]
[ -5  20 -25]

Matrix Addition

To perform addition of two matrices:
$$\begin{pmatrix}
1 & 1 & -2\\
-1 & 4 & -5
\end{pmatrix}+\begin{pmatrix}
2 & 1 & 5\\
1 & 3 & 2
\end{pmatrix}$$, first call the second matrix $B$:

B=matrix([[2,1,5],[1,3,2]])

and do

A+B

the output is

[ 3  2  3]
[ 0  7 -3]

The linear combination $3A+2B$ can be calculated by the command

3*A+2*B

and the output is

[  7   5   4]
[ -1  18 -11]

Transpose of a Matrix

To find the transpose of the matrix $A$ do

A.transpose()

and the output is

[ 1 -1]
[ 1  4]
[-2 -5]

Matrix Multiplication

An $m\times n$ matrix can be multiplied by a $p\times q$ matrix as long as $n=p$. The resulting multiplication is an $m\times q$ matrix. Let $C=\begin{pmatrix}
3 & 4\\
-1 &2\\
2 &1
\end{pmatrix}$. The number of columns of $A$ and the number of rows of $C$ coincide as 3, so we can perform $AC$ and this can be done in Sage as:

A*C

The output is

[ 1 -1]
[ 1  4]
[-2 -5]

Inverses

Let $F: V\longrightarrow W$ be a mapping. $F$ is said to be invertible if there exists a map $G: W\longrightarrow V$ such that
$$G\circ F:I_V,\ F\circ G=I_W,$$
where $I_V: V\longrightarrow V$ and $I_W: W\longrightarrow W$ the identity maps on $V$ and $W$, respectively. The map $G$ is called the inverse of $F$ and is denoted by $F^{-1}$.

Theorem. A map $F: V\longrightarrow W$ has an inverse if and only if it is one-to-one (injective) and onto (surjective).

Remark. If a map $F: V\longrightarrow W$ has an inverse, it is unique.

Example. The inverse map of $T_u$ is $T_{-u}$.

Example. Let $A$ be a square matrix. Then $L_A: \mathbb{R}^n\longrightarrow\mathbb{R}^n$ is invertible if and only if $A$ is invertible.

Proof. If $A$ is invertible, then one can immediately see that $L_A\circ L_{A^{-1}}=L_{A^{-1}}\circ L_A=I$. On the other hand, any linear map from $\mathbb{R}^n$ to $\mathbb{R}^n$ may be written as $L_B:\mathbb{R}^n\longrightarrow\mathbb{R}^n$ for some $n\times n$ square matrix $B$ as seen here. Suppose that $L_B:\mathbb{R}^n\longrightarrow\mathbb{R}^n$ is an inverse of $L_A: \mathbb{R}^n\longrightarrow\mathbb{R}^n$. Then $L_A\circ L_B=L_B\circ L_A=I$ i.e. for any $X\in\mathbb{R}^n$,
$$(AB)X=(BA)X=IX=X.$$
This implies that $AB=BA=I$. (Why?) Therefore, $A$ is invertibale and $B=A^{-1}$.

Theorem. Let $F: U\longrightarrow V$ be a linear map, and assume that this map has an inverse map  $G:V\longrightarrow U$. Then $G$ is a linear map.

Proof. The proof is straightforward and is left for an exercise.

Recall that a linear map $F: V\longrightarrow W$ is injective if and only if $\ker F=\{O\}$ as seen here.

Theorem. Let $V$ be a vector space of $\dim n$. If $W\subset V$ is a subspace of $\dim n$, then $W=V$.

Proof. It follows from the fact that a basis of a vector space is the maximal set of linearly independent vectors.

Theorem. Let $F: V\longrightarrow W$ be a linear map. Assume that $\dim V=\dim W$. Then

(i) If $\ker F=\{O\}$ then $F$ is invertible.

(ii) If $F$ is surjective then $F$ is invertible.

Proof. It follows from the formula
$$\dim V=\dim\ker F+\dim\mathrm{Im}F.$$

Example. Let $F:\mathbb{R}^2\longrightarrow\mathbb{R}^2$ be the linear map such that
$$F(x,y)=(3x-y,4x+2y).$$
Show that $F$ is invertible.

Solution. Suppose that $(x,y)\in\ker F$. Then
$$\left\{\begin{aligned}
3x-y&=0,\\
4x+2y&=0.
\end{aligned}\right.$$
This system of linear equations has only the trvial solution $(x,y)=(0,0)$, so $\ker F=\{O\}$. Therefore, $F$ is invertible.

A linear map $F: V\longrightarrow W$ which is invertible is called an isomorphism. If there is an isomorphism from $V$ onto $W$, we say $V$ is isomorphic to $W$ and write $V\cong W$ or $V\stackrel{F}{\cong}W$ in case we want to specify the isomorphism $F$. If $V\cong W$, as vector spaces they are identical and we do not distinguish them.

The following theorem says that there is essentially only one vector space of dimension $n$, $\mathbb{R}^n$.

Theorem. Let $V$ be a vector space of dimension $n$. Then $\mathbb{R}^n\cong V$.

Proof. Let $\{v_1,\cdots,v_n\}$ be a basis of $V$. Define a map $L:\mathbb{R}^n\longrightarrow V$ by
$$L(x_1,\cdots,x_n)=x_1v_1+\cdots+x_nv_n$$
for each $(x_1,\cdots,x_n)\in\mathbb{R}^n$. Then $L$ is an isomorphism.

Theorem. A square matrix $A$ is invertible if and only if its columns $A^1,\cdots,A^n$ are linearly indepdendent.

Proof. Let $L_A:\mathbb{R}^n\longrightarrow\mathbb{R}^n$ be the map such that $L_AX=AX$. For $X=\begin{pmatrix}
x_1\\
\vdots\\
x_n
\end{pmatrix}\in\mathbb{R}^n$,
$$L_AX=x_1A^1+\cdots+x_nA^n.$$
Hence, $\ker L_A=\{O\}$ if and only if $A^1,\cdots,A^n$ are linearly independent.

Composition of Linear Maps

Let $F: U\longrightarrow V$ and $G: V\longrightarrow W$ be two maps. The composite map $G\circ F: U\longrightarrow W$ is defined by
$$G\circ F(v)=G(F(v))$$
for each $v\in U$.

Example. Let $A$ be an $m\times n$ matrix and let $B$ be a $q\times m$ matrix. Let $L_A: \mathbb{R}^n\longrightarrow\mathbb{R}^n$ be the linear map such that $L_AX=AX$ for each $X\in\mathbb{R}^n$, and let $L_B:\mathbb{R}^m\longrightarrow\mathbb{R}^m$ such that $L_BY=Y$ for each $Y\in\mathbb{R}^m$. Then
\begin{align*}
L_B\circ L_A(X)&=L_B(L_A(X))\\
&=L_B(L_A(X))\\
&=B(AX)\\
&=BA(X)\\
&=L_{BA}(X)
\end{align*}
for each $X\in\mathbb{R}^n$. Therefore, $L_B\circ L_A=L_{BA}$.

Example. Let $A$ be an $m\times n$ matrix, and let $L_A:\mathbb{R}^n\longrightarrow\mathbb{R}^m$ be the linear map such that $L_A(X)=AX$ for each $X\in\mathbb{R}^n$. Let $C$ be a vector in $\mathbb{R}^m$ and let $T_C:\mathbb{R}^m\longrightarrow\mathbb{R}^m$ be the translation by $C$
$$T_C(Y)=Y+C$$
for each $Y\in\mathbb{R}^m$. Then for each $X\in\mathbb{R}^n$,
\begin{align*}
T_C\circ L_A(X)&=T_C(AX)\\
&=AX+C.
\end{align*}

Example. Let $V$ be a vector space, and let $w$ be an element of $V$. Let $T_w: V\longrightarrow V$ be the translation by $w$ i.e.
$$T_w(v)=v+w$$
for each $v\in V$. Then
\begin{align*}
T_{w_1}\circ T_{w_2}(v)&=T_{w_1}(T_{w_2}(v))\\
&=T_{w_1}(v+w_2)\\
&=v+w_2+w_1
\end{align*}
for each $v\in V$. Therefore, $T_{w_1}\circ T_{w_2}=T_{w_1+w_2}$ i.e. the composite of translations is again a translation.

Remark. Note that translations are not linear. One easy way to see this is that $T_w(O)=O+w=w$. So $T_w$ is not linear unless $w=O$ in which case $T_w$ is the identity map.

Example. [Rotations] Let $\theta$ be a number, and $A(\theta)$ the matrix
$$A(\theta)=\begin{pmatrix}
\cos\theta & -\sin\theta\\
\sin\theta & \cos\theta
\end{pmatrix}.$$
Let $R_\theta:\mathbb{R}^2\longrightarrow\mathbb{R}^2$ be the rotation by angle $\theta$ i.e.
$$R_\theta(X)=A(\theta)X$$
for each $X\in\mathbb{R}^2$. Clearly, rotations are linear.
Now,
\begin{align*}
R_{\theta_1}\circ R_{\theta_2}(X)&=R_{\theta_1}(R_{\theta_2}(X))\\
&=R_{\theta_1}(A(\theta_2)X)\\
&=A(\theta_1)A(\theta_2)X\\
&=A(\theta_1+\theta_2)X\\
&=R_{\theta_1+\theta_2}.
\end{align*}

Theorem. Let $U,V,W$ be vector spaces. Let $F:U\longrightarrow V$ and $G:V\longrightarrow W$ be linear maps. Then the composite map $G\circ F:U\longrightarrow W$ is also linear.

Proof. The proof is straightforward and is left for an exercise.

The Matrix Associated with a Linear Map

Given an $m\times n$ matrix $A$, there is an associated linear map $L_A: \mathbb{R}^n\longrightarrow\mathbb{R}^m$ as seen here. Conversely, given a linear map $L: \mathbb{R}^n\longrightarrow\mathbb{R}^m$ there exists an $m\times n$ matrix $A$ such that $L=L_A$. To see this, consider the unit column vectors $E^1,\cdots,E^n$ of $\mathbb{R}^n$. For each $j=1,\cdots,n$, let $L(E^j)=A^j$, where $A^j$ is a column vector in $\mathbb{R}^m$. For each $X\in\mathbb{R}^n$,
$$X=x_1E^1+\cdots+x_nE^n=\begin{pmatrix}
x_1\\
\vdots\\
x_n
\end{pmatrix}$$
and hence
\begin{align*}
LX&=x_1L(E^1)+\cdots+x_nL(E^n)\\
&=x_1A^1+\cdots+x_nA^n\\
&=AX
\end{align*}
where $A$ is the matrix whose column vectors are $A^1,\cdots,A^n$. The matrix $A$ is called the matrix associated with the linear map $L$.

Example. Let $L:\mathbb{R}^3\longrightarrow\mathbb{R}^2$ be the projection given by $$L\begin{pmatrix}
x\\
y\\
z
\end{pmatrix}=\begin{pmatrix}
x\\
y
\end{pmatrix}.$$
Find the matrix associated with $L$.

Solution. $$L(E^1)=\begin{pmatrix}
1\\
0\end{pmatrix},\ L(E^2)=\begin{pmatrix}
0\\
1
\end{pmatrix},\ L(E^3)=\begin{pmatrix}
0\\
0
\end{pmatrix}.$$
Thus, the matrix associated with $L$ is
$$A=\begin{pmatrix}
1 & 0 & 0\\
0 & 1 & 0
\end{pmatrix}.$$

Let us now consider a more general case. Let $V$ be a vector space and $\{v_1,\cdots,v_n\}$ a given basis of $V$. Let $L:V\longrightarrow V$ be a linear map. Then there exist numbers $c_{ij}$, $i,j=1,\cdots,n$ such that
\begin{align*}
Lv_1&=c_{11}v_1+\cdots+c_{1n}v_n,\\
&\vdots\\
Lv_n&=c_{n1}v_1+\cdots+c_{nn}v_n.
\end{align*}
Let $v=x_1v_1+\cdots+x_nv_n$. Then
\begin{align*}
Lv&=\sum_{i=1}^nx_iLv_i\\
&=\sum_{i=1}^nx_i\sum_{j=1}^nc_{ij}v_j\\
&=\sum_{j=1}^n(\sum_{i=1}^nx_ic_{ij})v_j.
\end{align*}
Hence, we have the following theorem.

Theorem. If $C=(c_{ij})$ is the matrix such that $Lv_i=\sum_{j=1}^nc_{ij}v_j$ and $X=\begin{pmatrix}
x_1\\
\vdots\\
x_n
\end{pmatrix}$ is the coordinate vector of $v$, then the coordinate vector of $Lv$ is ${}^tCX$ i.e. the matrix associated with $L$ is ${}^tC$ with respect to the basis $\{v_1,\cdots,v_n\}$.

Example. Let $L: V\longrightarrow V$ be a linear map. Let $\{v_1,v_2,v_3\}$ be a basis of $V$ such that
\begin{align*}
L(v_1)&=2v_1-v_2,\\
L(v_2)&=v_1+v_2-4v_3,\\
L(v_3)&=5v_1+4v_2+2v_3.
\end{align*}
The matrix associated with $L$ is
$$\begin{pmatrix}
2 & 1 & 5\\
-1 & 1 & 4\\
0 & -4 & 2
\end{pmatrix}.$$

The Kernel and Image of a Linear Map

Let $F:V\longrightarrow W$ be a linear map. The image of $F$ is the set
$$\mathrm{Im}F=\{w\in W: F(v)=w\ \mbox{for some}\ v\in V\}.$$

Proposition. The image of $F$ is a subspace of $W$.

Proof. The proof is straightforward. It is left for an exercise.

The preimage of the identity element $O$ under the linear map $F$ i.e. the set of elements $v\in V$ such that $F(v)=O$ is called the kernel of $F$ and is denoted by $\ker F$.

Proposition. The kernel of $F$ is a subspace of $V$.

Proof. It is straightforward and left for an exercise.

Example. Let $L: \mathbb{R}^3\longrightarrow\mathbb{R}$ be the map defined by
$$L(x,y,z)=3x-2y+z.$$
If we write $A=(3,-2,1)$, then $L(X)$ may be written as
$$L(X)=X\cdot A.$$
So, the kernel of $L$ is the set of all $X$ that are perpendicular to $A$.

Example. Let $A$ be an $m\times n$ matrix and let $L_A:\mathbb{R}^n\longrightarrow\mathbb{R}^m$ be the linear map defined by
$$L_A(X)=AX.$$
The kernel of $L_A$ is the subspace of solutions $X$ of the system of linear equations
$$AX=O.$$

Example. Let $\mathcal{F}$ be the vector space of smooth functions. Let $a_1,\cdots,a_m$ be numbers and let
$$L=a_m\frac{d^m}{dx^m}+a_{m-1}\frac{d^{m-1}}{dx^{m-1}}+\cdots+a_1.$$
Then $L:\mathcal{F}\longrightarrow\mathcal{F}$ is a linear map. $\ker L$ is the space of solutions of the homogeneous linear differential equation
$$a_m\frac{d^mf}{dx^m}+a_{m-1}\frac{d^{m-1}f}{dx^{m-1}}+\cdots+a_1f=0.$$
If there exists one solution $h_0$ for the non-homogeneous linear differential equation $L(h)=g$, then any solution $h$ may be written as $h=f+h_0$ where $f$ is a solution of the homogeneous equation $L(f)=0$. The proof is left as an exercise.

Theorem. Let $F: V\longrightarrow W$ be a linear map such that $\ker F=\{O\}$. If $v_1,\cdots,v_n$ are linearly independent elements of $V$, then $F(v_1),\cdots,F(v_n)$ are linearly independent elements of $W$.

Proof. The proof is straightforward and is left for an exercise.

Theorem. Let $F: V\longrightarrow W$ be a linear map. $F$ is one-to-one if and only if $\ker F=\{O\}$.

Proof. Suppose that $F$ is one-to-one. Let $v\in\ker F$. Then $F(v)=O=F(O)$. Since $F$ is one-to-one, $v=O$. So, $\ker F=\{O\}$. Suppose that $\ker F=\{O\}$. Let $F(v_1)=F(v_2)$. Then $F(v_1-v_2)=O$ and so $v_1-v_2\in\ker F=\{O\}$ i.e. $v_1=v_2$. Hence, $F$ is one-to-one.

Given a linear map $L: V\longrightarrow W$, there is a relationship between the dimensions of $V$, $\ker L$, and $\mathrm{Im}L$, namely
$$\dim V=\dim\ker L+\dim\mathrm{Im}L.$$
We will not prove it here but those interested may find proof in [1].

Example. Consider the linear map $L:\mathbb{R}^3\longrightarrow\mathbb{R}$ given by
$$L(x,y,z)=3x-2y+z.$$
The image is not $\{O\}$, so it is $\mathbb{R}$. Therefore the dimension of $\ker L$, the space of all solutions of $3x-2y+z=0$ is 2. $3x-2y+z=0$ is indeed equation of a plane through the origin and we know that the dimension of a plane as a vector space is 2.

References:

[1] Serge Lang, Introduction to Linear Algebra, Second Edition, Undergraduate Texts in Mathematics, Springer, 1986

Linear Maps

Let $V, W$ be two vector spaces. A map $L:V\longrightarrow W$ is called a linear map if it satisfies the following properties: for any elements $u,v\in V$ and any scalar $c$,

LM 1. $L(u+v)=L(u)+L(v)$.

LM 2. $L(cu)=cL(u)$.

That is, linear maps are maps that preserve addition and scalar multiplication.

Proposition. A map $L: V\longrightarrow W$ is linear if and only if for any elements $u,v\in V$ and scalars $a,b$,
$$L(au+bv)=aL(u)+bL(v).$$

Proof. It is straightforward and left as an exercise.

Example. Let $A$ be an $m\times n$ matrix. Define
$$L_A:\mathbb{R}^n\longrightarrow\mathbb{R}^m$$
by
$$L_A(X)=A\cdot X.$$ Then $L_A$ is linear.

Example. Let $A=(a_1,\cdots,a_n)$ be a fixed vector in $\mathbb{R}^n$. Define $L_A:\mathbb{R}^n\longrightarrow\mathbb{R}$ by
$$L_A(X)=A\cdot X.$$
Then $L_A$ is a linear map. The dot product $A\cdot X$ can be viewed as a matrix multiplication if we view $A$ as a row vector and $X$ as a column vector. So this example is a spacial case of the previous example.

Example. Let $\mathcal{F}$ be the set of all smooth functions. Then the derivative $D:\mathcal{F}\longrightarrow\mathcal{F}$ is a linear map.

Example. Define $\wp: \mathbb{R}^3\longrightarrow\mathbb{R}^2$ by $\wp(x,y,z)=(x,y)$, i.e. $\wp$ is a projection. It is a linear map.

Proposition. Let $L: V\longrightarrow W$ be a linear map. Then $L(O)=O$.

Proof. Let $v\in V$. Then
$$L(O)=L(v-v)=L(v-v)=L(v)-L(v)=O.$$

Example. Let $L:\mathbb{R}^2\longrightarrow\mathbb{R}^2$ be a linear map. Suppose that
$$L(1,1)=(1,4)\ \rm{and}\ L(2,-1)=(-2,3).$$
Find $L(3,-1)$.

Solution. $(3,-1)$ is written as a linear combination of $(1,1)$ and $(2,-1)$ as
$$(3,-1)=\frac{1}{3}(1,1)+\frac{4}{3}(-2,3).$$
Hence,
$$L(3,1)=\frac{1}{3}L(1,1)+\frac{4}{3}L(-2,3)=\frac{1}{3}(1,4)+\frac{4}{3}(-2,3)=\left(-\frac{7}{3},\frac{16}{3}\right).$$

The coordinates of a linear map

Consider a map $F: V\longrightarrow\mathbb{R}^n$. For any $v\in V$, $F(v)\in\mathbb{R}^n$ so $F(v)$ may be written as
$$F(v)=(F_1(v),F_2(v),\cdots,F_n(v))$$
where each $F_i$ is a function $F_i:V\longrightarrow\mathbb{R}$ called the $i$-th coordinate function.

Proposition. A map $F_i: V\longrightarrow\mathbb{R}^n$ is linear if and only if each coordinate function $F_i$ is linear.

Proof. Straightforward. Left as an exercise.

Example. Let $F:\mathbb{R}^2\longrightarrow\mathbb{R}^3$ be the map
$$F(x,y)=(2x-y,3x+4y,x-5y).$$
Then
$$F_1(x,y)=2x-y,\ F_2(x,y)=3x+4y,\\ F_3(x,y)=x-5y.$$
These coordinate functions can be written as
$$F_1(x,y)=\begin{pmatrix}
2 & -1
\end{pmatrix}\begin{pmatrix}
x\\y
\end{pmatrix},\ F_2(x,y)=\begin{pmatrix}
3 & 4
\end{pmatrix}\begin{pmatrix}
x\\y
\end{pmatrix},\ F_3(x,y)=\begin{pmatrix}
1 & -5
\end{pmatrix}\begin{pmatrix}
x\\y
\end{pmatrix}.$$
Hence, each $F_i$ is linear, $i=1,2,3$ and therefore $F$ is linear by the Proposition. In fact, $F$ may be written as $L_A:\mathbb{R}^2\longrightarrow\mathbb{R}^3$ where
$$A=\begin{pmatrix}
2 & -1\\
3 & 4\\
1 & -5
\end{pmatrix}.$$