2.2 Vector spaces
It will be useful to talk about vector spaces. These are sets of vectors that can be added together, or multiplied by a scalar. You should be familiar with these from your undergraduate degree. We don’t provide a formal definition here, but you can think of a real vector space \(V\) as a set of vectors such that for any \(\mathbf v_1, \mathbf v_2 \in V\) and \(\alpha_1, \alpha_2 \in \mathbb{R}\), we have
\[\alpha_1 \mathbf v_1 + \alpha_2 \mathbf v_2 \in V\]
i.e., vector spaces are closed under addition and scalar multiplication.
A subset \(U \subset V\) of a vector space \(V\) is called a vector subspace if \(U\) is also a vector space.
2.2.1 Linear independence
2.2.2 Row and column spaces
We can think about the matrix-vector multiplication \(\mathbf A\mathbf x\) in two ways. The usual way is as the inner product between the rows of \(A\) and \(x\).
\[ \left( \begin{array}{cc} 1 & 2\\ 3&4\\5&6\end{array}\right) \left(\begin{array}{c}x_1\\ x_2\end{array}\right) = \left(\begin{array}{c} x_1+2x_2\\3x_1+4x_2\\5x_1+6x_2\end{array}\right)\]
But a better way to think of \(\mathbf A\mathbf x\) is as a linear combination of the columns of \(A\).
\[ \left( \begin{array}{cc} 1 & 2\\ 3&4\\5&6\end{array}\right) \left(\begin{array}{c}x_1\\ x_2\end{array}\right) = x_1\left(\begin{array}{c}1\\3\\5 \end{array}\right)+x_2\left(\begin{array}{c}2\\4\\6 \end{array}\right)\]
For \[A=\left( \begin{array}{cc} 1 & 2\\ 3&4\\5&6\end{array}\right) \] we can see that the column space is a 2-dimensional plane in \(\mathbb{R}^3\). The matrix \(\mathbf B\) has the same column space as \(\mathbf A\) \[\mathbf B=\left( \begin{array}{cccc} 1 & 2&3 &4\\ 3&4 &7&10\\5&6&11&16\end{array}\right) \]
The number of linearly independent columns of \(\mathbf A\) is called the column rank of \(\mathbf A\), and is equal to the dimension of the column space of \(\mathcal{C}(\mathbf A)\). The column rank of \(\mathbf A\) and \(\mathbf B\) is 2.
The row space of \(\mathbf A\) is defined to be the column space of \(\mathbf A^\top\), and the row rank is the number of linearly independent rows of \(\mathbf A\).
Thus we can simply refer to the rank of the matrix.
2.2.3 Linear transformations
We can view an \(n\times p\) matrix \(\mathbf A\) as a linear map between two vector spaces: \[\begin{align*} \mathbf A: \;\mathbb{R}^p &\rightarrow \mathbb{R}^n\\ \mathbf x&\mapsto \mathbf A\mathbf x \end{align*}\]
The image of \(\mathbf A\) is precisely the column space of \(\mathbf A\): \[\operatorname{Im}(\mathbf A) = \{\mathbf A\mathbf x: \mathbf x\in \mathbb{R}^p\}=\mathcal{C}(\mathbf A) \subset \mathbb{R}^n\]
The kernel of \(A\) is the set of vectors mapped to zero: \[\operatorname{Ker}(\mathbf A)=\{\mathbf x: \mathbf A\mathbf x=\boldsymbol 0\}\subset \mathbb{R}^p\] and is sometimes called the null-space of \(\mathbf A\) and denoted \(\mathcal{N}(\mathbf A)\).
If we’re thinking about matrices, then \(\dim \mathcal{C}(\mathbf A)+\dim \mathcal{N}(\mathbf A)=p\), or equivalently that
\(\operatorname{rank}(\mathbf A)+\dim \mathcal{N}(\mathbf A)=p\).
We’ve already said that the row space of \(\mathbf A\) is \(\mathcal{C}(\mathbf A^\top)\). The left-null space is \(\{\mathbf x\in \mathbb{R}^n: \mathbf x^\top \mathbf A=0\}\) or equivalently \(\{x \in \mathbb{R}^n: \mathbf A^\top \mathbf x=0\}=\mathcal{N}(\mathbf A^\top)\). And so by the rank-nullity theorem we must have \[n=\dim \mathcal{C}(\mathbf A^\top) + \dim \mathcal{N}(\mathbf A^\top)= \operatorname{rank}(\mathbf A)+\dim \operatorname{Ker}(\mathbf A^\top).\]
Example 2.10 Consider again the matrix \(D: \mathbb{R}^3\rightarrow \mathbb{R}^2\) \[ D=\left( \begin{array}{ccc} 1 & 2&3\\ 2&4&6 \end{array}\right)= \left( \begin{array}{c} 1 \\ 2 \end{array}\right)\left(\begin{array}{ccc}1&2&3\end{array}\right) \] We have already seen that \[\mathcal{C}(D)=\operatorname{span}\left\{\left(\begin{array}{c}1\\2\end{array}\right)\right\}\] and so \(\dim \mathcal{C}(D)=\operatorname{rank}(D)=1\). The kernel, or null-space, of \(\mathbf D\) is the set of vectors for which \(\mathbf D\mathbf x=\boldsymbol 0\), i.e., \[x_1+2x_2+3x_3=0\] This is a single equation with three unknowns, and so there must be a plane of solutions. We need two linearly independent vectors in this plane to describe it. Convince yourself that \[\mathcal{N}(D) = \operatorname{span}\left\{\left(\begin{array}{c}0\\3\\-2\end{array}\right), \left(\begin{array}{c}2\\-1\\0\end{array}\right)\right\}\] So we have \[\dim \mathcal{C}(D)+\dim \mathcal{N}(D)=1+2=3\] as required by the rank-nullity theorem.
If we consider \(D^\top\), we already know \(\dim \mathcal{C}(D^\top)=1\) (as row-rank=column rank), and the rank-nullity theorem tells us that the dimension of the null space of \(D^\top\) must be \(2-1=1\). This is easy to confirm as \(D^\top x=0\) implies \[x_1+2x_2=0\] which is a line in \(\mathbb{R}^2\) \[\mathcal{N}(D^\top) = \operatorname{span}\left\{ \left(\begin{array}{c}-2\\1\end{array}\right)\right\}\]Question: When does a square matrix \(\mathbf A\) have an inverse?
- Precisely when the kernel of \(\mathbf A\) contains only the zero vector, i.e., has dimension 0. In this case the column space of \(\mathbf A\) is the original space, and \(\mathbf A\) is surjective and so must have an inverse. A simpler way to determine if \(\mathbf A\) has an inverse is to consider its determinant.
Question: Suppose we are given a \(n\times p\) matrix \(\mathbf A\), and a n-vector \(\mathbf y\). When does \[\mathbf A\mathbf x= \mathbf y\] have a solution?
- When \(\mathbf y\) is in the column space of \(\mathbf A\), \[\mathbf y\in \mathcal{C}(\mathbf A)\]
Question: When is the answer unique?
- Suppose \(\mathbf x\) and \(\mathbf x'\) are both solutions with \(\mathbf x\not =\mathbf x'\). We can write \(\mathbf x'=\mathbf x+\mathbf u\) for some vector \(\mathbf u\) and note that \[\mathbf y=\mathbf A\mathbf x' = \mathbf A\mathbf x+\mathbf A\mathbf u= \mathbf y+\mathbf A\mathbf u\] and so \(\mathbf A\mathbf u=\boldsymbol 0\), i.e., \(\mathbf u\in \mathcal{N}(A)\). So there are multiple solutions when the null-space of \(\mathbf A\) contains more than the zero vector. If the dimension of \(\mathcal{N}(A)\) is one, there is a line of solutions. If the dimension is two, there is a plane of solutions, etc.