2.4 The Centering Matrix
The centering matrix will be play an important role in this module, as we will use it to remove the column means from a matrix (so that each column has mean zero), centering the matrix.
Definition 2.12 The centering matrix is \[\begin{equation} \mathbf H=\mathbf I_n - \frac{1}{n} {\mathbf 1}_n {\mathbf 1}_n^\top. \tag{2.2} \end{equation}\] where \(\mathbf I_n\) is the \(n \times n\) identity matrix, and \({\mathbf 1}_n\) is an \(n \times 1\) column vector of ones.
You will be asked to prove the following results about \(\mathbf H\) in the exercises:
- The matrix \(\mathbf H\) is a projection matrix, i.e. \(\mathbf H^2=\mathbf H\), and it is symmetric, \(\mathbf H^\top =\mathbf H\).
- Writing \({\mathbf 0}_n\) for the \(n \times 1\) vector of zeros, we have \(\mathbf H{\mathbf 1}_n={\mathbf 0}_n\) and \({\mathbf 1}_n^\top \mathbf H={\mathbf 0}_n^\top.\) In words: the sum of each row and each column of \(\mathbf H\) is \(0\).
- If \(\mathbf x=(x_1, \ldots , x_n)^\top\), then \(\mathbf H\mathbf x= \mathbf x- \bar{x}{\mathbf 1}_n\) where \(\bar{x}=n^{-1}\sum_{i=1}^n x_i\). I.e., \(\mathbf H\) subtracts the mean \(\bar{x}\) from \(\mathbf x\).
- With \(\mathbf x\) as in 3., we have \[ \mathbf x^\top \mathbf H\mathbf x= \sum_{i=1}^n (x_i-\bar{x})^2, \] and so \[ \frac{1}{n}\mathbf x^\top \mathbf H\mathbf x=\frac{1}{n}\sum_{i=1}^n (x_i-\bar{x})^2 = \hat{\sigma}^2, \] where \(\hat{\sigma}^2\) is the sample variance.
- If \[\mathbf X=\left[\begin{array}{ccc}-&\mathbf x_1^\top&-\\ &\vdots& \\ -&\mathbf x_n^\top&-\end{array}\right] = [\mathbf x_1, \ldots, \mathbf x_n]^\top\] is an \(n \times p\) data matrix containing data points \(\mathbf x_1, \ldots, \mathbf x_n\in \mathbb{R}^p\), then \[ \mathbf H\mathbf X=\left[ \begin{array}{ccc} -&(\mathbf x_1-\bar{\mathbf x})^\top&-\\ -&(\mathbf x_2 -\bar{\mathbf x})^\top&-\\ &\vdots&\\ -&(\mathbf x_n - \bar{\mathbf x})^\top&- \end{array}\right ]= \left[ \mathbf x_1 -\bar{\mathbf x}, \ldots , \mathbf x_n-\bar{\mathbf x}\right]^\top \] where \[\bar{\mathbf x} = \frac{1}{n} \sum_{i=1}^n \mathbf x_i \in \mathbb{R}^p\] is the p-dimensional sample mean of \(\mathbf x_1, \ldots, \mathbf x_n\in \mathbb{R}^p\). In words, \(\mathbf H\) has subtracted the column mean from each column of \(\mathbf X\).
- With \(\mathbf X\) as in 5. \[ \frac{1}{n}\mathbf X^\top \mathbf H\mathbf X=\frac{1}{n} \sum_{i=1}^n (\mathbf x_i -\bar{\mathbf x})(\mathbf x_i -\bar{\mathbf x})^\top =\mathbf S, \] where \(\mathbf S\) is the sample covariance matrix.
- If \(\mathbf A=(a_{ij})_{i,j=1}^n\) is a symmetric \(n \times n\) matrix, then \[ \mathbf B=\mathbf H\mathbf A\mathbf H= \mathbf A- {\mathbf 1}_n \bar{\mathbf a}_+^\top -\bar{\mathbf a}_+{\mathbf 1}_n^\top +\bar{a}_{++}{\mathbf 1}_n {\mathbf 1}_n^\top, \] or, equivalently, \[ b_{ij}=a_{ij}-\bar{a}_{i+}-\bar{a}_{+j}+\bar{a}_{++}, \qquad i,j=1, \ldots , n, \] where \[ \bar{\mathbf a}_{+}\equiv (\bar{a}_{1+}, \ldots , \bar{a}_{n+})^\top=\frac{1}{n}\mathbf A{\mathbf 1}_n, \] \(\bar{a}_{+j}=\bar{a}_{j+}\) , for , \(j=1, \ldots , n\),, and , \(\bar{a}_{++}=n^{-2}\sum_{i,j=1}^n a_{ij}\).
Note that Property 3. is a special case of Property 5., and Property 4. is a special case of Property 6. However, it is useful to see these results in the simpler scalar case before moving onto the general matrix case.