4.5 Exercises

\[\mathbf x_1 =\begin{pmatrix}1\\-1\end{pmatrix},\; \mathbf x_2 =\begin{pmatrix}-1\\1\end{pmatrix}, \;\mathbf x_3 =\begin{pmatrix}2\\2\end{pmatrix}\]

What is the orthogonal projection of these points onto \[\mathbf u_1 = \begin{pmatrix}1\\0\end{pmatrix}\] and onto \[\mathbf u_2 =\frac{1}{\sqrt{5}}\begin{pmatrix}1\\2\end{pmatrix}?\]
Compute the sample variance matrix of the data points, and compute its spectral decomposition.
Which unit vector \(\mathbf u\) would maximize the variance of these projections?
What vector \(\mathbf u\) would minimize \[\sum_{i=1}^4 ||\mathbf x_i -\mathbf u\mathbf u^\top \mathbf x_i||^2_2?\] This is the sum of squared errors from a rank 1 approximation to the data.
Plot the data points and convince yourself that your answers make intuitive sense.

##  [1] 4.22 2.38 1.88 1.11 0.91 0.82 0.58 0.44 0.35 0.19 0.05 0.04 0.04

Sketch a scree plot.
Determine the minimum number of principal components needed to explain 90% of the total variation.
Determine the number of principal components whose eigenvalues are above average.

Measurements are taken on \(p=3\) variables \(x_1\), \(x_2\) and \(x_3\), with sample correlation matrix \[ \mathbf R= \begin{pmatrix} 1 & 0.5792 & 0.2414 \\ 0.5792 & 1 & 0.5816 \\ 0.2414 & 0.5816 & 1 \end{pmatrix}. \] The variable \(z_j\) is the standardised versions of \(x_j\), \(j=1,2,3\), i.e. each \(z_j\) has sample mean \(0\) and variance \(1\). One observation has \(z_1 = z_2 = z_3 = 0\) and a second observation has \(z_1 = z_2 = z_3 =1\). Calculate the three principal component scores for each of these observations.
Do exam question 1 part (a) from the 2017-18 exam paper. You will find the past exam papers on Moodle.