1.5 Exercises
Show that the two formulae for the population covariance matrix \(\boldsymbol{\Sigma}\) are equivalent, i.e. show that \[\boldsymbol{\Sigma}= {\mathbb{E}}[(\mathbf x- {\boldsymbol{\mu}}) (\mathbf x- {\boldsymbol{\mu}})^\top ] = {\mathbb{E}}[ \mathbf x\mathbf x^\top ] - {\boldsymbol{\mu}}{\boldsymbol{\mu}}^\top.\]
Let \(\mathbf x_1, \ldots, \mathbf x_n\) be a \(p\)-dimensional sample with mean \(\bar{\mathbf x}\) and sample covariance matrix \(\mathbf S\). Consider the transformation \(\mathbf y_i = \mathbf A\mathbf x_i + \mathbf c\) where \(\mathbf A\) is a fixed \(q \times p\) matrix and \(\mathbf c\) is a fixed \(q\)-dimensional vector. Let \(\mathbf T\) be the sample covariance matrix of \(\mathbf y_1, \ldots, \mathbf y_n\). Show
\(\bar{\mathbf y} = \mathbf A\bar{\mathbf x} + \mathbf c\),
\(\mathbf T= \mathbf A\mathbf S\mathbf A^\top\).
Assuming now that \(\mathbf x\) is a random vector with \({\mathbb{E}}(\mathbf x)={\boldsymbol{\mu}}\), \({\mathbb{V}\operatorname{ar}}(\mathbf x)=\boldsymbol{\Sigma}\), \(\mathbf y=\mathbf A\mathbf x+\mathbf c\) with \(\mathbf A\) and \(\mathbf c\) as before. Let \({\mathbb{E}}(\mathbf y)={\pmb \phi}\) and \({\mathbb{V}\operatorname{ar}}(\mathbf y)={\pmb \Omega}\) denote the population mean and variance of \(\mathbf y\). What are the analogous population level versions of the sample-based results above (i.e. express \({\pmb \phi}\) and \({\pmb \Omega}\) in terms of \(\boldsymbol{\Sigma}\), \(\mathbf A\) and \(\mathbf c\))?
A sample of size \(n=144\) produced the following summary statistics \[ \sum_{i=1}^n \mathbf x_i = \begin{pmatrix} 392.2 \\ 1530.8 \end{pmatrix} \qquad \sum_{i=1}^n \mathbf x_i \mathbf x_i^\top = \begin{pmatrix} 1101.88 & 4305.17 \\ 4305.17 & 17120.88 \end{pmatrix}.\] Calculate the sample mean, the sample covariance matrix and the sample correlation coefficient.
Let \(\mathbf x\) and \(\mathbf y\) be independent random \(p\)-dimensional vectors. Assuming that all relevant moments exist, show that for any real scalars \(\alpha\) and \(\beta\), \[{\mathbb{V}\operatorname{ar}}(\alpha \mathbf x+ \beta \mathbf y) = \alpha^2 {\mathbb{V}\operatorname{ar}}(\mathbf x) + \beta^2 {\mathbb{V}\operatorname{ar}}(\mathbf y).\]
What is the corresponding formula when \(\mathbf x\) and \(\mathbf y\) are not independent? Express your answer in terms of \({\mathbb{V}\operatorname{ar}}(\mathbf x)\), \({\mathbb{V}\operatorname{ar}}(\mathbf y)\) and \({\mathbb{C}\operatorname{ov}}(\mathbf x, \mathbf y)\).
You are given the following data points \[\mathbf x_1 = \begin{pmatrix} 1\\ 2\end{pmatrix}, \mathbf x_2 = \begin{pmatrix} 2 \\ 2\end{pmatrix}, \mathbf x_3 = \begin{pmatrix} 3\\ 1\end{pmatrix}, \mathbf x_4 = \begin{pmatrix} 0 \\3\end{pmatrix}\] Write down the data matrix \(\mathbf X\), the mean vector, and the covariance matrix.