7.5 Exercises

If \(M \sim W_p(\boldsymbol{\Sigma},n)\), prove that \({\mathbb{E}}\mathbf M= n \boldsymbol{\Sigma}\).
If \(\mathbf x\sim N_p({\boldsymbol{\mu}},\boldsymbol{\Sigma})\) and \(\mathbf a\) is any fixed vector of dimension \(p\), show that \[ z = \frac{ \mathbf a^\top (\mathbf x- {\boldsymbol{\mu}}) }{ \sqrt{\mathbf a^\top \boldsymbol{\Sigma}\mathbf a} } \sim N(0,1).\]
Prove that \[ n ( \bar{\mathbf x} - {\boldsymbol{\mu}})^{\rm T} \boldsymbol{\Sigma}^{-1} ( \bar{\mathbf x} - {\boldsymbol{\mu}}) \sim \chi_p^2 . \]
If \(\mathbf x_1, \ldots, \mathbf x_n\) are i.i.d. \(N_p({\boldsymbol{\mu}}, \boldsymbol{\Sigma})\) with sample mean \(\bar{\mathbf x}\) and sample covariance matrix \(\mathbf S\), show that \({\mathbb{C}\operatorname{ov}}(\bar{\mathbf x}, \mathbf x_i - \bar{\mathbf x}) = {\boldsymbol 0}\). Hence deduce that \(\bar{\mathbf x}\) and \(\mathbf S\) are independent.

Hint: If the random vector \(\mathbf x\) is independent of the random vector \(\mathbf y\), then \(\mathbf x\) is also independent of any function \(f(\mathbf y)\).
A survey of \(n=25\) families records the head length of the first (\(x_1\)) and second (\(x_2\)) sons. The sample mean is \(\bar{\mathbf x} = (185.72, 183.84)^\top\). Assume that the observations are sampled from \(N_2({\boldsymbol{\mu}},\boldsymbol{\Sigma})\) where \(\boldsymbol{\Sigma}= \text{diag}(100,100)\).
1. Conduct a hypothesis test of \({\boldsymbol{\mu}}= (182,182)^\top\).
2. Show that the confidence region for \({\boldsymbol{\mu}}\) is circular. Find its centre and radius.
3. Repeat the hypothesis test and sketch the confidence region if we assume that \[\boldsymbol{\Sigma}= \begin{pmatrix} 100 & 50 \\ 50 & 100 \end{pmatrix}.\]
Let \(\mathbf x_1, \ldots, \mathbf x_{20}\) be a random sample of vectors from a \(N_3({\boldsymbol{\mu}},\boldsymbol{\Sigma})\) population where \({\boldsymbol{\mu}}\) and \(\boldsymbol{\Sigma}\) are unknown. The sample mean and sample covariance matrix are given by \[\bar{\mathbf x} = \begin{pmatrix} 0.358 \\ -1.056 \\ -1.795 \end{pmatrix} \qquad \mathbf S= \begin{pmatrix} 0.522 & 0.556 & -2.285 \\ 0.556 & 3.258 & -0.765 \\ -2.285 & -0.765 & 14.093 \end{pmatrix}.\]
1. Use Hotelling’s \(T^2\) distribution to perform a significance test of the hypothesis \(H_0: {\boldsymbol{\mu}}= (0,-1,-1)^T\). Note that \(\mathbf S= \mathbf V\boldsymbol \Lambda\mathbf V^T\) where \(\boldsymbol \Lambda= \text{diag}(14.531, 3.253,0.090)\) and \[\mathbf V= \begin{pmatrix} -0.163 & -0.121 & -0.979 \\ -0.075 & -0.988 & 0.135 \\ 0.984 & -0.095 & -0.152 \end{pmatrix}.\]
2. Let \({\boldsymbol{\mu}}= (\mu_1,\mu_2,\mu_3)^\top\). Perform separate (univariate) \(t\)-tests of the following hypotheses: \(\mu_1 = 0\); \(\mu_2 = -1\); \(\mu_3 = -1\). Compare the results of the individual tests with the combined test based on Hotelling’s \(T^2\) distribution in (a). Comment briefly.
Two measurements were collected on each of 36 flea-beetles; 18 of the beetles were from a species called Chaetocnema concinna and the other 18 were from another species called Chaetocnema heikertingeri. The first variable consisted of the sum of widths (in micrometres) of the first joints of the first two tarsi (``feet’’); and the second variable consisted of the corresponding sum for the second joints. It is of interest to know whether or not the population means of the two species are different.

The sample means are \(\bar{\mathbf x}_1 = (181.50,129.17)^\top\) and \(\bar{\mathbf x}_2 = (205.06,120.44)^\top\); and the sample covariance matrices are \[\mathbf S_1 = \begin{pmatrix} 120.58 & 56.25 \\ 56.25 & 44.63 \end{pmatrix} \qquad \mathbf S_2 = \begin{pmatrix} 203.94 & 73.42 \\ 73.42 & 47.14 \end{pmatrix}.\] Conduct a suitable hypothesis test. State your conclusion in words. What assumptions have you made in constructing the test? Do any of these assumptions seem suspect with these data?