Chapter 2 Linear Models
In this chapter we consider linear regression methods. Suppose we have a response variable \(y\) that we would like to predict using covariate information \(\mathbf x=(x_1, \ldots, x_p)^\top \in \mathbb{R}^p\).
A linear model for \(y\) assumes that the mean of \(y\), \(\mu={\mathbb{E}}[y]\), is modelled as a linear function of \(\mathbf x\), i.e. it is assumed that \[ \mu = \mathbf x^\top \boldsymbol \beta, \] where \(\boldsymbol \beta\in \mathbb{R}^p\) is an unknown parameter vector to be estimated.
The videos for this chapter are available at the following links:
Notation
Suppose we have observations on \(n\) cases, \(\{\mathbf x_1, y_1\}, \ldots, \{\mathbf x_n, y_n\}\). The standard linear model is \[\begin{align} y_i &= \mathbf x_i^\top \boldsymbol \beta+\epsilon_i \tag{2.1}\\ &=\beta_1 x_{i1}+\ldots+\beta_p x_{ip}+\epsilon_i \end{align}\] for \(i=1, \ldots , n\), where \(\boldsymbol \beta\in \mathbb{R}^p\) is the unknown parameter vector we wish to estimate. We can write (2.1) in matrix form as \[\begin{equation} \mathbf y=\mathbf X\boldsymbol \beta+{\pmb \epsilon}, \tag{2.2} \end{equation}\] where \(\stackrel{n \times p}{\mathbf X}= \begin{pmatrix} - & \mathbf x_1^\top &-\\ &\vdots&\\ -&\mathbf x_n^\top&-\end{pmatrix}\) is the matrix of covariates, \(\stackrel{n \times 1}{\mathbf y}\) is the vector of univariate responses and \(\stackrel{n \times 1}{\pmb \epsilon}\) is the vector of univariate error terms.
We will assume throughout this chapter that \(\mathbf y\) and \(\mathbf X\) have been centred so that \(\bar{y}=0\) and the mean of each column of \(\mathbf X\) is zero. We do this to simplify the exposition and to avoid having to include an intercept term in the model (which makes the notation slightly messy). For uncentred data, we can think of the model as
\[y_i - \bar{y} = \beta_1 (x_{i1}-\bar{x}_1)+\ldots+\beta_p (x_{ip}-\bar{x}_p)+\epsilon_i\]