Case Study 6: General linear hypothesis tests: Toxicology experiment

The data
Visualise the data
Parameterisation
- Fitting the models in R
A simple ANOVA test
Single parameter hypothesis test
More complex General Linear Hypothesis Tests

The data

In a toxicology experiment, 28 mice were randomly assigned to one of six different treatment groups and a control group. The control group received no treatment. After the treatment period the liver weight of each mouse was measured. Note that the response is a continuous quantity, but the regressor is a discrete factor with 7 levels.

filepath <- "https://www.maths.nottingham.ac.uk/personal/pmzrdw/ToxicologyData.txt"

# Download the data from the internet
download.file(filepath, destfile = "ToxicologyData.txt", method = "curl")
ToxData <- read.table(file='ToxicologyData.txt', header=TRUE)
str(ToxData) # look at the data structure

## 'data.frame':    28 obs. of  2 variables:
##  $ Weight   : num  89.8 93.8 88.4 112.6 84.4 ...
##  $ Treatment: Factor w/ 7 levels "Control","T1",..: 1 1 1 1 2 2 2 2 3 3 ...

# Don't worry about these commands - they just allow us to see the data 
# structured in a nice way
library(reshape2)
ToxData$mouse = rep(1:4,7)
dcast(ToxData, mouse~Treatment, value.var = "Weight")

##   mouse Control    T1   T2   T3   T4   T5   T6
## 1     1    89.8  84.4 64.4 75.2 88.4 56.4 65.6
## 2     2    93.8 116.0 79.8 62.4 90.2 83.2 79.4
## 3     3    88.4  84.0 88.0 62.4 73.2 90.4 65.6
## 4     4   112.6  68.6 69.4 73.8 87.8 85.6 70.2

ToxData <- subset(ToxData, select = -c(mouse) )

Response = organ weight (continuous).
Input = treatment (discrete - a factor with 7 levels)

What questions might we want to answer about this data set?

Is there evidence that any of the treatments is having an effect on liver weight?
Is treatment 1 better than treatment 2?
Are treatments 1-6 all equally effective?
etc.

Visualise the data

Check the data and plot it

plot(Weight ~ Treatment, data=ToxData)   # Plot the data

This informally suggests that the treatments are having an effect. But with only 4 mice in each group, are the differences statistically significant?

Parameterisation

How can we model these data? A simple model is to assume that each treatment group has a mean weight, but that this mean might be different for different treatment groups.

Model 1:

Let \(y_{ij}\) be the \(j^{th}\) mouse weight in treatment group \(i\).

\[ y_{ij}=\begin{cases}\mu_C+\epsilon_{ij}, & \text{if}~ i=\text{`Control'}\\ \mu_1+\epsilon_{ij}, & \text{if}~ i=1\\ \mu_2+\epsilon_{ij}, & \text{if}~ i=2\\ \mu_3+\epsilon_{ij}, & \text{if}~ i=3\\ \mu_4+\epsilon_{ij}, & \text{if}~ i=4\\ \mu_5+\epsilon_{ij}, & \text{if}~ i=5\\ \mu_6+\epsilon_{ij}, & \text{if}~ i=6 \end{cases} \] This model allows the mean responses \(\mu_i\)’s to be different for different treatments. This is an Analysis of Variance (ANOVA) model.

What is \(Z\) here? We need to decide on a way to stack the \(y_{ij}\) into an observation vector. If we set

\[ y= \left[ \begin{array}{c} 89.8 \\ 93.8 \\ 88.4 \\ 112.6 \\ 84.4 \\ \vdots \\ 70.2 \end{array} \right] , \;\;\;\; x= \left[ \begin{array}{c} C \\ C \\ C \\ C \\ 1 \\ \vdots \\ 6 \end{array} \right] , \;\;\;\; \beta= \left[ \begin{array}{c} \mu_C \\ \mu_1 \\ \mu_2 \\ \vdots \\ \mu_6 \end{array} \right] \]

then the model in matrix form is

\[ y=\left[ \begin{array}{ccccc} 1&0& 0 & \dots & 0 \\ 1&0& 0 & \dots & 0 \\ 1&0& 0 & & \vdots \\ 1&0& 0 & & \\ 0&1& 0 & & \\ 0&1& 0 & & \\ 0&1& 0 & & \\ 0&1& 0 & & \\ 0&0& 1 & & \vdots \\ 0&0& 1 & \dots & 0 \\ \vdots & & & & \vdots \\ 0&0& 0 & \dots & 1 \\ 0&0& 0 & \dots & 1 \\ 0&0& 0 & \dots & 1 \\ 0&0& 0 & \dots & 1 \end{array} \right] \;\begin{bmatrix}\mu_C\\\mu_1 \\\mu_2\\\mu_3\\\mu_4\\\mu_5\\\mu_6 \end{bmatrix}+\epsilon. \] So, \(Z\) is a (28 \(\times\) 7) matrix of 0’s and 1’s (4 replicates of 7 treatments).

One can show that the least squares estimators are \[\hat{\mu}_i=\bar y_{i\cdot} = \frac{1}{4} \sum_j y_{ij}\] (the mean of the observations from group \(i\)).

Model 2:

It is possible to parameterise this model in a different way. Consider:

\[y_{ij}\quad =\begin{cases}\mu +\epsilon _{iC}\qquad& i=\mbox{`Control'}\\ \mu +\alpha_{j}+\epsilon _{ij} \qquad &i=1,\dots,6. \end{cases} \]

This is still a 7 parameter model with parameters \(\mu,\,\alpha _{1},\,\alpha _{2},\,\dots,\,\alpha _{6}\). Now \[ Z=\left[ \begin{array}{cccccc} 1 & 0 & 0 & \dots & 0 & 0 \\ 1 & 0 & 0 & \dots & 0 & 0 \\ 1 & 0 & 0 & \dots & 0 & 0 \\ 1 & 0 & 0 & \dots & 0 & 0 \\ 1 & 1 & 0 & \dots & 0 & 0 \\ 1 & 1 & 0 & \dots & 0 & 0 \\ 1 & 1 & 0 & \dots & 0 & 0 \\ 1 & 1 & 0 & \dots & 0 & 0 \\ 1 & 0 & 1 & \dots & 0 & 0 \\ \vdots & \vdots & \vdots & & \vdots & \vdots \\ 1 & 0 & 0 & \dots & 0 & 1 \end{array} \right] . \]

We can show that \(\hat{\mu}=\bar{y}_{+C}\quad \hat{\alpha}_{j}=\bar{y}_{+j}-\bar{y}_{+C}\), and thus the residuals are \(y_{ij}-\bar{y}_{+j}\), the same as in model 1.

In this case model 2 is simply a reparametrisation of model 1, i.e., \[ \begin{aligned} \mu &=\mu_{C} \\ \mu +\alpha _{j} &=\mu _{j}. \end{aligned} \]

Fitting the models in R

We can fit model 1 in R using the command

fit <- lm(Weight ~ Treatment - 1, data = ToxData)
fit

## 
## Call:
## lm(formula = Weight ~ Treatment - 1, data = ToxData)
## 
## Coefficients:
## TreatmentControl       TreatmentT1       TreatmentT2       TreatmentT3  
##            96.15             88.25             75.40             68.45  
##      TreatmentT4       TreatmentT5       TreatmentT6  
##            84.90             78.90             70.20

The formula used contains a \(-1\) as otherwise R automatically assumes an intercept term. If we don’t use a -1 then we fit model 2 instead. Try

fit2 <- lm(Weight ~ Treatment, data=ToxData)

Note that we can see the design matrix used by lm by typing

model.matrix(fit)

A simple ANOVA test

Lets consider testing the two models \[\begin{aligned} &M_0: y_{ij} = \mu+\epsilon_{ij}\\ &\mbox{ vs }M_1: y_{ij}= \mu+\alpha_i+\epsilon_{ij} \end{aligned}\] which is equivalent to testing \[\begin{aligned} &H_0:\alpha_1=\ldots = \alpha_6=0\\ &\mbox{ vs }H_1: \alpha_i \mbox{ arbitrary} \end{aligned}\]

We use an F-test to do this (\(q=6\) as we must constrain 6 parameters to go from \(M_1\) to \(M_0\)). The anova command is the easiest way to do this in R (although you could use linearHypothesis if you wished).

fit <- lm(Weight ~ Treatment-1, data=ToxData)
anova(fit)

## Analysis of Variance Table
## 
## Response: Weight
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## Treatment  7 183059 26151.3  179.32 < 2.2e-16 ***
## Residuals 21   3063   145.8                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

So there is strong evidence (p-value \(= 2.2 \times 10^{-16}\)) to reject \(H_0\) in favour of \(H_1\).

Can you manually calculate this F statistic?

Single parameter hypothesis test

Suppose we want to test \(H_{0}: \alpha_1=0\) vs \(H_{1}:\alpha_1\) arbitrary.

We can use a simple t-test here (we can use an F-test, but as \(q=1\), and \(F_{1,n-p} = (t_{n-p})^2\) they are equivalent). The test statistic is \[ T=\frac{\mathbf{a}^{T}\hat{\beta}-\mathbf{a}^{T} \beta }{s \sqrt{\mathbf{a}^{T}\left(Z^{T}Z\right)^{-1}\mathbf{a}}} \] with \(\mathbf{a}^{T}=(0,1,0,0,0,0,0)\).

Our \(T\) statistic is therefore,

where, \[\textrm{std.error}(\widehat{\alpha}_{1})= s\sqrt{d_{ii}}.\]

R tells us that \(s=12.076\), \(\widehat\alpha_1=-7.9\), and \(d_{ii}=0.5\).

coef(fit2)

## (Intercept) TreatmentT1 TreatmentT2 TreatmentT3 TreatmentT4 TreatmentT5 
##       96.15       -7.90      -20.75      -27.70      -11.25      -17.25 
## TreatmentT6 
##      -25.95

fit.sum <- summary(fit2)
fit.sum$sigma

## [1] 12.07629

solve(t(model.matrix(fit2))%*%model.matrix(fit2))

##             (Intercept) TreatmentT1 TreatmentT2 TreatmentT3 TreatmentT4
## (Intercept)        0.25       -0.25       -0.25       -0.25       -0.25
## TreatmentT1       -0.25        0.50        0.25        0.25        0.25
## TreatmentT2       -0.25        0.25        0.50        0.25        0.25
## TreatmentT3       -0.25        0.25        0.25        0.50        0.25
## TreatmentT4       -0.25        0.25        0.25        0.25        0.50
## TreatmentT5       -0.25        0.25        0.25        0.25        0.25
## TreatmentT6       -0.25        0.25        0.25        0.25        0.25
##             TreatmentT5 TreatmentT6
## (Intercept)       -0.25       -0.25
## TreatmentT1        0.25        0.25
## TreatmentT2        0.25        0.25
## TreatmentT3        0.25        0.25
## TreatmentT4        0.25        0.25
## TreatmentT5        0.50        0.25
## TreatmentT6        0.25        0.50

Hence, \[ T=\frac{-7.9 }{12.0763\times\sqrt{0.5} }=-0.925. \]

This is not significant, as \(t_{21}(0.975)=2.0796\).

qt(0.975, df=21)

## [1] 2.079614

NB: there are \(28-7=21\) degrees of freedom for this hypothesis test. Note that all of the 7 groups are contributing to the estimation of \(\sigma^{2}\), whereas in the two-sample unpaired \(t\)-test (comparing group 1 with the control group) only two groups would be used, and so the number of degrees of freedom would be \(n_5+n_6-2=6\).

Note that the details above are just for illustrative purposes. To perform this test you can simply type

summary(fit2)

## 
## Call:
## lm(formula = Weight ~ Treatment, data = ToxData)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -22.500  -6.050  -1.175   5.688  27.750 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   96.150      6.038  15.924 3.38e-13 ***
## TreatmentT1   -7.900      8.539  -0.925  0.36540    
## TreatmentT2  -20.750      8.539  -2.430  0.02415 *  
## TreatmentT3  -27.700      8.539  -3.244  0.00389 ** 
## TreatmentT4  -11.250      8.539  -1.317  0.20188    
## TreatmentT5  -17.250      8.539  -2.020  0.05631 .  
## TreatmentT6  -25.950      8.539  -3.039  0.00624 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.08 on 21 degrees of freedom
## Multiple R-squared:  0.441,  Adjusted R-squared:  0.2813 
## F-statistic: 2.761 on 6 and 21 DF,  p-value: 0.03871

and read off the answer.

More complex General Linear Hypothesis Tests

Now lets consider the parameterization \[y_{ij}=\mu_{i}+\epsilon_{ij},\] for \(i=C, 1,\dots,6\), \(j=1,\dots,4\) and \(\beta=(\mu_C, \mu_1,\dots,\mu_6)^T\). It is easy to show that \(Z^\top Z = 4 I_7\).

The general linear hypothesis test can test any hypothesis of the form \(\mathbf{A}\beta=\mathbf{c}\).

We use the R command linearHypothesis in the package car. Read about this function in the manual pages (type ?linearHypothesis in R).

Here are some examples:

Example 1

Is there any difference between treatments 3 and 6?

\(H_0:\mu_3=\mu_6\)

\(H_1:\beta\) arbitrary.

Here \(\mathbf{A}=[0,0,0,1,0,0,-1]_{1\times 7}\) and \(\mathbf{c}=[0]_{1\times 1}\). This test can equivalently be done with a \(t\)-test as there is only one constraint (as in the previous section).

\((Z^TZ)^{-1}=0.25I_7\) then after some multiplication

\[ \begin{aligned} Q_H-Q&=\left(\mathbf{A}\hat{\beta}-\mathbf{c}\right)^{T}\left[\mathbf{A}\left(Z^{T}Z\right)^{-1}\mathbf{A}^{T}\right]^{-1} \left(\mathbf{A}\hat{\beta}-\mathbf{c}\right)\\ &= -1.75 \times \left( \frac{2}{4}\right)^{-1} \times (-1.75)\\ &=6.125.\end{aligned} \]

but, \(Q=3062.6\)

deviance(fit)

## [1] 3062.57

and so \[F=\frac{(Q_H-Q)}{Q/21}= \frac{6.126}{3062.6/21}=0.042,\] which is not significant as \[F_{1,21}(0.95)=4.32\] and so there is no evidence to reject the hypothesis that \(\mu_3=\mu_6\).

qf(0.95, df1=1, df2=21)

## [1] 4.324794

In R, we simply do the following:

library(car)
A <- c(0,0,0,1,0,0,-1)
c = 0
hyp1 <- linearHypothesis(fit, A, 0)
hyp1

## Linear hypothesis test
## 
## Hypothesis:
## TreatmentT3 - TreatmentT6 = 0
## 
## Model 1: restricted model
## Model 2: Weight ~ Treatment - 1
## 
##   Res.Df    RSS Df Sum of Sq     F Pr(>F)
## 1     22 3068.7                          
## 2     21 3062.6  1     6.125 0.042 0.8396

Make sure you know what all the numbers mean and how to calculate them yourself.

Example 2

\(H_0:\mu_3=\mu_6=70\)

\(H_1:\beta\) arbitrary.

This time, \[\mathbf{A}= \begin{bmatrix} 0 & 0 & 0 & 1 & 0 & 0 & 0\cr 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix}_{2\times 7} \mbox{ and } \mathbf{c}= \begin{bmatrix} 70\cr 70 \end{bmatrix}_{2\times 1} \].

A <- rbind(c(0,0,0,1,0,0,0),c(0,0,0,0,0,0,1))
c=c(70,70)
hyp2 <- linearHypothesis(fit, A,c)
hyp2

## Linear hypothesis test
## 
## Hypothesis:
## TreatmentT3 = 70
## TreatmentT6 = 70
## 
## Model 1: restricted model
## Model 2: Weight ~ Treatment - 1
## 
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     23 3072.3                           
## 2     21 3062.6  2      9.77 0.0335 0.9671

Not significant at the 5% level as the p-value \(>0.05\)

Example 3

\(H_0:\mu_3+\mu_6=2\mu_1\), \(\mu_1=80\) and \(\mu_2=60\)

\(H_1:\beta\) arbitrary.

Now \[\mathbf{A}= \begin{bmatrix} 0 & 2 & 0 & -1 & 0 & 0 & -1\cr 0 & 1 & 0 & 0 & 0 & 0 & 0\cr 0 & 0 & 1 & 0 & 0 & 0 & 0 \end{bmatrix}_{3\times 7} \mbox{ and }\mathbf{c}= \begin{bmatrix} 0\cr 80\cr 60 \end{bmatrix}_{3\times 1}\]

A <- rbind(c(0,-2,0,1,0,0,1),c(0,1,0,0,0,0,0), c(0,0,1,0,0,0,0))
c=c(0, 80,60)
hyp3 <- linearHypothesis(fit, A, c)
hyp3

## Linear hypothesis test
## 
## Hypothesis:
## - 2 TreatmentT1  + TreatmentT3  + TreatmentT6 = 0
## TreatmentT1 = 80
## TreatmentT2 = 60
## 
## Model 1: restricted model
## Model 2: Weight ~ Treatment - 1
## 
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)   
## 1     24 5195.1                                
## 2     21 3062.6  3    2132.5 4.8743 0.009998 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Case Study 6: General linear hypothesis tests: Toxicology experiment

Richard Wilkinson

The data

Visualise the data

Parameterisation

Model 1:

Model 2:

Fitting the models in R

A simple ANOVA test

Single parameter hypothesis test

More complex General Linear Hypothesis Tests

Example 1

Example 2

Example 3