LECTURE 7: a priori orthogonal contrasts

class: center, middle, inverse, title-slide

.title[
# LECTURE 7: <em>a priori</em> orthogonal contrasts
]
.subtitle[
## FANR 6750 (Experimental design)
]
.author[
### <br/><br/><br/>Fall 2022
]

---

# motivation

> As the name *a priori* implies, these tests are planned **before** the experiment is done

<br/>
--

> Previously, we considered *a posteriori* tests, which often feature all possible comparisons, some of which may not be of interest

<br/>
--

> Further, hypotheses involving certain combinations of treatment groups are sometimes of interest. These cannot be tested using most multiple comparison procedures

<br/>
--

> Because tests that are not of interest are ignored, and the experiment is set up to test only certain hypotheses, orthogonal contrasts offer a more powerful procedure

---
# mussel size

<table class="table table-striped table-hover table-condensed" style="margin-left: auto; margin-right: auto;">
 <thead>
<tr><th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="4"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Watershed</div></th></tr>
  <tr>
   <th style="text-align:center;"> Twelvemile </th>
   <th style="text-align:center;"> Chattooga </th>
   <th style="text-align:center;"> Keowee </th>
   <th style="text-align:center;"> Coneross </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> 16 </td>
   <td style="text-align:center;"> 18 </td>
   <td style="text-align:center;"> 28 </td>
   <td style="text-align:center;"> 14 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 8 </td>
   <td style="text-align:center;"> 25 </td>
   <td style="text-align:center;"> 22 </td>
   <td style="text-align:center;"> 20 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 12 </td>
   <td style="text-align:center;"> 22 </td>
   <td style="text-align:center;"> 24 </td>
   <td style="text-align:center;"> 11 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 17 </td>
   <td style="text-align:center;"> 16 </td>
   <td style="text-align:center;"> 20 </td>
   <td style="text-align:center;"> 17 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 13 </td>
   <td style="text-align:center;"> 26 </td>
   <td style="text-align:center;"> 27 </td>
   <td style="text-align:center;"> 15 </td>
  </tr>
</tbody>
</table>

--
- Chattooga and Keowee are more forested

- Coneross and Twelvemile are more agricultural

---
# anova table

</br>

---
# hypotheses

#### Questions

1) Do mussels from forested watersheds differ from agricultural watersheds?

2) Do the agricultural watersheds differ from one another?

3) Do the forested watersheds differ from one another?

#### Hypotheses

<table class="table table-condensed" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;">  </th>
   <th style="text-align:left;"> Comparisons </th>
   <th style="text-align:center;"> H0 to test </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> Forested vs agricultural </td>
   <td style="text-align:center;"> $\frac{\mu_T + \mu_{Co}}{2} - \frac{\mu_{Ch} + \mu_K}{2}=0$ </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 2 </td>
   <td style="text-align:left;"> Twelvemile vs Coneross </td>
   <td style="text-align:center;"> $\mu_{T} - \mu_{Co} = 0$ </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 3 </td>
   <td style="text-align:left;"> Chattooga vs Keowee </td>
   <td style="text-align:center;"> $\mu_{Ch} - \mu_K = 0$ </td>
  </tr>
</tbody>
</table>

---
# hypotheses

#### Hypotheses

<table class="table table-condensed" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;">  </th>
   <th style="text-align:left;"> Comparisons </th>
   <th style="text-align:center;"> H0 to test </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> 1 </td>
   <td style="text-align:left;"> Forested vs agricultural </td>
   <td style="text-align:center;"> $\frac{\mu_T + \mu_{Co}}{2} - \frac{\mu_{Ch} + \mu_K}{2}=0$ </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 2 </td>
   <td style="text-align:left;"> Twelvemile vs Coneross </td>
   <td style="text-align:center;"> $\mu_{T} - \mu_{Ch} = 0$ </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 3 </td>
   <td style="text-align:left;"> Chattooga vs Keowee </td>
   <td style="text-align:center;"> $\mu_{Ch} - \mu_{K} = 0$ </td>
  </tr>
</tbody>
</table>

#### Linear combinations

---
# are these contrasts orthogonal?

A set of linear combinations is called a set of orthogonal contrasts if the following conditions hold for all pairs of linear combinations:

<br/>

--
Given

`$$\Large L_1 = a_1\mu_1 + a_2\mu_2 + ... + a_a\mu_a$$`
--

and

`$$\Large L_2 = b_1\mu_1 + b_2\mu_2 +...+ b_a\mu_a$$`
--

then `$L_1$` and `$L_2$` are orthogonal if:

`$$\large \sum_i a_i=0;\;\; \sum_i b_i=0; \;\;and \; \sum_i a_ib_i=0$$`

---
# back to the saw data

Returning to the question: "Does mussel size differ among forested and agricultural watersheds?":

`$$\large H_{0_1} = \frac{\mu_T + \mu_{Co}}{2} - \frac{\mu_{Ch} + \mu_{K}}{2} = 0$$`

--
Multiplying through by 2 gives:

`$$\large H_{0_1} = (\mu_{T} + \mu_{Co}) - (\mu_{Ch} + \mu_K) = 0$$`

--
Which can be written as:
`$$\large L1 = (1)\mu_T + (-1)\mu_{Ch} + (-1)\mu_K + (1)\mu_{Co}$$`
where the coefficients are `$a_1 = 1$`, `$a_2 = -1$`, `$a_3 = -1$`, and `$a_4 = 1$`

> Note that it's easier to work with coefficients that are integers rather than fractions

---
# are the contrasts orthogonal?

#### Does each set of coefficients sum to 0?

- `$\large L_1$`: `$\large \sum_i a = 1 - 1 - 1 + 1 = 0$`

- `$\large L_2$`: `$\large \sum_i a  = 1 + 0 + 0 - 1 = 0$`

- `$\large L_3$`: `$\large \sum_i a  = 0 + 1 - 1 + 0 = 0$`

#### Do the products of pairs of coefficients sum to 0?

- For `$\large L_1$` and `$\large L_2$`: `$\large (1)(1) + (-1)(0) + (-1)(0) + (1)(-1) = 0$`

- For `$\large L_1$` and `$\large L_3$`: `$\large (1)(0) + (-1)(1) + (-1)(-1) + (1)(0) = 0$`

- For `$\large L_2$` and `$\large L_3$`: `$\large (1)(0) + (0)(1) + (0)(-1) + (-1)(0) = 0$`

---
# testing the null hypotheses

To obtain the sums of squares for each contrast, we use the general formula:

`$$\large SS_L =\frac{(\sum_i a_i T_i)^2}{n \sum_i a_i^2}$$`

where `$\large  T_i$` is the sum of observations in group `$\large  i$`, and `$\large  a_i$` is the corresponding coefficient for group `$\large  i$`

For thefirst hypothesis we have:

$$ \large SS_{L_1} = \frac{(66-107-121+77)^2}{5(1 + 1 + 1 + 1)} = 361.2$$

---
# sums of squares for each contrast

For the second hypothesis we have: `$\large L_2 = \mu_T + 0 + 0 - \mu_{Co}$` with coefficients `$\large a_i = (1, 0, 0,-1)$`

`$$\large SS_{L_2} = \frac{(66 +0 +0 - 77)^2}{5(1 + 0 + 0 + 1)} = 12.1$$`

For the third hypothesis we have: `$\large L_3 = 0 + \mu_B - \mu_C + 0$` with coefficients `$\large a_i = (0, 1,-1, 0)$`

`$$\large SS_{L_3} = \frac{(0 + 107 - 121 + 0)^2}{5(0 + 1 + 1 + 0)} = 19.6$$`
---
# expanded anova table

For each contrast, each SS is divided by 1 d.f., and then divided by MSW.

<table class="table table-condensed" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:right;"> Source </th>
   <th style="text-align:center;"> df </th>
   <th style="text-align:center;"> SS </th>
   <th style="text-align:center;"> MS </th>
   <th style="text-align:center;"> F </th>
   <th style="text-align:center;"> Fcrit </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> Among watersheds </td>
   <td style="text-align:center;"> 3 </td>
   <td style="text-align:center;"> 393 </td>
   <td style="text-align:center;"> 131.0 </td>
   <td style="text-align:center;"> 9.70 </td>
   <td style="text-align:center;"> 3.24 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> F vs Ag </td>
   <td style="text-align:center;"> 1 </td>
   <td style="text-align:center;"> 361 </td>
   <td style="text-align:center;"> 361.0 </td>
   <td style="text-align:center;"> 26.76 </td>
   <td style="text-align:center;"> 4.49 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> T vs Co </td>
   <td style="text-align:center;"> 1 </td>
   <td style="text-align:center;"> 12 </td>
   <td style="text-align:center;"> 12.0 </td>
   <td style="text-align:center;"> 0.90 </td>
   <td style="text-align:center;"> 4.49 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> Ch vs K </td>
   <td style="text-align:center;"> 1 </td>
   <td style="text-align:center;"> 20 </td>
   <td style="text-align:center;"> 20.0 </td>
   <td style="text-align:center;"> 1.45 </td>
   <td style="text-align:center;"> 4.49 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> Within </td>
   <td style="text-align:center;"> 16 </td>
   <td style="text-align:center;"> 216 </td>
   <td style="text-align:center;"> 13.5 </td>
   <td style="text-align:center;">  </td>
   <td style="text-align:center;">  </td>
  </tr>
</tbody>
</table>

Notice that the Sums of Squares are partitioned according to hypotheses we are interested in, unlike in multiple comparison procedures.

---
# example

Study on the effectiveness of analgesics. Five persons who have a headache are chosen at random for each treatment. All patients take the medication in capsule form and do not know which group they are in. The capsules containing aspirin (with or without something else) all contain the same amount.

.pull-left[
<table class="table table-condensed" style="font-size: 12px; width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;">  </th>
   <th style="text-align:left;"> Treatment </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> I </td>
   <td style="text-align:left;"> Placebo (control) </td>
  </tr>
  <tr>
   <td style="text-align:left;"> II </td>
   <td style="text-align:left;"> Aspirin, Brand 1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> III </td>
   <td style="text-align:left;"> Aspirin with caffeine </td>
  </tr>
  <tr>
   <td style="text-align:left;"> IV </td>
   <td style="text-align:left;"> Aspirin, Brand 2 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> V </td>
   <td style="text-align:left;"> Aspirin with buffer </td>
  </tr>
  <tr>
   <td style="text-align:left;"> VI </td>
   <td style="text-align:left;"> Aspirin with buffer and caffeine </td>
  </tr>
</tbody>
</table>
]

.pull.right[
<br/>
<br/>
The response variable is the amount of time (in hours) until relief from pain is felt.
]

---
# example

#### Questions:

1. How many orthogonal contrasts can be made?

--
2. Make the following orthogonal comparisons:

a. Placebo vs. analgesic  
    
    b. Pure aspirin vs. aspirin with additives  
    
    c. Aspirin 1 vs. aspirin 2  
    
    d. Aspirin with caffeine (alone) vs. aspirin with buffer (with or without caffeine)  
    
    e. Aspirin with buffer vs. aspirin with buffer and caffeine

--
3. Show that the set of comparisons in Part 2 are mutually orthogonal.

---
# example

1\. How many orthogonal contrasts can be made?

- *Answer: a* - 1 = 5, where *a* is the number of treatment groups

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

]

.pull-right[

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

]

.pull-right[

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 +\mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

]

.pull-right[

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 +\mu_4 +\mu_5 +\mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

]

.pull-right[

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 + \mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

- `$\small H0 : \frac{\mu_2 +\mu_4}{2} - \frac{\mu_3 + \mu_5 + \mu_6}{3} = 0$`

- **Coefficients**: `$\small (0, 3, -2, 3, -2, -2)$`

]

.pull-right[

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 + \mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

- `$\small H0 : \frac{\mu_2 + \mu_4}{2} - \frac{\mu_3 + \mu_5 + \mu_6}{3} = 0$`

- **Coefficients**: `$\small (0, 3, -2, 3, -2, -2)$`

c. Aspirin 1 vs. aspirin 2

]

.pull-right[

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 + \mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

- `$\small H0 : \frac{\mu_2 + \mu_4}{2} - \frac{\mu_3 + \mu_5 + \mu_6}{3} = 0$`

- **Coefficients**: `$\small (0, 3, -2, 3, -2, -2)$`

c. Aspirin 1 vs. aspirin 2

- `$\small H0 : \mu_2 - \mu_4 = 0$`

- **Coefficients**: `$\small (0, 1, 0, -1, 0, 0)$`  
]

.pull-right[

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 + \mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

- `$\small H0 : \frac{\mu_2 + \mu_4}{2} - \frac{\mu_3 + \mu_5 + \mu_6}{3} = 0$`

- **Coefficients**: `$\small (0, 3, -2, 3, -2, -2)$`

c. Aspirin 1 vs. aspirin 2

- `$\small H0 : \mu_2 - \mu_4 = 0$`

- **Coefficients**: `$\small (0, 1, 0, -1, 0, 0)$`  
]

.pull-right[

d. Aspirin with caffeine (alone) vs. aspirin with buffer (with or without caffeine)

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 + \mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

- `$\small H0 : \frac{\mu_2 + \mu_4}{2} - \frac{\mu_3 + \mu_5 + \mu_6}{3} = 0$`

- **Coefficients**: `$\small (0, 3, -2, 3, -2, -2)$`

c. Aspirin 1 vs. aspirin 2

- `$\small H0 : \mu_2 - \mu_4 = 0$`

- **Coefficients**: `$\small (0, 1, 0, -1, 0, 0)$`  
]

.pull-right[

d. Aspirin with caffeine (alone) vs. aspirin with buffer (with or without caffeine)

- `$\small H0 : \mu_3 - \frac{\mu_5 + \mu_6}{2} = 0$`

- **Coefficients**: `$\small (0, 0, 2, 0, -1, -1)$`

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 + \mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

- `$\small H0 : \frac{\mu_2 + \mu_4}{2} - \frac{\mu_3 + \mu_5 + \mu_6}{3} = 0$`

- **Coefficients**: `$\small (0, 3, -2, 3, -2, -2)$`

c. Aspirin 1 vs. aspirin 2

- `$\small H0 : \mu_2 - \mu_4 = 0$`

- **Coefficients**: `$\small (0, 1, 0, -1, 0, 0)$`  
]

.pull-right[

d. Aspirin with caffeine (alone) vs. aspirin with buffer (with or without caffeine)

- `$\small H0 : \mu_3 - \frac{\mu_5 + \mu_6}{2} = 0$`

- **Coefficients**: `$\small (0, 0, 2, 0, -1, -1)$`

e. Aspirin with buffer vs. aspirin with buffer and caffeine

]

---
# example

2\. Make the following orthogonal comparisons:

.pull-left[

a. Placebo vs. analgesic

- `$\small H0 : \mu_1 - \frac{\mu_2 + \mu_3 + \mu_4 + \mu_5 + \mu_6}{5} = 0$`

- **Coefficients**: `$\small (5, -1, -1, -1, -1, -1)$`

b. Pure aspirin vs. aspirin with additives

- `$\small H0 : \frac{\mu_2 + \mu_4}{2} - \frac{\mu_3 + \mu_5 + \mu_6}{3} = 0$`

- **Coefficients**: `$\small (0, 3, -2, 3, -2, -2)$`

c. Aspirin 1 vs. aspirin 2

- `$\small H0 : \mu_2 - \mu_4 = 0$`

- **Coefficients**: `$\small (0, 1, 0, -1, 0, 0)$`  
]

.pull-right[

d. Aspirin with caffeine (alone) vs. aspirin with buffer (with or without caffeine)

- `$\small H0 : \mu_3 - \frac{\mu_5 + \mu_6}{2} = 0$`

- **Coefficients**: `$\small (0, 0, 2, 0, -1, -1)$`

e. Aspirin with buffer vs. aspirin with buffer and caffeine

- `$\small H0 : \mu_5 - \mu_6 = 0$`

- **Coefficients**: `$\small (0, 0, 0, 0, 1, -1)$`
]

---
# example

3\. Show that the set of comparisons in Part 2 are mutually orthogonal

--
- The sum of each set of coefficients = zero

+ **a**: `$5 - 1 - 1 - 1 - 1 - 1 = 0$`  
    
    + **b**: `$0 + 3 - 2 + 3 - 2 - 2 = 0$`

- The cross product (here equivalent to the inner, or dot product) of any pair is also zero

+ e.g., **a** versus **b**: `$(a \cdot b) : (5)(0) + (-1)(3) + (-1)(-2) + (-1)(3) +(-1)(-2) + (-1)(-2) = 0$`

---
# summary

- Orthogonal contrasts are more powerful than multiple comparison procedures

- They also require more thought and preparation (good things!)

- There can be only `$a - 1$` comparisons

- If the contrasts are not orthogonal, they can't be used to fully partition the Sums of Squares among groups

- If the comparisons really represent more than 1 treatment variable, it will be better to use a factorial design. More on that later.