LECTURE 6: statistical power

class: center, middle, inverse, title-slide

.title[
# LECTURE 6: statistical power
]
.subtitle[
## FANR 6750 (Experimental design)
]
.author[
### Fall 2024
]

---

# outline

1) Motivation

2) Type I and Type II error

3) Two-sample *t*-test

4) ANOVA

---
class: inverse

# motivation

> A statistical test will not be able to detect a true difference if the sample size is too small compared with the magnitude of the difference.

> Since data are sampled at random, there is always a risk of reaching a wrong conclusion, and things can go wrong in two ways - Dalgaard (2008)

---
# type i & type ii errors

#### Type I error (i.e., false positive)

> The null hypothesis is correct, but the test rejects it.

`$$\large \alpha = Pr(Type\;I\;error)$$`

#### Type II error (i.e., false negative)

> The null hypothesis is wrong, but the test fails to reject it.

`$$\large \beta = Pr(Type\;II\;error)$$`

#### Power

> The test's ability to reject a false null hypothesis.

`$$\large Power = 1 - \beta$$`

---
# type i & type ii errors

#### The type I error rate is set by the scientist

#### The type II error rate, and hence the power of the test, depends on many factors

#### In the context of a linear model, these are:

1) Magnitude of the slope coefficients ( `$\beta$` )

2) Standard deviation (or variance) of population ( `$\sigma$` )

3) The sample size ( `$n$` )

4) The Type I error rate ( `$\alpha$` )

---
# magnitude of the difference

---
# magnitude of the difference

---
# standard deviation

---
# standard deviation

---
# sample size

.pull-left[
<img src="06_power_files/figure-html/unnamed-chunk-5-1.png" width="504" />
]

.pull-right[
<img src="06_power_files/figure-html/unnamed-chunk-6-1.png" width="504" />
]

---
# sample size

.pull-left[
<img src="06_power_files/figure-html/unnamed-chunk-7-1.png" width="504" />
]

.pull-right[
<img src="06_power_files/figure-html/unnamed-chunk-8-1.png" width="504" />
]

---
# type i error rate

#### `$\large \alpha = 0.05$`

---
# type i error rate

#### `$\large \alpha = 0.001$`

---
# factors affecting power

#### In general, power increases when:

1) The difference in means/magnitude of slope increases

2) The standard deviation **of the population** decreases

3) The sample size increases

4) The Type I error rate increases

#### Which of these, as researchers, do we have control over?

---
# example in `R`

.pull-left[

#### `$\large \mu_1 = 90$`, `$\large \mu_2 = 100$`

#### `$\large \sigma = 5$`

]

---
# example in `R`

.pull-left[

#### `$\large \mu_1 = 90$`, `$\large \mu_2 = 100$`

#### `$\large \sigma = 5$`

<img src="06_power_files/figure-html/unnamed-chunk-10-1.png" width="504" style="display: block; margin: auto;" />
]

.pull-right[
#### `$\large n = 5$`
.small-code[

```r
power.t.test(n = 5, 
             delta = 10, 
             sd = 5, 
             sig.level = 0.05, 
             power = NULL)
```

```
## 
##      Two-sample t test power calculation 
## 
##               n = 5
##           delta = 10
##              sd = 5
##       sig.level = 0.05
##           power = 0.7905
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
```
]
]

---
# example in `R`

.pull-left[

#### `$\large \mu_1 = 90$`, `$\large \mu_2 = 100$`

#### `$\large \sigma = 5$`

<img src="06_power_files/figure-html/unnamed-chunk-12-1.png" width="504" style="display: block; margin: auto;" />
]

.pull-right[
#### `$\large n = 15$`
.small-code[

```r
power.t.test(n = 15, 
             delta = 10, 
             sd = 5, 
             sig.level = 0.05, 
             power = NULL)
```

```
## 
##      Two-sample t test power calculation 
## 
##               n = 15
##           delta = 10
##              sd = 5
##       sig.level = 0.05
##           power = 0.9996
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
```
]
]

---
# example in `R`

.pull-left[

#### `$\large \mu_1 = 90$`, `$\large \mu_2 = 100$`

#### `$\large \sigma = 5$`

<img src="06_power_files/figure-html/unnamed-chunk-14-1.png" width="504" style="display: block; margin: auto;" />
]

.pull-right[
#### `$\large \alpha = 0.001$`
.small-code[

```r
power.t.test(n = 15, 
             delta = 10, 
             sd = 5, 
             sig.level = 0.001, 
             power = NULL)
```

```
## 
##      Two-sample t test power calculation 
## 
##               n = 15
##           delta = 10
##              sd = 5
##       sig.level = 0.001
##           power = 0.9501
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
```
]
]

---
# example in `R`

.pull-left[

#### `$\large \mu_1 = 94$`, `$\large \mu_2 = 97$`

#### `$\large \sigma = 5$`

<img src="06_power_files/figure-html/unnamed-chunk-16-1.png" width="504" style="display: block; margin: auto;" />
]

.pull-right[
#### `$\large n = 15$`
.small-code[

```r
power.t.test(n = 15, 
             delta = 3, 
             sd = 5, 
             sig.level = 0.001, 
             power = NULL)
```

```
## 
##      Two-sample t test power calculation 
## 
##               n = 15
##           delta = 3
##              sd = 5
##       sig.level = 0.001
##           power = 0.03597
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
```
]
]

---
# example in `R`

.pull-left[

#### `$\large \mu_1 = 94$`, `$\large \mu_2 = 97$`

#### `$\large \sigma = 5$`

<img src="06_power_files/figure-html/unnamed-chunk-18-1.png" width="504" style="display: block; margin: auto;" />
]

.pull-right[
#### `$\large n = 100$`
.small-code[

```r
power.t.test(n = 100, 
             delta = 3, 
             sd = 5, 
             sig.level = 0.001, 
             power = NULL)
```

```
## 
##      Two-sample t test power calculation 
## 
##               n = 100
##           delta = 3
##              sd = 5
##       sig.level = 0.001
##           power = 0.8143
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
```
]
]

---
## When should I do a power analysis?

#### Prospective is always better than retrospective!

**Retrospective** (conducted after experiment)

- If you failed to reject the null, then your power was low

- But you can't use this as an excuse!

- Only useful as a way of planning a subsequent experiment

**Prospective** (Done before the experiment)

- Used to determine sample size or power, given `$\beta$` and `$\sigma$`

- How can `$\beta$`  and/or `$\sigma$` be known ahead of time?

+ Requires prior knowledge, perhaps from a pilot study

+ Requires clear-headed thinking about what consitutes a biologically signiffcant difference

---
## What level of power should I aim for?

#### We want power to be as close to 1 as possible

#### Sometimes it may be prohibitively expensive to obtain a sample size large enough to achieve power close to 1

#### In practice, we are usually satisfied with power > 0.8

---
# summary

- Power analysis let's you determine the necessary sample size (or power) for testing an effect size of interest

- Power is influenced by the magnitude of the effect, the standard deviation of the population, the Type I error rate, and the sample size

- Retrospective power analysis isn't useful unless you are planning a subsequent experiment
 
--

- `R` has several functions for conducting power analysis, but only for simple tests

- More complicated power analysis can be performed using simulation (not covered in this course)

---
# looking ahead

### **Next time**: Linear models part 2: categorical predictor >2 levels

### **Reading**: [Fieberg chp. 3.7](https://statistics4ecologists-v1.netlify.app/matrixreg#ancova)