class: center, middle, inverse, title-slide .title[ # LECTURE 6: statistical power ] .subtitle[ ## FANR 6750 (Experimental design) ] .author[ ###
Fall 2024 ] --- # outline <br/> 1) Motivation <br/> -- 2) Type I and Type II error <br/> -- 3) Two-sample *t*-test <br/> -- 4) ANOVA --- class: inverse # motivation <br/> <br/> > A statistical test will not be able to detect a true difference if the sample size is too small compared with the magnitude of the difference. > Since data are sampled at random, there is always a risk of reaching a wrong conclusion, and things can go wrong in two ways - Dalgaard (2008) --- # type i & type ii errors #### Type I error (i.e., false positive) > The null hypothesis is correct, but the test rejects it. `$$\large \alpha = Pr(Type\;I\;error)$$` -- #### Type II error (i.e., false negative) > The null hypothesis is wrong, but the test fails to reject it. `$$\large \beta = Pr(Type\;II\;error)$$` -- #### Power > The test's ability to reject a false null hypothesis. `$$\large Power = 1 - \beta$$` --- # type i & type ii errors #### The type I error rate is set by the scientist -- #### The type II error rate, and hence the power of the test, depends on many factors -- #### In the context of a linear model, these are: 1) Magnitude of the slope coefficients ( `\(\beta\)` ) 2) Standard deviation (or variance) of population ( `\(\sigma\)` ) 3) The sample size ( `\(n\)` ) 4) The Type I error rate ( `\(\alpha\)` ) --- # magnitude of the difference <img src="06_power_files/figure-html/unnamed-chunk-1-1.png" width="504" /> --- # magnitude of the difference <img src="06_power_files/figure-html/unnamed-chunk-2-1.png" width="504" /> --- # standard deviation <img src="06_power_files/figure-html/unnamed-chunk-3-1.png" width="504" /> --- # standard deviation <img src="06_power_files/figure-html/unnamed-chunk-4-1.png" width="504" /> --- # sample size .pull-left[ <img src="06_power_files/figure-html/unnamed-chunk-5-1.png" width="504" /> ] .pull-right[ <img src="06_power_files/figure-html/unnamed-chunk-6-1.png" width="504" /> ] --- # sample size .pull-left[ <img src="06_power_files/figure-html/unnamed-chunk-7-1.png" width="504" /> ] .pull-right[ <img src="06_power_files/figure-html/unnamed-chunk-8-1.png" width="504" /> ] --- # type i error rate #### `\(\large \alpha = 0.05\)` <img src="06_power_files/figure-html/cv-1.png" width="576" style="display: block; margin: auto;" /> --- # type i error rate #### `\(\large \alpha = 0.001\)` <img src="06_power_files/figure-html/cv2-1.png" width="576" style="display: block; margin: auto;" /> --- # factors affecting power #### In general, power increases when: 1) The difference in means/magnitude of slope increases 2) The standard deviation **of the population** decreases 3) The sample size increases 4) The Type I error rate increases -- #### Which of these, as researchers, do we have control over? --- # example in `R` .pull-left[ #### `\(\large \mu_1 = 90\)`, `\(\large \mu_2 = 100\)` #### `\(\large \sigma = 5\)` <img src="06_power_files/figure-html/unnamed-chunk-9-1.png" width="504" style="display: block; margin: auto;" /> ] --- # example in `R` .pull-left[ #### `\(\large \mu_1 = 90\)`, `\(\large \mu_2 = 100\)` #### `\(\large \sigma = 5\)` <img src="06_power_files/figure-html/unnamed-chunk-10-1.png" width="504" style="display: block; margin: auto;" /> ] -- .pull-right[ #### `\(\large n = 5\)` .small-code[ ```r power.t.test(n = 5, delta = 10, sd = 5, sig.level = 0.05, power = NULL) ``` ``` ## ## Two-sample t test power calculation ## ## n = 5 ## delta = 10 ## sd = 5 ## sig.level = 0.05 ## power = 0.7905 ## alternative = two.sided ## ## NOTE: n is number in *each* group ``` ] ] --- # example in `R` .pull-left[ #### `\(\large \mu_1 = 90\)`, `\(\large \mu_2 = 100\)` #### `\(\large \sigma = 5\)` <img src="06_power_files/figure-html/unnamed-chunk-12-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right[ #### `\(\large n = 15\)` .small-code[ ```r power.t.test(n = 15, delta = 10, sd = 5, sig.level = 0.05, power = NULL) ``` ``` ## ## Two-sample t test power calculation ## ## n = 15 ## delta = 10 ## sd = 5 ## sig.level = 0.05 ## power = 0.9996 ## alternative = two.sided ## ## NOTE: n is number in *each* group ``` ] ] --- # example in `R` .pull-left[ #### `\(\large \mu_1 = 90\)`, `\(\large \mu_2 = 100\)` #### `\(\large \sigma = 5\)` <img src="06_power_files/figure-html/unnamed-chunk-14-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right[ #### `\(\large \alpha = 0.001\)` .small-code[ ```r power.t.test(n = 15, delta = 10, sd = 5, sig.level = 0.001, power = NULL) ``` ``` ## ## Two-sample t test power calculation ## ## n = 15 ## delta = 10 ## sd = 5 ## sig.level = 0.001 ## power = 0.9501 ## alternative = two.sided ## ## NOTE: n is number in *each* group ``` ] ] --- # example in `R` .pull-left[ #### `\(\large \mu_1 = 94\)`, `\(\large \mu_2 = 97\)` #### `\(\large \sigma = 5\)` <img src="06_power_files/figure-html/unnamed-chunk-16-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right[ #### `\(\large n = 15\)` .small-code[ ```r power.t.test(n = 15, delta = 3, sd = 5, sig.level = 0.001, power = NULL) ``` ``` ## ## Two-sample t test power calculation ## ## n = 15 ## delta = 3 ## sd = 5 ## sig.level = 0.001 ## power = 0.03597 ## alternative = two.sided ## ## NOTE: n is number in *each* group ``` ] ] --- # example in `R` .pull-left[ #### `\(\large \mu_1 = 94\)`, `\(\large \mu_2 = 97\)` #### `\(\large \sigma = 5\)` <img src="06_power_files/figure-html/unnamed-chunk-18-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right[ #### `\(\large n = 100\)` .small-code[ ```r power.t.test(n = 100, delta = 3, sd = 5, sig.level = 0.001, power = NULL) ``` ``` ## ## Two-sample t test power calculation ## ## n = 100 ## delta = 3 ## sd = 5 ## sig.level = 0.001 ## power = 0.8143 ## alternative = two.sided ## ## NOTE: n is number in *each* group ``` ] ] --- ## When should I do a power analysis? #### Prospective is always better than retrospective! -- **Retrospective** (conducted after experiment) - If you failed to reject the null, then your power was low - But you can't use this as an excuse! - Only useful as a way of planning a subsequent experiment -- **Prospective** (Done before the experiment) - Used to determine sample size or power, given `\(\beta\)` and `\(\sigma\)` - How can `\(\beta\)` and/or `\(\sigma\)` be known ahead of time? + Requires prior knowledge, perhaps from a pilot study + Requires clear-headed thinking about what consitutes a biologically signiffcant difference --- ## What level of power should I aim for? <br/> #### We want power to be as close to 1 as possible <br/> -- #### Sometimes it may be prohibitively expensive to obtain a sample size large enough to achieve power close to 1 <br/> -- #### In practice, we are usually satisfied with power > 0.8 --- # summary - Power analysis let's you determine the necessary sample size (or power) for testing an effect size of interest -- - Power is influenced by the magnitude of the effect, the standard deviation of the population, the Type I error rate, and the sample size -- - Retrospective power analysis isn't useful unless you are planning a subsequent experiment -- - `R` has several functions for conducting power analysis, but only for simple tests -- - More complicated power analysis can be performed using simulation (not covered in this course) --- # looking ahead <br/> ### **Next time**: Linear models part 2: categorical predictor >2 levels <br/> ### **Reading**: [Fieberg chp. 3.7](https://statistics4ecologists-v1.netlify.app/matrixreg#ancova)