class: center, middle, inverse, title-slide .title[ # LECTURE 10: confidence intervals (again) ] .subtitle[ ## FANR 6750 (Experimental design) ] .author[ ###
Fall 2022 ] --- # outline <br/> 1) Motivation <br/> -- 2) Example <br/> -- 3) Graphical displays --- class: inverse # motivation <br/> Thus far, we have approached ANOVA from a statistical hypothesis testing standpoint only. <br/> -- This approach and the use of statistical significance alone has been criticized because: -- - Null hypotheses are almost always known to be false *a priori* -- - `\(\alpha\)` levels are arbitrary -- - Statistical significance is based on sample size -- - Statistical and biological significance are easily confused <br/> -- Practitioners often are not interested in whether or not a null hypothesis can be rejected, but rather what the magnitude of the treatment effect is --- # parameter estimation Often an investigator wants to obtain one or more estimates of parameters after (or as part of) an analysis of variance <table> <thead> <tr> <th style="text-align:center;"> Parameter </th> <th style="text-align:center;"> Description </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> \(\mu\) </td> <td style="text-align:center;"> Overall mean </td> </tr> <tr> <td style="text-align:center;"> \(\mu + \alpha_i\) </td> <td style="text-align:center;"> The \(i\)th treatment mean </td> </tr> <tr> <td style="text-align:center;"> \(\alpha_i\) </td> <td style="text-align:center;"> The \(i\)th treatment effect </td> </tr> <tr> <td style="text-align:center;"> \(\mu_i - \mu_{i^\prime}\) </td> <td style="text-align:center;"> Difference between 2 means </td> </tr> <tr> <td style="text-align:center;"> \(\sum_i a_i \mu_i\) </td> <td style="text-align:center;"> Some linear combination of parameters </td> </tr> </tbody> </table> --- # confidence intervals #### For each estimate, we often want a confidence interval to quantify uncertainty <br/> -- #### What is a confidence interval? `\(^*\)` > An interval [a,b] that is likely to include the true parameter of interest -- > If we calculated the x% confidence interval on an infinite number of samples from the population, x% of the intervals would contain the true parameter value <br/> -- #### In the context of ANOVA, the formula is: `$$\large CI_{1-\alpha} = point\; estimate \pm t_{\alpha/2,a(n-1)}\timesSE$$` ??? `\(^*\)`Do you find confidence intervals confusing? If so, you are not alone. Here a few resources to help: - A [nice webapp](https://www.zoology.ubc.ca/~whitlock/Kingfisher/CIMean.htm) for simulating samples and confidence intervals - Tthis blog](https://econometricsense.blogspot.com/2018/12/thinking-about-confidence-intervals.html) does a nice job explaining that it's the interval, not the true value, that is the random variable. That is, the true parameter value is a fixed point. But every time we collect a sample, the bounds of the confidence interval we calculate will change - [Another metaphor](https://medium.com/@EpiEllie/having-confidence-in-confidence-intervals-8f881712d837) for thinking about what a confidence interval is --- # standard errors #### The standard error (SE) is the standard deviation of the sampling distribution of a statistic <br/> -- #### We usually estimate the SE using a single sample of data <br/> -- #### But the appropriate equation for the SE depends on the statistic of interest --- # standard errors <br/> <table> <thead> <tr> <th style="text-align:center;"> Parameter </th> <th style="text-align:center;"> Symbol </th> <th style="text-align:center;"> Point </th> <th style="text-align:center;"> SE </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> Mean </td> <td style="text-align:center;"> \(\mu\) </td> <td style="text-align:center;"> \(\bar{y}.\) </td> <td style="text-align:center;"> \(\sqrt{MSe/N}\) </td> </tr> <tr> <td style="text-align:center;"> Treatment mean </td> <td style="text-align:center;"> \(\mu_i = \mu + \alpha_i\) </td> <td style="text-align:center;"> \(\bar{y}_i\) </td> <td style="text-align:center;"> \(\sqrt{MSe/n_i}\) </td> </tr> <tr> <td style="text-align:center;"> Difference in means </td> <td style="text-align:center;"> \(\mu_i - \mu_{i^\prime}\) </td> <td style="text-align:center;"> \(\bar{y}_i - \bar{y}_{i^\prime}\) </td> <td style="text-align:center;"> \(\sqrt{MSe(N-n_i)/n_i N}\) </td> </tr> <tr> <td style="text-align:center;"> Linear Combination </td> <td style="text-align:center;"> \(\sum_i a_i \mu_i\) </td> <td style="text-align:center;"> \(\sum_i a_i \bar{y}_i\) </td> <td style="text-align:center;"> \(\sqrt{MSe \sum_i a_i^2/n_i}\) </td> </tr> </tbody> </table> -- #### Note: Confidence intervals based on these SEs are not adjusted for multiple comparisons --- # mussel data example <table class="table table-striped table-hover table-condensed" style="margin-left: auto; margin-right: auto;border-bottom: 0;"> <thead> <tr><th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="4"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Watershed</div></th></tr> <tr> <th style="text-align:center;"> Twelvemile </th> <th style="text-align:center;"> Chattooga </th> <th style="text-align:center;"> Keowee </th> <th style="text-align:center;"> Coneross </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 16 </td> <td style="text-align:center;"> 18 </td> <td style="text-align:center;"> 28 </td> <td style="text-align:center;"> 14 </td> </tr> <tr> <td style="text-align:center;"> 8 </td> <td style="text-align:center;"> 25 </td> <td style="text-align:center;"> 22 </td> <td style="text-align:center;"> 20 </td> </tr> <tr> <td style="text-align:center;"> 12 </td> <td style="text-align:center;"> 22 </td> <td style="text-align:center;"> 24 </td> <td style="text-align:center;"> 11 </td> </tr> <tr> <td style="text-align:center;"> 17 </td> <td style="text-align:center;"> 16 </td> <td style="text-align:center;"> 20 </td> <td style="text-align:center;"> 17 </td> </tr> <tr> <td style="text-align:center;"> 13 </td> <td style="text-align:center;"> 26 </td> <td style="text-align:center;"> 27 </td> <td style="text-align:center;"> 15 </td> </tr> </tbody> <tfoot> <tr><td style="padding: 0; " colspan="100%"><span style="font-style: italic;">Note: </span></td></tr> <tr><td style="padding: 0; " colspan="100%"> <sup></sup> MSe = 13.5</td></tr> </tfoot> </table> -- #### What types of confidence intervals can we compute? --- # displaying confidence intervals <img src="10_estimation_files/figure-html/unnamed-chunk-4-1.png" width="648" style="display: block; margin: auto;" /> --- # displaying confidence intervals <img src="10_estimation_files/figure-html/unnamed-chunk-5-1.png" width="648" style="display: block; margin: auto;" /> --- # displaying confidence intervals <img src="10_estimation_files/figure-html/unnamed-chunk-6-1.png" width="648" style="display: block; margin: auto;" /> --- # displaying confidence intervals <img src="10_estimation_files/figure-html/unnamed-chunk-7-1.png" width="648" style="display: block; margin: auto;" /> --- # summary - Confidence intervals allow one to focus on effect sizes and statistical significance at the same time <br/> -- - If `\(p < 0.05\)`, a 95% CI will not include 0 and vice versa <br/> -- - If the CIs of two means do not overlap, the difference is statistically different. However, if the CIs of two means overlap, it does not necessarily indicate that there is no difference <br/> -- - It is better to assess if the CI of the difference in means includes 0