LECTURE 17: generalized linear models

class: center, middle, inverse, title-slide

.title[
# LECTURE 17: generalized linear models
]
.subtitle[
## FANR 6750 (Experimental design)
]
.author[
### <br/><br/><br/>Fall 2023
]

---

# outline

1) Motivation

<br/>

2) Linear model review

<br/>

3) Distributions

<br/>
--

4) Link functions

<br/>
--

5) Generalized linear models

---
# motivation

#### Although the central limit theorem often allows us to meet the normality assumption, there are many cases where the assumption does not hold:

- Presence-absence studies

- Studies of survival

- Seed germination studies

- Count-based surveys (especially when close to zero)

---
# linear model review

<br/>
<br/>
`$$\Large response = deterministic\; part+stochastic\; part$$` 
<br/>
<br/>

--
`$$\underbrace{\LARGE E[y_i] = \beta_0 + \beta_1 \times x_i}_{Deterministic}$$`

--
`$$\underbrace{\LARGE y_i \sim normal(E[y_i], \sigma)}_{Stochastic}$$`

---
# example

Imagine we are interested in the effects of seed size on germination rate

We measure 100 seeds and then plant them in a greenhouse. All seeds are kept at the same temperature, soil moisture, etc. After several weeks, we record which seeds germinated and which did not

What is the response variable `$y$` in this example? What are the possible values it can take?

What is the predictor variable `$x$` in this example?

---
# example

<br/>

---
# example

.small-code[

```r
data("seeddata")
fit <- lm(fate ~ size, data = seeddata)
summary(fit)
```

```
## 
## Call:
## lm(formula = fate ~ size, data = seeddata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.941 -0.168 -0.051  0.257  0.859 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -1.1558     0.1895   -6.10  2.1e-08 ***
## size          0.1140     0.0124    9.18  7.4e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.369 on 98 degrees of freedom
## Multiple R-squared:  0.462,	Adjusted R-squared:  0.457 
## F-statistic: 84.2 on 1 and 98 DF,  p-value: 7.35e-15
```
]

---
# example

<br/>

What's wrong with the expected values? What is an "expected value" in this example?

---
# another example

We are interested in the effect of non-native Joro Spiders (*Trichonephila clavata*) on the number of native Golden Silk Orb Weaver Spiders (*Trichonephila clavipes*)

<table>
 <thead>
  <tr>
   <th style="text-align:center;"> T. clavata </th>
   <th style="text-align:center;"> T. clavipes </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> 17 </td>
   <td style="text-align:center;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 13 </td>
   <td style="text-align:center;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 8 </td>
   <td style="text-align:center;"> 1 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 7 </td>
   <td style="text-align:center;"> 1 </td>
  </tr>
</tbody>
</table>

What is the response variable `$y$` in this example? What are the possible values it can take?

What is the predictor variable `$x$` in this example?

---
# another example

<br/>

---
# another example

.small-code[

```r
data("spiderdata")
fit <- lm(clavipes ~ clavata, data = spiderdata)
summary(fit)
```

```
## 
## Call:
## lm(formula = clavipes ~ clavata, data = spiderdata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -1.343 -0.633 -0.185  0.239  3.023 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.4513     0.4009    6.11  1.7e-07 ***
## clavata      -0.1583     0.0358   -4.42  5.6e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.947 on 48 degrees of freedom
## Multiple R-squared:  0.29,	Adjusted R-squared:  0.275 
## F-statistic: 19.6 on 1 and 48 DF,  p-value: 5.56e-05
```
]

---
# another example

<br/>

---
class: inverse, middle

# generalized linear models

---
# generalized linear models

GLMs are a class of models that extend the conventional linear model framework to include data-generating distributions other than the normal distribution

All GLMs are composed of:

1) Distribution

`$$f(y_i|x_i)$$`

2) Linear predictor

`$$\mu_i = \beta_0 + \beta_1 x_i...$$`

3) Link function

`$$\eta_i = g(\mu_i)$$`

The distribution allows us to model response variables that are not normally distributed. The link function allows us to map the linear predictor to the expected values of the probability distribution

---
# the linear model as a glm

#### Linear regression

- Distribution: `$f(y_i|x_i) = normal(\mu_i, \sigma^2)$`

- Linear predictor: `$\mu_i = \beta_0 + \beta_1 x_i$`

- Link function: `$g(\mu_i) = \mu_i$` (identity link)

---
class: inverse, middle

# probability distributions

---
# probability distributions

There are *many* probability distributions. We've seen several this semester (what are they?)

The three that we will focus on for the remainder of the course are:

1) Poisson distribution

2) Binomial distribution

3) Bernoulli distribution

---
# poisson distribution

The Poisson distribution is a *discrete* probability distribution (only integers) with support (i.e., possible values) from 0 to `$+\infty$`. It is often used to model count data

The Poisson distribution has one parameter `$\lambda$`, which is both the mean and the variance

Note that although `$y$` must be an integer, `$\lambda$` only has to be positive

---
# poisson distribution

```r
rpois(n = 10, lambda = 1.2)
```

```
##  [1] 1 2 1 0 2 1 2 2 0 1
```

```r
rpois(n = 10, lambda = 10.5)
```

```
##  [1]  8 17  9  8 13 13 14 10 10 14
```

```r
rpois(n = 10, lambda = 100.68)
```

```
##  [1]  90 108  93  90 102 104 100 105  97 102
```

---
# binomial distribution

The binomial distribution is also a discrete probability distribution with support from 0 to `$N$`. It is often used to model count data where there is an upper limit (e.g., # of heads from `$N$` coin flips)

The binomial distribution has two parameters `$N$` (# of trials) and `$p$` (probability of success). The mean is `$Np$` and the variance is `$np(1-p)$`

Note that `$N$` is often fixed and known. We are interested in estimating `$p$`

---
# binomial distribution

```r
rbinom(n = 10, size = 20, p = 0.25)
```

```
##  [1] 4 2 4 5 5 5 7 5 5 6
```

```r
rbinom(n = 10, size = 20, p = 0.75)
```

```
##  [1] 15 15 13 17 16 16 17 15 13 17
```

```r
rbinom(n = 10, size = 50, p = 0.5)
```

```
##  [1] 15 25 27 26 24 22 26 21 23 23
```

---
# bernoulli distribution

The Bernoulli distribution is a special case of the binomial distribution where `$N = 1$`. What are the possible values of `$y$`?

The Bernoulli distribution is very useful for data where the response variable is binary in nature (e.g., dead/alive, presence/absence, ). The parameter we estimate is `$p$` - the probability of "success"

---
# bernoulli distribution

```r
rbinom(n = 10, size = 1, p = 0.1)
```

```
##  [1] 0 0 0 0 0 0 0 0 0 0
```

```r
rbinom(n = 10, size = 1, p = 0.5)
```

```
##  [1] 0 0 0 0 1 1 1 1 1 0
```

```r
rbinom(n = 10, size = 1, p = 0.9)
```

```
##  [1] 1 1 1 1 1 0 1 1 1 1
```

---
class: inverse, middle

# link functions

---
# link functions

In a conventional linear regression model, the expected value of `$y_i$` given `$x_i$` is simply:

`$$\large E[y_i|x_i] = \mu_i = \beta_0 + \beta_1x_i$$`
<br/>

---
# link functions

When the data follow distributions other than the normal, we need to use a transformation to ensure that the linear predictor `$\mu_i$` is consistent with the parameters governing the mean of the distribution

- For the Poisson distribution, the expected value `$\lambda_i$` must be positive

- For the binomial/Bernoulli distributions, the probability of success `$p_i$` must be between 0 and 1

Link functions allow us to go back and forth between the linear scale ( `$-\infty$` to `$+\infty$` ) and the scale of the Poisson/binomial distributions

---
# link functions

<br/>

<table>
 <thead>
  <tr>
   <th style="text-align:center;"> Distribution </th>
   <th style="text-align:center;"> link name </th>
   <th style="text-align:center;"> link equation </th>
   <th style="text-align:center;"> inverse link equation </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> Binomial </td>
   <td style="text-align:center;"> logit </td>
   <td style="text-align:center;"> $\mu = log(\frac{p}{1-p})$ </td>
   <td style="text-align:center;"> $p = \frac{e^{\mu}}{1+e^{\mu}}$ </td>
  </tr>
  <tr>
   <td style="text-align:center;"> Poisson </td>
   <td style="text-align:center;"> log </td>
   <td style="text-align:center;"> $\mu = log(\lambda)$ </td>
   <td style="text-align:center;"> $\lambda = e^{\mu}$ </td>
  </tr>
</tbody>
</table>

<br/>

<table>
 <thead>
  <tr>
   <th style="text-align:center;"> Distribution </th>
   <th style="text-align:center;"> link name </th>
   <th style="text-align:center;"> link in R </th>
   <th style="text-align:center;"> inverse link in R </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> Binomial </td>
   <td style="text-align:center;"> logit </td>
   <td style="text-align:center;"> qlogis </td>
   <td style="text-align:center;"> plogis </td>
  </tr>
  <tr>
   <td style="text-align:center;"> Poisson </td>
   <td style="text-align:center;"> log </td>
   <td style="text-align:center;"> log </td>
   <td style="text-align:center;"> exp </td>
  </tr>
</tbody>
</table>

---
# log link

.small-code[

```r
beta0 <- 5
beta1 <- -0.08
(log.lambda <- beta0 + beta1*100) # linear predictor for x = 100
```

```
## [1] -3
```
]

How do we convert -3 to a positive value? Use the inverse-link:

.small-code[

```r
(lambda <- exp(log.lambda))
```

```
## [1] 0.04979
```
]

To go back, use the link function itself:

.small-code[

```r
log(lambda)
```

```
## [1] -3
```
]

---
# log link

<br/>

---
# log link example

.pull-left[
<br/>
<img src="17_glm_files/figure-html/unnamed-chunk-28-1.png" width="288" style="display: block; margin: auto;" />
]

.pull-right[
<br/>
<img src="17_glm_files/figure-html/unnamed-chunk-29-1.png" width="288" style="display: block; margin: auto;" />
]

---
# logit link

.small-code[

```r
beta0 <- 5
beta1 <- -0.08
(logit.p <- beta0 + beta1*100) # linear predictor for x = 100
```

```
## [1] -3
```
]

How do we convert -3 to a probability? Use the inverse-link:

.small-code[

```r
(p <- exp(logit.p)/(1+exp(logit.p))) # same as plogis(logit.p)
```

```
## [1] 0.04743
```
]

To go back, use the link function itself:

.small-code[

```r
log(p/(1-p))
```

```
## [1] -3
```

```r
qlogis(p)
```

```
## [1] -3
```
]

---
# log link

<br/>

---
# logit link example

.pull-left[
<br/>
<img src="17_glm_files/figure-html/unnamed-chunk-34-1.png" width="288" style="display: block; margin: auto;" />
]

.pull-right[
<br/>
<img src="17_glm_files/figure-html/unnamed-chunk-35-1.png" width="288" style="display: block; margin: auto;" />
]

---
# generalized linear models

Putting the distribution and link functions together, we can write several of the most common GLMs

#### Logistic regression

- Distribution: `$f(y_i|x_i) = Bernoulli(p_i)$`

- Linear predictor: `$\mu_i = \beta_0 + \beta_1 x_i$`

- Link function: `$g(p_i) = logit(p_i) = \mu_i$` (or `$p_i = ilogit(\mu_i)$`)

---
# generalized linear models

Putting the distribution and link functions together, we can write several of the most common GLMs

#### Poisson regression

- Distribution: `$f(y_i|x_i) = Poisson(\lambda_i)$`

- Linear predictor: `$\mu_i = \beta_0 + \beta_1 x_i$`

- Link function: `$g(\lambda_i) = log(\lambda_i) = \mu_i$` (or `$\lambda_i = exp(\mu_i)$`)

---
# looking ahead

<br/>

### **Next time**: Logistic regression

<br/>

### **Reading**: [Fieberg chp. 16](https://statistics4ecologists-v1.netlify.app/logistic)