Lecture 7

class: center, middle, inverse, title-slide

# Lecture 7
## Generalized linear model review
### <br/><br/><br/>WILD6900 (Spring 2021)

---

## Readings

> ### Kéry & Schaub 48-55

---
# Statistics cookbook

---
class: inverse, center, middle

# From linear models to GLMs

---
# Linear models
<br/>
<br/>
`$$\Large response = deterministic\; part+stochastic\; part$$` 
<br/>
<br/>

--
`$$\underbrace{\LARGE \mu_i = \beta_0 + \beta_1 \times x_i}_{Deterministic}$$`
<br/>
<br/>

--
`$$\underbrace{\LARGE y_i \sim normal(\mu_i, \tau)}_{Stochastic}$$`

???

Note that the deterministic portion of the model has the same form as the equation for a line: `$y = a + b \times x$`, which is why we call these linear models

---
class: inverse, middle, center

# Linear models under the hood
## Variations on the determinstic model

---
# A simple linear model: the t-test
<br/>
<br/>
<br/>

`$$\LARGE \mu_i = \beta_0 + \beta_1 \times x_i$$`
<br/>
<br/>

--
`$$\LARGE x_i \in [0,1]$$`

---
## Under the hood: The t-test

`$$\begin{bmatrix}
    \mu_1 \\
    \mu_2 \\
    \mu_3 \\
    .\\
    .\\
    .\\
    \mu_N
\end{bmatrix} = \begin{bmatrix}
    1 & x_1\\
    1 & x_2\\
    1 & x_3\\
    . \\
    . \\
    . \\
    1 & x_N
\end{bmatrix} \times \begin{bmatrix}
    \beta_0\\
    \beta_1 
\end{bmatrix}$$`

--
`$$\LARGE \mu_1 = \beta_0 \times 1 + \beta_1 \times x_1$$`
`$$\LARGE \mu_2 = \beta_0 \times 1 + \beta_1 \times x_2$$`

---
## Under the hood: The t-test

```r
model.matrix(lm(mpg ~ am, data = mtcars))[c(3,1,5),]
```

```
##                   (Intercept) am1
## Datsun 710                  1   1
## Mazda RX4                   1   1
## Hornet Sportabout           1   0
```

--
### `$\LARGE \beta_0$` is the mean mpg for automatic transmissions

--
### `$\LARGE \beta_1$` is the *difference* between automatic and manual transmissions

???

This is called the *effects* parameterization

---
## Under the hood: The t-test

```r
lm(mpg ~ am, data = mtcars)
```

---
## Under the hood: The t-test

```r
model.matrix(lm(mpg ~ as.factor(am) - 1, data = mtcars))[c(3,1,5),]
```

```
##                   am0 am1
## Datsun 710          0   1
## Mazda RX4           0   1
## Hornet Sportabout   1   0
```

--
### `$\large \beta_0$` is the mean mpg for automatic transmissions

--
### `$\large \beta_1$` is the mean mpg for manual transmissions

???

This is called the *means* parameterization

A more compact way to write this model would be `$\mu_i = \beta_{0[j]}$`. Instead of `$\beta_0$` (intercept) and `$\beta_1$` (slope), we essentially have one intercept for each factor level

---
## Under the hood: The t-test

```r
lm(mpg ~ as.factor(am) - 1, data = mtcars)
```

---
# The t-test becomes an ANOVA

`$$\LARGE y_i = \beta_0 + \beta_{1[j]} \times x_i$$`
<br/>
<br/>

--
`$$j \in [1,2,3,...,J-1]$$`

`$$\LARGE x_i \in [0,1]$$`

---
## Under the hood: ANOVA

```r
model.matrix(lm(mpg ~ as.factor(cyl), data = mtcars))[c(3,1,5),]
```

```
##                   (Intercept) cyl6 cyl8
## Datsun 710                  1    0    0
## Mazda RX4                   1    1    0
## Hornet Sportabout           1    0    1
```

--
### `$\large \beta_0$` is the mean mpg for 4-cylinders

--
### `$\large \beta_{1[6-cyl]}$` is the *difference* between 4-cyl & 6-cyl

--
### `$\large \beta_{1[8-cyl]}$` is the *difference* between 4-cyl & 8-cyl

---
## Under the hood: ANOVA

### Effects parameterization

```r
lm(mpg ~ as.factor(cyl) , data = mtcars)
```

---
## Under the hood: ANOVA

### Means parameterization

`$$\LARGE \mu_i = \beta_{0[j]}$$`

```r
lm(mpg ~ as.factor(cyl) - 1, data = mtcars)
```

---
# The ANOVA becomes an ANCOVA

`$$\LARGE y_i = \beta_0 + \beta_{1[j]} \times x1_i + \beta_2 \times x2_i$$`
<br/>
<br/>

`$$\LARGE x1_i \in [0,1]$$`
<br/>

`$$\LARGE x2_i \in [-\infty, \infty]$$`

---
## Under the hood: ANCOVA

```r
model.matrix(lm(mpg ~ as.factor(cyl) + hp, data = mtcars))[c(3,1,5),]
```

```
##                   (Intercept) cyl6 cyl8  hp
## Datsun 710                  1    0    0  93
## Mazda RX4                   1    1    0 110
## Hornet Sportabout           1    0    1 175
```

--
#### `$\large \beta_0$` is the mean mpg for 4-cylinders **@ 0hp**

--
#### `$\large \beta_{1[6-cyl]}$` is the *difference* between 4-cyl & 6-cyl **@ 0hp**

--
#### `$\large \beta_{1[8-cyl]}$` is the *difference* between 4-cyl & 8-cyl **@ 0hp**

--
#### `$\large \beta_2$` is the effect of hp on mpg

???

In this formulation, we assume that the effect of horsepower is the same for all cylinder levels

---
## Under the hood: ANCOVA

```r
lm(mpg ~ as.factor(cyl) + hp, data = mtcars)
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 28.650 </td>
   <td style="text-align:right;"> 1.5878 </td>
   <td style="text-align:right;"> 18.044 </td>
   <td style="text-align:right;"> 0.0000 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl6 </td>
   <td style="text-align:right;"> -5.968 </td>
   <td style="text-align:right;"> 1.6393 </td>
   <td style="text-align:right;"> -3.640 </td>
   <td style="text-align:right;"> 0.0011 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl8 </td>
   <td style="text-align:right;"> -8.521 </td>
   <td style="text-align:right;"> 2.3261 </td>
   <td style="text-align:right;"> -3.663 </td>
   <td style="text-align:right;"> 0.0010 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> hp </td>
   <td style="text-align:right;"> -0.024 </td>
   <td style="text-align:right;"> 0.0154 </td>
   <td style="text-align:right;"> -1.560 </td>
   <td style="text-align:right;"> 0.1300 </td>
  </tr>
</tbody>
</table>

---
## Under the hood: Interactions

`$$\begin{bmatrix}
    \mu_1 \\
    \mu_2 \\
    \mu_3 \\
    .\\
    .\\
    .\\
    \mu_N
\end{bmatrix} = \begin{bmatrix}
    1 & x1_1 & x2_1 & x1_1 * x2_1\\
    1 & x1_2 & x2_2 & x1_2 * x2_2\\
    1 & x1_3 & x2_3 & x1_3 * x2_3\\
    . \\
    . \\
    . \\
    1 & x1_N & x2_N & x1_N * x2_N
\end{bmatrix} \times \begin{bmatrix}
    \beta_0\\
    \beta_1\\
    \beta_2\\
    \beta_3
\end{bmatrix}$$`

---
## Under the hood: Interactions

```r
model.matrix(lm(mpg~as.factor(cyl)*hp, data = mtcars))[c(3,1,5),]
```

```
##                   (Intercept) cyl6 cyl8  hp cyl6:hp cyl8:hp
## Datsun 710                  1    0    0  93       0       0
## Mazda RX4                   1    1    0 110     110       0
## Hornet Sportabout           1    0    1 175       0     175
```

--
##### `$\large \beta_0$` is the mean mpg for 4-cylinders **@ 0hp**

--
##### `$\large \beta_{1[6-cyl]}$` is the *difference* between 4-cyl & 6-cyl **@ 0hp**

--
##### `$\large \beta_{1[8-cyl]}$` is the *difference* between 4-cyl & 8-cyl **@ 0hp**

--
##### `$\large \beta_2$` is the effect of hp on mpg for 4-cylinders

--
##### `$\large \beta_{3[6-cyl]}$` is the *difference* between the effect of hp in 4-cyl vs 6-cyl

--
##### `$\large \beta_{3[8-cyl]}$` is the *difference* between the effect of hp in 4-cyl vs 8-cyl

---
## Under the hood: Interactions

```r
lm(mpg ~ as.factor(cyl) * hp, data = mtcars)
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 35.9830 </td>
   <td style="text-align:right;"> 3.8891 </td>
   <td style="text-align:right;"> 9.252 </td>
   <td style="text-align:right;"> 0.0000 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl6 </td>
   <td style="text-align:right;"> -15.3092 </td>
   <td style="text-align:right;"> 7.4346 </td>
   <td style="text-align:right;"> -2.059 </td>
   <td style="text-align:right;"> 0.0496 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl8 </td>
   <td style="text-align:right;"> -17.9030 </td>
   <td style="text-align:right;"> 5.2596 </td>
   <td style="text-align:right;"> -3.404 </td>
   <td style="text-align:right;"> 0.0022 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> hp </td>
   <td style="text-align:right;"> -0.1128 </td>
   <td style="text-align:right;"> 0.0457 </td>
   <td style="text-align:right;"> -2.465 </td>
   <td style="text-align:right;"> 0.0206 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl6:hp </td>
   <td style="text-align:right;"> 0.1052 </td>
   <td style="text-align:right;"> 0.0685 </td>
   <td style="text-align:right;"> 1.536 </td>
   <td style="text-align:right;"> 0.1367 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl8:hp </td>
   <td style="text-align:right;"> 0.0985 </td>
   <td style="text-align:right;"> 0.0486 </td>
   <td style="text-align:right;"> 2.026 </td>
   <td style="text-align:right;"> 0.0531 </td>
  </tr>
</tbody>
</table>

---
## Under the hood: Interactions

```r
model.matrix(lm(mpg ~ as.factor(cyl) * hp - 1 - hp, data = mtcars))[c(3,1,5),]
```

```
##                   cyl4 cyl6 cyl8 cyl4:hp cyl6:hp cyl8:hp
## Datsun 710           1    0    0      93       0       0
## Mazda RX4            0    1    0       0     110       0
## Hornet Sportabout    0    0    1       0       0     175
```

--
`$\large \beta_{0[4-cyl]}$` is the mean mpg for 4-cylinders **@ 0hp**

--
`$\large \beta_{0[6-cyl]}$` is the mean mpg for 6-cylinders **@ 0hp**

--
`$\large \beta_{0[8-cyl]}$` is the mean mpg for 8-cylinders **@ 0hp**

--
`$\large \beta_{1[4-cyl]}$` is the effect of hp on mpg for 4-cylinders

--
`$\large \beta_{1[6-cyl]}$` is the effect of hp on mpg for 6-cylinders

--
`$\large \beta_{1[8-cyl]}$` is the effect of hp on mpg for 8-cylinders

---
### Under the hood: Interactions

```r
lm(mpg ~ as.factor(cyl) * hp - 1, data = mtcars)
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> cyl4 </td>
   <td style="text-align:right;"> 35.9830 </td>
   <td style="text-align:right;"> 3.8891 </td>
   <td style="text-align:right;"> 9.2523 </td>
   <td style="text-align:right;"> 0.0000 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl6 </td>
   <td style="text-align:right;"> 20.6739 </td>
   <td style="text-align:right;"> 6.3362 </td>
   <td style="text-align:right;"> 3.2628 </td>
   <td style="text-align:right;"> 0.0031 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl8 </td>
   <td style="text-align:right;"> 18.0801 </td>
   <td style="text-align:right;"> 3.5410 </td>
   <td style="text-align:right;"> 5.1059 </td>
   <td style="text-align:right;"> 0.0000 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl4:hp </td>
   <td style="text-align:right;"> -0.1128 </td>
   <td style="text-align:right;"> 0.0457 </td>
   <td style="text-align:right;"> -2.4652 </td>
   <td style="text-align:right;"> 0.0206 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl6:hp </td>
   <td style="text-align:right;"> -0.0076 </td>
   <td style="text-align:right;"> 0.0510 </td>
   <td style="text-align:right;"> -0.1494 </td>
   <td style="text-align:right;"> 0.8824 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> cyl8:hp </td>
   <td style="text-align:right;"> -0.0142 </td>
   <td style="text-align:right;"> 0.0165 </td>
   <td style="text-align:right;"> -0.8645 </td>
   <td style="text-align:right;"> 0.3952 </td>
  </tr>
</tbody>
</table>

---
class: inverse, middle, center

# Linear models under the hood
## Variations on the stochastic model

---
## Stochasticity in the linear model

`$$\underbrace{\LARGE \mu_i = -2 + 0.5 \times x_i}_{Deterministic}$$`
--
<img src="Lecture7_files/figure-html/unnamed-chunk-28-1.png" width="288" style="display: block; margin: auto;" />

---
## Stochasticity in the linear model

`$$\LARGE \mu_i = -2 + 0.5 \times x_i$$`

`$$\underbrace{\LARGE y_i \sim normal(\mu_i, \tau)}_{Stochastic}$$`

--
<img src="Lecture7_files/figure-html/unnamed-chunk-29-1.png" width="288" style="display: block; margin: auto;" />

---
## Stochasticity in the linear model

`$$\LARGE \mu_i = -2 + 0.5 \times x_i$$`

---
## Stochasticity in the linear model

`$$\LARGE \mu_i = -2 + 0.5 \times x_i$$`

`$$\LARGE y_i \sim normal(\mu_i, \tau)$$`

---
class: inverse, center, middle

# Components of the linear model

---
## Components of the linear model

### 1) Distribution

`$$\LARGE y_i \sim normal(\mu_i, \tau)$$`

--
### 2) Link function

`$$\LARGE \mu_i = E(y_i) = linear\;\;predictor$$`

--
### 3) Linear predictor

`$$\LARGE \beta_0 + \beta_1 \times x_i$$`

---
## Stochasticity in the linear model

### What happens if `$\Large 0 \leq y_i$`?
<br/>
<br/>

--
<img src="Lecture7_files/figure-html/unnamed-chunk-32-1.png" width="288" style="display: block; margin: auto;" />

---
##  Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim normal(\mu_i, \tau)$$`

---
## Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim Poisson(\lambda_i)$$`

--
### 2) Link function

`$$\LARGE \lambda_i = E(y_i) = linear\;\;predictor$$`

---
## Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim Poisson(\lambda_i)$$`

### 2) Link function

`$$\LARGE log(\lambda_i) = log(E(y_i)) = linear\;\;predictor$$`

--
<img src="Lecture7_files/figure-html/unnamed-chunk-33-1.png" width="252" style="display: block; margin: auto;" />

---
## Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim Poisson(\lambda_i)$$`

### 2) Link function

`$$\LARGE log(\lambda_i) = log(E(y_i)) = linear\;\;predictor$$`

---
## Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim Poisson(\lambda_i)$$`

### 2) Link function

`$$\LARGE log(\lambda_i) = log(E(y_i)) = linear\;\;predictor$$`

### 3) Linear predictor

`$$\LARGE \beta_0 + \beta_1 \times x_i$$`

---
## Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim Bernoulli(p_i)$$`

--
### 2) Link function

`$$\LARGE logit(p_i) = log \bigg(\frac{p_i}{1-p_0}\bigg) = linear\;\;predictor$$`

--
<img src="Lecture7_files/figure-html/unnamed-chunk-34-1.png" width="252" style="display: block; margin: auto;" />

---
## Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim Bernoulli(p_i)$$`

### 2) Link function

`$$\LARGE logit(p_i) = log \bigg(\frac{p_i}{1-p_0}\bigg) = linear\;\;predictor$$`

### 3) Linear predictor

`$$\LARGE \beta_0 + \beta_1 \times x_i$$`

---
## Components of the generalized linear model

### 1) Distribution

`$$\LARGE y_i \sim binomial(N, p_i)$$`

### 2) Link function

`$$\LARGE logit(p_i) = log \bigg(\frac{p_i}{1-p_0}\bigg) = linear\;\;predictor$$`

### 3) Linear predictor

`$$\LARGE \beta_0 + \beta_1 \times x_i$$`

---
class: inverse

## Generalized linear models
<br/>
<br/>
--
- Flexible method to model observations arising from different probability distributions  
<br/>
<br/>
--
- Link many classical tests under unified framework  
<br/>
<br/>
--
- Underlie nearly all common ecological models