Lecture 10

class: center, middle, inverse, title-slide

.title[
# Lecture 10
]
.subtitle[
## Dynamic Occupancy Models
]
.author[
### Spring 2025
]

---

# Readings: 
 
> Kéry & Schaub 383-411

> Occupancy Estimation and Modeling : Inferring Patterns and Dynamics of Species Occurrence by Darryl I. MacKenzie, James D. Nichols, J. Andrew Royle, Kenneth H. Pollock, Larissa Bailey, and James E. Hines

---
## Closed Occupancy models

$$\Large logit(\psi_j) = \beta_0 + \beta_1*x_j $$

$$ \Large z_j \sim Bernoulli(\psi_j) $$

$$ \Large y_{j,k} \sim Bernoulli(z_j, p_k)$$
---

## Assumptions of closed occupancy models

1) Binomial distribution is a true description of state/observation processes

2) Occupancy at each site is random and independent of occupancy at all other sites

3) Population is closed between surveys

4) Observers do not produce false positives

---
## Estimating `$\LARGE p$`

#### Imagine a single site surveyed 3 times:

- Assume site is closed across samples

- Assume constant `$\large p$`

`$$\LARGE y_i = [111]$$`

--
#### What is the likelihood of this observation?

--
P(occupied) x P(observed 3 times)

`$$\LARGE \psi p^3$$`

---
## Estimating `$\LARGE p$`

#### What about?

`$$\LARGE y_i = [011]$$`

--
P(occupied) x P(not observed 1 time) x P(observed 2 times)

`$$\LARGE \psi (1-p)p^2$$`

---
## Estimating `$\LARGE p$`

#### What about?

`$$\LARGE y_i = [000]$$`

--
 
Three 0's can arise from two different scenarios - the site is not occupied, or the site is occupied but we missed it 3 times.

P(not occupied) + P(occupied) x P(not observed 3 times)

`$$\LARGE (1 - \psi) + \psi (1-p)^3$$`

---
## Multi-season (dynamic) occupancy model

#### What if occupancy can change over time?

- Data collection using the *robust design*
    + Population open between primary periods (e.g., years)
    + Population closed within secondary periods (e.g., occasions)

`$$\Large y_i = [\underbrace{101}_{Year\;1}\;\;\;\;\; \underbrace{000}_{Year\;2}\;\;\;\;\;\underbrace{110}_{Year\;3}\;\;\;\;\;\underbrace{100}_{Year\;4}]$$`

---
## Probability of Observations

#### What is the likelihood of this observation?

`$$\Large y_i = [\underbrace{101}_{Year\;1}\;\;\;\;\; \underbrace{000}_{Year\;2}\;\;\;\;\;\underbrace{110}_{Year\;3}]$$`
--

Year 1: `$$\psi_1p_1^2(1-p_1)$$`
 
--
Year 2: `$$\psi_2(1-p_2)^3 + (1-\psi_2)$$`
 
--
Year 3: `$$\psi_3p_3^2(1-p_3)$$`

---
## Probability of Observations

#### What is the likelihood of this observation?

`$$\Large y_i = [\underbrace{101}_{Year\;1}\;\;\;\;\; \underbrace{000}_{Year\;2}\;\;\;\;\;\underbrace{110}_{Year\;3}]$$`
Therefore `$$Pr(y_i = [101\;\;\;\;\; 000 \;\;\;\;\;110 \;\;\;\;\; 100 ]| \boldsymbol{ \psi, p} ) = \\\psi_1p_1^2(1-p_1)\\\times\psi_2(1-p_2)^3 + (1-\psi_2)\\\times\psi_3p_3^2(1-p_3)$$`

---
## Occupancy Dynamics

Remember that occupancy is a *markovian* process - therefore occupancy status is not independent between years! We need a model to explain *why* occupancy can change.

Assume some site is occupied in year 1. In year 2, it can either stay occupied (survive) or no longer be occupied (go 'extinct')

---
## Occupancy Dynamics

Similarly, a site can be empty in year 1. In year 2, it can either become occupied (colonization) or remain unoccupied.

---
## Multi-season (dynamic) occupancy model

- In year 1:
`$$\Large z_{i,1} \sim Bernoulli(\psi)$$`

- In years 2+:
`$$\Large z_{i,t} \sim Bernoulli(z_{i,t-1}(1-\phi)+(1-z_{i,t-1})\gamma)$$`

- `$\Large z_{i,t}$` = Binary occupancy state

- `$\Large\psi$` = Initial occupancy probability

- `$\Large\phi$` = Survival probability

- `$\Large\gamma$` = Colonization probability

*Some authors will use different greek letters

---
## Multi-season (dynamic) occupancy model

We can easily model any of our probabilities (`$\psi$`,  `$\phi$`,  `$\gamma$`) as functions of covariates.

`$$logit(\Large\psi_i) = \beta_0 + \beta_1\boldsymbol{x_i}$$`
$$logit(\Large\phi_t) = \alpha_0 + \alpha_1\boldsymbol{x_t}  ...  $$

---
# Probability of Occurrence

There are several derived quantites that are often of interest.

Probability of occurrence is just the mean `$\psi$` at average covariate conditions. We calculate this via `$\phi$` and `$\gamma$`.

`$$\Large \hat{\psi_1} = logit(\beta_0)$$`
For `$t$` > 1

`$$\Large \hat{\psi_t} = \hat{\psi}_{t-1}\phi_{t-1} + (1-\hat{\psi}_{t-1})\gamma_{t-1}$$`
---
# Turnover

Turnover is another common derived quantity. This can be defined multiple ways, though the most common is the proportion of all sites that *just* became occupied relative to the total number of occupied sites.

You can calculate this as either expected or realized turnover, though expected is more common.

`$$\Large \tau_{t} = \frac{(1-\hat{\psi}_{t})\gamma_{t}}{\hat{\psi}_{t}} =\frac{(1-\hat{\psi}_{t})\gamma_{t}}{\hat{\psi}_{t}\phi_{t} + (1-\hat{\psi}_{t})\gamma_{t}}$$`

---
# Growth Rate

Finally, some authors like to include a growth rate parameter, which is just the rate of change in occupancy probability over time.

`$$\Large \lambda_t = \frac{\hat{\psi_t}}{\hat{\psi}_{t-1}}$$`

---

## In Class Exercise

Simulate 15 years of a dynamic occupancy system using the following information:

- `set.seed(55)`

- `$\psi_1$` = .62

- `$\phi$` = .8

- `$\gamma$` = .2

- nsites = 40

- Binomial detection process with `$p$` = .4 and 4 site visits per year

---

## In Class Exercise

``` r
set.seed(55)
nyears <- 15
psi1 <- .62
phi <- .8
gamma <- .1
nsites <- 40
nvisits <- 4
p <- .4

z <- y <-array(NA, c(nsites, nyears))
z[,1] <- rbinom(nsites, 1, psi1)
for(i in 1:nsites){
 y[i,1] <- rbinom(1, z[i,1]*nvisits, p)
 for(t in 2:nyears){
 z[i, t] <- rbinom(1, 1, z[i,t-1]*(1-phi)+(1-z[i,t-1])*gamma)
 y[i, t] <- rbinom(1, z[i,t]*nvisits, p)
 }
}
```

---

## In Class Exercise

What proportion of sites were occupied in each year?

``` r
colSums(z)/nsites
```

```
##  [1] 0.700 0.250 0.150 0.125 0.150 0.175 0.175 0.050 0.100 0.225 0.125 0.125
## [13] 0.075 0.125 0.025
```

How about observed occupancy?

``` r
colSums(y > 0)/nsites
```

```
##  [1] 0.500 0.200 0.150 0.125 0.125 0.125 0.150 0.050 0.075 0.200 0.125 0.125
## [13] 0.025 0.125 0.025
```

---
## In Class Exercise

What's the expected probability of occurence?

`$$\hat{\psi_1} = logit(\beta_0)$$`

For `$t$` > 1

`$$\hat{\psi_t} = \hat{\psi}_{t-1}\phi_{t-1} + (1-\hat{\psi}_{t-1})\gamma_{t-1}$$`

``` r
psi <- array(NA, nyears)
psi[1] <- psi1
for(t in 2:nyears){
 psi[t] <- psi[t-1]*phi + (1-psi[t-1])*gamma
}
psi
```

```
##  [1] 0.6200 0.5340 0.4738 0.4317 0.4022 0.3815 0.3671 0.3569 0.3499 0.3449
## [11] 0.3414 0.3390 0.3373 0.3361 0.3353
```

---
# Expected vs Realized Occupancy

---
## Nimble Model

``` r
occ.mod <- nimbleCode({
for(i in 1:nsites){
 z[i,1] ~dbern(psi)
 y[i,1] ~dbinom(p, z[i,1]*nvisits)
 
 for(t in 2:nyears){
 z[i, t] ~dbern(z[i,t-1]*(1-phi)+(1-z[i,t-1])*gamma)
 y[i, t] ~dbinom(p, z[i,t]*nvisits)
 } #end t
} #end i
 
gamma ~ dbeta(1, 1) 
phi ~ dbeta(1, 1) 
p ~ dbeta(1, 1)
psi ~ dbeta(1,1)
})
```

---
## Nimble Model

``` r
params <- c('gamma', 'phi', 'p', 'psi')
n.c <- list(nsites = nsites, nvisits = nvisits, nyears = nyears)
n.d <- list(y =y)
n.i <- function()list(p = runif(1), gamma = runif(1), phi = runif(1), psi = runif(1))
```

Initial values for `$z$` can be tricky - we need to make sure we don't suggest a site is unoccupied if we had observations there 
 
--
One option is to start anywhere with y = 0 as unoccupied

``` r
z.init <- (y >0)*1
n.i <- function()list(p = runif(1), gamma = runif(1), phi = runif(1), psi = runif(1), z.init =z.init)
```

--
Alternatively, flip a coin for any sites where y = 0

``` r
z.init <- matrix(rbinom(nsites*nyears, 1, .5), ncol = nyears)
z.init[which(y > 0)] <- 1 
n.i <- function()list(p = runif(1), gamma = runif(1), phi = runif(1), psi = runif(1), z =z.init)
```

---
## Run the model in Nimble

``` r
occ_out <- nimbleMCMC(code = occ.mod,
 data = n.d,
 constants = n.c,
 inits = n.i(),
 monitors = params,
 thin = 1,
 niter = 5000,
 nburnin = 2500,
 nchains = 3,
 samplesAsCodaMCMC = TRUE
 )
```

---
## Check the model
<img src="10_DynamicOcc_files/figure-html/unnamed-chunk-15-1.png" width="576" style="display: block; margin: auto;" /><img src="10_DynamicOcc_files/figure-html/unnamed-chunk-15-2.png" width="576" style="display: block; margin: auto;" />

---
## Compare with simulation

|      |   2.5%|    25%|    50%|    75%|  97.5%| true|
|:-----|------:|------:|------:|------:|------:|----:|
|gamma | 0.0914| 0.1106| 0.1224| 0.1336| 0.1599| 0.10|
|p     | 0.3373| 0.3784| 0.3996| 0.4214| 0.4646| 0.40|
|phi   | 0.7000| 0.7692| 0.8004| 0.8288| 0.8818| 0.80|
|psi   | 0.4059| 0.5184| 0.5813| 0.6434| 0.7610| 0.62|

---
## Expansions

Royle and Dorazio (2008) suggested a reparameterization of this model that can sometimes help with convergence.

Before we had: 
`$$\Large z_{i,t} \sim Bernoulli(z_{i,t-1}(1-\phi)+(1-z_{i,t-1})\gamma)$$`

--
Instead, let:

`$$\Large z_{i,t} \sim Bernoulli(\pi_{i,t})$$`

`$$\Large logit(\pi_{i,t}) = \alpha_{t-1} + \beta_{t-1}*z_{i,t-1}$$`

---
## Expansions

`$$\Large logit(\pi_{i,t}) = \alpha_{t-1} + \beta_{t-1}*z_{i,t-1}$$`

We can then derive `$\phi$` and `$\gamma$`

`$$\Large logit(\phi_{i,t}) = \alpha_{t-1} + \beta_{t-1}$$`

`$$\Large logit(\gamma_{i,t}) = \alpha_{t-1}$$`