In this lab, we’ll continue to explore multi-state models using a data set from Kéry & Schaub (2015). The data comes from a long-term study of Showy Lady’s Slipper (Cypripedium reginae) in West Virginia (Kéry & Gregg 2004). This perennial orchid is one of the tallest orchid species native to North America and is the state flower of Minnesota. The species has always been rare but in recent decades it has been extirpated throughout much of its range due to habitat loss.

The Showy Lady's Slipper. Image courtesy of shcostello via Wikicommons (CC BY 3.0)

The Showy Lady’s Slipper. Image courtesy of shcostello via Wikicommons (CC BY 3.0)

Why, you may ask, should we perform a mark-recapture study to estimate the survival of a plant species? First, individual plants can remain dormant below ground in years when conditions are not suitable for growth. As a result, even if detection probability is 1 for plants in the vegetative and flowering stages, we do not know the true state (dead or alive) of dormant plants. Second, the multi-state model is a natural framework for estimating transitions between stages (vegetative to flowering, dormant to flowering, etc.). These transition probabilities form the basis for matrix projection models so the ability to estimate them at the same time as survival is a big advantage of this modeling framework.

Following Kéry & Gregg (2004) and Kéry & Schaub (2015), we will define four states: “vegetative”, “flowering”, “dormant”, and “dead”. Individuals that are in the vegetative, flowering, and dormant states can die with probability \(s\) (note that because there is not emigration in plants, we can model true rather than apparent survival). Individuals can also transition between stages or remain in their current state with probabilities:

  • \(\psi_{VV}\): probability of staying in the vegetative state

  • \(\psi_{FV}\): probability of transitioning from flowering to vegetative

  • \(\psi_{DV}\): probability of transitioning from dormant to vegetative

  • \(\psi_{FF}\): probability of staying in the flowering state

  • \(\psi_{VF}\): probability of transitioning from vegetative to flowering

  • \(\psi_{DF}\): probability of transitioning from dormant to flowering

  • \(\psi_{DD}\): probability of staying in the dormant state

  • \(\psi_{VD}\): probability of transitioning from flowering to dormant

  • \(\psi_{FD}\): probability of transitioning from flowering to dormant

Vegetative Flowering Dormant Dead
Vegetative \(s \psi_{VV}\) \(s \psi_{VF}\) \(s \psi_{VD}\) \(1 - s\)
Flowering \(s \psi_{FV}\) \(s \psi_{FF}\) \(s \psi_{FD}\) \(1 -s\)
Dormant \(s \psi_{DV}\) \(s \psi_{DF}\) \(s \psi_{DD}\) \(1 -s\)
Dead 0 0 0 1

We will assume, following Kéry & Gregg (2004), that individuals in the vegetative and flowering states are observed perfectly and that dead or dormant individuals are never detected. Therefore:

Seen veg. Seen flow. Not Seen
Vegetative 1 0 0
Flowering 0 1 0
Dormant 0 0 1
Dead 0 0 1

Reading in the data

Capture histories for 490 individual plants over an 8 year period are providing by Kéry & Schaub (2015):

  • observed vegetative is coded as 1

  • observed flowering is coded as 2

  • not observed is coded as 0

data("ladyslipper")

Next, determine the first capture occasion of each individual and recode “not seen” as 3:

# First capture occasion
f <- apply(ladyslipper, 1, function(x) min(which(x != 0)))

# Recode "not seen" as 3
rCH <- ladyslipper
rCH[rCH == 0] <- 3

The model

Here’s the JAGS code to fit this model. In this case, we will assume that survival probability \(s\) can vary across occasions and treat it as a random effect.

Priors for the transition probabilities require a distribution we haven’t seen yet: the Dirichlet. The Dirichlet prior is used when we need to put priors on a vector of probabilities that must sum to 1. To implement the Dirichlet prior in JAGS, we define a set of hyperpriors, one for each element in the vector of probabilities, call these \(a_j\). We put a vague gamma prior on each hyperprior and then estimate the transition probabilities as \(\psi_{ij} = \frac{a_j}{\sum_{j = 1}^n a_j}\):

sink("jags/ladyslipper.jags")
cat([2045 chars quoted with '"'],fill=TRUE)
sink()

Fitting the model

Multi-state models can be very computationally expensive. One way to speed them up is the provide the latent matrix z as partially-observed data. When the individual is seen, we know \(z\). When the individual is not seen, we don’t know the state so we code that as z = NA. The following function will take the observed capture history and create a matrix z with the partially-known states:

known.state.ms <- function(ms, notseen){
  # notseen: label for not seen
  state <- ms
  state[state==notseen] <- NA
  for (i in 1:dim(ms)[1]){
    m <- min(which(!is.na(state[i,])))
    state[i,m] <- NA
  }
  return(state)
}

We can also define a function to create initial values for z and thereby avoid the dreaded Invalid parent node error:

# Function to create initial values for unknown z
ms.init.z <- function(ch, f){
  for (i in 1:dim(ch)[1]){ch[i,1:f[i]] <- NA}
  states <- max(ch, na.rm = TRUE)
  known.states <- 1:(states-1)
  v <- which(ch==states)
  ch[-v] <- NA
  ch[v] <- sample(known.states, length(v), replace = TRUE)
  return(ch)
}

Now bundle the data and set the MCMC settings. Currently, ni and nb are set very low. This model takes a long time to run so these settings will allow us to work out any bugs without having to wait a long time to see the error message. Once you have the model running, increase these values to, say, 50,000 and 10,000:

# Bundle data
jags.data <- list(y = rCH, f = f, n.occasions = dim(rCH)[2], nind = dim(rCH)[1], z = known.state.ms(rCH, 3))

# Initial values
# Need 1 value of s for each occasion
inits <- function(){list(s = runif(dim(rCH)[2] - 1, 0, 1), z = ms.init.z(rCH, f))}  

# Parameters monitored
parameters <- c("s", "psiV", "psiF", "psiD")

# MCMC settings (change these once you know the model is running)
ni <- 500
nt <- 10
nb <- 100
nc <- 3

Finally, fit the model:

# Call JAGS from R
fit <- jagsUI::jags(data = jags.data, inits = inits, parameters.to.save = parameters, 
                        model.file = here::here("jags/ladyslipper.jags"), 
                        n.chains = nc, n.thin = nt, n.iter = ni, n.burnin = nb, 
                        parallel = TRUE)

print(fit, digits = 3)

Be sure to check the model for convergence and to determine whether the parameter estimates are consistent with the data generating values.