fill_transitions()
fill_fertility()
knitr::opts_chunk$set(echo = TRUE, tidy = FALSE, collapse = TRUE, comment = "#>")
library(raretrans)Matrix population models are a standard tool in ecology for projecting population dynamics and quantifying the contributions of survival, growth, and reproduction to population growth. A population projection matrix \(\mathbf{A}\) is typically split into two components:
The full model is \(\mathbf{A} = \mathbf{T} + \mathbf{F}\), and the asymptotic population growth rate \(\lambda\) is the dominant eigenvalue of \(\mathbf{A}\).
When populations are small or study periods are short, the observed transition frequencies produce two distinct types of biased estimates:
Structural zeros: biologically possible transitions that are never observed and appear as 0% in the data. These are rare events missed by chance — not truly impossible transitions.
Structural ones: transitions estimated at 100% survival, stasis, or mortality. For example, if all 3 observed juveniles survived, the data suggest 100% juvenile survival — a biologically implausible certainty caused entirely by small sample size.
Both are artefacts of insufficient data, and both distort the matrix:
raretrans addresses both problems by using
Bayesian priors to regularise estimates, adding prior
biological knowledge to the observed data so that all transitions
receive a credible, non-extreme estimate.
Throughout this vignette we use a published matrix for the understorey palm Chamaedorea elegans (parlour palm), compiled by Burns et al. (2013) and archived in the COMPADRE Plant Matrix Database (Salguero-Gómez et al. 2015). The population has three life-history stages:
| Stage label | Description |
|---|---|
seed |
Propagule (seed) |
nonrep |
Non-reproductive individual (seedling/juvenile) |
rep |
Reproductive adult |
The transition (\(\mathbf{T}\)) and
fecundity (\(\mathbf{F}\)) matrices are
entered directly below. This is the standard way to provide data to
raretrans when you do not have individual-level
observations.
# ── Transition matrix T ────────────────────────────────────────────────────────
# Each column must sum to ≤ 1 (the remainder is mortality).
# Rows = destination stage; Columns = source stage.
T_mat <- matrix(
c(0.2368, 0.0000, 0.0000, # seed → seed, nonrep, rep
0.0010, 0.0100, 0.0200, # nonrep → seed, nonrep, rep
0.0000, 0.1429, 0.1364), # rep → seed, nonrep, rep
nrow = 3, ncol = 3, byrow = TRUE,
dimnames = list(c("seed", "nonrep", "rep"),
c("seed", "nonrep", "rep"))
)
# ── Fecundity matrix F ─────────────────────────────────────────────────────────
# F[i,j] = average number of stage-i individuals produced by one stage-j parent.
F_mat <- matrix(
c(0, 0, 1.9638, # reproductive adults produce seeds
0, 0, 8.3850, # reproductive adults produce nonreproductive offspring
0, 0, 0.0000), # no fecundity contributions to the reproductive stage
nrow = 3, ncol = 3, byrow = TRUE,
dimnames = list(c("seed", "nonrep", "rep"),
c("seed", "nonrep", "rep"))
)
# ── Wrap in a named list — the format expected by raretrans ────────────────────
TF <- list(T = T_mat, F = F_mat)
TF
#> $T
#> seed nonrep rep
#> seed 0.237 0.000 0.000
#> nonrep 0.001 0.010 0.020
#> rep 0.000 0.143 0.136
#>
#> $F
#> seed nonrep rep
#> seed 0 0 1.96
#> nonrep 0 0 8.38
#> rep 0 0 0.00Each column of \(\mathbf{T}\) describes one source stage.
For example, the column seed tells us that in one time step
a seed has:
seed →
seed),seed → nonrep), andThe \(\mathbf{F}\) matrix shows that
only reproductive adults (rep) contribute offspring: each
adult produces on average 1.96 seeds and 8.39 non-reproductive
individuals per time step. These large fecundity values are typical for
tropical palms that invest heavily in reproduction.
Many functions in raretrans require the starting
number of individuals observed in each stage at the beginning
of the census period. This vector \(\mathbf{N}\) is used to convert the
transition probabilities back to raw counts (because the Bayesian model
works on counts, not proportions).
If you have individual-level data in a data frame you can use
get_state_vector() to extract \(\mathbf{N}\) automatically (see the
onepopperiod vignette for a worked example). Here we supply
it directly, using a hypothetical census of 80 seeds, 25
non-reproductive plants, and 12 reproductive adults.
Before adding any priors, let us examine the raw matrix.
# Dominant eigenvalue = asymptotic growth rate λ
A_obs <- TF$T + TF$F
lambda_obs <- Re(eigen(A_obs)$values[1])
cat("Observed lambda:", round(lambda_obs, 3), "\n")
#> Observed lambda: 1.17Now look at the column sums of \(\mathbf{T}\):
The seed column sums to only 0.238. The rest (0.762)
represents mortality of seeds. Importantly, the transition from
seed to nonrep is 0.001 — very small and
easily missed in a short study. In a small sample we might observe zero
such transitions, leaving a structural zero in the data
matrix even though we know germination happens.
Similarly, the nonrep → rep entry (0.1429)
and the back-transition rep → nonrep (0.0200)
could be missed in small populations. Conversely, if all observed
individuals in a stage survived, the data would show a
structural one (100% survival or stasis) — equally
unreliable because it is driven by small sample size rather than
biology. When either type of bias appears in real data, we need a
principled way to incorporate our biological knowledge and regularise
the estimates.
fill_transitions()The simplest prior is the uniform Dirichlet prior. It adds a small equal weight to every possible fate. For a 3-stage system there are 4 fates per column (3 stages + death), so a uniform prior with weight 1 adds 0.25 to each fate count.
The prior matrix P has one more row than the
transition matrix \(\mathbf{T}\): the extra row represents
death.
# Uniform prior: equal weight to all fates (including death)
P_uniform <- matrix(0.25, nrow = 4, ncol = 3)
fill_transitions(TF, N, P = P_uniform)
#> [,1] [,2] [,3]
#> [1,] 0.23696 0.00962 0.0192
#> [2,] 0.00407 0.01923 0.0377
#> [3,] 0.00309 0.14702 0.1451Compare with the original \(\mathbf{T}\):
TF$T
#> seed nonrep rep
#> seed 0.237 0.000 0.000
#> nonrep 0.001 0.010 0.020
#> rep 0.000 0.143 0.136The uniform prior has filled in all the zero entries, including the
biologically impossible seed → rep transition.
This is one weakness of a uniform prior: it does not distinguish
between rare-but-possible transitions and impossible ones. This
is an example and should NOT be used as a default prior
in practice.
A better approach is to use an informative prior that reflects biological knowledge. The prior matrix still has 4 rows (3 stages + death), and ideally each column sums to 1 so that the weight is easy to interpret.
For C. elegans we specify:
P_info <- matrix(
c(0.25, 0.05, 0.00, # → seed
0.05, 0.55, 0.15, # → nonrep
0.00, 0.15, 0.70, # → rep
0.70, 0.25, 0.15), # → dead
nrow = 4, ncol = 3, byrow = TRUE
)
# Verify columns sum to 1
colSums(P_info)
#> [1] 1 1 1Now apply this prior with the default weight of 1 (equivalent to adding 1 individual’s worth of prior “observations”):
T_posterior <- fill_transitions(TF, N, P = P_info)
T_posterior
#> [,1] [,2] [,3]
#> [1,] 0.2370 0.00192 0.00
#> [2,] 0.0016 0.03077 0.03
#> [3,] 0.0000 0.14317 0.18The informative prior fills in the rare transitions without creating
the impossible seed → rep entry (which stays
at 0 because we set that prior entry to 0).
The priorweight argument controls how strongly the prior
influences the posterior. A weight of 1 (the default) means the prior
counts as 1 individual. Setting priorweight = 0.5 halves
that influence:
fill_transitions(TF, N, P = P_info, priorweight = 0.5)
#> [,1] [,2] [,3]
#> [1,] 0.2412 0.0167 0.0000
#> [2,] 0.0173 0.1900 0.0633
#> [3,] 0.0000 0.1453 0.3243With priorweight = 0.5 and \(N
= 25\) non-reproductive plants observed, the effective prior
sample size for that column is \(0.5 \times 25
= 12.5\) individuals, still less than the data. This is the
recommended approach: the prior sample size should be smaller
than the observed sample size so that the data dominate the
posterior.
fill_fertility()The fecundity prior follows a Gamma-Poisson
(negative binomial) model. We specify two matrices, alpha
and beta, whose entries are the shape and rate parameters
of the Gamma prior on the Poisson rate. Use NA for entries
that represent stages that cannot reproduce or fates that do not arise
from reproduction.
# alpha = prior "offspring counts"; beta = prior "adult counts"
# Only the (seed, rep) and (nonrep, rep) entries are reproductive
alpha <- matrix(
c(NA, NA, 1e-5,
NA, NA, 1e-5,
NA, NA, NA),
nrow = 3, ncol = 3, byrow = TRUE
)
beta <- matrix(
c(NA, NA, 1e-5,
NA, NA, 1e-5,
NA, NA, NA),
nrow = 3, ncol = 3, byrow = TRUE
)
F_posterior <- fill_fertility(TF, N, alpha = alpha, beta = beta)
F_posterior
#> seed nonrep rep
#> seed 0 0 1.96
#> nonrep 0 0 8.38
#> rep 0 0 0.00The tiny values of alpha and beta (1e-5)
produce a nearly non-informative prior: the posterior fecundity is
almost identical to the observed fecundity because the data completely
dominate. This is the correct behaviour when you have no prior
information about fecundity.
posterior <- list(
T = fill_transitions(TF, N, P = P_info),
F = fill_fertility(TF, N, alpha = alpha, beta = beta)
)
posterior
#> $T
#> [,1] [,2] [,3]
#> [1,] 0.2370 0.00192 0.00
#> [2,] 0.0016 0.03077 0.03
#> [3,] 0.0000 0.14317 0.18
#>
#> $F
#> seed nonrep rep
#> seed 0 0 1.96
#> nonrep 0 0 8.38
#> rep 0 0 0.00
# Asymptotic growth rate of the posterior matrix
lambda_post <- Re(eigen(posterior$T + posterior$F)$values[1])
cat("Posterior lambda:", round(lambda_post, 3), "\n")
#> Posterior lambda: 1.21
cat("Observed lambda:", round(lambda_obs, 3), "\n")
#> Observed lambda: 1.17The posterior \(\lambda\) is slightly different from the observed \(\lambda\) because the prior redistributes a small amount of probability mass from the observed transitions to the rare ones.
Both fill_transitions() and
fill_fertility() accept a returnType argument
for accessing different representations of the posterior.
Setting returnType = "TN" returns the augmented
fate matrix — the raw Dirichlet posterior counts (not divided
by column sums). The extra bottom row gives the posterior count for the
death fate.
TN <- fill_transitions(TF, N, P = P_info, returnType = "TN")
TN
#> [,1] [,2] [,3]
#> [1,] 19.19 0.05 0.00
#> [2,] 0.13 0.80 0.39
#> [3,] 0.00 3.72 2.34
#> [4,] 61.68 21.43 10.27This is useful for simulation (see Part 5) and for computing credible intervals directly (see Part 6).
Setting returnType = "A" returns \(\mathbf{T} + \mathbf{F}\) directly:
Setting returnType = "ab" in
fill_fertility() returns the posterior Gamma
parameters:
fill_fertility(TF, N, alpha = alpha, beta = beta, returnType = "ab")
#> $alpha
#> seed nonrep rep
#> seed NA NA 23.6
#> nonrep NA NA 100.6
#> rep NA NA NA
#>
#> $beta
#> [,1] [,2] [,3]
#> [1,] NA NA 12
#> [2,] NA NA 12
#> [3,] NA NA NAPoint estimates of \(\lambda\) do
not convey uncertainty. To obtain a credible interval
on the asymptotic growth rate we simulate many matrices from the
posterior distribution using sim_transitions().
set.seed(20240301) # for reproducibility
# Simulate 1000 projection matrices from the posterior
sims <- sim_transitions(TF, N,
P = P_info,
alpha = alpha,
beta = beta,
priorweight = 0.5,
samples = 1000)
# Extract λ from each simulated matrix
lambdas <- sapply(sims, function(mat) Re(eigen(mat)$values[1]))
# Summarise
cat("Posterior median λ:", round(median(lambdas), 3), "\n")
#> Posterior median λ: 1.01
cat("95% credible interval: [",
round(quantile(lambdas, 0.025), 3), ",",
round(quantile(lambdas, 0.975), 3), "]\n")
#> 95% credible interval: [ 0.729 , 1.34 ]
cat("P(λ > 1):", round(mean(lambdas > 1), 3), "\n")
#> P(λ > 1): 0.516
# Plot
hist(lambdas,
breaks = 30,
main = expression("Posterior distribution of " * lambda),
xlab = expression(lambda),
col = "steelblue", border = "white")
abline(v = 1, lty = 2, lwd = 2, col = "firebrick")The histogram shows the full posterior uncertainty about \(\lambda\). P(λ > 1) is the
posterior probability that the population is growing —
a more informative summary than a point estimate alone.
transition_CrI()transition_CrI() computes the marginal posterior
beta credible intervals for every entry in \(\mathbf{T}\), including the probability of
dying. It returns a tidy data frame with one row per source–destination
pair.
cri <- transition_CrI(TF, N,
P = P_info,
priorweight = 0.5,
ci = 0.95,
stage_names = c("seed", "nonrep", "rep"))
cri| seed | seed | 0.241 | 0.169 | 0.321 |
| seed | nonrep | 0.0173 | 0.00226 | 0.0471 |
| seed | rep | 0 | 0 | 0 |
| seed | dead | 0.742 | 0.66 | 0.816 |
| nonrep | seed | 0.0167 | 6.26e-05 | 0.0744 |
| nonrep | nonrep | 0.19 | 0.0831 | 0.328 |
| nonrep | rep | 0.145 | 0.0534 | 0.273 |
| nonrep | dead | 0.648 | 0.491 | 0.79 |
| rep | seed | 0 | 0 | 0 |
| rep | nonrep | 0.0633 | 0.00251 | 0.209 |
| rep | rep | 0.324 | 0.135 | 0.55 |
| rep | dead | 0.612 | 0.385 | 0.817 |
Each row gives the posterior mean transition probability and its 95%
symmetric credible interval. The dead rows summarise the
probability of mortality from each source stage.
plot_transition_CrI()You can hide the death fate if you prefer to focus on the living stages:
Notice that the credible intervals for the seed →
nonrep transition are very wide, reflecting the fact that
this germination event is rarely observed (only 0.1% per time step in
this matrix). The prior provides a small regularising signal but the
data are sparse, so uncertainty is high.
plot_transition_density()plot_transition_density() arranges the full marginal
posterior beta density for every transition as a matrix
of panels, mirroring the layout of \(\mathbf{T}\). Each column corresponds to a
source stage and each row to a destination stage. The shaded region is
the 95% credible interval.
plot_transition_density(TF, N,
P = P_info,
priorweight = 0.5,
stage_names = c("seed", "nonrep", "rep"))Panels on the main diagonal (e.g., seed →
seed, rep → rep) tend to have
well-constrained distributions because stasis is commonly observed.
Off-diagonal transitions that are biologically rare show broad, flat
densities — exactly where the prior matters most.
Excluding the death row gives a tighter layout focused on the living stages:
plot_transition_density(TF, N,
P = P_info,
priorweight = 0.5,
stage_names = c("seed", "nonrep", "rep"),
include_dead = FALSE)This vignette introduced the core workflow of
raretrans:
popbio::projection.matrix() and
get_state_vector()) or by entering them directly.P that encodes
biological knowledge about which transitions are possible and how likely
they are. Prefer informative priors over the uniform default when you
have relevant expertise.fill_transitions() and fill_fertility() to
correct structural zeros, structural ones, and other small-sample
biases.sim_transitions() to obtain a credible interval on \(\lambda\).transition_CrI(), plot_transition_CrI(), and
plot_transition_density().For a worked example that starts from individual-level field
observations see vignette("onepopperiod").
Burns, J., Blomberg, S., Crone, E., Ehrlén, J., Knight, T., Pichancourt, J.-B.,, Ramula, S., Wardle, G., & Buckley, Y. (2010). Empirical tests of life-history, evolution theory using phylogenetic analysis of plant demography., Journal of Ecology, 98(2), 334–344., https://doi.org/10.1111/j.1365-2745.2009.01634.x
Salguero-Gómez, R., Jones, O. R., Archer, C. R., Buckley, Y. M., Che-Castaldo, J., Caswell, H., …, & Vaupel, J. W. (2015). The COMPADRE Plant Matrix Database: an open online repository for plant demography. Journal of Ecology, 103, 202–218. https://doi.org/10.1111/1365-2745.12334
Tremblay, R. L., Tyre, A. J., Pérez, M.-E., & Ackerman, J. D. (2021). Population projections from holey matrices: Using prior information to estimate rare transition events. Ecological Modelling, 447, 109526. https://doi.org/10.1016/j.ecolmodel.2021.109526