1 Introduction

ErrorTracer addresses a gap in the ecological and genomic forecasting toolkit: existing Bayesian packages (brms, rstanarm) do not decompose forecast uncertainty into its contributing sources, and no R package quantifies the forecast shelf life — the time horizon at which a credible interval becomes too wide to be informative.

The package provides a seven-step pipeline:

Step	Function	What it does
1	`extract_priors()`	Convert a regularized model to `brms` priors
2	`et_fit()`	Fit the informed Bayesian model
3	`et_diagnose()`	Convergence, ESS, LOO-CV
4	`et_predict()`	Posterior prediction + environmental noise propagation
5	`decompose_uncertainty()`	Three-way variance decomposition
6	`shelf_life()`	Forecast horizon / shelf-life analysis
7	`et_calibrate()`	Observed vs. nominal coverage

Visualisation functions (et_plot_*) return ggplot2 objects at every step.

1.1 The simulated dataset

We work with et_sim, a list bundled with the package. It represents a hypothetical mountain plant site where allele-frequency change on the arcsin-sqrt scale (z_diff) in two SNP clusters (A and B) is linked to three growing-season climate predictors.

Note on the arcsin-sqrt transformation. Raw allele frequencies $f \in [0,\,1]$ are transformed via $z = \arcsin(\sqrt{f})$. The result is unbounded (theoretically $[0,\,\pi/2] \approx [0,\,1.57]$ for a single chromosome, but changes $\Delta z$ are unbounded in principle). Do not use plausible_range = c(-1, 1) for this response — instead derive the range from the training data or use a biologically motivated interval.

Predictor	Description	Units (raw)
`Tmean`	Mean growing-season temperature	°C (standardised)
`PPT`	Total growing-season precipitation	mm (standardised)
`SWE`	Peak snow water equivalent	mm (standardised)

The dataset has a training period (1995–2014, 20 years) and a forecast period (2015–2019, 5 years). The forecast climate is slightly warmer than the training period on average (Tmean ~1 SD above training mean), representing a mild warming trend. Because the data are simulated, we know the true regression coefficients — stored in et_sim$true_params — which lets us validate parameter recovery and calibration.

True data-generating model:

\[ z\_diff_t = \alpha + \beta_1 \cdot Tmean_t + \beta_2 \cdot PPT_t + \beta_3 \cdot SWE_t + \varepsilon_t, \quad \varepsilon_t \sim \mathcal{N}(0,\, \sigma^2) \]

Parameter	Cluster A	Cluster B
$\alpha$ (intercept)	0.00	0.10
$\beta_1$ (Tmean)	+0.50	+0.30
$\beta_2$ (PPT)	−0.30	−0.20
$\beta_3$ (SWE)	+0.20	−0.10
$\sigma$	0.40	0.50

Both clusters have a positive response to warming (Tmean) and a negative response to precipitation (PPT). They differ in the direction of their response to snowpack (SWE) and in residual variability — cluster B is noisier (σ = 0.50 vs. 0.40), which will be reflected in its wider forecast intervals and shelf-life ratio closer to the uninformative threshold.

2 Setup

library(ErrorTracer)
library(glmnet)      # for elastic net (Suggests)
library(ggplot2)

# Load the bundled simulated dataset
data(et_sim)

# Use small MCMC settings for this tutorial.
# In a real analysis: CHAINS = 4, ITER = 2000 (or more).
CHAINS <- 2L
ITER   <- 1000L
SEED   <- 42L

3 Exploring the dataset

str(et_sim, max.level = 2)
# List of 7
#  $ train         :'data.frame': 40 obs. of 6 variables:
#  $ forecast      :'data.frame': 10 obs. of 5 variables:
#  $ validation    :'data.frame': 10 obs. of 3 variables:
#  $ true_params   :List of 2
#  $ env_noise     :List of 3
#  $ standardization:List of 3
#  $ description   : chr ...

head(et_sim$train)
#   year cluster_id      Tmean        PPT        SWE     z_diff
# 1 1995          A -0.6775963  0.9119435 -0.1386249 -0.4917161
# 2 1996          A  0.4866542 -1.7284276 -1.4391959  0.4088745
# 3 1997          A -0.4935765 -0.3945083 -0.7267905  0.2352018
# 4 1998          A -0.2397691 -0.9805886 -0.8206722  0.3194194
# 5 1999          A -0.5193449  1.1304845  1.6453631 -0.2554886
# 6 2000          A -1.5333598  0.6437605  0.6252043 -0.5864645

# True coefficients — known because the data are simulated
et_sim$true_params
# $A
#   intercept     Tmean       PPT       SWE     sigma
#        0.00      0.50     -0.30      0.20      0.40
#
# $B
#   intercept     Tmean       PPT       SWE     sigma
#        0.10      0.30     -0.20     -0.10      0.50

# Forecast predictors are slightly above training range in Tmean (warming trend)
et_sim$forecast[et_sim$forecast$cluster_id == "A", ]
#   year cluster_id     Tmean       PPT        SWE
# 1 2015          A 0.6918714  0.291...  0.102...
# 2 2016          A 0.7012... ...
# ...
# Forecast Tmean is ~0.7-1.4 SDs above training mean — mild extrapolation.

3.1 Visualise the training data

ggplot(et_sim$train, aes(x = year, y = z_diff, colour = cluster_id)) +
  geom_line() +
  geom_point(size = 2) +
  labs(
    title   = "Simulated allele-frequency change (training period)",
    x       = "Year",
    y       = expression(italic(z)[diff]),
    colour  = "Cluster"
  ) +
  et_theme()

# Climate predictors over the full period — warming trend visible in Tmean
all_years <- rbind(
  transform(et_sim$train[et_sim$train$cluster_id == "A",
                          c("year", "Tmean", "PPT", "SWE")],
            period = "Training"),
  transform(et_sim$forecast[et_sim$forecast$cluster_id == "A",
                             c("year", "Tmean", "PPT", "SWE")],
            period = "Forecast")
)

all_long <- reshape(all_years, direction = "long",
                    varying = c("Tmean", "PPT", "SWE"),
                    v.names = "value", timevar = "predictor",
                    times   = c("Tmean", "PPT", "SWE"))

ggplot(all_long, aes(x = year, y = value,
                     colour = period, group = period)) +
  geom_line() +
  geom_point(size = 1.8) +
  facet_wrap(~ predictor, ncol = 3) +
  labs(
    title  = "Climate predictors: training and forecast periods",
    x      = "Year", y = "Standardised value", colour = "Period"
  ) +
  et_theme()

4 Single-cluster workflow: Cluster A

We walk through the complete pipeline for cluster A, then repeat for both clusters simultaneously using the grouping argument.

4.1 Step 1 — Elastic net for prior extraction

We fit a cross-validated elastic net (alpha = 0.5) on the training data. The elastic net simultaneously performs variable selection under regularisation and provides coefficient estimates that become prior means in the subsequent Bayesian model.

train_A <- et_sim$train[et_sim$train$cluster_id == "A", ]

x_train <- as.matrix(train_A[, c("Tmean", "PPT", "SWE")])
y_train <- train_A$z_diff

set.seed(42)
cv_fit_A <- glmnet::cv.glmnet(
  x     = x_train,
  y     = y_train,
  alpha = 0.5          # 0 = ridge, 1 = lasso, 0.5 = elastic net
)

# Coefficient estimates at lambda.min
coef(cv_fit_A, s = "lambda.min")
# 4 x 1 sparse Matrix of class "dgCMatrix"
#             lambda.min
# (Intercept)  0.0430
# Tmean        0.6658
# PPT         -0.4591
# SWE          0.3775

Note on small n: With 20 observations and 10-fold CV, R will print a warning about fewer than 3 observations per fold. This is expected and harmless — the elastic net still regularises correctly; it just uses the default grouped cross-validation.

The elastic net correctly recovers the signs of all three predictors. The magnitudes are somewhat inflated relative to the true values (Tmean = 0.50, PPT = −0.30, SWE = 0.20) because with n = 20 the regularisation cannot fully separate the signal from sampling noise. This shrinkage toward the true values will be completed by the Bayesian posterior.

plot(cv_fit_A)
title("Cross-validated elastic net — Cluster A", line = 2.5)

4.2 Step 2 — Extract informed priors

extract_priors() converts the elastic net coefficients into a brmsprior object. The prior SD for each coefficient is set to multiplier × |coef| (floored at min_sd).

prior_spec_A <- extract_priors(
  model      = cv_fit_A,
  lambda     = "lambda.min",
  multiplier = 2.0,   # prior SD = 2 × |elastic-net coef|
  min_sd     = 0.10   # floor: avoids spike priors on near-zero coefficients
)

print(prior_spec_A)
# ErrorTracer prior specification
#   Method      : glmnet
#   Predictors  : 3
#   Multiplier  : 2
#   Min SD      : 0.1
#   Coefficients:
#     Tmean                 mean =   0.6658  sd =   1.3316
#     PPT                   mean =  -0.4591  sd =   0.9182
#     SWE                   mean =   0.3775  sd =   0.7550

The large prior SDs (1.3 for Tmean, 0.9 for PPT, 0.8 for SWE) reflect the multiplier = 2 setting — priors are wide enough that the likelihood can readily update them, while still encoding the directional information from the elastic net.

Design note: The intercept is excluded from the informed prior. In this simulation the predictors are standardised to zero mean in the training period, so the intercept carries no predictive information. The intercept receives a default Normal(0, 2.5) prior.

4.3 Step 3 — Fit the Bayesian model

et_fit() wraps brms::brm(), attaches the prior specification, and returns an et_model object.

fit_A <- et_fit(
  formula = z_diff ~ Tmean + PPT + SWE,
  data    = train_A,
  priors  = prior_spec_A,
  chains  = CHAINS,
  iter    = ITER,
  warmup  = ITER %/% 2L,
  cores   = CHAINS,
  seed    = SEED
)

print(fit_A)
# ErrorTracer model (et_model)
#   Formula : z_diff ~ Tmean + PPT + SWE
#   n obs   : 20
#   Chains  : 2  Iter: 1000  Warmup: 500
#   Priors  : informed (glmnet, 3 predictors)
#   Rhat max: 1.004

summary(fit_A)
# === ErrorTracer model summary ===
#
# --- Fixed effects ---
#              Estimate  Est.Error     Q2.5    Q97.5
# Intercept    0.041      0.104    -0.162    0.246
# Tmean        0.666      0.113     0.438    0.888
# PPT         -0.487      0.200    -0.909   -0.123
# SWE          0.406      0.206     0.023    0.827

The posterior means are shrunk slightly relative to the elastic-net estimates, particularly for PPT and SWE — the likelihood is pulling the estimates toward the true values (Tmean = 0.50, PPT = −0.30, SWE = 0.20). All 95% credible intervals exclude zero for Tmean and PPT; SWE is positive but its lower bound just crosses zero, consistent with the true value being 0.20 with σ = 0.40 and only 20 training observations.

4.4 Step 4 — Diagnose convergence and model fit

diag_A <- et_diagnose(fit_A, loo = TRUE)

# Convergence
diag_A$convergence
# $rhat_max
# [1] 1.004         # all chains well below 1.05 threshold
#
# $rhat_all_ok
# [1] TRUE
#
# $neff_min
# [1] 0.275         # lowest effective-sample-size ratio
#
# $neff_all_ok
# [1] TRUE          # all Neff ratios > 0.10
#
# $n_divergences
# [1] 0             # no divergent transitions

diag_A$loo[c("elpd_loo", "p_loo", "looic", "n_bad_pareto_k")]
# $elpd_loo
# [1] -14.55
#
# $p_loo
# [1] 4.00          # effective parameters ≈ 4 (intercept, 3 betas); sigma
#                   # is implicit — consistent with a 4-parameter model
# $looic
# [1] 29.10
#
# $n_bad_pareto_k
# [1] 0             # no influential observations (Pareto k > 0.7)

All Rhat < 1.05, zero divergences, and no Pareto-k warnings — the sampler has converged and the posterior is not dominated by any single observation.

4.5 Step 5 — Predict with environmental noise propagation

et_predict() generates posterior predictive draws and propagates uncertainty about the predictor values themselves via the env_noise argument. Each perturbation draw adds Gaussian noise to the predictor values before computing the linear predictor, capturing sensitivity to measurement or forecast error in the climate inputs.

env_noise accepts three forms: - A named list of scalars — constant noise SD per predictor across all observations (suitable for short validation windows). - A named list of vectors of length nrow(newdata) — time-varying noise, where each observation gets its own SD. This is the recommended form for GCM projections, where ensemble spread grows with forecast horizon. - A single scalar — applied as a fraction of each predictor’s empirical SD.

fcast_A <- et_sim$forecast[et_sim$forecast$cluster_id == "A", ]

pred_A <- et_predict(
  model     = fit_A,
  newdata   = fcast_A,
  env_noise = et_sim$env_noise,   # list(Tmean = 0.30, PPT = 0.20, SWE = 0.15)
  n_draws   = 1000L,
  ci_levels = c(0.50, 0.80, 0.90, 0.95),
  n_perturb = 300L                # draws for the perturbation step
)

print(pred_A)
# ErrorTracer prediction (et_prediction)
#   Observations : 5
#   Draws        : 1000
#   CI levels    : 0.5, 0.8, 0.9, 0.95
#   Mean var decomposition (across observations):
#     Parameter    : 0.0441
#     Environmental: 0.0579
#     Residual     : 0.2173
#     Total        : 0.2673

In this dataset the environmental uncertainty (0.058) is comparable in magnitude to the parameter uncertainty (0.044) — a noteworthy finding. With only 20 training observations, measurement noise in the climate inputs contributes almost as much uncertainty as imprecision in the estimated regression coefficients. Residual process noise (0.217) remains dominant because σ = 0.40 is large relative to the signal.

# 90% credible intervals by forecast year
ci90 <- pred_A$credible_intervals[pred_A$credible_intervals$ci_level == 0.90, ]
ci90$year <- fcast_A$year
ci90[, c("year", "lower", "median", "upper", "width")]
#   year    lower  median   upper  width
# 1 2015   -0.507   0.381   1.213  1.720
# 2 2016    0.009   0.794   1.660  1.651
# 3 2017   -0.009   0.875   1.631  1.640
# 4 2018   -0.179   0.649   1.460  1.639
# 5 2019   -0.767   0.131   1.002  1.769

The medians are positive across all forecast years, consistent with the mild warming trend pushing Tmean ~0.7–1.4 SDs above the training mean. CI widths are 1.64–1.77, somewhat narrower than the full plausible range of 2.

4.6 Step 6 — Three-way uncertainty decomposition

decomp_A <- decompose_uncertainty(pred_A)
decomp_A
#   obs_id total_var param_var  env_var residual_var
# 1      1    0.2627    0.0202   0.0541       0.2173
# 2      2    0.2667    0.0361   0.0588       0.2173
# 3      3    0.2573    0.0364   0.0541       0.2173
# 4      4    0.2529    0.0427   0.0653       0.2173
# 5      5    0.2972    0.0849   0.0574       0.2173

residual_var is constant across observations — it equals the posterior mean of σ² and represents process noise that cannot be reduced by better predictors or more training data. param_var increases slightly in observations 4–5, reflecting that those forecast years (2018–2019) are more distant from the training period’s centroid, where the model is less certain.

et_plot_decomposition(decomp_A, proportional = TRUE) +
  labs(subtitle = "Cluster A, forecast 2015–2019")

The proportional view makes it clear that process noise (residual) accounts for ~80% of total forecast variance, parameter uncertainty for ~12–20%, and environmental measurement noise for ~17–25%. To substantially improve forecast precision for this system, the primary target should be the residual variance — identifying additional drivers of allele-frequency change.

4.7 Step 7 — Forecast shelf life

The shelf life analysis asks: at which forecast year does the 90% CI width exceed the biologically plausible response range? Because z_diff is on the arcsin-sqrt scale (unbounded), we derive the plausible range from the training data rather than assuming a fixed ±1 bound.

# Derive plausible range from training data (arcsin-sqrt scale — unbounded)
pr_A <- range(train_A$z_diff)   # e.g. c(-1.50, 2.04) => width ~ 3.54

sl_A <- shelf_life(
  predictions              = pred_A,
  plausible_range          = pr_A,
  ci_level                 = 0.90,
  threshold                = 1.0,        # uninformative when CI width >= plausible range
  time_col                 = "year",
  min_slope_for_projection = 1e-4,       # positive slope required to project
  max_extrapolation_factor = 10          # cap: project <= last_year + 10 * window_width
)

print(sl_A)
# ErrorTracer shelf life analysis
#   Observations     : 5
#   Plausible range  : 3.54
#   Threshold        : 1
#   Informative      : 5 / 5
#   Mean CI/range    : 0.488
#   Max CI/range     : 0.499
#   Shelf life       : > 2019 (lower bound — all periods informative, no trend)
#   Detail           : All 5 forecast periods informative. Linear trend
#                      (slope = 0.00241 per time unit) projects threshold
#                      crossing at ~2234.3, but this exceeds the extrapolation
#                      cap (2059). Shelf life > 2019.

as.data.frame(sl_A)
#   obs_id  time ci_width plausible_range  ratio informative
# 1      1  2015    1.720            3.54  0.486        TRUE
# 2      2  2016    1.651            3.54  0.466        TRUE
# 3      3  2017    1.640            3.54  0.463        TRUE
# 4      4  2018    1.639            3.54  0.463        TRUE
# 5      5  2019    1.769            3.54  0.499        TRUE

All five forecast years are strongly informative (ratio ≈ 0.47–0.50, well below the threshold of 1.0). The near-flat ratio trend means the forecast stays reliable throughout the 2015–2019 window. A faint positive slope (0.00241 per year) technically implies a linear crossing at ~2234 — but since that is far beyond the 10× extrapolation cap (2059), the result is reported as a lower bound: the forecast remains informative for at least the entire observed window, and likely far beyond.

Interpreting the three horizon modes:

Mode When it applies What is returned

observed Threshold crossed within the prediction window First crossing time

projected All periods informative; positive trend within cap Extrapolated crossing time

lower_bound All informative; no trend, or projection beyond cap > last observed time

Increase max_extrapolation_factor (or set it to Inf) to allow longer projections if the biological context supports it.

Mode	When it applies	What is returned
`observed`	Threshold crossed within the prediction window	First crossing time
`projected`	All periods informative; positive trend within cap	Extrapolated crossing time
`lower_bound`	All informative; no trend, or projection beyond cap	`> last observed time`

et_plot_shelf_life(sl_A, show_ratio = TRUE) +
  labs(subtitle = "Cluster A — 90% CI width / training range (arcsin-sqrt scale)")

4.8 Step 8 — Calibration assessment

We compare the posterior predictive intervals against the held-out true values in et_sim$validation.

valid_A <- et_sim$validation[et_sim$validation$cluster_id == "A", ]

cal_A <- et_calibrate(
  predictions  = pred_A,
  observed     = valid_A,
  response_col = "z_diff",
  ci_levels    = c(0.50, 0.80, 0.90, 0.95)
)

print(cal_A)
#   ci_level nominal observed_coverage n_obs calibration_error
# 1     0.50    0.50              0.60     5              0.10
# 2     0.80    0.80              0.80     5              0.00
# 3     0.90    0.90              1.00     5              0.10
# 4     0.95    0.95              1.00     5              0.05

The 80% CI achieves exactly nominal coverage. The 90% and 95% CIs are slightly over-conservative (all 5 held-out values fall inside), which is expected with only 5 validation observations and regularising priors. Coverage estimates based on n = 5 have high sampling variance; the important result is that no CI is systematically under-covering.

et_plot_calibration(cal_A) +
  labs(subtitle = "Cluster A — posterior predictive calibration")

5 Visualisations

5.1 Forecast fan chart

et_plot_forecast(
  predictions  = pred_A,
  observed     = valid_A,
  response_col = "z_diff",
  time_col     = "year"
) +
  labs(subtitle = "Cluster A — shaded ribbons: 50%, 80%, 90%, 95% CI")

Nested credible-interval ribbons (darker = narrower interval) overlay the posterior median. Red points are the held-out validation values. All five observed values fall within the 95% CI.

5.2 Prior versus posterior

et_plot_prior_posterior(fit_A, max_preds = 3L) +
  labs(subtitle = "Cluster A — how 20 observations update the elastic-net prior")

Each panel overlays the prior density (grey, wide) with the posterior density (blue, narrower). A tightly shifted posterior means the data are informative; a posterior that closely mirrors the prior means that the likelihood has little leverage on that coefficient. Tmean shows the largest update, consistent with its largest true effect size.

5.3 Forest plot: Bayesian vs. elastic net

et_plot_coefficients(fit_A) +
  labs(subtitle = "Cluster A — Bayesian 95% CI (blue) vs. enet coefficient (red ×)")

Red crosses mark the elastic-net estimates used as prior means. The Bayesian posterior means (blue points) with their 95% CIs show how the likelihood shifted the estimates. For all three predictors the posterior is pulled slightly toward zero relative to the elastic-net estimate, reflecting regularisation by the wide but informative priors.

6 Parameter recovery validation

A key advantage of simulated data is that we can check whether the 95% posterior credible intervals contain the known true values.

post    <- brms::fixef(fit_A$fit)
true_A  <- et_sim$true_params$A

post_df <- data.frame(
  parameter  = c("Intercept", "Tmean", "PPT", "SWE"),
  true_value = c(true_A["intercept"], true_A["Tmean"],
                 true_A["PPT"],       true_A["SWE"]),
  post_mean  = post[, "Estimate"],
  lower_95   = post[, "Q2.5"],
  upper_95   = post[, "Q97.5"]
)
post_df$covered <- with(post_df,
                        true_value >= lower_95 & true_value <= upper_95)

post_df[, c("parameter", "true_value", "post_mean", "lower_95", "upper_95", "covered")]
#   parameter true_value post_mean lower_95 upper_95 covered
# 1 Intercept       0.00     0.041   -0.162    0.246    TRUE
# 2     Tmean       0.50     0.666    0.438    0.888    TRUE
# 3       PPT      -0.30    -0.487   -0.909   -0.123    TRUE
# 4       SWE       0.20     0.406    0.023    0.827    TRUE

All four true parameter values fall within their 95% posterior credible intervals. The posterior means for Tmean, PPT, and SWE are consistently above the true values in absolute magnitude — this is typical over-estimation bias with n = 20 and informative priors near the OLS estimate. The bias is small relative to the interval widths, and would shrink with more training data.

7 Multi-cluster grouped workflow

In practice you often need to fit the same model to many units (SNP clusters, species, sites) simultaneously. et_fit() handles this with the grouping argument, returning an et_model_list.

7.1 Fitting

# Build one et_prior_spec per cluster from individual elastic nets
prior_list <- lapply(c("A", "B"), function(cl) {
  df <- et_sim$train[et_sim$train$cluster_id == cl, ]
  x  <- as.matrix(df[, c("Tmean", "PPT", "SWE")])
  set.seed(42)
  suppressWarnings(cv <- glmnet::cv.glmnet(x, df$z_diff, alpha = 0.5))
  extract_priors(cv, multiplier = 2.0, min_sd = 0.10)
})
names(prior_list) <- c("A", "B")

# One brms model per cluster, returned as et_model_list
fit_all <- et_fit(
  formula  = z_diff ~ Tmean + PPT + SWE,
  data     = et_sim$train,
  priors   = prior_list,       # named list: one et_prior_spec per cluster
  grouping = "cluster_id",
  chains   = CHAINS,
  iter     = ITER,
  warmup   = ITER %/% 2L,
  cores    = CHAINS,
  seed     = SEED
)

print(fit_all)
# ErrorTracer grouped model list (et_model_list)
#   Grouping : cluster_id
#   Formula  : z_diff ~ Tmean + PPT + SWE
#   Groups   : 2
#   Fitted   : 2 / 2

summary(fit_all)
# --- Per-group Rhat max ---
#   A                     Rhat max = 1.004
#   B                     Rhat max = 1.007

7.2 Diagnostics

diag_all <- et_diagnose(fit_all, loo = TRUE)
diag_all$summary
#   group rhat_ok neff_ok n_divergences elpd_loo n_bad_pareto_k
# 1     A    TRUE    TRUE             0   -14.55              0
# 2     B    TRUE    TRUE             0   -17.15              1

Cluster B has one observation with Pareto-k > 0.7, which is a mild flag worth investigating (it suggests that observation has an outsized influence on the posterior). With only 20 training points this is not unusual.

7.3 Prediction and decomposition

pred_all <- et_predict(
  model     = fit_all,
  newdata   = et_sim$forecast,
  env_noise = et_sim$env_noise,
  n_draws   = 1000L,
  ci_levels = c(0.50, 0.80, 0.90, 0.95),
  n_perturb = 300L
)

print(pred_all)
# ErrorTracer grouped predictions (et_prediction_list)
#   Grouping : cluster_id
#   Groups   : 2 / 2 predicted

decomp_all <- decompose_uncertainty(pred_all)
head(decomp_all)
#   group obs_id total_var param_var  env_var residual_var
# 1     A      1    0.2627    0.0202   0.0541       0.2173
# 2     A      2    0.2667    0.0361   0.0588       0.2173
# ...
# 6     B      1    0.3374    0.0252   0.0122       0.2999
# 7     B      2    0.3472    0.0488   0.0180       0.2999

Cluster B has a systematically larger residual_var (0.30 vs. 0.22), directly reflecting its higher σ (0.50 vs. 0.40).

7.4 Shelf life comparison

# Use the pooled training range across both clusters (arcsin-sqrt scale)
pr_all <- range(et_sim$train$z_diff)   # e.g. c(-1.67, 2.04) => width ~ 3.71

sl_all <- shelf_life(
  pred_all,
  plausible_range          = pr_all,
  ci_level                 = 0.90,
  threshold                = 1.0,
  time_col                 = "year",
  max_extrapolation_factor = 10
)

print(sl_all)
# ErrorTracer shelf life analysis
#   Observations     : 10
#   Plausible range  : 3.71
#   Threshold        : 1
#   [A]  100.0% informative  mean ratio = 0.465  shelf life > 2019 (lower bound)
#   [B]  100.0% informative  mean ratio = 0.544  shelf life > 2019 (lower bound)

Both clusters remain strongly informative across all five forecast years. Cluster B’s mean ratio (~0.54) is noticeably higher than cluster A’s (~0.47), directly reflecting cluster B’s higher residual variance (σ = 0.50 vs. 0.40). Both are well below the uninformative threshold of 1.0, and neither crosses within the 2015–2019 window, so both are reported as lower bounds.

The difference in ratios has a practical implication: under a longer forecast horizon or with larger environmental noise inputs, cluster B would approach the uninformative threshold first — a direct consequence of its higher biological process noise.

et_plot_shelf_life(sl_all, show_ratio = TRUE) +
  labs(
    subtitle = "Both clusters — 90% CI width / plausible range",
    colour   = "Cluster"
  )

7.5 Calibration and forecast plots

cal_all <- et_calibrate(
  pred_all,
  observed     = et_sim$validation,
  response_col = "z_diff",
  ci_levels    = c(0.50, 0.80, 0.90, 0.95)
)

et_plot_calibration(cal_all) +
  labs(subtitle = "Both clusters — posterior predictive calibration")

# Forest plot for all groups in a single faceted figure
et_plot_coefficients(fit_all) +
  labs(subtitle = "Bayesian 95% CI (blue) vs. enet coefficient (red ×)")

8 Long-horizon forecasting with GCM projections

The shelf-life analysis becomes most powerful when combined with climate projections from Global Circulation Models (GCMs). The idea is:

Validate that the model works on real held-out years (e.g. 2022–2025).
Project onto GCM-simulated climate for 2025–2075 to ask: over what horizon does the forecast remain informative?

A critical refinement is that GCM uncertainty is not constant — ensemble spread (the disagreement among model runs) typically grows with the projection horizon. The env_noise argument now accepts per-row vectors so you can pass growing noise SDs directly.

# Suppose you have GCM ensemble-mean predictors for 2025-2075,
# standardised using your training-period statistics.
gcm_years <- 2025:2075

gcm_df <- data.frame(
  year  = gcm_years,
  Tmean = ...,    # standardised GCM ensemble-mean temperature
  PPT   = ...,    # standardised GCM ensemble-mean precipitation
  SWE   = ...     # standardised GCM ensemble-mean snowpack
)

# GCM ensemble spread (SD across runs) grows roughly linearly with time.
# Here we model spread as starting at the validation-period noise level
# and increasing by ~0.01 SD per year for temperature.
base_year   <- 2025
tmean_noise <- 0.30 + 0.01 * (gcm_years - base_year)  # 0.30 → 0.80
ppt_noise   <- 0.20 + 0.005 * (gcm_years - base_year) # 0.20 → 0.45
swe_noise   <- 0.15 + 0.005 * (gcm_years - base_year) # 0.15 → 0.40

# Predict over the 50-year GCM window with time-varying noise
pred_gcm <- et_predict(
  model     = fit_A,
  newdata   = gcm_df,
  env_noise = list(
    Tmean = tmean_noise,   # per-row vector, length = nrow(gcm_df)
    PPT   = ppt_noise,
    SWE   = swe_noise
  ),
  ci_levels = c(0.90, 0.95),
  n_draws   = 1000L
)

# Shelf life over the 50-year horizon
sl_gcm <- shelf_life(
  pred_gcm,
  plausible_range          = range(train_A$z_diff),
  time_col                 = "year",
  max_extrapolation_factor = Inf    # allow projection beyond the 50-yr window
)

print(sl_gcm)
# If the 90% CI stays below the plausible range throughout 2025-2075:
#   Shelf life : > 2075 (lower bound — all periods informative, no trend)
# If it crosses (say, around 2058):
#   Shelf life : ~2058 (observed — threshold first exceeded)

The per-row noise means that decompose_uncertainty() now shows env_var increasing across forecast years — directly capturing the epistemic uncertainty from the GCM ensemble spread.

decomp_gcm <- decompose_uncertainty(pred_gcm)
# env_var should be small in 2025 and growing toward 2075 as GCM spread
# accumulates; param_var and residual_var remain roughly constant.
plot(gcm_years, decomp_gcm$env_var,
     type = "l", xlab = "Year", ylab = "Environmental variance",
     main = "Growing GCM ensemble uncertainty propagated into forecast")

Interpreting the result for conservation. If the shelf life exceeds your planning horizon (e.g. “> 2075”), you can state: “Under the [GCM scenario], the 90% credible interval for allele- frequency change remains within the plausible biological range throughout the 50-year planning horizon, supporting its use in long-term conservation genomic forecasting.” If a crossing year is observed (e.g. 2058), you can report the horizon directly and note which uncertainty source (parameter, environmental, or residual) drives the loss of informativeness.

9 Using `lm` instead of `glmnet`

extract_priors() is not limited to elastic net. The lm method uses OLS standard errors (× multiplier) as prior SDs instead of coefficient magnitudes, giving narrower priors when SEs are small.

lm_fit_A <- lm(z_diff ~ Tmean + PPT + SWE, data = train_A)

prior_lm_A <- extract_priors(
  model      = lm_fit_A,
  multiplier = 2.0,
  min_sd     = 0.10
)

print(prior_lm_A)
# ErrorTracer prior specification
#   Method      : lm
#   Predictors  : 3
#   Multiplier  : 2
#   Min SD      : 0.1
#   Coefficients:
#     Tmean                 mean =   0.6669  sd =   1.3338
#     PPT                   mean =  -0.4677  sd =   0.9354
#     SWE                   mean =   0.3859  sd =   0.7719

In this case the OLS and elastic-net estimates are very close (n = 20 is not a sparse problem with only 3 predictors), so the two prior specifications lead to nearly identical posteriors. The key advantage of the elastic-net pathway arises when the number of predictors is large relative to n, where regularisation is essential.

10 Standardising and back-transforming

The standardization slot stores the training-period mean and SD for each predictor, enabling back-transformation to original units with unstandardize().

# Standardisation constants used when generating et_sim
et_sim$standardization
# $Tmean
#      mean        sd
# 15.5422  0.6797
#
# $PPT
#      mean        sd
# 110.5460  14.6855
#
# $SWE
#      mean        sd
# 86.9190  15.1885

# A Tmean of +1 SD corresponds to:
unstandardize(z = 1.0,
              mu = et_sim$standardization$Tmean["mean"],
              s  = et_sim$standardization$Tmean["sd"])
# [1] 16.22   ← 1 SD above training mean = 16.2 °C

11 Summary

This vignette has demonstrated the complete ErrorTracer pipeline on simulated data where the ground truth is known:

extract_priors() converted elastic-net coefficients into brms-compatible informed priors, transferring regularised estimates as Bayesian prior means.
et_fit() fitted the Bayesian model for a single cluster and for both clusters simultaneously via grouping = "cluster_id".
et_diagnose() confirmed convergence (Rhat < 1.05, zero divergences) and adequate model fit (no high Pareto-k observations for cluster A).
et_predict() generated posterior predictive draws and propagated environmental measurement uncertainty. Environmental noise contributed ~17–25% of total variance — comparable to parameter uncertainty in this small-sample setting.
decompose_uncertainty() partitioned forecast variance into three components, identifying residual process noise as the dominant bottleneck (~80%) in both clusters.
shelf_life() quantified that cluster A forecasts (mean ratio ~0.47) are more informative than cluster B (~0.54), consistent with cluster B’s higher residual variance. Both clusters are well below the uninformative threshold (1.0), so the shelf life exceeds the entire 2015–2019 validation window — reported as lower bounds. The three-mode horizon logic (observed / projected / lower_bound) and the max_extrapolation_factor cap prevent spurious far-future projections when the CI/range trend is nearly flat.
et_calibrate() confirmed near-nominal posterior predictive coverage on held-out validation data.
Parameter recovery confirmed that all four true coefficients fell within their 95% posterior credible intervals, validating the pipeline’s statistical integrity.

12 Session information

#> R version 4.5.3 (2026-03-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=es_CU.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=es_CU.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=es_CU.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=es_CU.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/Chicago
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.39   R6_2.6.1        fastmap_1.2.0   xfun_0.57      
#>  [5] cachem_1.1.0    knitr_1.51      htmltools_0.5.9 rmarkdown_2.31 
#>  [9] lifecycle_1.0.5 cli_3.6.6       sass_0.4.10     jquerylib_0.1.4
#> [13] compiler_4.5.3  tools_4.5.3     evaluate_1.0.5  bslib_0.10.0   
#> [17] yaml_2.3.12     otel_0.2.0      rlang_1.2.0     jsonlite_2.0.0

Getting Started with ErrorTracer

Bayesian error propagation and forecast uncertainty decomposition

Luis Javier Madrigal-Roca, John Kelly

2026-04-29

1 Introduction

1.1 The simulated dataset

2 Setup

3 Exploring the dataset

3.1 Visualise the training data

4 Single-cluster workflow: Cluster A

4.1 Step 1 — Elastic net for prior extraction

4.2 Step 2 — Extract informed priors

4.3 Step 3 — Fit the Bayesian model

4.4 Step 4 — Diagnose convergence and model fit

4.5 Step 5 — Predict with environmental noise propagation

4.6 Step 6 — Three-way uncertainty decomposition

4.7 Step 7 — Forecast shelf life

4.8 Step 8 — Calibration assessment

5 Visualisations

5.1 Forecast fan chart

5.2 Prior versus posterior

5.3 Forest plot: Bayesian vs. elastic net

6 Parameter recovery validation

7 Multi-cluster grouped workflow

7.1 Fitting

7.2 Diagnostics

7.3 Prediction and decomposition

7.4 Shelf life comparison

7.5 Calibration and forecast plots

8 Long-horizon forecasting with GCM projections

9 Using `lm` instead of `glmnet`

10 Standardising and back-transforming

11 Summary

12 Session information

Parameter	Cluster A	Cluster B
\(\alpha\) (intercept)	0.00	0.10
\(\beta_1\) (Tmean)	+0.50	+0.30
\(\beta_2\) (PPT)	−0.30	−0.20
\(\beta_3\) (SWE)	+0.20	−0.10
\(\sigma\)	0.40	0.50

Getting Started with ErrorTracer

Bayesian error propagation and forecast uncertainty decomposition

Luis Javier Madrigal-Roca, John Kelly

2026-04-29

1 Introduction

1.1 The simulated dataset

2 Setup

3 Exploring the dataset

3.1 Visualise the training data

4 Single-cluster workflow: Cluster A

4.1 Step 1 — Elastic net for prior extraction

4.2 Step 2 — Extract informed priors

4.3 Step 3 — Fit the Bayesian model

4.4 Step 4 — Diagnose convergence and model fit

4.5 Step 5 — Predict with environmental noise propagation

4.6 Step 6 — Three-way uncertainty decomposition

4.7 Step 7 — Forecast shelf life

4.8 Step 8 — Calibration assessment

5 Visualisations

5.1 Forecast fan chart

5.2 Prior versus posterior

5.3 Forest plot: Bayesian vs. elastic net

6 Parameter recovery validation

7 Multi-cluster grouped workflow

7.1 Fitting

7.2 Diagnostics

7.3 Prediction and decomposition

7.4 Shelf life comparison

7.5 Calibration and forecast plots

8 Long-horizon forecasting with GCM projections

9 Using lm instead of glmnet

10 Standardising and back-transforming

11 Summary

12 Session information

9 Using `lm` instead of `glmnet`