---
title: "Comparing ML-UMR, STC, and Naive Methods"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Comparing ML-UMR, STC, and Naive Methods}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE, purl = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE,
  message = FALSE,
  warning = FALSE
)
```

## Why compare methods?

Unanchored indirect treatment comparisons (ITCs) rely on strong assumptions
that cannot be fully verified from data alone. Running multiple methods and
comparing their results is an important part of any ITC analysis:
a sensitivity check that reveals how conclusions depend on modeling choices.

mlumr provides three methods in a single package with a unified data
interface, making side-by-side comparison straightforward.

| Method | Adjustment | Framework | Key assumption |
|--------|-----------|-----------|----------------|
| Naive | None | Frequentist | Populations are exchangeable |
| STC | Outcome regression | Frequentist | Correct outcome model specification |
| ML-UMR SPFA | Joint Bayesian model | Bayesian | Shared prognostic effects (SPFA) |
| ML-UMR Relaxed | Joint Bayesian model | Bayesian | Correct specification and enough information/prior support for treatment-specific effects |

## Setup: shared data preparation

All three methods start from the same `mlumr_data` object. STC and ML-UMR
benefit from integration points; the naive method ignores them.

```{r, eval = TRUE, purl = FALSE}
library(mlumr)
set.seed(2026)

# --- Simulate IPD (index treatment) ---
n_A <- 500
age_A <- rbinom(n_A, 1, 0.40)
sex_A <- rbinom(n_A, 1, 0.55)

# True DGP: logit(p) = -0.5 + 0.8*age - 0.3*sex
logit_p_A <- -0.5 + 0.8 * age_A - 0.3 * sex_A
y_A <- rbinom(n_A, 1, plogis(logit_p_A))

ipd_df <- data.frame(
  trt = "Drug_A", outcome = y_A,
  age_group = age_A, sex = sex_A
)

# --- Simulate AgD (comparator) ---
# Same covariate effects, different intercept and covariate distribution
n_B <- 400
r_B <- 148  # pre-computed from true model with different covariates

agd_df <- data.frame(
  trt = "Drug_B", n_total = n_B, n_events = r_B,
  age_group_mean = 0.35, sex_mean = 0.50
)

# --- Prepare data ---
ipd <- set_ipd(ipd_df, treatment = "trt", outcome = "outcome",
               covariates = c("age_group", "sex"))
agd <- set_agd(agd_df, treatment = "trt",
               outcome_n = "n_total", outcome_r = "n_events",
               cov_means = c("age_group_mean", "sex_mean"),
               cov_types = c("binary", "binary"))

dat <- combine_data(ipd, agd)

# Add integration points (used by STC and ML-UMR)
dat <- add_integration(
  dat, n_int = 64,
  age_group = distr(qbern, prob = age_group_mean),
  sex = distr(qbern, prob = sex_mean)
)
```

## Running all three methods

### Naive estimate

```{r, eval = TRUE, purl = FALSE}
res_naive <- naive(dat)
print(res_naive)
```

The naive method ignores covariate differences between populations. It
serves as a reference: if naive and adjusted estimates agree, covariate
imbalance has little practical impact.

### STC estimate

```{r, eval = TRUE, purl = FALSE}
res_stc <- stc(dat)
print(res_stc)
```

STC fits a logistic regression on IPD and predicts counterfactual outcomes
for the comparator population via G-computation. The delta method provides
frequentist standard errors. STC is fast (sub-second) and a good default
when covariate adjustment is needed but a full Bayesian model is not
warranted.

### ML-UMR SPFA

```{r, eval = TRUE, purl = FALSE}
fit_spfa <- mlumr(
  dat, model = "spfa",
  prior_intercept = prior_normal(0, 10),
  prior_beta = prior_normal(0, 2.5),
  chains = 2, iter = 500, warmup = 250,
  seed = 42, refresh = 0, verbose = FALSE
)
summary(fit_spfa)
```

The SPFA model assumes shared covariate effects across treatments. It
jointly models both data sources and produces posterior distributions
for all parameters.

### ML-UMR Relaxed

```{r, eval = TRUE, purl = FALSE}
fit_relaxed <- mlumr(
  dat, model = "relaxed",
  prior_intercept = prior_normal(0, 10),
  prior_beta = prior_normal(0, 2.5),
  chains = 2, iter = 500, warmup = 250,
  seed = 43, refresh = 0, verbose = FALSE
)
summary(fit_relaxed)
```

The Relaxed model allows treatment-specific covariate coefficients,
capturing potential effect modification. Compare it with SPFA to assess
whether assuming shared effects is reasonable. With sparse AgD, relaxed-model
comparator coefficients can be prior-sensitive, so inspect `delta_beta` and
run prior-sensitivity checks when the relaxed model is central to the
interpretation.

## Building a comparison table

```{r, eval = TRUE, purl = FALSE}
# Extract ML-UMR marginal effects (LOR in comparator population)
me_spfa <- marginal_effects(fit_spfa, effect = "lor")
me_relaxed <- marginal_effects(fit_relaxed, effect = "lor")

# Comparator-population LORs from ML-UMR
lor_spfa <- me_spfa[me_spfa$population == "Comparator", ]
lor_relaxed <- me_relaxed[me_relaxed$population == "Comparator", ]

# Assemble results
comparison <- data.frame(
  Method = c("Naive", "STC", "ML-UMR SPFA", "ML-UMR Relaxed"),
  LOR = c(res_naive$link_effect, res_stc$link_effect, lor_spfa$mean, lor_relaxed$mean),
  SE = c(res_naive$se, res_stc$se, lor_spfa$sd, lor_relaxed$sd),
  CI_lower = c(res_naive$ci_lower, res_stc$ci_lower,
               lor_spfa$q2.5, lor_relaxed$q2.5),
  CI_upper = c(res_naive$ci_upper, res_stc$ci_upper,
               lor_spfa$q97.5, lor_relaxed$q97.5),
  stringsAsFactors = FALSE
)

# Add odds ratios for clinical interpretation
comparison$OR <- exp(comparison$LOR)
comparison$OR_lower <- exp(comparison$CI_lower)
comparison$OR_upper <- exp(comparison$CI_upper)

print(comparison, digits = 3)
```

## Interpreting differences between methods

### Naive vs STC

A large gap between naive and STC estimates indicates that covariate
imbalance is influencing the unadjusted comparison. The direction of the
shift reveals which population's covariates favor the outcome.

```{r, eval = TRUE, purl = FALSE}
bias_naive <- res_naive$link_effect - res_stc$link_effect
cat(sprintf("Apparent bias from covariate imbalance: %.3f (log OR scale)\n",
            bias_naive))
```

### STC vs ML-UMR SPFA

Under the shared prognostic factor assumption, STC and ML-UMR SPFA should
give similar point estimates because both adjust for the same covariates.
Differences arise because:

1. **STC** uses maximum likelihood; **ML-UMR** uses Bayesian inference with
   priors. With large samples, this difference is small.
2. **ML-UMR** jointly models both data sources; **STC** only uses AgD for
   prediction targets, not as a likelihood contribution.
3. ML-UMR uncertainty intervals tend to be wider because they account for
   more sources of uncertainty (prior, integration, joint modeling).

### SPFA vs Relaxed

```{r, eval = TRUE, purl = FALSE}
dic_comparison <- compare_models(fit_spfa, fit_relaxed)
print(dic_comparison)
```

If the Relaxed model gives markedly different LORs or substantially better
DIC, this *may* suggest effect modification -- covariate effects differ by
treatment. However, DIC is a rough metric with known limitations: it should
not be the sole basis for claiming effect modification. Always inspect
`delta_beta` credible intervals, prior sensitivity, and clinical plausibility.
If SPFA and Relaxed agree, the simpler SPFA model is preferred.

```{r, eval = TRUE, purl = FALSE}
# Directly compare delta_beta from the Relaxed model
# Non-zero delta_beta indicates effect modification
relaxed_summary <- fit_relaxed$summary
delta_rows <- grepl("^delta_beta", relaxed_summary$variable)
if (any(delta_rows)) {
  cat("Effect modification parameters (delta_beta):\n")
  print(relaxed_summary[delta_rows, c("variable", "mean", "2.5%", "97.5%")],
        row.names = FALSE)
}
```

```{r, eval = FALSE, purl = FALSE}
prior_sensitivity(fit_relaxed, prior_beta_scales = c(1, 2.5, 5, 10))
```

## Decision guide: which method to report?

The choice of primary analysis depends on the clinical and regulatory
context. This flowchart summarizes the main considerations:

1. **Are covariate distributions similar across trials?**
   - Yes: Naive estimate may suffice as primary, with STC as sensitivity.
   - No: Adjustment is needed. Proceed to step 2.

2. **Is the SPFA assumption plausible?**
   - Yes: ML-UMR SPFA is the recommended primary analysis. Report STC and
     Naive as sensitivity analyses.
   - Uncertain: Run both SPFA and Relaxed. If they agree, report SPFA.
     If they disagree, report Relaxed as primary and discuss the evidence
     for effect modification.
   - No (known effect modification): ML-UMR Relaxed as primary.

3. **Is Bayesian analysis acceptable to the audience?**
   - No: Report STC as primary with ML-UMR as sensitivity.
   - Yes: Report ML-UMR as primary with STC as sensitivity.

In all cases, report the naive estimate as a benchmark to quantify the
impact of covariate adjustment.

## Presenting results

For a complete ITC report, present:

1. **A summary table** (as above) with LOR, SE, and CI for each method
2. **Model diagnostics** for ML-UMR (divergences, Rhat, ESS)
3. **DIC comparison** between SPFA and Relaxed (if both were fit)
4. **A narrative** explaining which method is primary and why, with
   sensitivity analyses supporting the conclusions

```{r, eval = TRUE, purl = FALSE}
# Example narrative output
cat("Primary analysis: ML-UMR SPFA\n")
cat(sprintf("  LOR = %.3f (95%% CrI: %.3f to %.3f)\n",
            lor_spfa$mean, lor_spfa$q2.5, lor_spfa$q97.5))
cat(sprintf("  OR  = %.3f (95%% CrI: %.3f to %.3f)\n\n",
            exp(lor_spfa$mean), exp(lor_spfa$q2.5), exp(lor_spfa$q97.5)))

cat("Sensitivity analyses:\n")
cat(sprintf("  STC:    LOR = %.3f (95%% CI: %.3f to %.3f)\n",
            res_stc$link_effect, res_stc$ci_lower, res_stc$ci_upper))
cat(sprintf("  Naive:  LOR = %.3f (95%% CI: %.3f to %.3f)\n",
            res_naive$link_effect, res_naive$ci_lower, res_naive$ci_upper))
cat(sprintf("  Relaxed: LOR = %.3f (95%% CrI: %.3f to %.3f)\n",
            lor_relaxed$mean, lor_relaxed$q2.5, lor_relaxed$q97.5))
```

## Summary

Running all three methods and comparing their results strengthens any
unanchored ITC analysis. mlumr's unified data interface makes this
comparison straightforward -- the same `mlumr_data` object feeds all
methods, ensuring consistency in the inputs.

Key takeaways:

- **Always report the naive estimate** as a baseline to show the impact
  of adjustment.
- **STC is a fast, frequentist alternative** that adjusts for measured
  covariates via outcome regression.
- **ML-UMR SPFA** is the recommended primary analysis under the shared
  prognostic factor assumption, offering fully Bayesian uncertainty
  quantification.
- **ML-UMR Relaxed** is a useful sensitivity analysis when effect
  modification is plausible.
- **Compare SPFA and Relaxed via DIC** as a rough sensitivity check.
  DIC is a legacy metric with known limitations; treat large differences
  as suggestive rather than definitive evidence for effect modification.