In contrast to CausalMix, which was limited to modeling
the average treatment effect, CausalMixGPD expands on the
framework of causal inference by fitting treatment-specific outcome
distributions and obtaining the contrasts by evaluating posterior
functional of the fitted distribution (Rubin 1974; Rosenbaum and Rubin 1983; Imbens and Rubin 2015). Thus, it
is a distribution-based approach where one obtains means, quantiles,
restricted means, survival summaries, and their contrasts based on
posterior objects obtained from the modeled outcome distributions.
The key point made in the software article is that the proposed approach becomes especially relevant when the outcome is skewed or heavy-tailed. In such settings, treatment heterogeneity might not manifest itself via contrasts of means but rather in terms of other summaries of interest such as quantiles, and hence the modeling of treatment-specific distributions becomes critical for obtaining a coherent picture of the heterogeneous treatment effect.
In this vignette, we describe the process of performing causal
analysis with CausalMixGPD in four parts. First, we
introduce potential outcomes framework and discuss the causal estimands
computed by the package. Next, we explain what happens under the hood
when the spliced DPM-GPD model is fit to each of the arms. After that,
we demonstrate the usage of the key functions of the package using the
example of the provided simulated data set.
Let \(A \in \{0,1\}\) denote a binary treatment indicator, with \(A = 1\) for treatment and \(A = 0\) for control. Let \(Y(1)\) and \(Y(0)\) be the potential outcomes, and let \(X\) denote the pre-treatment covariates. The observed outcome is
\[ Y = A Y(1) + (1-A)Y(0). \]
For each treatment arm \(a \in \{0,1\}\), define the arm-specific conditional CDF and density by
\[ F_a(y \mid x) = \Pr\{Y(a) \le y \mid X = x\}, \qquad f_a(y \mid x) = \frac{\partial}{\partial y} F_a(y \mid x). \]
The package models these arm-specific conditional distributions directly. Once they are available, all reported causal estimands are computed from them.
The causal workflow follows the standard potential-outcomes framework under consistency, no interference, conditional ignorability, and overlap (Rubin 1974; Rosenbaum and Rubin 1983; Imbens and Rubin 2015). Under these assumptions,
\[ F_a(y \mid x) = \Pr(Y \le y \mid A=a, X=x), \qquad a \in \{0,1\}. \]
That identification step matters because it justifies fitting arm-specific outcome models to the observed data and then interpreting posterior contrasts as causal quantities.
In observational settings, the package can estimate a propensity score
\[ \rho(x) = \Pr(A=1 \mid X=x) \]
and append a summary of it to the outcome model covariates. If \(\widehat\rho(x)\) is the estimated score, the augmented predictor used by the outcome model is typically of the form
\[ r(x) = (x^\top, \psi\{\widehat\rho(x)\})^\top, \]
where \(\psi(\cdot)\) is either the identity map on the probability scale or the logit transformation, depending on the setting. This is the package’s default strategy for incorporating design information into the arm-specific outcome regressions.
Within each arm, CausalMixGPD uses the same spliced
model described in the one-arm workflow. For arm \(a\), the conditional distribution combines
a DPM bulk and an optional GPD tail:
\[ f_a(y \mid x) = \begin{cases} f_{a,\mathrm{DP}}(y \mid x; \Theta_a), & y \le u_a(x), \\ \{1-p_{u_a}(x;\Theta_a)\} f_{a,\mathrm{GPD}}(y \mid x; \Phi_a), & y > u_a(x), \end{cases} \]
where \(p_{u_a}(x;\Theta_a) = F_{a,\mathrm{DP}}(u_a(x) \mid x; \Theta_a)\). The main modeling choice in the causal workflow is therefore not the contrast itself but the quality of the two fitted conditional outcome distributions.
Once the arm-specific distributions are fitted, the package returns several causal functionals.
The conditional mean in arm \(a\) is
\[ \mu_a(x) = E\{Y(a) \mid X=x\}, \]
and the conditional quantile is
\[ Q_a(\tau \mid x) = \inf\{y : F_a(y \mid x) \ge \tau\}. \]
From these, the package constructs conditional and standardized estimands. The main ones are
\[ \mathrm{CATE}(x) = \mu_1(x) - \mu_0(x), \]
\[ \mathrm{CQTE}(\tau \mid x) = Q_1(\tau \mid x) - Q_0(\tau \mid x), \]
\[ \mathrm{ATE} = E\{\mu_1(X) - \mu_0(X)\}, \]
and
\[ \mathrm{QTE}(\tau) = Q^m_1(\tau) - Q^m_0(\tau), \]
where the superscript \(m\) indicates standardization over the empirical covariate distribution in the full sample. The package also provides ATT and QTT analogues standardized over the treated covariate distribution.
These definitions explain the structure of the causal API. Functions
such as ate(), qte(), cate(), and
cqte() are not fitting new models. They are post-processing
a fitted causal object to summarize different posterior functionals.
The most important exported functions in this vignette are:
dpmgpd.causal() for spliced arm-specific outcome
modeling.dpmix.causal() for the bulk-only causal analogue.ate() and att() for mean or
restricted-mean treatment contrasts.qte() and qtt() for marginal quantile
contrasts.cate() and cqte() for profile-specific
conditional contrasts.predict() for arm-wise predictive summaries and
contrast-based summaries at new covariate profiles.summary() and plot() for effect-object
summaries and visualization.mcmc_vig <- list(
niter = 1200,
nburnin = 300,
thin = 2,
nchains = 2,
seed = 2026
)
# Settings used when regenerating `inst/extdata/causal_*.csv` (see tools/.Rscripts/build_vignette_fits.R).
mcmc_out <- list(
niter = 1200,
nburnin = 300,
thin = 2,
nchains = 2,
seed = 2027
)
mcmc_ps <- list(
niter = 1000,
nburnin = 250,
thin = 2,
nchains = 2,
seed = 2028
)We use the bundled causal dataset causal_pos500_p3_k2,
which contains a positive outcome y, a binary treatment
indicator A, and a predictor matrix X.
data("causal_pos500_p3_k2", package = "CausalMixGPD")
causal_dat <- data.frame(
y = causal_pos500_p3_k2$y,
A = causal_pos500_p3_k2$A,
causal_pos500_p3_k2$X
)A quick empirical comparison between the observed outcome distributions by treatment arm gives a sense of the modeling problem.
ggplot(causal_dat, aes(x = y, colour = factor(A), fill = factor(A))) +
geom_histogram(aes(y = after_stat(density)), bins = 25,
alpha = 0.25, position = "identity") +
labs(
x = "y",
y = "Density",
colour = "A",
fill = "A",
title = "Observed outcome distributions by treatment arm"
) +
theme_minimal()Even in a simulated benchmark, the treated and control distributions can differ in ways that are not well summarized by a single average shift. That is exactly where the distributional causal framework is useful.
The high-level causal wrapper is dpmgpd.causal(). It
fits treatment-specific outcome models and, when requested, a propensity
score stage. The software article presents this as the direct causal
extension of the one-arm workflow.
To keep vignette building stable, the numerical outputs shown below
are read from small precomputed files shipped in
inst/extdata/. The following chunk shows the package call
you would use for a full fit; the tables and figures in this section are
rendered from those shipped artifacts in hidden companion chunks.
fit_causal <- dpmgpd.causal(
formula = y ~ x1 + x2 + x3,
data = causal_dat,
treat = "A",
backend = "crp",
kernel = "gamma",
components = 6,
PS = "logit",
ps_scale = "logit",
ps_summary = "mean",
mcmc_outcome = mcmc_out,
mcmc_ps = mcmc_ps,
parallel_arms = FALSE
)The call matches the structure used to build the shipped causal summaries: arm-specific spliced outcome distributions, a gamma bulk kernel, a CRP backend for the mixture, and a logistic propensity score stage summarized on the logit scale.
The standardized ATE contrasts the arm-specific means after averaging over the empirical covariate distribution.
ate_fit <- ate(fit_causal, interval = "hpd", level = 0.95, show_progress = FALSE)
knitr::kable(summary(ate_fit)$effect_table, format = "html", digits = 4)| estimand | estimate | lower | upper |
|---|---|---|---|
| ATE | -240.3653 | -222.5154 | 0.1209 |
This table is intentionally simple. In the package,
ate() returns an effect object with summary and plotting
methods. The key interpretation point is that the reported effect is
built from the arm-specific fitted distributions, not from a stand-alone
regression coefficient.
Quantile treatment effects are often more informative than a single average contrast in skewed or heavy-tailed settings.
qte_fit <- qte(
fit_causal,
probs = c(0.25, 0.50, 0.75, 0.90, 0.95),
interval = "credible",
level = 0.95,
show_progress = FALSE
)
knitr::kable(summary(qte_fit)$effect_table, format = "html", digits = 4)| prob | estimate | lower | upper |
|---|---|---|---|
| 0.50 | -0.0447 | -1.0870 | 1.4699 |
| 0.90 | 0.0476 | -3.0921 | 3.6900 |
| 0.95 | 0.2909 | -3.8436 | 4.8933 |
qte_fit <- qte(
fit_causal,
probs = c(0.25, 0.50, 0.75, 0.90, 0.95),
interval = "credible",
level = 0.95,
show_progress = FALSE
)
plot(qte_fit, type = "effect")The QTE display shows how the treatment effect varies with the quantile level. This is one of the main scientific motivations for fitting treatment-specific distributions directly rather than relying on mean-only summaries.
The conditional functions cate() and cqte()
evaluate treatment contrasts at user-supplied covariate profiles. A
common strategy is to build representative profiles from empirical
quartiles of the predictors.
qs <- c(0.25, 0.50, 0.75)
Xgrid <- expand.grid(lapply(causal_dat[, c("x1", "x2", "x3")], quantile, probs = qs))
head(Xgrid)
#> x1 x2 x3
#> 1 -0.66114531 -0.54573642 -0.6595519
#> 2 0.07649836 -0.54573642 -0.6595519
#> 3 0.66247098 -0.54573642 -0.6595519
#> 4 -0.66114531 -0.09628321 -0.6595519
#> 5 0.07649836 -0.09628321 -0.6595519
#> 6 0.66247098 -0.09628321 -0.6595519To illustrate the profile-specific summaries in a lightweight vignette, we reuse the package’s shipped conditional quantile summaries.
Xgrid <- as.data.frame(lapply(
causal_dat[, c("x1", "x2", "x3")],
stats::quantile,
probs = c(0.25, 0.50, 0.75),
na.rm = TRUE
))
rownames(Xgrid) <- c("low", "mid", "high")
cqte_fit <- cqte(
fit_causal,
newdata = Xgrid,
probs = c(0.25, 0.50, 0.75, 0.90, 0.95),
interval = "credible",
level = 0.95,
show_progress = FALSE
)
knitr::kable(summary(cqte_fit)$effect_table, format = "html", digits = 4)| estimate | index | id | lower | upper |
|---|---|---|---|---|
| 1.4411 | 0.50 | 1 | 1.2335 | 1.6327 |
| 1.8302 | 0.50 | 2 | 1.2346 | 2.3465 |
| 42.4650 | 0.90 | 1 | 13.1888 | 129.6083 |
| 50.5770 | 0.90 | 2 | 14.9305 | 144.2923 |
| 134.7847 | 0.95 | 1 | 25.1334 | 478.7875 |
| 156.0091 | 0.95 | 2 | 28.9078 | 518.2162 |
The columns have the following meaning.
id indexes the covariate profile.index is the quantile level \(\tau\).estimate is the posterior point estimate of the
arm-contrast quantile summary.lower and upper are the corresponding
interval bounds.In a full causal fit, cqte() would return an effect
object that can be plotted profile by profile. The important conceptual
point is that these conditional quantile contrasts are produced by
evaluating the fitted treated and control quantile functions at the same
covariate profile and differencing the results.
The causal workflow also supports direct prediction. After fitting
the causal object, predict() can be used to obtain
arm-specific densities, survival probabilities, quantiles, means, or
restricted means at new covariate profiles. For mean and quantile-type
outputs, the treated-minus-control contrast is often the quantity of
interest; for density and survival outputs, users frequently inspect the
arm-specific predictions themselves.
The predictive logic is the same as in the one-arm case. First, the model builds arm-specific conditional distributions. Second, desired posterior functionals are evaluated from those distributions. Third, contrasts are formed draw by draw and then summarized.
The causal analysis in CausalMixGPD is best understood
as a structured post-processing layer on top of two fitted conditional
outcome models. The ATE summarizes a standardized mean contrast, but the
QTE and CQTE summaries reveal whether treatment effects differ across
the outcome scale or across covariate profiles. In heavy-tailed
problems, this distinction is often substantive rather than cosmetic.
Two interventions may have similar average effects while behaving quite
differently in the upper tail.
The package’s causal interface is especially useful in that situation because the same fitted object supports several scientifically relevant views of the treatment effect. Mean, restricted-mean, quantile, and conditional contrasts all remain tied to the same posterior predictive mechanism.
The package appendix lists the full causal customization surface; the most practically important arguments are summarized here.
kernel, backend, GPD,
components, and param_specs can each be
supplied either as a single shared value or as arm-specific values of
length two. This allows the treated and control outcome models to differ
in kernel family, backend, truncation size, and prior structure.
dpmgpd.causal() activates the spliced tail, while
dpmix.causal() fits the bulk-only causal analogue.
PS selects the propensity score model. Common values are
"logit", "probit", "naive", and
FALSE.
ps_scale controls whether the propensity score enters
the outcome model on the probability scale or the logit scale.
ps_summary selects the posterior summary of the
propensity score appended to the outcome regressors, typically
"mean" or "median".
ps_prior, ps_clamp, and
include_intercept provide additional control over
propensity score estimation and numerical stability.
The causal workflow separates iteration controls for the propensity
score stage and the outcome stage through mcmc_ps and
mcmc_outcome.
parallel_arms, workers, and related runtime
arguments can be used when the treated and control models are fitted in
parallel.
ate(), att(), cate(), and
related functions accept type = "mean" or
"rmean" when the target is mean-based.
qte(), qtt(), and cqte() use
the argument probs to specify the quantile levels.
Effect intervals are controlled by interval and
level, just as in the one-arm workflow.
predict() supports the same distributional output types
as the one-arm interface, now interpreted in an arm-specific or
contrast-based manner depending on the chosen output.
sessionInfo()
#> R version 4.5.3 (2026-03-11 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26200)
#>
#> Matrix products: default
#> LAPACK version 3.12.1
#>
#> locale:
#> [1] LC_COLLATE=C
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> time zone: America/New_York
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggplot2_4.0.2 CausalMixGPD_0.7.0 nimble_1.4.0
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.6 jsonlite_2.0.0 dplyr_1.2.0
#> [4] compiler_4.5.3 tidyselect_1.2.1 parallel_4.5.3
#> [7] jquerylib_0.1.4 scales_1.4.0 yaml_2.3.12
#> [10] fastmap_1.2.0 lattice_0.22-9 coda_0.19-4.1
#> [13] R6_2.6.1 labeling_0.4.3 generics_0.1.4
#> [16] igraph_2.2.1 knitr_1.51 tibble_3.3.0
#> [19] bslib_0.10.0 pillar_1.11.1 RColorBrewer_1.1-3
#> [22] rlang_1.1.7 cachem_1.1.0 xfun_0.56
#> [25] sass_0.4.10 S7_0.2.1 cli_3.6.5
#> [28] withr_3.0.2 magrittr_2.0.4 digest_0.6.39
#> [31] grid_4.5.3 rstudioapi_0.18.0 lifecycle_1.0.5
#> [34] vctrs_0.7.1 evaluate_1.0.5 pracma_2.4.6
#> [37] glue_1.8.0 farver_2.1.2 numDeriv_2016.8-1.1
#> [40] rmarkdown_2.30 tools_4.5.3 pkgconfig_2.0.3
#> [43] htmltools_0.5.9