Classical reliability texts assume all components in a series system
share the same lifetime distribution: all exponential, all Weibull, or
(in the homogeneous case) all Weibull with a shared shape. This
simplification enables closed-form likelihood analysis — which is why
the reference implementations shipped with maskedcauses
cover these specific families.
Real systems rarely cooperate. A server’s failure modes include:
When you fit a single-family model to such a system, you’re forced to
average over regimes that behave qualitatively differently.
maskedhaz lets each component carry its own
distribution.
Consider an implantable medical device with four independently-failing components:
| Component | Distribution | Parameters | Failure physics |
|---|---|---|---|
| Battery | Weibull | shape \(=3\), scale \(=10\) (years) | Wear-out at end of rated life |
| Electronics | Exponential | rate \(= 0.02\) | Random bit-flips / latch-ups |
| Lead wire | Gompertz | \(a=0.02\), \(b=0.3\) | Fatigue cycling, accelerating |
| Sensor | Log-logistic | \(\alpha = 8\), \(\beta = 2\) | Hump-shaped: peaks in mid-life, declines after |
library(maskedhaz)
device <- dfr_series_md(components = list(
battery = dfr_weibull(shape = 3, scale = 10),
electronics = dfr_exponential(0.02),
lead_wire = dfr_gompertz(a = 0.02, b = 0.3),
sensor = dfr_loglogistic(alpha = 8, beta = 2)
))
device
#> Masked-cause likelihood model (4-component series)
#> Component 1: 2 param(s) [ 3, 10]
#> Component 2: 1 param(s) [0.02]
#> Component 3: 2 param(s) [0.02, 0.30]
#> Component 4: 2 param(s) [8, 2]
#> Data columns: t (lifetime), omega (type), x * (candidates)The total parameter count is \(2 + 1 + 2 +
2 = 7\). param_layout() shows which components own
which positions:
We simulate 500 devices observed for 8 years (right-censoring at
tau = 8). Masking probability p = 0.3 means
diagnostics include each non-failing component in the candidate set with
30% probability.
true_par <- c(3, 10, # battery: shape, scale
0.02, # electronics: rate
0.02, 0.3, # lead wire: a, b
8, 2) # sensor: alpha, beta
set.seed(2026)
rdata_fn <- rdata(device)
df <- rdata_fn(theta = true_par, n = 500, tau = 8, p = 0.3)
table(df$omega)
#>
#> exact right
#> 437 63About 87% of devices reach failure within the 8-year observation window, the rest are right-censored — a high exact-failure fraction because the combined hazard becomes substantial by the end of the window.
The exp/Weibull reference implementations in
maskedcauses don’t cover this mixture — no single
family-specific likelihood does. But the protocol they
implement (the series_md class, the C1–C2–C3 schema, the
loglik/score/fit generics)
applies uniformly. maskedhaz is another implementation
under the same protocol: it handles arbitrary component combinations via
component-wise hazard accumulation and numerical integration where
needed (none here, since we have no left or interval censoring).
solver <- fit(device)
# Initialize reasonably far from the truth to exercise the optimizer
init <- c(2, 8, 0.03, 0.03, 0.2, 6, 1.5)
result <- solver(df, par = init)Building the comparison table — coef() from
stats, vcov() from stats, and the
se() diagonal computation is what
algebraic.mle::se() wraps:
comparison <- data.frame(
parameter = c("battery shape", "battery scale",
"electronics rate",
"lead wire a", "lead wire b",
"sensor alpha", "sensor beta"),
truth = true_par,
estimate = round(coef(result), 4),
se = round(sqrt(diag(vcov(result))), 4)
)
comparison
#> parameter truth estimate se
#> 1 battery shape 3.00 3.1443 0.3069
#> 2 battery scale 10.00 9.3948 0.4318
#> 3 electronics rate 0.02 0.0250 0.0043
#> 4 lead wire a 0.02 0.0135 0.0033
#> 5 lead wire b 0.30 0.3613 0.0486
#> 6 sensor alpha 8.00 7.1420 0.3617
#> 7 sensor beta 2.00 2.4346 0.1893All seven parameters are recovered within roughly one standard error of truth. Identifiability is comfortable here because the four components have qualitatively different hazard shapes — the system hazard \(h_{\text{sys}}(t) = \sum_j h_j(t)\) carries distinguishable contributions from each regime, and each regime dominates a different time window within the 8-year observation horizon.
A fundamental constraint in masked-cause inference: when all
components share a family, sometimes only aggregate parameters
are identifiable from system-level data. The canonical example is
exponential components — only \(\sum_j
\lambda_j\) is identifiable, not the individual rates (see
vignette("censoring-and-masking") for the details).
Mixed families break this degeneracy. If component 1 is Weibull (power-law growth) and component 2 is exponential (flat), the system hazard shape tells you which component dominates at which time regime. The likelihood surface separates them cleanly.
We can visualize this by looking at conditional cause probabilities — given a failure at time \(t\), which component was most likely at fault?
ccp_fn <- conditional_cause_probability(device)
t_grid <- seq(0.5, 15, length.out = 60)
probs <- ccp_fn(t_grid, par = true_par)
matplot(t_grid, probs, type = "l", lwd = 2,
xlab = "system failure time (years)",
ylab = "P(component j caused failure | T = t)",
main = "Which component fails when?")
legend("topright",
legend = c("battery", "electronics", "lead wire", "sensor"),
col = 1:4, lty = 1:4, lwd = 2, bty = "n")Three regimes are visible. In early life (\(t < 2\)), the log-logistic sensor and constant-rate electronics compete for dominance — both have small but comparable hazards while the Weibull and Gompertz components are still negligible. In mid-life (\(t \approx 2\)–\(6\)), the Weibull battery rapidly takes over as its power-law hazard \(h_1(t) = 3(t/10)^2/10\) scales up. At late life (\(t > 10\)), the Gompertz lead wire competes with the battery — its accelerating hazard \(h_3(t) = 0.02 \cdot e^{0.3 t}\) catches up to the battery’s quadratic growth. The electronics has a flat contribution at rate 0.02 but its share declines because other components ramp up around it.
The fact that these regimes are temporally separated is exactly what makes the individual parameters identifiable from system-level data — each time window provides a different “view” of which component is contributing. Under a single-family assumption (say, “all Weibull”), the model would be forced to absorb all four physical modes into two aggregate parameters, losing both interpretability and the ability to predict which component will fail next.
The dfr_dist abstraction from flexhaz lets
you define a component from nothing more than its hazard function.
Suppose we want a bathtub-shaped failure mode with a
well-chosen hazard \(h(t) = \alpha t^{-1/2} +
\beta + \gamma t\):
library(flexhaz)
bathtub <- dfr_dist(
rate = function(t, par, ...) par[1]/sqrt(t) + par[2] + par[3]*t,
cum_haz_rate = function(t, par, ...) 2*par[1]*sqrt(t) + par[2]*t + par[3]*t^2/2,
n_par = 3L,
par = c(0.1, 0.05, 0.01),
par_lower = c(1e-6, 1e-6, 1e-6)
)
# Plug it into a series system like any other dfr_dist
mixed <- dfr_series_md(components = list(
bathtub,
dfr_exponential(0.02)
))Because maskedhaz never assumes closed-form likelihoods,
this works exactly like the built-in families. The log-likelihood,
score, Hessian, and MLE are all computed by the same machinery — your
custom hazard doesn’t need any special integration code.
maskedhaz supports any combination of
dfr_dist components — no closed-form assumptions