| Type: | Package |
| Title: | Convergence and Dynamic Factor Models |
| Version: | 0.3.2 |
| Description: | Tests convergence in macro-financial panels combining Dynamic Factor Models (DFM) and mean-reverting, discrete-time Ornstein-Uhlenbeck/AR(1) factor processes. Provides: (i) static factor extraction with VAR stability checks, Portmanteau tests and rolling out-of-sample R^2, in the spirit of Stock and Watson (2002) <doi:10.1198/073500102317351921> and the Generalized Dynamic Factor Model of Forni, Hallin, Lippi and Reichlin (2000) <doi:10.1162/003465300559037>; (ii) cointegration analysis a la Johansen (1988) <doi:10.1016/0165-1889(88)90041-3>; (iii) Bayesian factor-OU/AR(1) estimation with convergence and half-life summaries grounded in Uhlenbeck and Ornstein (1930) <doi:10.1103/PhysRev.36.823> and Vasicek (1977) <doi:10.1016/0304-405X(77)90016-2>, with full Markov chain Monte Carlo convergence diagnostics; (iv) heteroskedasticity-consistent (HC) and, when the suggested 'sandwich' (Zeileis (2004) <doi:10.18637/jss.v011.i10>) and 'lmtest' packages are available, heteroskedasticity- and autocorrelation- consistent (HAC) robust inference, with a self-contained HC fallback; (v) coupling significance tests based on time-shift / block-bootstrap nulls that preserve marginal dynamics while breaking cross-series dependence; and (vi) optional PLS-based factor preselection (Mevik and Wehrens (2007) <doi:10.18637/jss.v018.i02>). Functions emphasize reproducibility (explicit seeds throughout) and clear, publication-ready summaries. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| Depends: | R (≥ 4.1) |
| Imports: | stats, methods, parallel, pls, vars, urca, readxl, dplyr, tidyr, stringr, magrittr, zoo, BayesianDisaggregation (≥ 0.2.1) |
| Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown, cmdstanr, posterior, rstan, sandwich, lmtest |
| Additional_repositories: | https://mc-stan.org/r-packages/ |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | no |
| Config/roxygen2/version: | 8.0.0 |
| Packaged: | 2026-06-21 16:05:54 UTC; josemgomezj |
| Author: | José Mauricio Gómez Julián
|
| Maintainer: | José Mauricio Gómez Julián <isadore.nabi@pm.me> |
| Repository: | CRAN |
| Date/Publication: | 2026-06-26 08:00:13 UTC |
convergenceDFM: Dynamic Factor Models for Economic Convergence Analysis
Description
Tools to analyze economic convergence using Dynamic Factor Models (DFM) and Factor Ornstein-Uhlenbeck (OU) processes. Includes data preparation, PLS-based factor selection, DFM estimation with robust inference, OU estimation (Stan or fallback), formal convergence tests, robustness checks, rotation/coupling tests, and visualization utilities.
Main functions
-
run_complete_factor_analysis_robust: End-to-end pipeline. -
estimate_DFM: Estimate a VAR-based Dynamic Factor Model. -
estimate_factor_OU: Estimate Factor OU model (Stan or fallback). -
run_convergence_robustness_tests: Robustness test suite. -
visualize_factor_dynamics: Visualization helpers.
Author(s)
Maintainer: José Mauricio Gómez Julián isadorenabi@pm.me
Authors:
José Mauricio Gómez Julián isadorenabi@pm.me
See Also
vignette("convergence-analysis")
Build a fallback sector-to-cluster map
Description
Constructs a sector-to-cluster partition for test_leave_cluster_out() when a
genuine input-output (Leontief, "MIP") partition is not yet available. This is
an explicitly documented stopgap proxy: the canonical clusters are value
chains defined by inter-industry demand linkages, which neither correlation
nor organic composition reproduces. Supply the real partition through the
cluster_map argument of test_leave_cluster_out() once the MIP is at hand.
Usage
build_cluster_map(
Y_matrix,
n_clusters = 3,
method = c("correlation", "com"),
com = NULL
)
Arguments
Y_matrix |
Numeric matrix (periods in rows, sectors in columns). |
n_clusters |
Integer number of clusters (clamped to |
method |
Fallback rule, |
com |
Numeric vector of organic-composition values, one per sector
(length |
Value
A named integer vector of length ncol(Y_matrix): the cluster label of
each sector, named by sector.
Fallback methods
"correlation"Average-linkage hierarchical clustering on the correlation distance
1 - \rho_{ij}between the sectoral series (columns ofY_matrix), cut inton_clustersgroups. Sectors that move together over time are grouped; this is a purely statistical proxy for co-movement, not a demand-linkage cluster."com"Quantile binning of a per-sector organic-composition vector
cominton_clustersgroups. Motivated by the gravitation reading (sectors of similar organic composition share a profit-rate neighbourhood), but again a one-dimensional proxy, not an input-output partition.
See Also
Select optimal VAR lag order with multiple criteria
Description
Determines the optimal lag order for a Vector Autoregression model using information criteria, stability checks, serial correlation tests, and optional out-of-sample validation.
Usage
choose_var_lag(
F_combined,
lag.max = 4,
type = "const",
p_pref = c("SC(n)", "HQ(n)"),
alpha = 0.05,
oos_eval = TRUE,
oos_start = 0.7,
verbose = TRUE
)
Arguments
F_combined |
Numeric matrix (T x K) of factor scores to be modeled. |
lag.max |
Integer. Maximum lag order to consider. Default is 4. |
type |
Character string. Type of deterministic terms: "const" (default), "trend", "both", or "none". |
p_pref |
Character vector. Preferred information criteria for initial
selection. Default is |
alpha |
Numeric. Significance level for serial correlation test. Default is 0.05. |
oos_eval |
Logical. Should out-of-sample evaluation be performed? Default is TRUE. |
oos_start |
Numeric. Proportion of sample to use for training in OOS validation. Default is 0.7. |
verbose |
Logical; print progress information. Default |
Details
The function combines multiple selection criteria: (1) information criteria (AIC, BIC, HQ), (2) VAR stability (eigenvalue modulus < 1), (3) Portmanteau test for serial correlation, and (4) out-of-sample forecast performance. Returns the model that best balances these considerations.
Value
List with components:
pSelected optimal lag order.
fitFitted VAR model object of class
varest.roots_okLogical indicating if stability condition is satisfied.
serial_okLogical indicating if serial correlation test passed.
oos_mseOut-of-sample mean squared error (if
oos_eval = TRUE).
Construct the transformation wedge W = Phi - V = K * G' - p
Description
Builds the sector-by-time wedge that isolates the marxian redistribution of
surplus from the shared cost price. The wedge cancels the cost price
k = c + v that both the production price \Phi = k + K G' and the
value V = k + p share, leaving W_i = K_i G' - p_i. By
construction of the general profit rate G' = \sum_i p_i / \sum_i K_i,
the wedge sums to zero across sectors each period (redistribution, not
creation of value: the marxian invariance).
Usage
compute_wedge(K, Gprime, p, tol = 1e-08, check_zero_sum = TRUE)
Arguments
K |
Numeric matrix |
Gprime |
Numeric vector of length |
p |
Numeric matrix |
tol |
Relative tolerance for the per-period zero-sum check. Default
|
check_zero_sum |
Logical; if |
Value
A list with: W (the [T, S] wedge), row_sums
(per-period cross-sectional sums, expected ~0), max_abs_rowsum_rel
(the largest relative zero-sum violation), and markup (the profit
markup K_i G').
Examples
set.seed(1)
Tn <- 20; S <- 5
K <- matrix(runif(Tn * S, 1, 5), Tn, S)
p <- matrix(runif(Tn * S, 0.1, 1), Tn, S)
Gp <- rowSums(p) / rowSums(K)
w <- compute_wedge(K, Gp, p)
max(abs(w$row_sums)) # ~ 0 by construction
Incremental R-squared from X in OU model
Description
Computes the incremental explanatory power (delta R-squared) contributed by X factors in predicting Y factors, both in-sample and out-of-sample.
Usage
deltaR2_ou(
results_robust,
lag = 1,
oos = TRUE,
seed = 123,
ridge = 1e-08,
verbose = TRUE
)
Arguments
results_robust |
Output from run_complete_factor_analysis_robust() |
lag |
Number of lags for the model (default: 1) |
oos |
Logical, perform out-of-sample validation (default: TRUE) |
seed |
Random seed for reproducibility (default: 123) |
ridge |
Ridge regularization parameter (default: 1e-08) |
verbose |
Logical, print progress messages (default: TRUE) |
Value
List with components:
in_sampleData frame with R2_full, R2_baseline, and deltaR2.
per_equationVector of deltaR2 for each Y factor equation.
OOSList with out-of-sample RMSE and deltaR2 (if
oos = TRUE).
Diagnose and prepare data matrices
Description
Performs data validation, missing value imputation, and variance checks on input matrices to prepare them for factor analysis. Handles dimension compatibility, NA values, and zero-variance columns.
Usage
diagnose_data(
X_matrix,
Y_matrix,
verbose = TRUE,
seed = NULL,
max_impute_frac = 0.2
)
Arguments
X_matrix |
Numeric matrix or data frame of X variables (e.g., labour-value price indices). |
Y_matrix |
Numeric matrix or data frame of Y variables (e.g., market prices / CPI). |
verbose |
Logical; print diagnostic information. Default |
seed |
Optional integer. If supplied, the random number generator state is
set (and restored on exit) before injecting the jitter used to break exactly
zero-variance columns, making the operation reproducible. Default
|
max_impute_frac |
Numeric in (0, 1). If the fraction of imputed entries
in either matrix exceeds this value a warning is emitted (heavy imputation
can distort downstream covariance structure). Default |
Details
The function:
Converts to matrix format if needed
Validates dimensional compatibility
Imputes missing values via interpolation (using zoo if available)
Adds minimal, optionally seeded, noise to zero-variance columns
Reports diagnostic information and the imputation fraction
Imputation by interpolation with boundary carry-forward (and a mean fallback) is a convenience for exploratory work; for inference on series with substantial missingness, prefer an explicit, model-based imputation upstream.
Value
List with components:
X_matrixCleaned and prepared X matrix.
Y_matrixCleaned and prepared Y matrix.
impute_fracNamed numeric vector with the fraction of imputed entries in X and Y.
Estimate Dynamic Factor Model with VAR dynamics
Description
Estimates a Dynamic Factor Model by extracting factors via PLS and modeling their dynamics with a Vector Autoregression. Includes automatic lag selection, robust inference, and optional out-of-sample evaluation.
Usage
estimate_DFM(
factors_data,
p = 2,
compute_oos = TRUE,
hc_type = "HC3",
hac = FALSE,
verbose = TRUE
)
Arguments
factors_data |
List containing PLS-extracted factor scores ( |
p |
Integer. VAR lag order. If |
compute_oos |
Logical. Should out-of-sample diagnostics be computed? Default is TRUE. |
hc_type |
Character string. Heteroskedasticity-consistent SE type. Default is "HC3". |
hac |
Logical. If |
verbose |
Logical; print progress and diagnostic information. Default |
Details
This function models the joint dynamics of X and Y factors using a VAR. It performs stability checks, tests for serial correlation, computes robust standard errors, and optionally evaluates forecast performance out-of-sample. Robust covariance uses sandwich/lmtest when available and a self-contained HC (HC0-HC4) implementation otherwise. Because the factors are PLS scores (supervised on Y), in-sample fit is optimistic; rely on the out-of-sample diagnostics. See the vignette section "Methodological notes".
Value
List with components:
var_modelFitted VAR model (
vars::varest).p_usedVAR lag order used.
roots,is_stableCompanion-root moduli and stability flag.
half_lives,half_life_dominantPer-root half-lives and the half-life of the largest-modulus (slowest) root, which governs system persistence.
r2_eq,r2_adj_eq,r2_globalIn-sample R2.
r2_oos_eq,r2_oos_global,mse_oos_eqOut-of -sample diagnostics (if requested), reported unclipped.
robust_sePer-equation list of robust standard errors.
robust_inferencePer-equation list of
coef,se,t,pfrom HC/HAC robust inference.portmanteau_pPortmanteau serial-correlation p-value.
hc_typeThe HC type used.
Estimate Factor Ornstein-Uhlenbeck / AR(1) model (Stan if available)
Description
Estimates a Bayesian multivariate, discrete-time mean-reverting model
(a first-order vector autoregression with cross-equation coupling) in which
Y_t depends on its own lag and on lagged X_{t-1} through a
\beta matrix. This is the discrete-time analogue of a coupled
Ornstein-Uhlenbeck system; see the vignette section "Methodological notes" for
the exact mapping \phi = e^{-\kappa \Delta t}. Uses cmdstanr when
available, otherwise rstan, with a discrete AR(1) fallback.
Usage
estimate_factor_OU(
factors_data,
data_prep = NULL,
chains = 4,
iter = 2000,
seed = 1234,
adapt1 = 0.95,
adapt2 = 0.999,
mtd1 = 12,
mtd2 = 15,
verbose = TRUE
)
Arguments
factors_data |
List with |
data_prep |
Optional preprocessed data (reserved). |
chains, iter, seed |
Stan sampling controls. |
adapt1, mtd1 |
|
adapt2, mtd2 |
|
verbose |
Logical; print progress/details. |
Value
A list with posterior medians (\phi, \mu, \beta),
half-lives, coupling strength, pseudo-R2, MCMC convergence diagnostics
(diagnostics: max R-hat, min ESS, divergences) and the fitted Stan
object.
Convergence is testable, not assumed
The persistence \phi is not constrained to (0,1): it is
given a generous support and a weakly-informative, convergence-neutral prior
(normal(0.5, 0.5)), so the posterior can place mass on unit-root or
mildly explosive dynamics. Mean reversion (convergence) is therefore a
conclusion supported by the data, not an artefact of the parameterization.
Examples
set.seed(123)
n <- 60
X_scores <- matrix(rnorm(n * 2), n, 2)
Y_scores <- matrix(rnorm(n * 2), n, 2)
factors_data <- list(scores_X = X_scores, scores_Y = Y_scores)
ou_result <- estimate_factor_OU(factors_data, chains = 2, iter = 600,
verbose = FALSE)
print(ou_result$half_lives_Y)
print(ou_result$diagnostics)
Extract X innovations (reduced-form VAR residuals)
Description
Computes reduced-form innovations for the X factors by fitting a VAR and returning its residuals. These are not structural shocks: identifying structural shocks would require an identification scheme (e.g. a Cholesky or sign restriction); the residuals here are correlated across equations.
Usage
make_X_innovations(results, p = NULL)
Arguments
results |
List. Output from main analysis. |
p |
Integer. VAR lag order. If |
Value
List with components:
shocksMatrix (T x Kx) of reduced-form innovations with NA padding for the initial lags.
var_fitFitted VAR model object.
lag_usedInteger lag order used.
Aggregate-preserving placebo values (negative controls for gravitation)
Description
Builds theoretically false "values" that respect the SAME aggregate marxian
constraints (per-period total surplus \sum_i p_i and the general profit
rate G') but BREAK the marxian sectoral structure. Used as negative
controls: if the genuine sectoral value gravitates (wedge reverts / OOS)
better than placebos that share all aggregate accounting, the specific
marxian assignment carries information beyond the shared cost price.
Usage
placebo_values(
k_cost,
K,
Gprime,
p,
scheme = c("uniform", "permute"),
seed = 1L
)
Arguments
k_cost |
Numeric matrix |
K |
Numeric matrix |
Gprime |
Numeric vector length |
p |
Numeric matrix |
scheme |
Either |
seed |
Integer seed for the permutation (reproducibility). Default
|
Details
Schemes:
"uniform"Every sector earns the average profit rate:
p^{plac}_i = G' K_i, soV^{plac} = k + G' K = \Phiand the wedge is identically zero (the "no redistribution" null). Tests whether the value structure adds anything over the production price."permute"A fixed (seeded) derangement
\sigmareassigns each sector's surplus series to another sector:p^{plac}_i = p_{\sigma(i)}. Per-period totals are preserved (\sum_i p_{\sigma(i)} = \sum_i p_i) but theK_i \leftrightarrow p_ipairing is destroyed, so the wedgeK_i G' - p_{\sigma(i)}mis-pairs capital and surplus.
Value
A list with p_placebo, V_placebo (= k + p^{plac}),
W_placebo (= K G' - p^{plac}), the scheme, and (for
"permute") the perm used.
Examples
set.seed(1); Tn <- 12; S <- 6
k <- matrix(runif(Tn * S, 1, 3), Tn, S)
K <- k + matrix(runif(Tn * S, 1, 3), Tn, S)
p <- matrix(runif(Tn * S, 0.1, 1), Tn, S)
Gp <- rowSums(p) / rowSums(K)
pl <- placebo_values(k, K, Gp, p, scheme = "permute", seed = 42)
max(abs(rowSums(pl$p_placebo) - rowSums(p))) # ~0: per-period total preserved
Plot error correction panel
Description
Creates a two-panel plot showing (1) error correction terms over time and (2) their estimated half-lives, providing visual assessment of convergence speed.
Usage
plot_error_correction_panel(
results_robust,
use_contemporaneous_X = TRUE,
main_left = expression(u[t] == Y[t] - hat(mu)[Y](X[t])),
main_right = "Half-life de u[t]"
)
Arguments
results_robust |
List. Results from main analysis pipeline. |
use_contemporaneous_X |
Logical. Use contemporaneous X (time t, the default) or lagged X (time t-1) for computing the equilibrium path. |
main_left |
Expression/character. Title for the left panel. |
main_right |
Character string. Title for the right panel. |
Value
Invisibly returns a list with components:
UMatrix of error correction terms (T x Ky).
half_livesVector of estimated half-lives for each Y factor.
Read CPI data from an Excel file
Description
Reads an aggregate Consumer Price Index series from an Excel file, with robust
column detection and data validation. The date/year and CPI columns are
located by pattern-matching on lower-cased header names; localized numerics
(comma decimals) are parsed with to_num_commas(); duplicate years are
collapsed by averaging.
Usage
read_cpi(path_cpi)
Arguments
path_cpi |
Path to the CPI Excel file. |
Value
A data.frame with two columns: Year (integer) and CPI
(numeric), sorted ascending by Year. The function errors (it does not
return NULL) when the file is missing or no date/CPI column can be
identified.
See Also
Rescue short-run channel test
Description
Evaluates the short-run relationship from X to Y factors using three complementary, valid pieces of evidence: (1) a coupling null obtained by circularly time-shifting the X residual (breaking its alignment with Y while preserving its own autocorrelation – the rotation-based null used previously was invariant and therefore vacuous); (2) an expanding-window out-of-sample comparison of forecasts with and without lagged X, with a Clark-West test of the improvement; and (3) a Granger causality test.
Usage
rescue_short_run_channel(
results_robust,
lag = 1,
B = 1000,
seed = NULL,
ridge = 0.001,
oos_start = 0.6,
verbose = TRUE
)
Arguments
results_robust |
Output from run_complete_factor_analysis_robust() |
lag |
Number of lags for the model (default: 1) |
B |
Number of null replicates (default: 1000) |
seed |
Random seed for reproducibility (default: NULL) |
ridge |
Ridge regularization parameter (default: 0.001) |
oos_start |
Proportion of data to use for training (default: 0.6) |
verbose |
Logical; print progress and diagnostic information. Default |
Value
List with components:
p_values,p_values_fdrCoupling p-values (time-shift null) and their Benjamini-Hochberg adjustment.
OOSOut-of-sample RMSE/R2 comparison plus the Clark-West test (
cw_stat,cw_p).granger_pGranger causality p-value.
observed,sim_quantiles,null_statsCoupling statistics and their null distributions.
Null hypothesis test for X->Y factor coupling
Description
Tests whether the observed association between the lagged X factor space and the Y factor space is stronger than expected when the X->Y temporal alignment is broken.
Usage
rotation_null_test(
scores_X,
scores_Y,
lag = 1,
B = 1000,
seed = 123,
compute = c("procrustes", "cca", "principal", "dynbeta"),
null_method = c("circular_shift", "rotation"),
progress = TRUE
)
Arguments
scores_X |
Factor scores from the first dataset (T x Kx). |
scores_Y |
Factor scores from the second dataset (T x Ky). |
lag |
Number of lags for the X->Y relationship (default 1). |
B |
Number of null replicates (default 1000). |
seed |
Random seed for reproducibility (default 123). The RNG state is restored on exit. |
compute |
Statistics to compute: any of 'procrustes', 'cca', 'principal', 'dynbeta'. |
null_method |
Null-generating mechanism: "circular_shift" (default) or "rotation" (invariance diagnostic only). |
progress |
Logical, show progress bar (default TRUE). |
Value
List with components:
observedObserved statistics.
null_statsList of length-
Bnull vectors per statistic.p_valuesMonte Carlo one-sided p-values
(1 + #{null >= obs}) / (B + 1).p_values_fdrBenjamini-Hochberg FDR-adjusted p-values.
paramsTest settings, including the seed and null method.
Why not a rotation null
The coupling statistics used here (Procrustes nuclear-norm ratio, canonical
correlations, principal angles, dynamic-beta Frobenius norm) are
invariant to orthogonal rotation of either factor space: rotating Y by
an orthogonal matrix leaves all of them unchanged, so a rotation-based null is
degenerate (its "distribution" is a point mass at the observed value and every
p-value is ~1). The default null (null_method = "circular_shift")
instead applies a random circular time shift to Y, which preserves each
series' own autocorrelation while destroying the cross-series alignment – the
statistics then genuinely vary under the null. null_method = "rotation"
is retained only as an invariance diagnostic and will (correctly) yield
non-significant results.
Normalize matrix rows to sum to one
Description
Scales each row of a matrix so that its elements sum to 1. Handles zero-sum rows by leaving them unchanged (avoiding division by zero).
Usage
row_norm1(M)
Arguments
M |
Numeric matrix to be row-normalized. |
Value
Matrix of the same dimensions as M with each row summing to 1
(except rows that originally summed to zero).
Examples
M <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2)
M_norm <- row_norm1(M)
rowSums(M_norm) # Should be c(1, 1)
Complete factor-OU convergence analysis pipeline
Description
Executes the end-to-end analysis workflow: data preparation, PLS-based factor extraction, DFM estimation, Factor-OU/AR(1) estimation, convergence tests, and robustness checks. This is the main user-facing function.
Usage
run_complete_factor_analysis_robust(
X_matrix,
Y_matrix,
TMG = NULL,
COM_matrix = NULL,
SPVR_matrix = NULL,
CA = NULL,
sector_names = NULL,
max_comp = 3,
dfm_lags = 1,
ou_chains = 4,
ou_iter = 2000,
skip_ou = FALSE,
run_convergence_tests = TRUE,
make_plots = FALSE,
plot_file = NULL,
seed = 123,
path_cpi = NULL,
path_weights = NULL,
verbose = TRUE
)
Arguments
X_matrix |
Matrix of first set of variables |
Y_matrix |
Matrix of second set of variables |
TMG |
Optional total Marxist gross output (default: NULL) |
COM_matrix |
Optional commodity-composition matrix (default: NULL) |
SPVR_matrix |
Optional surplus-value-rate matrix (default: NULL) |
CA |
Optional additional covariates (default: NULL) |
sector_names |
Vector of sector names (default: NULL) |
max_comp |
Maximum number of components (default: 3) |
dfm_lags |
Number of lags for DFM (default: 1) |
ou_chains |
Number of MCMC chains for OU estimation (default: 4) |
ou_iter |
Number of MCMC iterations (default: 2000) |
skip_ou |
Logical, skip OU estimation (default: FALSE) |
run_convergence_tests |
Logical, run convergence tests (default: TRUE) |
make_plots |
Logical, draw the factor-dynamics figure (default: FALSE).
When |
plot_file |
Optional path for the saved figure (default: NULL). |
seed |
Integer master seed for the stochastic steps (default: 123). |
path_cpi |
Path to CPI data (default: NULL) |
path_weights |
Path to weights data (default: NULL) |
verbose |
Logical; print progress and diagnostic information. Default |
Details
This function orchestrates the complete analysis: data validation,
PLS factor extraction, VAR-based DFM, Bayesian factor-OU/AR(1), convergence
tests (stationary band on phi, Johansen cointegration, residual unit-root
with caveats) and robustness tests (time-shift permutation, reweighting,
delete-one-sector jackknife). Plotting is opt-in (make_plots); by
default no graphics device is opened and no file is written, so the function
is safe in batch/non-interactive use.
Value
List with components:
data_prepScaled (possibly auxiliary-augmented) X and Y.
selectionPLS component selection (see
select_optimal_components_safe).factorsPLS-extracted factors (
scores_X,scores_Y) and interpretation.dfmDFM/VAR estimation (see
estimate_DFM).factor_ouFactor-OU/AR(1) estimates (see
estimate_factor_OU), including MCMC diagnostics.convergence_testsConvergence and robustness test results (if
run_convergence_tests = TRUE).
See Also
estimate_DFM, estimate_factor_OU,
run_convergence_robustness_tests, visualize_factor_dynamics
Examples
set.seed(123)
X <- matrix(rnorm(100 * 10), 100, 10)
Y <- X + matrix(rnorm(100 * 10, 0, 0.5), 100, 10)
results <- run_complete_factor_analysis_robust(
X_matrix = X, Y_matrix = Y,
max_comp = 3, dfm_lags = 1,
skip_ou = TRUE, # set FALSE to run the Bayesian factor-OU model
verbose = FALSE
)
str(results$dfm$r2_global)
Run comprehensive robustness test suite
Description
Executes all available robustness tests (permutation, reweighting, jackknife) and synthesizes results into an integrated assessment.
Usage
run_convergence_robustness_tests(
results_robust,
X_matrix,
Y_matrix,
path_cpi = NULL,
path_weights = NULL,
sector_names = NULL,
run_permutation = TRUE,
run_reweighting = FALSE,
run_jackknife = TRUE,
run_leadlag = FALSE,
run_common_factor = FALSE,
sensitivity_analysis = TRUE,
verbose = TRUE
)
Arguments
results_robust |
Output from run_complete_factor_analysis_robust() |
X_matrix |
Matrix of first set of variables |
Y_matrix |
Matrix of second set of variables |
path_cpi |
Path to CPI data (default: NULL) |
path_weights |
Path to weights data (default: NULL) |
sector_names |
Vector of sector names (default: NULL) |
run_permutation |
Logical, run permutation test (default: TRUE) |
run_reweighting |
Logical, run reweighting test (default: FALSE) |
run_jackknife |
Logical, run jackknife test (default: TRUE) |
run_leadlag |
Logical, run lead-lag test (default: FALSE) |
run_common_factor |
Logical, run common factor test (default: FALSE) |
sensitivity_analysis |
Logical, run sensitivity analysis (default: TRUE) |
verbose |
Logical; print progress and diagnostic information. Default |
Value
List with components:
permutationResults from permutation test.
reweightingResults from reweighting test (if applicable).
jackknifeResults from jackknife test (if requested).
summaryData frame summarizing all tests.
overall_robustLogical indicating if convergence is robust across all tests.
Run the coupling null test on complete analysis results
Description
Wrapper that extracts factors from a complete analysis object and runs
rotation_null_test with the corrected time-shift null.
Usage
run_rotation_null_on_results(
results_robust,
lag = 1,
B = 1000,
seed = 42,
null_method = c("circular_shift", "rotation"),
compute = c("procrustes", "cca", "principal", "dynbeta")
)
Arguments
results_robust |
Output from |
lag |
Number of lags for the X->Y relationship (default 1). |
B |
Number of null replicates (default 1000). |
seed |
Random seed for reproducibility (default 42). |
null_method |
Null mechanism passed to |
compute |
Statistics to compute: 'procrustes', 'cca', 'principal', 'dynbeta'. |
Value
List. Output from rotation_null_test.
Select optimal number of PLS components with cross-validation
Description
Determines the optimal number of Partial Least Squares (PLS) components using k-fold or leave-one-out cross-validation, with robust error handling. The number of components is chosen with the one-standard-error rule (the most parsimonious model whose cross-validated RMSEP is within one standard error of the minimum), falling back to an R2 elbow and then to the RMSEP minimizer.
Usage
select_optimal_components_safe(
X,
Y,
max_comp = 15,
validation = c("CV", "LOO"),
segments = NULL,
seed = NULL,
verbose = TRUE
)
Arguments
X |
First data matrix (predictors), T x p. |
Y |
Second data matrix (responses), T x q (or a vector). |
max_comp |
Maximum number of components to test. Default 15. It is
internally capped at |
validation |
Character; cross-validation type, one of |
segments |
Integer; number of folds when |
seed |
Optional integer. With |
verbose |
Logical; print progress and diagnostic information. Default
|
Details
The function fits PLS models with 1 to max_comp components,
evaluates each via cross-validation, and selects the number that minimizes
prediction error while avoiding overfitting (one-SE rule).
Value
A list with components:
optimal_ncompInteger, number of components selected.
pls_modelFitted
pls::plsrmodel (allmax_compcomponents, validated).R2_cvCross-validated R2 at
optimal_ncomp.var_explainedPercent of Y variance explained (CV) at
optimal_ncomp.rmsep_cvNumeric vector of cross-validated RMSEP for 1..
max_compcomponents.validationThe validation method actually used.
Classical cointegration control (Johansen trace or eigen)
Description
Runs Johansen's cointegration test on the first min(2, ncol(X), ncol(Y))
factors from scores_X and scores_Y, and counts the number of
cointegrating relations at the 5% level.
Usage
test_cointegration_control(
factors_data,
max_lag = 4,
type = "trace",
ecdet = "const",
verbose = TRUE
)
Arguments
factors_data |
A list with matrices |
max_lag |
Integer; maximum lag |
type |
Character; Johansen test type, one of |
ecdet |
Character; deterministic terms, e.g. |
verbose |
Logical; print progress and a summary of the test. Default |
Details
This function requires the optional package urca (declared in
Suggests). It does not attempt to install packages at runtime; if
urca is unavailable, an informative error is thrown.
The 5% critical values are taken from the "5pct" column of the
cval slot returned by urca::ca.jo().
Value
A list with:
-
test: theurca::ca.jofitted object, -
n_coint: integer number of cointegrating relations at 5%, -
vectors: matrix of cointegrating vectors (orNULLif none).
References
Johansen, S. (1991). Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Econometrica, 59(6), 1551-1580.
Johansen, S. (1995). Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford University Press.
See Also
Examples
if (requireNamespace("urca", quietly = TRUE)) {
set.seed(1)
T <- 120
X <- cbind(cumsum(rnorm(T)), cumsum(rnorm(T)))
Y <- cbind(cumsum(rnorm(T)), cumsum(rnorm(T)))
fd <- list(scores_X = X, scores_Y = Y)
out <- test_cointegration_control(fd, max_lag = 2, verbose = FALSE)
str(out)
}
Delete-one-sector jackknife of the X->Y coupling
Description
Genuine leave-one-sector-out jackknife: each sector (column) is dropped in
turn from both X and Y, the coupling is re-estimated, and the delete-one
replicates are used to compute the jackknife bias and standard error and to
rank sectors by influence. (The previous version dropped the top-k
highest-variance sectors a single time and was not a jackknife.)
Usage
test_jackknife_sectors(
X_matrix,
Y_matrix,
sector_names = NULL,
k_exclude = 3,
max_comp = 3,
seed = 123,
verbose = TRUE
)
Arguments
X_matrix |
Matrix of first set of variables (sectors in columns) |
Y_matrix |
Matrix of second set of variables (sectors in columns) |
sector_names |
Vector of sector names (default: column names of Y) |
k_exclude |
Number of most-influential sectors to report (default: 3) |
max_comp |
Maximum number of components (default: 3) |
seed |
Optional integer for reproducible component selection/diagnostics. |
verbose |
Logical; print progress and diagnostic information. Default |
Value
List with components:
baselineFull-sample coupling.
jackknife_estimatesNamed vector of delete-one coupling estimates.
biasJackknife bias estimate
(n-1)(mean(jk) - full).seJackknife standard error.
influencePer-sector change
full - jk_i(signed).retentionPer-sector
jk_i / full.influential_sectorsThe
k_excludemost influential sectors.robustTRUEif no single sector changes the coupling by more than 50 percent.
Leave-cluster-out robustness of the X->Y coupling
Description
Generalizes the delete-one-sector jackknife (test_jackknife_sectors()) to
delete-one-cluster: each cluster of sectors is dropped in turn from both
X and Y, and the coupling is re-estimated on the remaining sectors. Under
cross-sectional dependence of the input-output ("MIP") kind, a naive
leave-one-out is optimistic because the excluded sector's information leaks in
through intermediate demand; dropping an entire value chain forces the
prediction to rely on the general gravitation rather than on near-collinear
neighbours. This is the cross-sectional complement of the temporal
time-shift / block-bootstrap nulls already in the package
(rotation_null_test(), test_permutation_robustness()); it reuses the same
coupling pipeline (.coupling_estimate), it does not reimplement it.
Usage
test_leave_cluster_out(
X_matrix,
Y_matrix,
cluster_map = NULL,
n_clusters = 3,
fallback = c("correlation", "com"),
com = NULL,
max_comp = 3,
seed = 123,
verbose = TRUE
)
Arguments
X_matrix, Y_matrix |
Numeric matrices (sectors in columns; |
cluster_map |
Optional sector-to-cluster map (see "Cluster map"). |
n_clusters |
Integer; number of clusters for the fallback (ignored when
|
fallback |
Fallback rule when |
com |
Optional per-sector organic-composition vector for |
max_comp |
Maximum number of PLS components (default |
seed |
Optional integer for reproducible component selection/diagnostics. |
verbose |
Logical; print progress and diagnostic information. Default |
Value
List with components:
baselineFull-sample coupling.
cluster_estimatesNamed vector of leave-one-cluster coupling estimates.
cluster_membersNamed list of the sectors in each cluster.
cluster_sizesNamed integer vector of cluster sizes.
bias,seDelete-a-group jackknife bias and standard error.
influencePer-cluster signed change
baseline - lco_c.retentionPer-cluster
lco_c / baseline.influential_clustersClusters ordered by absolute influence.
n_clustersNumber of clusters.
cluster_mapThe resolved per-sector cluster labels.
map_source"user"or"fallback:<method>".robustTRUEif no single cluster changes the coupling by more than 50 percent.
Cluster map (pluggable)
Supply cluster_map as the genuine input-output partition (a per-sector label
vector, or a named list mapping each cluster to its sector names/indices). When
it is NULL, a documented fallback partition is built with
build_cluster_map() ("correlation" or "com"); the fallback is a stopgap
proxy, not a demand-linkage partition, and a message flags its use.
Statistical layer (do not over-read)
bias and se are the delete-a-group (block) jackknife estimates with the
(g-1)/g factor over the g cluster-deletion replicates. They are
well-calibrated for roughly balanced clusters; with strongly unequal MIP
clusters they are an approximate, conservative summary, and the primary
outputs are the per-cluster influence/retention and the robust verdict.
The verdict is purely a robustness diagnostic, not a coupling point estimate.
See Also
test_jackknife_sectors(), build_cluster_map()
Permutation-based robustness test for X->Y coupling
Description
Tests whether the estimated X->Y coupling strength is stronger than expected when the temporal alignment between X and Y is broken. The null is generated by circularly time-shifting the Y factors, which preserves Y's own autocorrelation while destroying the cross-series relationship.
Usage
test_permutation_robustness(
factors_data,
data_prep,
n_perms = 100,
seed = 123,
use_stan = TRUE,
chains = 4,
iter = 2000,
verbose = TRUE
)
Arguments
factors_data |
List with |
data_prep |
Prepared data object (passed to the OU estimator). |
n_perms |
Number of permutations (default: 100). |
seed |
Random seed for reproducibility (default: 123). Now honoured. |
use_stan |
Logical, use Stan for estimation (default: TRUE). |
chains |
Number of MCMC chains (default: 4). |
iter |
Number of MCMC iterations (default: 2000). |
verbose |
Logical; print progress and diagnostic information. Default |
Value
List with components:
baselineObserved coupling strength.
permuted_mean,permuted_median,permuted_sd-
Summary of the null coupling distribution.
p_valueMonte Carlo one-sided p-value
(1 + #{null >= baseline}) / (n + 1).z_scoreStandardized effect size.
robustTRUEifp_value < 0.05.n_successfulNumber of usable replicates.
Why not a column permutation
The previous version permuted the columns of the Y factors. The coupling strength is a Frobenius norm of the coupling matrix and is invariant to such a relabelling, so the old null was degenerate (every replicate reproduced the baseline). Shuffling time, not factor labels, is what tests coupling.
Reweighting-based robustness test
Description
Tests sensitivity of the coupling result to alternative sectoral weighting
schemes. For each perturbed weight vector the sectoral price levels are
obtained from the canonical closed-form conjugate disaggregation
(BayesianDisaggregation::disaggregate_conjugate(), a Kalman/RTS smoother that
conditions genuinely on the aggregate CPI), and the (fallback) factor-OU
coupling is re-estimated. The coefficient of variation of the coupling across
schemes summarizes the sensitivity.
Usage
test_reweighting_robustness(
path_cpi,
path_weights,
X_matrix,
max_comp = 3,
seed = 123,
verbose = TRUE
)
Arguments
path_cpi |
Path to CPI data file |
path_weights |
Path to weights data file |
X_matrix |
Matrix of variables |
max_comp |
Maximum number of components (default: 3) |
seed |
Optional integer for reproducible priors (default: 123). |
verbose |
Logical; print progress and diagnostic information. Default |
Value
List with components:
resultsPer-scheme list with
couplingandhalf_life_mean.cv_couplingCoefficient of variation of coupling across schemes.
robustTRUEifcv_coupling < 0.3.
Canonical disaggregation
Earlier versions used a deterministic weight blend
(run_disaggregation_custom_prior) that did not condition on the CPI. That
method is removed; the single source of truth is now the conjugate engine in
BayesianDisaggregation. A constant-in-time custom prior is replicated
across periods to form the value-added (VAB) weight matrix W, and the point
estimate used is the posterior median ($phi_summary$median), i.e. the
RTS-smoothed sectoral levels, now genuinely conditioned on the CPI.
Convert localized number strings to numeric
Description
Converts character strings that may use either the European convention (comma decimal separator, period thousands separator, e.g. "1.234,56") or the Anglo-Saxon convention (period decimal separator, comma thousands separator, e.g. "1,234.56") to numeric. Plain numbers and already-numeric inputs are returned unchanged. The function is vectorized.
Usage
to_num_commas(x)
Arguments
x |
Numeric value or character vector with localized number strings. |
Details
Disambiguation rules, applied per element after stripping whitespace:
If both
.and,are present, the rightmost of the two is treated as the decimal mark and the other as the thousands separator.If only commas are present: a single comma is treated as a decimal mark; multiple commas are treated as thousands separators and removed.
If only periods are present: a single period is treated as a decimal mark (Anglo-Saxon); multiple periods are treated as thousands separators and removed.
The single ambiguous case "1.234" (one period, no comma) is interpreted as the decimal 1.234, not as 1234.
Value
Numeric vector the same length as x. Entries that cannot be
parsed become NA.
Examples
to_num_commas("1.234,56") # European -> 1234.56
to_num_commas("1,234.56") # US -> 1234.56
to_num_commas("1234.56") # plain -> 1234.56
to_num_commas(1234.56) # numeric -> 1234.56
Visualize factor dynamics comprehensively
Description
Creates a multi-panel visualization summarizing the factor model: X and Y
factor scores over time and the DFM and OU half-lives. Plotting is a side
effect; nothing is written unless save_plot = TRUE.
Usage
visualize_factor_dynamics(
dfm_result,
ou_result,
factors_data,
save_plot = FALSE,
plot_file = NULL,
use_device = "default",
verbose = TRUE
)
Arguments
dfm_result |
Result object from DFM analysis |
ou_result |
Result object from OU estimation |
factors_data |
Data frame with factor information |
save_plot |
Logical, save plot to a PDF file (default: FALSE) |
plot_file |
File name for the saved plot. If |
use_device |
Graphics device to use: 'default', 'pdf', 'none' (default: 'default') |
verbose |
Logical; print progress and diagnostic information. Default |
Value
Invisibly returns TRUE when all four panels were created.
Simple factor dynamics visualization
Description
Creates a simplified multi-panel plot of factor scores over time with minimal dependencies. Useful for quick diagnostics.
Usage
visualize_factor_dynamics_simple(
factors_data,
output_file = NULL,
verbose = TRUE
)
Arguments
factors_data |
List containing factor scores. |
output_file |
Character string. Optional file path for saving. Default is |
verbose |
Logical; print progress and diagnostic information. Default |
Value
Invisibly returns NULL.
Examples
data <- list(scores_X = matrix(rnorm(100), ncol = 2),
scores_Y = matrix(rnorm(100), ncol = 2))
tmp_file <- file.path(tempdir(), "test_plot.pdf")
visualize_factor_dynamics_simple(data, output_file = tmp_file)
Per-sector and panel stationarity / mean-reversion of the wedge
Description
Tests whether the transformation wedge is a bounded, mean-reverting series
(evidence that the redistribution gravitates and does not diverge) rather
than a random walk. For each sector it fits a discrete AR(1)
W_t = c + \rho W_{t-1} + e_t (reporting \rho, the implied
mean-reversion speed \kappa = -\log\rho and half-life
\log(0.5)/\log\rho) and an augmented Dickey-Fuller test
(urca::ur.df, drift). The panel summary reports the fraction of
sectors rejecting the unit-root null and the distribution of reversion
speeds. Boundedness is reported as the range and the max-abs over standard
deviation. This is a descriptive diagnostic; no single pooled p-value is
claimed (anti-overreach: panel unit-root nulls require their own tabulated
distribution).
Usage
wedge_stationarity(W, adf_lags = 2, alpha = 0.05, verbose = TRUE)
Arguments
W |
Numeric matrix |
adf_lags |
Maximum lag order for the ADF auxiliary regression
( |
alpha |
Significance level for the ADF unit-root rejection. Default
|
verbose |
Logical; print a short panel summary. Default |
Value
A list with per_sector (a data frame: rho,
kappa, half_life, adf_stat, adf_crit,
reject_unit_root, range, maxabs_over_sd) and
panel (fraction rejecting, median/IQR of rho and
half_life, and the boundedness summary).
Examples
if (requireNamespace("urca", quietly = TRUE)) {
set.seed(1); Tn <- 60; S <- 4
W <- sapply(1:S, function(s) as.numeric(stats::arima.sim(list(ar = 0.6), Tn)))
W <- W - rowMeans(W) # impose zero-sum like a real wedge
wedge_stationarity(W, verbose = FALSE)$panel$frac_reject
}