Help for package dcce

Title:

Dynamic Common Correlated Effects Estimation for Panel Data

Version:

0.4.2

Description:

Estimates heterogeneous coefficient models for large panels with cross-sectional dependence. Implements the Mean Group (MG) estimator of Pesaran and Smith (1995) <doi:10.1016/0304-4076(94)01644-F>, the Common Correlated Effects (CCE) and Dynamic CCE (DCCE) estimators of Pesaran (2006) <doi:10.1111/j.1468-0262.2006.00692.x> and Chudik and Pesaran (2015) <doi:10.1016/j.jeconom.2015.03.007>, the regularized CCE of Juodis (2022), the Augmented Mean Group (AMG) of Eberhardt and Teal (2010), the Interactive Fixed Effects (IFE) estimator of Bai (2009) <doi:10.3982/ECTA6135>, and long-run estimators including Cross-Sectionally augmented Distributed Lag (CS-DL), Cross-Sectionally augmented Autoregressive Distributed Lag (CS-ARDL), and Pooled Mean Group (PMG) (Chudik et al. 2016; Shin et al. 1999). Also provides rolling-window estimation, high-dimensional fixed effect absorption, spatial CCE via user-supplied weight matrices, and structural break tests (Chow and sup-Wald) following Andrews (1993), Bai and Perron (1998), and Ditzen, Karavias and Westerlund (2024). Supplies a comprehensive cross-sectional dependence (CD) test suite including the Pesaran (2015) CD test <doi:10.1080/07474938.2014.956623>, the Juodis and Reese (2022) randomized weighted CD (CDw) test, the Baltagi et al. (2012) bias-adjusted weighted CD (CDw+) test, the Fan et al. (2015) Power Enhancement Approach (PEA) test, and the Pesaran and Xie (2021) bias-corrected CD (CD*) test. Further diagnostics include the Pesaran (2007) Cross-sectionally Augmented IPS (CIPS) panel unit root test <doi:10.1002/jae.951>, the Westerlund (2007) panel cointegration tests, the Dumitrescu and Hurlin (2012) panel Granger causality test, the Im-Pesaran-Shin (IPS) and Levin-Lin-Chu (LLC) panel unit root tests, the Pedroni (2004) and Kao (1999) residual cointegration tests, the Swamy (1970) and Pesaran and Yamagata (2008) slope homogeneity tests, a Hausman-type test for MG versus pooled, the exponent of cross-sectional dependence from Bailey et al. (2016) <doi:10.1002/jae.2490>, information criteria for Cross-Sectional Average (CSA) selection, the rank condition classifier, impulse response functions, cross-section and wild bootstrap inference, and 'broom'-compatible methods.

License:

GPL (≥ 3)

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.3

Depends:

R (≥ 4.1.0)

Imports:

stats, Matrix, collapse (≥ 2.0.0), sandwich, generics, rlang (≥ 1.1.0), cli (≥ 3.0.0), tibble, Rcpp (≥ 1.0.0)

LinkingTo:

Rcpp, RcppArmadillo

Suggests:

broom, ggplot2, lifecycle, plm, testthat (≥ 3.0.0), knitr, rmarkdown, marginaleffects, parallel

VignetteBuilder:

knitr

Config/testthat/edition:

NeedsCompilation:

yes

Packaged:

2026-05-02 12:00:44 UTC; musta

Author:

Mustapha Wasseja [aut, cre]

Maintainer:

Mustapha Wasseja <muswaseja@gmail.com>

Repository:

CRAN

Date/Publication:

2026-05-05 18:20:18 UTC

dcce: Dynamic Common Correlated Effects Estimation for Panel Data

Description

Estimates heterogeneous coefficient models for large panels with cross-sectional dependence. Implements the Mean Group (MG) estimator of Pesaran and Smith (1995) doi:10.1016/0304-4076(94)01644-F, the Common Correlated Effects (CCE) and Dynamic CCE (DCCE) estimators of Pesaran (2006) doi:10.1111/j.1468-0262.2006.00692.x and Chudik and Pesaran (2015) doi:10.1016/j.jeconom.2015.03.007, the regularized CCE of Juodis (2022), the Augmented Mean Group (AMG) of Eberhardt and Teal (2010), the Interactive Fixed Effects (IFE) estimator of Bai (2009) doi:10.3982/ECTA6135, and long-run estimators including Cross-Sectionally augmented Distributed Lag (CS-DL), Cross-Sectionally augmented Autoregressive Distributed Lag (CS-ARDL), and Pooled Mean Group (PMG) (Chudik et al. 2016; Shin et al. 1999). Also provides rolling-window estimation, high-dimensional fixed effect absorption, spatial CCE via user-supplied weight matrices, and structural break tests (Chow and sup-Wald) following Andrews (1993), Bai and Perron (1998), and Ditzen, Karavias and Westerlund (2024). Supplies a comprehensive cross-sectional dependence (CD) test suite including the Pesaran (2015) CD test doi:10.1080/07474938.2014.956623, the Juodis and Reese (2022) randomized weighted CD (CDw) test, the Baltagi et al. (2012) bias-adjusted weighted CD (CDw+) test, the Fan et al. (2015) Power Enhancement Approach (PEA) test, and the Pesaran and Xie (2021) bias-corrected CD (CD*) test. Further diagnostics include the Pesaran (2007) Cross-sectionally Augmented IPS (CIPS) panel unit root test doi:10.1002/jae.951, the Westerlund (2007) panel cointegration tests, the Dumitrescu and Hurlin (2012) panel Granger causality test, the Im-Pesaran-Shin (IPS) and Levin-Lin-Chu (LLC) panel unit root tests, the Pedroni (2004) and Kao (1999) residual cointegration tests, the Swamy (1970) and Pesaran and Yamagata (2008) slope homogeneity tests, a Hausman-type test for MG versus pooled, the exponent of cross-sectional dependence from Bailey et al. (2016) doi:10.1002/jae.2490, information criteria for Cross-Sectional Average (CSA) selection, the rank condition classifier, impulse response functions, cross-section and wild bootstrap inference, and 'broom'-compatible methods.

Author(s)

Maintainer: Mustapha Wasseja muswaseja@gmail.com

Apply absorb projection to a panel

Description

Demeans the dependent variable, the regressors, and any user-supplied CSA source variables using the given grouping factors. Returns the panel with the affected numeric columns overwritten by their within transforms. Non-numeric columns and columns not listed in the target set are left untouched.

Usage

.absorb_apply(panel, target_vars, groups)

Arguments

panel

A panel data.frame from .make_panel().

target_vars

Character vector: column names to be demeaned.

groups

A list of grouping factors from .absorb_resolve().

Value

The panel with the target columns demeaned.

Alternating projections for multi-factor demeaning

Description

Iteratively projects X off each grouping factor until convergence. Correia (2016) shows this converges geometrically for any finite number of factors; in practice 10-30 iterations suffice for the tolerances used below.

Usage

.absorb_demean_multi(X, groups, tol = 1e-10, max_iter = 200L)

Arguments

X

Numeric vector or matrix.

groups

A list of factors (from .absorb_resolve()).

tol

Numeric: convergence tolerance on max absolute change.

max_iter

Integer: hard iteration cap.

Value

Within-transformed X.

Within transform: demean numeric columns by a single factor

Description

Within transform: demean numeric columns by a single factor

Usage

.absorb_demean_one(X, g)

Arguments

X

Numeric vector or matrix.

g

A factor of length matching the rows of X.

Value

Demeaned X.

Resolve absorb argument to a list of grouping vectors

Description

Resolve absorb argument to a list of grouping vectors

Usage

.absorb_resolve(absorb, panel)

Arguments

absorb

Either NULL, a character vector of column names, or a one-sided formula like ~ industry + year.

panel

A panel data.frame.

Value

A named list of grouping factors (one per variable) or NULL if no absorb requested.

Ahn-Horenstein ER criterion

Description

Eigenvalue Ratio criterion: ⁠k* = argmax_{k=1,...,K_max} lambda[k] / lambda[k+1]⁠

Usage

.ahn_horenstein_er(eigenvalues, K_max)

Arguments

eigenvalues

Numeric vector of eigenvalues (decreasing order).

K_max

Integer: maximum number of factors to consider.

Value

Integer: estimated number of factors.

Ahn-Horenstein GR criterion

Description

Growth Ratio criterion: GR(k) = log(1 + lambda[k]/V_k) / log(1 + lambda[k+1]/V_k) where V_k = sum(lambda[k+1:end]).

Usage

.ahn_horenstein_gr(eigenvalues, K_max)

Arguments

eigenvalues

Numeric vector of eigenvalues.

K_max

Integer: maximum number of factors.

Value

Integer: estimated number of factors.

Attach the cumulative CDP to a panel as a new column

Description

Merges the CDP into the panel by time period (units before the first differenced period receive NA), then cumulates within each unit to recover the level-form process \sum_{s \le t} \hat\mu_s.

Usage

.amg_attach_cdp_level(panel, cdp_df)

Arguments

panel

A panel data.frame.

cdp_df

Two-column data.frame from .amg_common_dynamic_process().

Value

The panel augmented with a cdp_level column.

Compute the Common Dynamic Process from a pooled FD regression

Description

Runs a pooled first-difference regression with time dummies and returns a two-column data.frame mapping each time period to its estimated common dynamic process value. The reference time period (the first one after differencing) has CDP 0.

Usage

.amg_common_dynamic_process(panel, y_name, x_names)

Arguments

panel

A panel data.frame from .make_panel().

y_name

Character: dependent variable name.

x_names

Character: regressor names (level variables).

Value

A two-column data frame: time and cdp.

Andrews (1993) sup-Wald critical values

Description

Lookup for asymptotic critical values of the sup-Wald statistic under symmetric trimming, taken from Andrews (1993, Econometrica 61(4), Table I, columns for the sup-LM / sup-Wald test). Values are tabulated for q (number of tested parameters) from 1 to 10 and trimming fractions pi_0 in the set (0.01, 0.05, 0.10, 0.15, 0.20, 0.25). For values of q above 10 the function linearly extrapolates in q at the nearest tabulated pi_0. For intermediate pi_0 values the function uses the nearest tabulated column.

Usage

.andrews_sup_wald_cv(q, pi0)

Arguments

q

Integer: number of tested parameters.

pi0

Computes cross-sectional means within groups defined by column by. Useful for models with group-specific common factors.

Usage

.build_cluster_csa(panel, vars, lags = 0L, by)

Arguments

panel

A panel data.frame from .make_panel().

vars

Character vector of variable names.

lags

Integer or named integer vector of CSA lags.

by

Character: name of the grouping variable in panel.

Value

panel with CSA columns appended (group-specific averages).

Build cross-sectional averages

Description

Computes cross-sectional means of vars at each time period and appends contemporaneous plus lagged CSAs as new columns. For unbalanced panels, means are computed using only observed units at each period.

Usage

.build_csa(panel, vars, lags = 0L)

Arguments

panel

A panel data.frame from .make_panel().

vars

Character vector of variable names to average.

lags

Integer: number of lags of CSAs to include (0 = contemporaneous only). Can also be a named integer vector for per-variable lags.

Value

The input panel with additional CSA columns appended.

Build cross-sectional averages from a wider sample

Description

Computes cross-sectional means using full_panel (a broader sample) and merges them into panel (the estimation sample). Useful when the estimation sample is a subset of the available data.

Usage

.build_global_csa(panel, full_panel, vars, lags = 0L)

Arguments

panel

A panel data.frame (estimation sample).

full_panel

A panel data.frame (wider sample for computing averages).

vars

Character vector of variable names.

lags

Integer or named integer vector of CSA lags.

Value

panel with CSA columns appended (computed from full_panel).

Build spatial (local) cross-sectional averages

Description

For each variable in vars and each unit i at time t, computes \bar y^W_{i,t} = \sum_j w_{ij} y_{j,t} where W is the row-normalised spatial weight matrix. Appends these as new columns on the panel, alongside lagged versions when requested.

Usage

.build_spatial_csa(panel, vars, W, lags = 0L)

Arguments

panel

A panel data.frame from .make_panel().

vars

Character vector of variables to build spatial CSAs for.

W

A validated spatial weight matrix from .spatial_validate_W(), with rows/columns labelled by unit id.

lags

Integer scalar or named integer vector of lag orders. Default 0L (contemporaneous only).

Details

Compared to .build_csa() (which appends a single global series per variable, broadcast to all units), the spatial variant produces unit-specific values — each unit sees its own neighbourhood average.

Value

The panel with new csa_* columns appended.

Internal: build a suggested dcce() call string

Description

Internal: build a suggested dcce() call string

Usage

.build_suggested_call(
  rec_model,
  unit_index,
  time_index,
  y_name,
  x_names,
  rec_lags,
  data_name
)

Build the regressor matrix for a single unit

Description

Build the regressor matrix for a single unit

Usage

.build_unit_X(panel, idx, x_names, csa_colnames, include_constant, unit_trend)

Arguments

panel

Panel data.frame.

idx

Integer indices for this unit.

x_names

Character: structural regressor names.

csa_colnames

Character: CSA column names (or NULL).

include_constant

Logical: include intercept?

unit_trend

Implements a bias-adjusted LM-style test with random sign weighting. The base statistic is the Baltagi, Feng & Kao (2012) bias-adjusted LM:

LM_{adj} = \sqrt{\frac{1}{N(N-1)}} \sum_{i<j} \frac{(T_{ij}-1) \hat\rho_{ij}^2 - 1}{\sqrt{2}},

which is asymptotically standard normal under the null. The "weighted plus" variant multiplies each pairwise term by random Rademacher weights w_i w_j and averages the absolute value across draws, producing a statistic that is robust to heteroskedasticity and sensitive to sparse alternatives.

Usage

.cd_weighted_plus(em, n_reps)

Check panel balance

Description

Check panel balance

Usage

.check_balance(panel)

Arguments

panel

A panel data.frame from .make_panel().

Value

A list with elements is_balanced, T_i (named vector of per-unit time periods), T_min, T_max, T_bar, N.

Check rank of augmented CSA matrix

Description

Verifies that the CSA matrix (with lags) is full column rank. Warns but continues if rank deficient (MG estimates remain consistent per Chudik & Pesaran 2015, p.398).

Usage

.check_csa_rank(Z, unit_id = "")

Arguments

Z

Numeric matrix: the CSA matrix for a given unit.

unit_id

Given the dependent variable name, the list of regressor names, and the panel, classify each regressor term into:

y-lag (e.g. L(y,1))
contemporaneous or lagged x (grouped by base variable name)

Usage

.csardl_classify_terms(y_name, x_names)

Arguments

y_name

Character: dependent variable name.

x_names

Character vector: regressor names from the parsed formula.

Value

A list with

y_lag_terms: Character vector of y-lag term names.
y_lag_orders: Integer vector of the lag orders matching y_lag_terms.
x_groups: Named list: each element is a character vector of terms (level + lags) belonging to a single base regressor.
x_group_lags: Named list: each element is an integer vector of lag orders matching x_groups.

Aggregate unit-level long-run and adjustment results to the MG level

Description

Aggregate unit-level long-run and adjustment results to the MG level

Usage

.csardl_mg_lr(unit_lr)

Arguments

unit_lr

A list of unit-level results from .csardl_unit_lr().

Value

A list with MG long-run coefficients, MG variance, MG adjustment speed, and an inverse-variance weighted pooled long-run estimate (used by PMG).

Post-process a CS-ARDL fit to attach LR / adjustment information

Description

Called from dcce() after the unit-level OLS and MG aggregation are done. Classifies terms, computes unit-level LR and adjustment, aggregates to MG, and returns the augmented fit object.

Usage

.csardl_postprocess(fit)

Arguments

fit

A dcce_fit object from the main dcce() pipeline.

Value

The fit object augmented with lr_coef, lr_vcov, lr_se, adjustment, adjustment_se, and related fields.

Recover unit-level long-run coefficients and adjustment speed

Description

For a single unit, given the ARDL coefficient vector b and its variance V, compute the long-run coefficient on each base regressor and the speed of adjustment, with delta-method standard errors.

Usage

.csardl_unit_lr(b, V, classify)

Arguments

b

Numeric vector: unit-level coefficients from the ARDL regression (must have names matching the regressor columns).

V

Numeric matrix: unit-level variance-covariance matrix.

classify

List returned by .csardl_classify_terms().

Value

A list with

lr_coef: Named numeric vector of long-run coefficients.
lr_vcov: Variance-covariance matrix of lr_coef.
phi: Adjustment speed \varphi_i = -(1 - \sum \phi_p).
phi_se: Delta-method SE of phi.

Augment a CS-DL panel with first-difference lags of the regressors

Description

Given a panel and the list of base regressor names, attach columns for the first difference of each regressor and its lags 1, ..., p_x, and also the first difference of the dependent variable (used as the CS-DL LHS). The returned list contains the augmented panel plus the vectors of column names that should be used on the LHS and RHS of the CS-DL regression.

Usage

.csdl_augment(panel, y_name, x_names, p_x = 3L)

Arguments

panel

A panel data.frame with unit_var/time_var attributes.

y_name

Character scalar: dependent variable name.

x_names

Character vector: base regressor names (levels).

p_x

Integer: number of \Delta x lags to include (default 3).

Value

A list with

panel: Augmented panel with new columns.
y_diff_name: Name of the first-difference dependent variable.
rhs_terms: Character vector: level x + lagged \Delta x terms for the CS-DL regression.
lr_names: Character vector: column names whose coefficients are the long-run effects (one per base x).

Post-process a CS-DL fit to label long-run coefficients

Description

The coefficients on the level x terms are the long-run effects. This helper extracts them and stores them in the fit's lr_coef and lr_vcov fields so that downstream S3 methods can print a long-run block.

Usage

.csdl_postprocess(fit, lr_names)

Arguments

fit

A dcce_fit object.

lr_names

Character vector of x level names whose coefficients are the long-run effects.

Value

Augmented fit object.

Check whether the compiled C++ unit-OLS routines are available

Description

Check whether the compiled C++ unit-OLS routines are available

Usage

.dcce_cpp_available()

Delta Method

Description

Computes standard errors for nonlinear transformations of parameters using the delta method: ⁠V(g(b)) = G' V(b) G⁠ where G = dg/db.

Usage

.delta_method(g, b, V, h = 1e-06)

Arguments

g

A function of the parameter vector returning the transformed values.

b

Numeric vector: parameter estimates.

V

Numeric matrix: variance-covariance of b.

h

Numeric: step size for numerical differentiation. Default 1e-6.

Value

A list with estimate (transformed values) and vcov (delta-method variance).

Panel-aware differencing

Description

Computes k-th differences within each cross-sectional unit.

Usage

.diff_panel(panel, vars, k = 1L)

Arguments

panel

A panel data.frame from .make_panel().

vars

Character vector of column names to difference.

k

Integer difference order (default 1).

Value

A data.frame (or vector if single var) of differenced values.

Estimate MG on a subset of time periods

Description

Estimate MG on a subset of time periods

Usage

.estimate_half(
  panel,
  y_name,
  x_names,
  csa_colnames,
  unit_var,
  time_var,
  constant,
  trend,
  time_subset
)

Evaluate a formula term

Description

Handles plain variable names, L(), D(), and Lrange() calls. Adds the computed column(s) to the parent panel in the calling environment.

Usage

.eval_formula_term(expr, panel, unit_var, time_var)

Arguments

expr

An expression (from formula).

panel

A panel data.frame.

unit_var

Character: unit variable name.

time_var

Character: time variable name.

Value

Character vector of column name(s) created or referenced.

Extract base variable names

Description

Given potentially transformed names like "L(y,1)" or "D(x,1)", extract the underlying variable names.

Usage

.extract_base_vars(var_names, panel)

Arguments

var_names

Character vector of variable names.

panel

A panel data.frame.

Value

Character vector of base variable names that exist in panel.

Post-process a dcce_fit into an IFE fit

Description

Called from dcce() when model = "ife". Runs the Bai (2009) iterative PC estimator on the panel stored inside the fit.

Usage

.fit_ife(
  panel,
  y_name,
  x_names,
  n_factors = NULL,
  max_iter = 100L,
  tol = 1e-08,
  include_constant = TRUE
)

Arguments

panel

Prepared panel data.frame.

y_name

Character: dependent variable name.

x_names

Character: regressor names.

n_factors

Integer or NULL: number of factors. If NULL, the BIC3 criterion of Bai & Ng (2002) is used to select r.

max_iter

Integer: maximum iterations. Default 100.

tol

Numeric: convergence tolerance on beta. Default 1e-8.

include_constant

Logical: include unit FE? Default TRUE.

Value

A list with coefficients, vcov, se, factors, loadings, n_factors, residuals, r2, and related fields.

Get CSA column names

Description

Returns all column names matching the CSA naming convention.

Usage

.get_csa_colnames(panel)

Arguments

panel

A panel data.frame with CSA columns.

Value

Character vector of CSA column names.

Half-Panel Jackknife Bias Correction

Description

Implements the Chudik & Pesaran (2015) half-panel jackknife for dynamic CCE/DCCE estimators. Each unit's time series is split into two halves at the midpoint; MG estimates are computed on each half, and the full-sample MG is bias-corrected as

\hat\beta_{HPJ} = 2\hat\beta_{full} - \frac{1}{2}(\hat\beta_{half1} + \hat\beta_{half2}).

Usage

.half_panel_jackknife(panel_list, coef_names, fast = TRUE)

Arguments

panel_list

Named list of list(y, X) pairs (from the main dcce() pipeline).

coef_names

Character vector: structural coefficient names to extract from each half-sample fit.

fast

Logical: use the C++ fast path.

Details

This targets the Nickell (1981) bias that arises in short-T dynamic panels when the lagged dependent variable is correlated with the unit fixed effect.

Value

A named numeric vector of the half-panel MG average (to be combined with the full-sample estimate in dcce()), or NULL if the correction cannot be applied.

Select number of factors via BIC3 (Bai & Ng 2002)

Description

Select number of factors via BIC3 (Bai & Ng 2002)

Usage

.ife_select_factors(Y_dm, max_r = 10L)

Half-panel jackknife bias correction

Description

Implements the Chudik & Pesaran (2015) jackknife bias correction: b_jk = 2 * b_full - 0.5 * (b_first + b_second) where b_first and b_second are MG estimates from the first and second half of the time dimension, respectively.

Usage

.jackknife_bias_correction(
  panel,
  y_name,
  x_names,
  csa_colnames,
  unit_var,
  time_var,
  constant,
  trend,
  b_full
)

Arguments

panel

A panel data.frame with all CSA columns already appended.

y_name

Character: dependent variable name.

x_names

Character: regressor names.

csa_colnames

Character: CSA column names.

unit_var

Character: unit variable name.

time_var

Character: time variable name.

constant

Logical: include intercept.

trend

Logical: include trend.

b_full

Numeric vector: full-sample MG coefficient.

Value

A list with b_jk (bias-corrected coefficients).

Kao ADF statistic

Description

Kao ADF statistic

Usage

.kao_core(resid_list, lags)

Panel-aware lag

Description

Computes lagged values within each cross-sectional unit. Returns NA for the first k observations of each unit to avoid cross-unit contamination.

Usage

.lag_panel(panel, vars, k = 1L)

Arguments

panel

A panel data.frame from .make_panel().

vars

Character vector of column names to lag.

k

Integer lag order (positive = lag, negative = lead).

Value

A data.frame (or vector if single var) of lagged values, same length as nrow(panel).

Prepare a panel data frame

Description

Validates the data and index columns, sorts by unit then time, and attaches "unit_var" and "time_var" attributes. Accepts either the new unit_index/time_index arguments or the legacy index argument.

Usage

.make_panel(data, unit_index = NULL, time_index = NULL, index = NULL)

Arguments

data

A data.frame.

unit_index

Character scalar: name of the unit identifier column.

time_index

Character scalar: name of the time identifier column.

index

Character vector of length 2 (legacy): c(unit_col, time_col). Ignored if unit_index and time_index are provided.

Value

A sorted data.frame with attributes unit_var and time_var.

Split a time vector into segments based on break dates

Description

Split a time vector into segments based on break dates

Usage

.make_segments(all_times, break_dates)

Mean Group aggregation

Description

Computes the unweighted average of unit-level coefficient vectors.

Usage

.mg_aggregate(coef_list)

Arguments

coef_list

A list of numeric coefficient vectors (one per unit).

Value

Numeric vector: Mean Group coefficient.

Mean Group variance

Description

Non-parametric variance estimator for the MG coefficient: ⁠V = (1/N^2) * sum_i (b_i - b_mg)(b_i - b_mg)'⁠

Usage

.mg_variance(coef_list, b_mg)

Arguments

coef_list

A list of numeric coefficient vectors.

b_mg

Extracts the dependent variable name and regressor names from the formula. Evaluates L(), D(), Lrange() calls by applying them panel-aware and adding the resulting columns to the panel.

Usage

.parse_dcce_formula(formula, panel, unit_var, time_var)

Arguments

formula

A formula.

panel

A panel data.frame.

unit_var

Character: unit variable name.

time_var

Character: time variable name.

Value

A list with y_name and x_names.

Partial out CSAs from regressors

Description

Projects out the CSA (cross-sectional average) component from a matrix of variables using the Frisch-Waugh-Lovell projection: ⁠M_Z X = X - Z (Z'Z)^{-1} Z' X⁠

Usage

.partial_out(X, Z)

Arguments

X

Numeric matrix of variables to be partialled.

Z

Moore-Penrose pseudoinverse

Usage

.pinv(M, tol = .Machine$double.eps^0.5)

Arguments

M

A numeric matrix.

tol

Tolerance for zero singular values.

Value

The pseudoinverse of M.

Pool unit-level long-run coefficients via inverse-variance weighting

Description

For each base regressor k, compute

\hat\theta^{PMG}_k = \frac{\sum_i w_{ik} \hat\theta_{ik}}{\sum_i w_{ik}}, \quad w_{ik} = 1 / \hat V(\hat\theta_{ik}),

with pooled standard error (\sum_i w_{ik})^{-1/2}.

Usage

.pmg_pool_lr(unit_lr)

Arguments

unit_lr

A list of unit-level results from .csardl_unit_lr().

Value

A list with lr_coef, lr_se, and the full pooled vcov (diagonal).

Post-process a PMG fit

Description

Takes a fit that has already been run through .csardl_postprocess() and overrides the long-run block with the pooled PMG estimates. The adjustment speed remains the MG average.

Usage

.pmg_postprocess(fit)

Arguments

fit

A dcce_fit object with CS-ARDL post-processing.

Value

Augmented fit object.

Pesaran (2006) non-parametric pooled variance

Description

For pooled (CCEP) coefficients, computes the non-parametric VCE: ⁠V_p = (1/N^2) sum_i (b_i - b_p)(b_i - b_p)'⁠ This is the same formula as MG variance but applied to the deviation from the pooled estimate rather than the MG estimate.

Usage

.pooled_vcov_pesaran(coef_list, b_pooled)

Arguments

coef_list

List of unit-level coefficient vectors.

b_pooled

Resolve cross-sectional average variables

Usage

.resolve_cr_vars(cr_formula, y_name, x_names, panel)

Arguments

cr_formula

A one-sided formula for CSA variables.

y_name

Character: dependent variable name.

x_names

Character: regressor names.

panel

A panel data.frame.

Value

Character vector of variable names to compute CSAs for.

Run unit-level OLS over a list of (y, X) pairs

Description

Dispatcher used by dcce(). Accepts a named list where each element is a list with elements y and X. Returns a list of unit-level OLS result lists with the same names. Internally dispatches to:

the C++ batch routine (.batch_ols_cpp) when fast = TRUE and the compiled shared library is available;
parallel::mclapply() when n_cores > 1 and the platform is Unix/macOS;
the pure-R .unit_ols() in all other cases.

Usage

.run_unit_loop(panel_list, fast = TRUE, n_cores = 1L)

Arguments

panel_list

Named list of list(y, X) pairs.

fast

Logical: use the compiled C++ fast path when available? Default TRUE.

n_cores

Integer: number of cores for parallel unit estimation. Only effective on Unix/macOS. Default 1L (no parallelism).

Value

A named list of unit-level OLS results matching the structure returned by .unit_ols().

Sequential Bai-Perron: recursively search pre/post segments

Description

Sequential Bai-Perron: recursively search pre/post segments

Usage

.seq_bai_perron(
  panel,
  formula,
  model,
  unit_index,
  time_index,
  first_break,
  trim,
  n_breaks,
  test_terms,
  ...
)

Validate and normalise a spatial weight matrix

Description

Validate and normalise a spatial weight matrix

Usage

.spatial_validate_W(W, units)

Arguments

W

A numeric matrix (N x N).

units

Character vector of unit identifiers in the panel, used both for dimension checks and for aligning rows/columns.

Value

A row-normalised matrix with zero diagonal.

Fit pre/post regimes at a given break date and return a Wald statistic

Description

Fit pre/post regimes at a given break date and return a Wald statistic

Usage

.structural_break_at(
  panel,
  formula,
  model,
  unit_index,
  time_index,
  break_date,
  test_terms,
  ...
)

Arguments

panel

A prepared panel.

formula

A dcce formula.

model

Character: base estimator.

break_date

The split point.

test_terms

Optional subset of coefficient names.

Value

A list with fit_pre, fit_post, wald, df.

SVD of CSA matrix

Description

Computes thin SVD and returns right singular vectors (principal components in the time dimension).

Usage

.svd_csa(Z, npc = NULL, criterion = c("er", "gr"))

Arguments

Z

Numeric matrix (T x K) of CSA variables.

npc

Integer: number of principal components to retain. If NULL, determined by ER/GR criterion.

criterion

Character: "er" (eigenvalue ratio) or "gr" (growth ratio). Default "er".

Value

A list with V_k (T x k matrix of PCs), eigenvalues, and k.

Unit-level OLS

Description

Fast OLS for a single cross-sectional unit. Uses normal equations rather than lm() for speed when called in a loop.

Usage

.unit_ols(y, X)

Arguments

y

Numeric vector: dependent variable.

X

Numeric matrix: regressors (including intercept if desired).

Value

A list with elements:

b: Coefficient vector.
V: Variance-covariance matrix of coefficients.
e: Residual vector.
r2: R-squared.
df_resid: Residual degrees of freedom.
sigma2: Residual variance (s^2).

Validate CD test names

Description

Validate CD test names

Usage

.validate_cd_tests(test)

Validate Westerlund test names

Description

Validate Westerlund test names

Usage

.validate_westerlund_tests(test)

Internal: convert a single variable column of a panel to a (N x T) matrix

Description

Internal: convert a single variable column of a panel to a (N x T) matrix

Usage

.var_to_matrix(panel, var)

Aggregate unit-level ECM statistics into Westerlund Ga/Gt/Pa/Pt

Description

Aggregate unit-level ECM statistics into Westerlund Ga/Gt/Pa/Pt

Usage

.westerlund_aggregate(unit_res, tests = c("ga", "gt", "pa", "pt"))

Approximate asymptotic p-value for a Westerlund statistic

Description

Uses critical values from Westerlund (2007, Table 3). For values more extreme than the 1% critical value, p_value is clamped to 0.001. For values less extreme than the 10% critical value, p_value is set to 0.5. Linear interpolation between tabulated critical values is used in-between. All statistics have a large-negative rejection region.

Usage

.westerlund_pvalue(stat, test_name)

Run error-correction regressions per unit

Description

Run error-correction regressions per unit

Usage

.westerlund_unit_regressions(
  panel,
  y_name,
  x_names,
  lags = 1L,
  leads = 1L,
  show_progress = FALSE
)

Arguments

panel

Prepared panel data.

y_name

Character: dependent variable.

x_names

Character: regressor names (levels).

lags

Integer: number of \Delta y lags.

leads

Integer: number of \Delta x leads (not used in this simplified implementation).

show_progress

Logical.

Value

A data frame with columns unit, T_i, alpha, alpha_se, t_alpha.

Difference operator for dcce formulas

Description

Creates a differenced version of a variable for use inside dcce() formulas.

Usage

D(x, k = 1L)

Arguments

x

A numeric vector.

k

Integer difference order. Default 1.

Value

A numeric vector of the same length as x with leading NAs.

Examples

x <- c(10, 20, 30, 40, 50)
D(x, 1)   # NA 10 10 10 10
D(x, 2)   # NA NA 20 20 20

Lag operator for dcce formulas

Description

Creates a lagged version of a variable for use inside dcce() formulas. This function is evaluated during formula processing by dcce() and should not be called directly on raw vectors outside of a dcce formula context.

Usage

L(x, k = 1L)

Arguments

x

A numeric vector (column name evaluated within dcce()).

k

Integer lag order. Default 1. Positive values lag, negative lead.

Value

A numeric vector of the same length as x with leading NAs.

Examples

x <- c(10, 20, 30, 40, 50)
L(x, 1)   # NA 10 20 30 40
L(x, 2)   # NA NA 10 20 30

Lag range operator for dcce formulas

Description

Creates multiple lagged versions of a variable from lag k0 to lag k1. Useful for distributed lag specifications.

Usage

Lrange(x, k0, k1)

Arguments

x

A numeric vector.

k0

Integer start lag (inclusive).

k1

Integer end lag (inclusive).

Value

A matrix with columns L[k0] through L[k1].

Examples

x <- c(10, 20, 30, 40, 50)
Lrange(x, 0, 2)   # 3 columns: lag 0, lag 1, lag 2

High-Dimensional Fixed Effect Absorption

Description

Internal helpers to project out one or more grouping factors from a set of numeric columns before the main dcce() unit loop runs. When a single factor is supplied the within transformation reduces to a simple group-mean demeaning; for two or more factors the alternating projections algorithm of Guimaraes & Portugal (2010) / Correia (2016) is used.

Details

The implementation is pure base R and requires no additional package dependencies.

Augmented Mean Group (AMG) Estimator Internals

Description

Internal helpers for the Augmented Mean Group (AMG) estimator of Eberhardt & Teal (2010) and Bond & Eberhardt (2013). AMG accounts for cross-sectional dependence via a two-step procedure:

Details

Fit a pooled first-difference regression of \Delta y_{it} on \Delta x_{it} augmented with T-1 time dummies.
Extract the time-dummy coefficients as the Common Dynamic Process (CDP), a non-parametric proxy for unobserved common factors.
Cumulate the CDP within each unit (back to the level) and add it as an extra regressor in a unit-level OLS on levels.
Average the unit-level slopes (excluding the CDP) to obtain the AMG Mean Group estimate.

This implementation uses base R throughout and does not add any new package dependencies.

Bootstrap Inference for DCCE Models

Description

Computes bootstrap standard errors and confidence intervals for dcce_fit objects using either cross-section or wild bootstrap.

Usage

bootstrap(
  object,
  type = c("crosssection", "wild"),
  reps = 500L,
  percentile = TRUE,
  cfresiduals = FALSE,
  seed = NULL
)

Arguments

object

A dcce_fit object.

type

Character: "crosssection" (default) or "wild".

reps

Integer: number of bootstrap repetitions. Default 500.

percentile

Logical: compute percentile CIs? Default TRUE.

cfresiduals

Logical: for wild bootstrap, use common-factor residuals instead of defactored residuals? Default FALSE.

seed

Integer: random seed for reproducibility. Default NULL.

Value

An object of class dcce_boot with elements:

se_boot: Bootstrap standard errors.
ci_lower: Percentile CI lower bound (if percentile=TRUE).
ci_upper: Percentile CI upper bound.
b_boot: B x K matrix of bootstrap coefficient draws.
reps: Number of repetitions.
type: Bootstrap type.

Note

Naming conflict with broom::bootstrap. The broom package also exports a function called bootstrap() (for resampling data frames), with a completely different signature. If you load broom after dcce, broom::bootstrap will mask dcce::bootstrap on the search path and calls to bootstrap(fit, type = ..., reps = ...) will fail with an "unused arguments" error. To avoid the conflict you can either (a) use the namespace prefix dcce::bootstrap(fit, ...), (b) load broom before dcce so dcce ends up higher in the search path, or (c) use the conflict-free alias dcce_bootstrap(fit, ...) which is exported by dcce and has the same semantics.

Examples

set.seed(42)
df <- data.frame(
  id = rep(1:10, each = 30),
  t  = rep(1:30, 10),
  y  = rnorm(300),
  x  = rnorm(300)
)
fit <- dcce(df, "id", "t", y ~ x, model = "mg", cross_section_vars = NULL)
boot_res <- bootstrap(fit, reps = 50)
print(boot_res)

Static CCE Estimator Internals

Description

Internal functions for the Pesaran (2006) Common Correlated Effects estimator (both Mean Group and Pooled variants).

Pesaran CIPS Panel Unit Root Test

Description

Implements the cross-sectionally augmented IPS (CIPS) panel unit root test of Pesaran (2007), which allows for cross-sectional dependence through a single unobserved common factor. For each unit, a cross-sectionally augmented Dickey-Fuller (CADF) regression is run:

Usage

cips_test(x, ...)

## S3 method for class 'matrix'
cips_test(x, ..., lags = 0L, trend = FALSE)

## S3 method for class 'dcce_fit'
cips_test(x, ..., lags = 0L, trend = FALSE)

## Default S3 method:
cips_test(
  x,
  ...,
  data = NULL,
  unit_index = NULL,
  time_index = NULL,
  lags = 0L,
  trend = FALSE
)

Arguments

x

A numeric vector, numeric matrix (N x T), data.frame, or dcce_fit object. If a vector, data, unit_index, and time_index must also be supplied.

...

Additional arguments passed to methods.

lags

Integer: number of lags of \Delta y to include in the CADF regression. Default 0 (pure CADF without augmentation).

trend

Logical: include a linear time trend? Default FALSE.

data

A data.frame containing the panel structure (when x is a vector).

unit_index

Character: name of the unit variable in data.

time_index

Character: name of the time variable in data.

Details

\Delta y_{it} = a_i + b_i y_{i,t-1} + c_i \bar{y}_{t-1} + d_i \Delta \bar{y}_t + \sum_{j=1}^{p} \rho_{ij} \Delta y_{i,t-j} + \sum_{j=0}^{p} \delta_{ij} \Delta \bar{y}_{t-j} + u_{it}.

The CIPS statistic is the cross-sectional average of the unit-level t-statistics for b_i = 0 (the CADF statistic). Critical values come from Pesaran (2007, Table II(b), constant case) and Pesaran (2007, Table II(c), constant + trend case). The null hypothesis is that all series contain a unit root.

Value

An object of class dcce_cips with elements:

statistic: The CIPS statistic (truncated cross-sectional average).
p_value: Approximate p-value from Pesaran (2007) critical values.
unit_stats: Per-unit truncated CADF t-statistics.
N: Number of units.
T: Time dimension.
lags: Number of augmentation lags.
trend: Whether a trend was included.

References

Pesaran, M. H. (2007). A simple panel unit root test in the presence of cross-section dependence. Journal of Applied Econometrics, 22(2), 265-312.

Examples

set.seed(1)
N <- 20; T <- 30
f <- cumsum(rnorm(T))
X <- matrix(NA, N, T)
for (i in seq_len(N)) X[i, ] <- cumsum(rnorm(T)) + 0.5 * f
cips_test(X)

Extract coefficients from a dcce_fit object

Description

Extract coefficients from a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
coef(object, type = c("mg", "unit"), ...)

Arguments

object

A dcce_fit object.

type

Character: "mg" for Mean Group coefficients (default), "unit" for unit-level coefficients as a tibble.

...

Ignored.

Value

A named numeric vector (for "mg") or a tibble (for "unit").

Westerlund (2007) Panel Cointegration Tests

Description

Implements the four error-correction-based panel cointegration tests of Westerlund (2007): two group-mean statistics (Ga, Gt) and two panel statistics (Pa, Pt). The null hypothesis is no cointegration between the dependent variable and the regressors.

Usage

cointegration_test(
  data,
  unit_index,
  time_index,
  formula,
  lags = 1L,
  leads = 1L,
  test = c("ga", "gt", "pa", "pt"),
  n_bootstrap = 0L,
  seed = NULL,
  show_progress = FALSE
)

Arguments

data

A panel data.frame.

unit_index

Character scalar: unit identifier column.

time_index

Character scalar: time identifier column.

formula

Two-sided formula of the form y ~ x1 + x2 in levels (not first differences).

lags

Integer scalar: ADF lag order for \Delta y lags in the ECM regression. Default 1L.

leads

Integer scalar: number of leads of \Delta x. Default 1L.

test

Character vector: which statistics to compute. Subset of c("ga", "gt", "pa", "pt"). Default: all four.

n_bootstrap

Integer: number of bootstrap replications for p-values. Default 0L (asymptotic p-values).

seed

Integer: random seed for the bootstrap.

show_progress

Logical: print progress? Default FALSE.

Details

For each unit i, an error-correction regression is fitted

\Delta y_{it} = \delta_i d_t + \alpha_i y_{i,t-1} + \lambda_i' x_{i,t-1} + \sum_{j=1}^{p_i} \alpha_{ij} \Delta y_{i,t-j} + \sum_{j=-q_i}^{p_i} \gamma_{ij}' \Delta x_{i,t-j} + e_{it},

and the t-statistic for \alpha_i = 0 is used to construct the four test statistics:

Gt: group-mean of the \hat\alpha_i / SE(\hat\alpha_i).
Ga: group-mean of T \hat\alpha_i / \hat\alpha_i(1) (unnormalised form).
Pt: pooled t-statistic.
Pa: pooled alpha-statistic.

A negative, large-magnitude statistic rejects the null of no cointegration. Asymptotic p-values are obtained by linear interpolation on the critical values in Westerlund (2007, Table 3). A bootstrap path (n_bootstrap > 0) is also provided, resampling cross-sectional units with replacement to account for cross-sectional dependence.

Value

An object of class dcce_cointegration with elements statistics (a tibble with columns test, statistic, p_value, method), lags, leads, N, T_bar, and call.

References

Westerlund, J. (2007). Testing for Error Correction in Panel Data. Oxford Bulletin of Economics and Statistics, 69(6), 709-748.

Examples

data(pwt8)
result <- cointegration_test(
  data       = pwt8,
  unit_index = "country",
  time_index = "year",
  formula    = log_rgdpo ~ log_hc + log_ck,
  test       = c("ga", "gt"),
  lags       = 1L
)
print(result)

Confidence intervals for a dcce_fit object

Description

Confidence intervals for a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
confint(
  object,
  parm = NULL,
  level = 0.95,
  type = c("mg", "lr", "adjustment"),
  ...
)

Arguments

object

A dcce_fit object.

parm

Character vector of parameter names (default: all).

level

Confidence level. Default 0.95.

type

Character: "mg" (default) for the main MG coefficients, "lr" for long-run coefficients (CS-ARDL, CS-DL, PMG), "adjustment" for the speed of adjustment (CS-ARDL, PMG).

...

Ignored.

Value

A matrix of confidence intervals with rows corresponding to parameters.

Cross-Sectional Average (CSA) Utilities

Description

Internal functions for computing cross-sectional averages and their lags, used to approximate unobserved common factors in CCE-type estimators.

CS-ARDL Estimator Internals

Description

Internal functions for the Cross-Sectionally augmented ARDL (CS-ARDL) estimator of Chudik, Mohaddes, Pesaran & Raissi (2016). Estimates an ARDL(p_y, p_x) model with cross-sectional averages and recovers long-run coefficients and the speed of adjustment via the delta method.

Details

The unit-level regression is

y_{it} = \alpha_i + \sum_{p=1}^{P_y} \phi_{ip} y_{i,t-p} + \sum_{q=0}^{P_x} \beta'_{iq} x_{i,t-q} + \delta'_{i} \bar{z}_t + e_{it}.

The long-run coefficient on x_k is

\theta_{ik} = \frac{\sum_{q=0}^{P_x} \beta_{ikq}}{1 - \sum_{p=1}^{P_y} \phi_{ip}}

and the implied speed of adjustment is

\varphi_i = -\left(1 - \sum_{p=1}^{P_y} \phi_{ip}\right).

Exponent of Cross-Sectional Dependence

Description

Estimates the exponent of cross-sectional dependence (alpha) using the methods of Bailey, Kapetanios & Pesaran (2016, 2019).

Usage

csd_exp(
  x,
  data = NULL,
  unit_index = NULL,
  time_index = NULL,
  use_residuals = FALSE,
  n_pca = 1L,
  test_size = 0.1,
  tuning = 0.5,
  n_bootstrap = 200L
)

Arguments

x

Either a numeric vector (variable stacked by unit), a numeric matrix (N x T), or a dcce_fit object (uses residuals).

data

A data.frame containing the panel structure. Required if x is a vector.

unit_index

Character: name of the unit variable in data.

time_index

Character: name of the time variable in data.

use_residuals

Logical: if TRUE, use the BKP (2019) residual method; if FALSE (default), use the BKP (2016) variable method.

n_pca

Integer: number of principal components. Default 1.

test_size

Numeric: significance level for thresholding. Default 0.1.

tuning

Numeric: tuning parameter for residual method threshold. Default 0.5.

n_bootstrap

Integer: bootstrap repetitions for SE (residual method only). Default 200.

Value

An object of class dcce_csd with elements alpha (estimated exponent), se (standard error), ci (confidence interval), method, N, and T_val.

Examples

set.seed(42)
# Matrix of cross-sectionally dependent data
N <- 20; T_val <- 50
f <- rnorm(T_val)
x <- matrix(NA, N, T_val)
for (i in 1:N) x[i,] <- rnorm(1) * f + rnorm(T_val, sd = 0.5)
result <- csd_exp(x, use_residuals = FALSE)
print(result)

CS-DL Estimator Internals

Description

Internal functions for the Cross-Sectionally augmented Distributed Lag (CS-DL) estimator of Chudik, Mohaddes, Pesaran & Raissi (2016). The CS-DL regression is

\Delta y_{it} = \alpha_i + w'_i x_{it} + \sum_{\ell=0}^{p_x} \phi'_{i\ell} \Delta x_{i,t-\ell} + \delta'_i \bar{z}_t + u_{it},

where the long-run coefficient w_i is identified directly as the coefficient on the level of x_{it}. This file provides helpers to augment a formula with the required \Delta x_{t-\ell} terms before the main dcce() pipeline runs.

Dynamic Common Correlated Effects Estimation

Description

Estimates heterogeneous coefficient panel data models with cross-sectional dependence using Mean Group (MG), Common Correlated Effects (CCE), Dynamic CCE (DCCE), and related estimators.

Usage

dcce(
  data,
  unit_index,
  time_index,
  formula,
  model = c("dcce", "cce", "ccep", "mg", "amg", "rcce", "ife", "pmg", "csdl", "csardl"),
  cross_section_vars = ~.,
  cross_section_lags = 0L,
  pooled_vars = NULL,
  include_constant = TRUE,
  unit_trend = FALSE,
  bias_correction = c("none", "jackknife", "recursive", "half_panel_jackknife"),
  long_run_vars = NULL,
  long_run_model = NULL,
  csdl_xlags = 3L,
  absorb = NULL,
  spatial_weights = NULL,
  fast = TRUE,
  n_cores = 1L,
  run_cd_test = FALSE,
  full_sample = FALSE,
  verbose = FALSE,
  ...
)

Arguments

data

A data.frame containing the panel data.

unit_index

Character: name of the unit (cross-section) variable.

time_index

Character: name of the time variable.

formula

A formula of the form y ~ x1 + x2. Supports L(), D(), and Lrange() operators for lags, differences, and lag ranges.

model

Character: estimator to use. One of "dcce", "cce", "ccep", "mg", "amg", "rcce", "ife", "pmg", "csdl", "csardl". Default "dcce". "ccep" is the Pooled CCE of Pesaran (2006) which constrains slopes to be identical across units. "ife" is the iterative PC estimator of Bai (2009).

cross_section_vars

A one-sided formula specifying variables for cross-sectional averages, e.g. ~ x1 + x2. Use ~ . for all RHS variables plus the dependent variable. Use NULL for no CSAs (plain MG).

cross_section_lags

Integer or named integer vector: number of lags of CSAs. Default 0. A single integer applies to all CSA variables.

pooled_vars

A one-sided formula specifying which coefficients to constrain equal across units (pooled estimation). Default NULL (all heterogeneous).

include_constant

Logical: include unit-specific intercepts? Default TRUE.

unit_trend

Logical: include unit-specific linear trends? Default FALSE.

bias_correction

Character: bias correction method. One of "none", "jackknife", "recursive", "half_panel_jackknife". The "half_panel_jackknife" implements the Chudik & Pesaran (2015) time-series half-panel jackknife that corrects the Nickell bias in dynamic CCE. Default "none".

long_run_vars

A one-sided formula specifying long-run variables (reserved for CSDL/CSARDL models). Default NULL.

long_run_model

Character: long-run model specification (reserved). Default NULL.

csdl_xlags

Integer: number of lags of \Delta x to include as short-run controls when model = "csdl". Default 3.

absorb

Either NULL (default), a character vector of column names, or a one-sided formula like ~ industry + region specifying high-dimensional fixed effects to project out of y and X before the main unit loop runs. A single factor uses the within transformation; multiple factors use the alternating projections of Guimaraes & Portugal (2010) / Correia (2016). The unit fixed effects used by CCE estimators are still kept via unit intercepts; absorb is for additional categorical effects on top of the cross-section.

spatial_weights

Optional N \times N numeric matrix of spatial weights. When supplied, the global cross-sectional averages of classical CCE are replaced with local, unit-specific weighted averages \bar y^W_{i,t} = \sum_j w_{ij} y_{j,t}. The matrix must be square with rows and columns matching the unit identifiers (row/column names are used for alignment if present); it is row-normalised automatically and the diagonal is zeroed. This enables spatial CCE estimation that respects the topology of cross-sectional dependence (geographical contiguity, trade links, etc.).

fast

Logical: use the compiled C++ (RcppArmadillo) unit-OLS fast path? Default TRUE. Falls back to pure R automatically if the compiled routines are not available.

n_cores

Integer: number of cores for parallel unit estimation. Only effective on Unix/macOS via parallel::mclapply; on Windows the argument is silently ignored. Default 1L (sequential).

run_cd_test

Logical: run the Pesaran CD test on residuals? Default FALSE.

full_sample

Logical: use the full (unbalanced) sample? Default FALSE.

verbose

Logical: print progress messages? Default FALSE.

...

Additional arguments passed to model-specific estimators.

Value

An object of class dcce_fit (and a model-specific subclass).

Examples

# Simple Mean Group estimation
df <- data.frame(
  id = rep(1:10, each = 20),
  t  = rep(1:20, 10),
  y  = rnorm(200),
  x  = rnorm(200)
)
fit <- dcce(df, unit_index = "id", time_index = "t",
            formula = y ~ x, model = "mg", cross_section_vars = NULL)
coef(fit)

Bootstrap alias that avoids the broom conflict

Description

Convenience alias for bootstrap with an unambiguous name. broom::bootstrap is a data-frame resampling helper that shares a name with dcce::bootstrap but has a completely different signature. If you load broom after dcce the broom function masks ours on the search path and calls to bootstrap(fit, type = ..., reps = ...) will fail with an "unused arguments" error. dcce_bootstrap() is identical to dcce::bootstrap() but cannot be masked by any other package.

Usage

dcce_bootstrap(
  object,
  type = c("crosssection", "wild"),
  reps = 500L,
  percentile = TRUE,
  cfresiduals = FALSE,
  seed = NULL
)

Arguments

object

A dcce_fit object.

type

Character: "crosssection" (default) or "wild".

reps

Integer: number of bootstrap repetitions. Default 500.

percentile

Logical: compute percentile CIs? Default TRUE.

cfresiduals

Logical: for wild bootstrap, use common-factor residuals instead of defactored residuals? Default FALSE.

seed

Integer: random seed for reproducibility. Default NULL.

Value

See bootstrap.

Examples

set.seed(42)
df <- data.frame(
  id = rep(1:10, each = 30),
  t  = rep(1:30, 10),
  y  = rnorm(300),
  x  = rnorm(300)
)
fit <- dcce(df, "id", "t", y ~ x, model = "mg", cross_section_vars = NULL)
dcce_bootstrap(fit, reps = 50)

Dynamic CCE Estimator Internals

Description

Internal functions specific to the Dynamic CCE (DCCE) estimator of Chudik and Pesaran (2015), including jackknife bias correction and collinearity checks on the augmented CSA matrix.

S3 Methods for dcce_fit Objects

Description

S3 Methods for dcce_fit Objects

Rolling-Window Panel Estimation

Description

Fits a sequence of dcce() models on overlapping time windows of the panel, producing a time path of coefficient estimates. Useful for detecting coefficient drift, parameter instability, or regime shifts in long panels.

Usage

dcce_rolling(
  data,
  unit_index,
  time_index,
  formula,
  model = "cce",
  window = 20L,
  step = 1L,
  min_units = 5L,
  verbose = FALSE,
  ...
)

Arguments

data

A panel data.frame.

unit_index

Character: unit identifier column.

time_index

Character: time identifier column.

formula

Two-sided model formula passed through to dcce().

model

Character: estimator to use (default "cce"). Same choices as dcce().

window

Integer: window length in time periods.

step

Integer: number of time periods to advance between windows. Default 1L.

min_units

Integer: minimum number of cross-sectional units a window must retain after NA handling to be kept. Default 5L.

verbose

Logical: print progress? Default FALSE.

...

Additional arguments passed to dcce() (e.g. cross_section_vars, cross_section_lags, fast).

Details

At each window the function subsets the panel to observations whose time index falls in [t_k, t_k + \text{window} - 1], runs dcce() with the user-supplied arguments, and collects the Mean Group coefficient vector and its standard errors. The result is returned as a dcce_rolling object containing the full list of fits and a tidy tibble of coefficients indexed by window end-date.

Value

An object of class dcce_rolling containing:

fits: List of dcce_fit objects, one per window (or NULL for windows that failed).
coefficients: A tibble with one row per (window end-date, term) combination, columns window_end, window_start, term, estimate, std.error, conf.low, conf.high.
window: The window length.
step: The step size.
n_windows: Number of successful windows.
call: The original call.

Examples

data(pwt8)
roll <- dcce_rolling(
  data       = pwt8,
  unit_index = "country",
  time_index = "year",
  formula    = d_log_rgdpo ~ log_hc + log_ck + log_ngd,
  model      = "cce",
  cross_section_vars = ~ .,
  window     = 20,
  step       = 2
)
print(roll)

Simulated Dynamic Panel Dataset

Description

A synthetic panel dataset generated from the dynamic common correlated effects DGP of Chudik and Pesaran (2015), equation (1). Generated with N = 30 cross-sectional units and T = 50 time periods. The true mean group parameters are: autoregressive coefficient 0.50, slope on x 1.00. Factor structure uses a single AR(1) common factor with persistence 0.60.

Usage

dcce_sim

Format

A data frame with 1500 rows and 4 columns:

unit: Cross-sectional unit identifier (integer, 1–30)
time: Time period (integer, 1–50)
y: Simulated dependent variable
x: Simulated regressor (cross-sectionally dependent via common factor)

Details

The true parameter values are stored in the companion object dcce_sim_truth and can be used in tests to verify estimator consistency.

References

Chudik, A. and Pesaran, M. H. (2015). Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors. Journal of Econometrics, 188(2), 393–420.

True Parameters for the Simulated Panel Dataset

Description

A named list containing the true mean group parameter values used to generate dcce_sim. Use in tests to verify that dcce() recovers parameters close to their true values.

Usage

dcce_sim_truth

Format

A named list with elements:

beta1_mg: True mean group autoregressive coefficient (~0.50)
beta2_mg: True mean group slope on x (~1.00)
N: Number of cross-sectional units (30)
T: Number of time periods (50)
seed: Random seed used for generation (20240101)

Automatic Diagnostic Workflow for Panel Data with CSD

Description

Runs the recommended pre-estimation diagnostic sequence on a panel regression specification and returns a structured report with a suggested dcce() call. The workflow performs six steps:

Usage

dcce_workflow(
  data,
  unit_index,
  time_index,
  formula,
  max_cr_lags = NULL,
  significance = 0.05,
  verbose = TRUE,
  n_bootstrap = 0L
)

Arguments

data

A panel data.frame.

unit_index

Character: unit identifier column.

time_index

Character: time identifier column.

formula

Two-sided formula (levels, not differences).

max_cr_lags

Integer: maximum CSA lag order to evaluate. Default NULL uses \lfloor T^{1/3} \rfloor.

significance

Numeric: significance level used for decisions. Default 0.05.

verbose

Logical: print progress as each step runs. Default TRUE.

n_bootstrap

Integer: bootstrap replications for the Westerlund p-values. Default 0L (asymptotic).

Details

Panel summary (N, T, balance).
Pesaran (2007) CIPS panel unit root test on each variable.
Pesaran CD test on raw residuals (pooled OLS) to check for cross-sectional dependence.
Westerlund (2007) cointegration test, if at least one variable is non-stationary.
Rank condition classifier (De Vos et al. 2024) for a baseline static CCE fit.
Information criterion selection of the optimal CSA lag order.

Based on the results, the function chooses a recommended estimator from c("mg", "cce", "dcce", "pmg", "csardl") and returns a printable suggested dcce() call.

Value

An object of class dcce_workflow with elements panel_summary, unit_root, csd_premodel, cointegration, rank_condition, optimal_cr_lags, recommendation, and call.

Examples

data(pwt8)
wf <- dcce_workflow(
  data       = pwt8,
  unit_index = "country",
  time_index = "year",
  formula    = log_rgdpo ~ log_hc + log_ck + log_ngd,
  verbose    = FALSE
)
print(wf)

Extract fitted values from a dcce_fit object

Description

Extract fitted values from a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
fitted(object, ...)

Arguments

object

A dcce_fit object.

...

Ignored.

Value

A numeric vector of fitted values (y-hat), or NULL if not available.

Glance at a dcce_fit object

Description

Returns a single-row tibble of model summary statistics, compatible with the broom package.

Usage

## S3 method for class 'dcce_fit'
glance(x, ...)

Arguments

x

A dcce_fit object.

...

Ignored.

Value

A single-row tibble.

Dumitrescu-Hurlin Panel Granger Causality Test

Description

Tests whether x Granger-causes y in a heterogeneous panel, following Dumitrescu & Hurlin (2012). For each unit i a bivariate VAR-type regression is fitted:

y_{it} = \alpha_i + \sum_{k=1}^{K} \gamma_{ik} y_{i,t-k} + \sum_{k=1}^{K} \beta_{ik} x_{i,t-k} + u_{it},

and the individual Wald statistic for H_0^{(i)}: \beta_{i1} = \cdots = \beta_{iK} = 0 is computed.

Usage

granger_test(data, unit_index, time_index, y, x, lags = 1L)

Arguments

data

A panel data.frame.

unit_index

Character: unit identifier column.

time_index

Character: time identifier column.

y

Character: name of the dependent variable.

x

Character: name of the potential Granger-causing variable.

lags

Integer: lag order K for both variables. Default 1.

Details

The panel statistics are:

W-bar: The cross-sectional average of the N unit-level Wald statistics.
Z-bar: Standardised version: \tilde{Z} = \sqrt{N/(2K)} (W\text{-bar} - K) \xrightarrow{d} \mathcal{N}(0,1).
Z-bar tilde: Small-sample adjusted version using E[W_i] and Var(W_i) from the F(K, T-3K-1) distribution.

Value

An object of class dcce_granger with elements:

W_bar: Mean Wald statistic across units.
Z_bar: Standardised statistic (large-T).
Z_bar_tilde: Small-sample adjusted statistic.
p_value_Z: p-value for Z-bar.
p_value_Zt: p-value for Z-bar tilde.
unit_wald: Named vector of per-unit Wald statistics.
N: Number of units used.
T_bar: Average time-series length.
lags: Lag order.

References

Dumitrescu, E.-I., & Hurlin, C. (2012). Testing for Granger non-causality in heterogeneous panels. Economic Modelling, 29(4), 1450–1460.

Examples

data(pwt8)
gc <- granger_test(
  data = pwt8, unit_index = "country", time_index = "year",
  y = "d_log_rgdpo", x = "log_ck", lags = 1
)
print(gc)

Hausman-type Test: MG vs Pooled

Description

Tests the null that the pooled and MG estimators are both consistent (i.e. slopes are homogeneous and the pooled estimator is efficient) against the alternative that slopes are heterogeneous and only MG is consistent. The test statistic is

H = (\hat\beta_{MG} - \hat\beta_{pool})' [V_{MG} - V_{pool}]^{-1} (\hat\beta_{MG} - \hat\beta_{pool}) \sim \chi^2_k.

Usage

hausman_test(object)

Arguments

object

A dcce_fit object.

Value

An object of class dcce_hausman with the test statistic, degrees of freedom, and p-value.

Information Criteria for CSA Selection

Description

Implements the Margaritella & Westerlund (2023) information and panel criteria for selecting the number of cross-sectional averages.

Usage

ic(object, models = NULL)

Arguments

object

A dcce_fit object.

models

Optional list of dcce_fit objects for PC criteria comparison.

Value

An object of class dcce_ic with IC1, IC2, PC1, PC2.

Interactive Fixed Effects (IFE) Estimator

Description

Implements the iterative principal-components estimator of Bai (2009) for panel data with interactive fixed effects:

y_{it} = \alpha_i + \beta' x_{it} + \lambda_i' f_t + u_{it},

where f_t is an r-vector of unobserved common factors and \lambda_i is unit i's vector of factor loadings. Unlike CCE (which proxies f_t via cross-sectional averages), IFE estimates f_t and \lambda_i directly via principal components on the residual matrix.

Details

The algorithm iterates between:

Given (\hat\beta, \hat\alpha_i), compute residuals \hat e_{it} and extract the first r principal components as \hat f_t, with loadings \hat\lambda_i.
Given (\hat f_t, \hat\lambda_i), re-estimate \hat\beta and \hat\alpha_i by pooled OLS on the defactored data.

until convergence. The number of factors r can be specified or selected by information criteria (BIC3 of Bai & Ng 2002).

Important: IFE estimates a common (pooled) \beta, not heterogeneous unit-specific slopes. If you need heterogeneous slopes, use CCE/DCCE instead.

Impulse Response Functions for Dynamic Panel Models

Description

Computes impulse response functions (IRFs) from a fitted dynamic panel model (DCCE, CS-ARDL, or PMG). The IRF traces the response of the dependent variable to a one-unit shock in a specific regressor over a given number of horizons, using the Mean Group ARDL coefficient estimates and the autoregressive lag polynomial.

Usage

irf(object, impulse, horizon = 10L, boot_reps = 200L, seed = NULL)

Arguments

object

A dcce_fit object from a dynamic model (must contain at least one L(y, k) term).

impulse

Character: the regressor that receives the shock.

horizon

Integer: number of periods to trace. Default 10.

boot_reps

Integer: bootstrap replications for confidence bands. 0 = no bands. Default 200.

seed

Integer: random seed.

Details

Confidence bands are computed via the cross-section bootstrap: units are resampled with replacement, the MG coefficients are recomputed, and the IRF is re-traced. The 2.5 and 97.5 percent quantiles of the bootstrap distribution form the 95 percent band.

Value

An object of class dcce_irf containing the impulse response path ($irf), optional bootstrap lower/upper bands, and metadata (impulse variable name and horizon length).

Examples

data(dcce_sim)
fit <- dcce(
  data = dcce_sim, unit_index = "unit", time_index = "time",
  formula = y ~ L(y, 1) + x,
  model = "dcce", cross_section_vars = ~ ., cross_section_lags = 3
)
ir <- irf(fit, impulse = "x", horizon = 10, boot_reps = 0)
print(ir)

marginaleffects Compatibility for dcce_fit Objects

Description

Provides S3 methods on the marginaleffects internal generics get_coef, get_vcov, get_predict, and find_predictors so that marginaleffects::avg_slopes(), marginaleffects::avg_predictions(), and marginaleffects::hypotheses() work directly on dcce_fit objects.

Details

The methods are registered dynamically in .onLoad() if the marginaleffects package is available; this keeps the dependency in Suggests rather than Imports.

Mean Group Estimator Internals

Description

Internal functions for the Pesaran & Smith (1995) Mean Group estimator.

Pedroni and Kao Panel Cointegration Tests

Description

Implements simplified versions of the Pedroni (1999, 2004) and Kao (1999) residual-based panel cointegration tests.

Usage

panel_coint_test(
  data,
  unit_index,
  time_index,
  formula,
  test = c("pedroni", "kao"),
  lags = 1L
)

Arguments

data

A panel data.frame.

unit_index

Character: unit identifier column.

time_index

Character: time identifier column.

formula

Two-sided formula in levels: y ~ x1 + x2.

test

Character: "pedroni" or "kao".

lags

Integer: ADF lag order for the residual regression. Default 1.

Details

Pedroni: Fits unit-level OLS regressions of y on x_1, \ldots, x_K, extracts the residuals, runs an ADF regression on each unit's residual series, and averages the t-statistics. Reports the group-mean t-statistic (group_t) and the group-mean rho-statistic (group_rho).

Kao: Similar but pools the residuals rather than averaging unit-level statistics. Reports a single ADF t-statistic on the demeaned pooled residuals.

Both tests have null hypothesis no cointegration. A large negative statistic rejects the null.

Value

An object of class dcce_cointegration_extra with elements test, statistics (a tibble), N, T_bar, lags.

References

Pedroni, P. (1999). Critical values for cointegration tests in heterogeneous panels with multiple regressors. Oxford Bulletin of Economics and Statistics, 61(S1), 653–670.

Pedroni, P. (2004). Panel cointegration: asymptotic and finite sample properties of pooled time series tests with an application to the PPP hypothesis. Econometric Theory, 20(3), 597–625.

Kao, C. (1999). Spurious regression and residual-based tests for cointegration in panel data. Journal of Econometrics, 90(1), 1–44.

Examples

data(pwt8)
panel_coint_test(
  data = pwt8, unit_index = "country", time_index = "year",
  formula = log_rgdpo ~ log_hc + log_ck,
  test = "pedroni", lags = 1
)

IPS and LLC Panel Unit Root Tests

Description

Implements the Im, Pesaran & Shin (2003) IPS t-bar test and the Levin, Lin & Chu (2002) LLC common-root test for panel unit roots. Neither test corrects for cross-sectional dependence — use cips_test for CSD-robust testing.

Usage

panel_ur_test(x, ...)

## S3 method for class 'matrix'
panel_ur_test(x, ..., test = c("ips", "llc"), lags = 0L, trend = FALSE)

## Default S3 method:
panel_ur_test(
  x,
  ...,
  data = NULL,
  unit_index = NULL,
  time_index = NULL,
  test = c("ips", "llc"),
  lags = 0L,
  trend = FALSE
)

Arguments

x

A numeric matrix (N x T), data.frame, or vector.

...

Additional arguments.

test

Character: "ips" or "llc".

lags

Integer: ADF lag order. Default 0.

trend

Logical: include a time trend. Default FALSE.

data, unit_index, time_index

Panel structure (when x is a vector).

Details

IPS: fits unit-level ADF regressions, averages the t-statistics, and standardises using tabulated moments from Im et al. (2003, Table 2).

LLC: fits unit-level ADF regressions, extracts the t-statistic from a pooled regression of adjusted \Delta y on adjusted y_{t-1}, standardised to N(0,1).

Value

An object of class dcce_unit_root with elements test, statistic, p_value, N, T, lags, trend.

References

Im, K. S., Pesaran, M. H., & Shin, Y. (2003). Testing for unit roots in heterogeneous panels. Journal of Econometrics, 115(1), 53–74.

Levin, A., Lin, C.-F., & Chu, C.-S. J. (2002). Unit root tests in panel data: asymptotic and finite-sample properties. Journal of Econometrics, 108(1), 1–24.

Examples

set.seed(1)
X <- matrix(cumsum(rnorm(20 * 30)), 20, 30)
panel_ur_test(X, test = "ips")
panel_ur_test(X, test = "llc")

Panel Data Utilities

Description

Internal functions for constructing, validating, and manipulating panel data. Exported formula helpers L(), D(), and Lrange() allow lag/difference operations inside dcce() formulas.

Cross-Sectional Dependence Tests

Description

Computes cross-sectional dependence test statistics for panel data residuals. Implements the Pesaran (2015) CD test, CDw (Juodis & Reese 2022), PEA (Fan et al. 2015), and CD* (Pesaran & Xie 2021).

Usage

pcd_test(x, ...)

## S3 method for class 'dcce_fit'
pcd_test(
  x,
  ...,
  test = c("pesaran", "cdw", "cdwplus", "pea", "cdstar"),
  n_reps = 500L,
  n_pca = 1L
)

## S3 method for class 'data.frame'
pcd_test(
  x,
  ...,
  unit_index = NULL,
  time_index = NULL,
  test = c("pesaran", "cdw", "pea", "cdstar"),
  n_reps = 500L,
  n_pca = 1L
)

## S3 method for class 'matrix'
pcd_test(
  x,
  ...,
  test = c("pesaran", "cdw", "pea", "cdstar"),
  n_reps = 500L,
  n_pca = 1L
)

## Default S3 method:
pcd_test(
  x,
  ...,
  data = NULL,
  unit_index = NULL,
  time_index = NULL,
  test = c("pesaran", "cdw", "cdwplus", "pea", "cdstar"),
  n_reps = 500L,
  n_pca = 1L
)

Arguments

x

Either a numeric vector of residuals (stacked by unit), a dcce_fit object, a numeric matrix (N x T) of residuals, or a data.frame containing the panel structure.

...

Arguments passed to methods.

test

Character vector specifying which tests to compute. One or more of "pesaran", "cdw", "cdwplus", "pea", "cdstar". Default uses all five. See Pesaran (2015), Juodis & Reese (2022), Baltagi, Feng & Kao (2012), Fan et al. (2015), Pesaran & Xie (2021).

n_reps

Integer: number of Rademacher draws for CDw. Default 500.

n_pca

Integer: number of principal components for CD* bias correction. Default 1.

unit_index

Character: name of the unit variable in data.

time_index

Character: name of the time variable in data.

data

A data.frame containing the panel structure. Required if x is a vector.

Value

An object of class dcce_cd containing:

statistics: A data.frame with columns test, statistic, p_value.
N: Number of cross-sectional units.
T_bar: Average time dimension.
rho_ij: Matrix of pairwise correlations (if retained).

Examples

set.seed(42)
df <- data.frame(
  id = rep(1:10, each = 20),
  t  = rep(1:20, 10),
  e  = rnorm(200)
)
cd <- pcd_test(df$e, data = df, unit_index = "id", time_index = "t",
               test = "pesaran")
print(cd)

Plot method for dcce_fit objects

Description

Produces coefficient distribution plots (one histogram per regressor) showing the unit-level estimates and the MG mean.

Usage

## S3 method for class 'dcce_fit'
plot(x, which = c("coef", "resid"), ...)

Arguments

x

A dcce_fit object.

which

Character: "coef" (default) for coefficient histograms, or "resid" for a residuals-versus-time plot by unit.

...

Passed to the underlying graphics call.

Value

Invisibly returns x.

Plot a dcce_irf object

Description

Plot a dcce_irf object

Usage

## S3 method for class 'dcce_irf'
plot(x, ...)

Arguments

x

A dcce_irf object.

...

Passed to plot().

Value

Invisibly returns x.

Plot a dcce_rolling coefficient path

Description

Produces one line per regressor showing the rolling-window coefficient against the window end-date, with a 95% confidence ribbon built from the unit-loop standard errors.

Usage

## S3 method for class 'dcce_rolling'
plot(x, terms = NULL, ...)

Arguments

x

A dcce_rolling object.

terms

Character vector of terms to plot. Default NULL plots all non-intercept terms.

...

Passed to the underlying graphics call.

Value

Invisibly returns x.

Pooled Mean Group (PMG) Estimator Internals

Description

Internal functions for the Pooled Mean Group (PMG) estimator of Pesaran, Shin & Smith (1999). PMG imposes common long-run coefficients across units while letting the speed of adjustment and short-run dynamics remain heterogeneous.

Details

The model is the same ARDL(p_y, p_x) as CS-ARDL, but the long-run coefficient vector \theta is pooled. This implementation uses a two-step inverse-variance weighted pooling of the unit-level long-run coefficients obtained from the CS-ARDL fit, a simplification of the concentrated maximum-likelihood estimator of Pesaran, Shin & Smith (1999) that is fast, consistent, and requires no numerical optimisation.

Predict from a dcce_fit object

Description

Predict from a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
predict(object, newdata = NULL, type = c("response", "xb"), ...)

Arguments

object

A dcce_fit object.

newdata

Optional data.frame with new observations. When supplied, predictions are computed using the Mean Group coefficients on the structural regressors (CSAs are not applied because they depend on the full cross-section). When NULL, in-sample unit-level fitted values are returned.

type

Character: "response" (default) returns predicted y; "xb" is an alias for response (kept for compatibility with marginaleffects).

...

Ignored.

Value

A numeric vector of predictions.

Print method for dcce_boot objects

Description

Print method for dcce_boot objects

Usage

## S3 method for class 'dcce_boot'
print(x, ...)

Arguments

x

A dcce_boot object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_break object

Description

Print a dcce_break object

Usage

## S3 method for class 'dcce_break'
print(x, ...)

Arguments

x

A dcce_break object.

...

Ignored.

Value

Invisibly returns x.

Print method for dcce_cd objects

Description

Print method for dcce_cd objects

Usage

## S3 method for class 'dcce_cd'
print(x, ...)

Arguments

x

A dcce_cd object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_cips object

Description

Print a dcce_cips object

Usage

## S3 method for class 'dcce_cips'
print(x, ...)

Arguments

x

A dcce_cips object.

...

Ignored.

Value

Invisibly returns x.

Print method for dcce_cointegration

Description

Print method for dcce_cointegration

Usage

## S3 method for class 'dcce_cointegration'
print(x, ...)

Arguments

x

A dcce_cointegration object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_cointegration_extra object

Description

Print a dcce_cointegration_extra object

Usage

## S3 method for class 'dcce_cointegration_extra'
print(x, ...)

Arguments

x

A dcce_cointegration_extra object.

...

Ignored.

Value

Invisibly returns x.

Print method for dcce_csd objects

Description

Print method for dcce_csd objects

Usage

## S3 method for class 'dcce_csd'
print(x, ...)

Arguments

x

A dcce_csd object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_fit object

Description

Print a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
print(x, ...)

Arguments

x

A dcce_fit object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_granger object

Description

Print a dcce_granger object

Usage

## S3 method for class 'dcce_granger'
print(x, ...)

Arguments

x

A dcce_granger object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_hausman object

Description

Print a dcce_hausman object

Usage

## S3 method for class 'dcce_hausman'
print(x, ...)

Arguments

x

A dcce_hausman object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_irf object

Description

Print a dcce_irf object

Usage

## S3 method for class 'dcce_irf'
print(x, ...)

Arguments

x

A dcce_irf object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_rolling object

Description

Print a dcce_rolling object

Usage

## S3 method for class 'dcce_rolling'
print(x, ...)

Arguments

x

A dcce_rolling object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_swamy object

Description

Print a dcce_swamy object

Usage

## S3 method for class 'dcce_swamy'
print(x, ...)

Arguments

x

A dcce_swamy object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_unit_root object

Description

Print a dcce_unit_root object

Usage

## S3 method for class 'dcce_unit_root'
print(x, ...)

Arguments

x

A dcce_unit_root object.

...

Ignored.

Value

Invisibly returns x.

Print a dcce_workflow object

Description

Print a dcce_workflow object

Usage

## S3 method for class 'dcce_workflow'
print(x, ...)

Arguments

x

A dcce_workflow object.

...

Ignored.

Value

Invisibly returns x.

Penn World Tables Growth Panel Dataset

Description

Panel dataset from Jan Ditzen's xtdcce2 Stata package, originally derived from the Penn World Tables 8. Used in Ditzen (2018, Stata Journal 18:3, 585–617) to illustrate MG, CCE, and DCCE estimation. Contains 93 countries observed from 1960 to 2007 (balanced).

Usage

pwt8

Format

A data frame with 4464 rows and 8 columns:

id: Country numeric identifier
year: Year (1960–2007)
log_rgdpo: Log real GDP (output-side)
log_hc: Log human capital index
log_ck: Log physical capital stock
log_ngd: Log of population growth plus depreciation
country: Country identifier (character)
d_log_rgdpo: First difference of log_rgdpo (GDP growth)

Source

Downloaded from https://github.com/JanDitzen/xtdcce2.

References

Ditzen, J. (2018). Estimating dynamic common-correlated effects in Stata. The Stata Journal, 18(3), 585–617.

Rank Condition Classifier

Description

Implements the De Vos et al. (2024) rank condition classifier for CCE estimators. Tests whether the rank condition holds (RC=1) which ensures CCE consistency.

Usage

rank_condition(object, criterion = c("er", "gr"))

Arguments

object

A dcce_fit object.

criterion

Character: "er" or "gr" for factor number selection. Default "er".

Details

Note: Only valid for static panels. Emits a warning if called on dynamic or PMG models.

Value

An object of class dcce_rank with elements m (number of common factors), g (rank of average factor loadings), RC (1 if rank condition holds, 0 otherwise).

Regularized CCE Estimator Internals

Description

Internal functions for the regularized CCE (rCCE) estimator of Juodis (2022). Uses SVD of cross-sectional averages to extract principal components, with the number of components selected by Ahn-Horenstein ER/GR criteria.

Extract residuals from a dcce_fit object

Description

Extract residuals from a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
residuals(object, ...)

Arguments

object

A dcce_fit object.

...

Ignored.

Value

A numeric vector.

Spatial Common Correlated Effects

Description

Helpers to replace the global cross-sectional averages of classical CCE with spatially-weighted local averages. Given a row-normalised spatial weight matrix W (N \times N, zero diagonal, rows summing to one), the cross-sectional average for unit i at time t becomes

\bar{y}_{it} = \sum_{j=1}^N w_{ij} y_{jt},

i.e. a weighted average over unit i's spatial neighbours rather than a global mean across the whole panel. This allows the common factor proxy to vary across units in a way that respects the spatial topology of the data (geographical contiguity, trade links, input- output connections, etc.).

Details

The weight matrix is supplied directly by the user; dcce makes no assumptions about how it was constructed. It must be square with dimensions equal to the number of units, and rows should ideally sum to one (the helper will row-normalise silently if they do not).

Structural Break Tests for Panels with Cross-Sectional Dependence

Description

Tests for and estimates structural breaks in heterogeneous panel data models with cross-sectional dependence. Implements a Wald-type Chow test at a known break date, a sup-Wald test for an unknown break date (with trimmed candidate set), and a sequential procedure for multiple breaks following Bai & Perron (1998) adapted to panel CCE settings.

Usage

structural_break_test(
  data,
  unit_index,
  time_index,
  formula,
  model = "cce",
  type = c("unknown", "known"),
  break_date = NULL,
  trim = 0.15,
  n_breaks = 1L,
  test_terms = NULL,
  verbose = FALSE,
  ...
)

Arguments

data

A panel data.frame.

unit_index

Character: unit identifier column.

time_index

Character: time identifier column.

formula

Two-sided formula passed through to dcce().

model

Character: base estimator (default "cce").

type

Character: "known" for a Chow test at a specific break date, "unknown" for a sup-Wald test over a trimmed window. Default "unknown".

break_date

Required when type = "known". Must match the time-index type.

trim

Numeric: trimming fraction at each end of the sample for the unknown-break-date search. Default 0.15 (standard Andrews 1993 choice).

n_breaks

Integer: maximum number of breaks to search for using the sequential Bai-Perron procedure. Default 1L.

test_terms

Character vector: which coefficients to include in the Wald statistic. Default NULL tests all slope coefficients (excluding the intercept).

verbose

Logical: print progress during the candidate search. Default FALSE.

...

Additional arguments passed to dcce() (e.g. cross_section_vars, cross_section_lags).

Details

For each candidate break date \tau the function computes

W_N(\tau) = (\hat\beta_1(\tau) - \hat\beta_2(\tau))' [V_1(\tau) + V_2(\tau)]^{-1} (\hat\beta_1(\tau) - \hat\beta_2(\tau))

where \hat\beta_1 and \hat\beta_2 are the Mean Group coefficients from running the base estimator on the pre-break and post-break sub-samples respectively, and V_1, V_2 are the corresponding MG variances. Under the null of no break the statistic is distributed \chi^2_k where k is the number of tested coefficients.

The unknown-break-date sup-Wald test reports \sup_{\tau \in [\tau_1, \tau_2]} W_N(\tau), where [\tau_1, \tau_2] is the interior of the sample after applying the trimming parameter. Approximate p-values use the Andrews (1993) large-sample distribution.

Value

An object of class dcce_break with elements:

type: "known" or "unknown".
statistic: The Wald (or sup-Wald) test statistic.
p_value: Approximate p-value.
df: Degrees of freedom of the Wald statistic.
break_date: Known or estimated break date.
candidates: For type = "unknown": a tibble with one row per candidate date (date, wald).
break_dates: Vector of break dates when n_breaks > 1L.
fit_pre: dcce_fit object from the pre-break sub-sample.
fit_post: dcce_fit object from the post-break sub-sample.
call: The original call.

References

Andrews, D. W. K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61(4), 821-856.

Bai, J., & Perron, P. (1998). Estimating and testing linear models with multiple structural changes. Econometrica, 66(1), 47-78.

Ditzen, J., Karavias, Y., & Westerlund, J. (2024). Multiple structural breaks in interactive effects panel data models. Journal of Applied Econometrics.

Examples

data(pwt8)
brk <- structural_break_test(
  data       = pwt8,
  unit_index = "country",
  time_index = "year",
  formula    = d_log_rgdpo ~ log_hc + log_ck + log_ngd,
  model      = "mg",
  cross_section_vars = NULL,
  type       = "unknown",
  trim       = 0.20
)
print(brk)

Summary for a dcce_fit object

Description

Summary for a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
summary(object, ...)

Arguments

object

A dcce_fit object.

...

Ignored.

Value

Invisibly returns object.

Swamy Slope Heterogeneity Test

Description

Tests the null hypothesis that all slope coefficients are identical across cross-sectional units, against the alternative of heterogeneous slopes. Implements the Swamy (1970) chi-square test and the Pesaran & Yamagata (2008) standardised dispersion statistic.

Usage

swamy_test(object)

Arguments

object

A dcce_fit object (typically with model = "mg", "cce", or "dcce").

Details

The Swamy statistic is

\tilde{S} = \sum_{i=1}^N (\hat\beta_i - \hat\beta^*)' \frac{X_i' M_\tau X_i}{\hat\sigma_i^2} (\hat\beta_i - \hat\beta^*),

where \hat\beta_i is the unit-level OLS estimate, \hat\beta^* is the weighted pooled estimate, and M_\tau = I - \tau(\tau'\tau)^{-1}\tau' projects off the intercept. Under H_0 (homogeneous slopes), \tilde{S} is asymptotically \chi^2_{k(N-1)}.

Pesaran & Yamagata (2008) propose a standardised version that is asymptotically standard normal:

\tilde\Delta = \sqrt{N} \, \frac{N^{-1}\tilde{S} - k}{\sqrt{2k}}.

Value

An object of class dcce_swamy with elements S_stat, delta_stat, df, p_swamy, p_delta, N, k.

References

Swamy, P. A. V. B. (1970). Efficient inference in a random coefficient regression model. Econometrica, 38(2), 311-323.

Pesaran, M. H., & Yamagata, T. (2008). Testing slope homogeneity in large panels. Journal of Econometrics, 142(1), 50-93.

Tidy a dcce_fit object

Description

Returns a tibble of MG coefficients with standard errors, test statistics, p-values, and confidence intervals, compatible with the broom package. For long-run estimators (CS-ARDL, CS-DL, PMG) the tibble additionally includes rows for long-run coefficients and the adjustment speed.

Usage

## S3 method for class 'dcce_fit'
tidy(x, include_lr = TRUE, ...)

Arguments

x

A dcce_fit object.

include_lr

Logical: include long-run and adjustment rows when available? Default TRUE.

...

Ignored.

Value

A tibble with columns term, estimate, std.error, statistic, p.value, conf.low, conf.high, type.

Update a dcce_fit object

Description

Update a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
update(object, formula. = NULL, ..., evaluate = TRUE)

Arguments

object

A dcce_fit object.

formula.

Optional replacement formula (use . to keep existing parts).

...

Additional arguments passed to dcce() to override the original call.

evaluate

Logical: evaluate the updated call? Default TRUE.

Value

An updated dcce_fit object (if evaluate = TRUE) or an unevaluated call.

Extract variance-covariance matrix from a dcce_fit object

Description

Extract variance-covariance matrix from a dcce_fit object

Usage

## S3 method for class 'dcce_fit'
vcov(object, ...)

Arguments

object

A dcce_fit object.

...

Ignored.

Value

A variance-covariance matrix.

Package {dcce}

dcce: Dynamic Common Correlated Effects Estimation for Panel Data

Description

Author(s)

Apply absorb projection to a panel

Description

Usage

Arguments

Value

Alternating projections for multi-factor demeaning

Description

Usage

Arguments

Value

Within transform: demean numeric columns by a single factor

Description

Usage

Arguments

Value

Resolve absorb argument to a list of grouping vectors

Description

Usage

Arguments

Value

Ahn-Horenstein ER criterion

Description

Usage

Arguments

Value

Ahn-Horenstein GR criterion

Description

Usage

Arguments

Value

Attach the cumulative CDP to a panel as a new column

Description

Usage

Arguments

Value

Compute the Common Dynamic Process from a pooled FD regression

Description

Usage

Arguments

Value

Andrews (1993) sup-Wald critical values

Description

Usage

Arguments

Value

Approximate p-value from Andrews critical values

Description

Usage

BKP (2019) residual method for alpha

Description

Usage

BKP (2016) variable method for alpha

Description

Usage

Build cluster-level cross-sectional averages

Description

Usage

Arguments

Value

Build cross-sectional averages

Description

Usage

Arguments

Value

Build cross-sectional averages from a wider sample

Description

Usage

Arguments

Value

Build spatial (local) cross-sectional averages

Description

Usage

Arguments

Details

Value

Internal: build a suggested dcce() call string