Introduction to Quantile-on-Quantile Regression

Overview

The QuantileOnQuantile package implements the Quantile-on-Quantile (QQ) regression methodology developed by Sim and Zhou (2015). This approach estimates the effect that quantiles of one variable have on quantiles of another, capturing the dependence between their distributions.

Why Quantile-on-Quantile Regression?

Traditional regression methods like OLS estimate the effect of independent variables on the conditional mean of the dependent variable. Quantile regression extends this by estimating effects on conditional quantiles. However, both approaches treat the independent variable as a single entity, ignoring the possibility that the relationship may vary depending on whether the independent variable takes extreme or moderate values.

The QQ approach addresses this limitation by:

Modeling the quantile of Y (the dependent variable) as the outcome
Examining the effect of different quantiles of X (the independent variable)
Producing coefficients indexed by both \(\theta\) (quantile of Y) and \(\tau\) (quantile of X)

This allows researchers to ask questions like: - Do large positive shocks in X affect Y differently than large negative shocks? - Does the relationship between X and Y depend on market conditions (bull vs bear)? - Is the dependence between variables stronger in the tails of their distributions?

The Original Application: Oil Prices and Stock Returns

Sim and Zhou (2015) applied this methodology to examine the relationship between oil price shocks and US stock returns. They found that:

Large negative oil price shocks (low \(\tau\)) can positively affect US equities when the stock market is performing well (high \(\theta\))
Positive oil price shocks have weak effects regardless of market conditions
The relationship is asymmetric and depends on both the nature of oil shocks and market state

Installation

# Install from CRAN (when available)
install.packages("QuantileOnQuantile")

# Install from GitHub
# install.packages("devtools")
devtools::install_github("merwanroudane/qq")

Quick Start

Basic Usage

library(QuantileOnQuantile)
#> QuantileOnQuantile v1.0.0
#> Based on: Sim & Zhou (2015) doi:10.1016/j.jbankfin.2015.01.013
#> Type ?QuantileOnQuantile for help, or vignette('introduction') for tutorial.

# Generate example data
set.seed(42)
n <- 300
x <- rnorm(n)
y <- 0.5 * x + 0.3 * x * (x < 0) + rnorm(n, sd = 0.5)  # Asymmetric relationship

# Run QQ regression
result <- qq_regression(y, x, verbose = FALSE)

# Print summary
print(result)
#> 
#> Quantile-on-Quantile Regression Results
#> ========================================
#> 
#> Call:
#> qq_regression(y = y, x = x, verbose = FALSE)
#> 
#> Number of observations: 300 
#> Y quantiles: 19 levels
#> X quantiles: 19 levels
#> Total combinations: 361 
#> 
#> Coefficient Summary:
#>   Mean: 0.8204 
#>   Min:  0.6102 
#>   Max:  1.4603 
#> 
#> Significant at 0.05: 360 of 361

Summary Statistics

# Get detailed summary
summary(result)
#> 
#> Quantile-on-Quantile Regression Summary
#> =======================================
#> 
#> Data:
#>   Observations: 300 
#>   Y quantiles: 19 levels
#>   X quantiles: 19 levels
#>   Total combinations: 361 
#>   Complete results: 361 
#> 
#> Coefficient Statistics:
#>   Mean:   0.8204 
#>   Median: 0.7868 
#>   Min:    0.6102 
#>   Max:    1.4603 
#>   SD:     0.1375 
#> 
#> R-squared Statistics:
#>   Mean:   0.3289 
#>   Median: 0.3232 
#>   Min:    0.2083 
#>   Max:    0.527 
#> 
#> Significance:
#>   p < 0.05: 360 of 361 
#>   p < 0.01: 356 of 361

# Get statistics as data frame
stats <- qq_statistics(result)
print(stats)
#>                 Statistic    Value
#> 1        Mean Coefficient   0.8204
#> 2      Median Coefficient   0.7868
#> 3         Min Coefficient   0.6102
#> 4         Max Coefficient   1.4603
#> 5          SD Coefficient   0.1375
#> 6          Mean R-squared   0.3289
#> 7        Median R-squared   0.3232
#> 8           Min R-squared   0.2083
#> 9           Max R-squared   0.5270
#> 10 Significant (p < 0.05) 360.0000
#> 11          Total Results 361.0000

Visualization

The package provides several interactive visualization options using plotly.

3D Surface Plot

The 3D surface plot is the signature visualization of the QQ approach, showing how coefficients vary across both dimensions.

# Coefficient surface with MATLAB-style Jet colorscale
plot_qq_3d(result, type = "coefficient", colorscale = "Jet")

# R-squared surface with Viridis colorscale
plot_qq_3d(result, type = "rsquared", colorscale = "Viridis")

# P-value surface
plot_qq_3d(result, type = "pvalue", colorscale = "Plasma")

Available Color Scales

The package supports several color scales:

qq_colorscales()
#> 
#> Available Color Scales for QQ Regression Plots
#> ===============================================
#> 
#>   Jet        : MATLAB-style rainbow (blue -> cyan -> green -> yellow -> red)
#>   BlueRed    : Diverging scale (blue = low, red = high)
#>   Viridis    : Perceptually uniform, colorblind-friendly
#>   Plasma     : Perceptually uniform, high contrast
#> 
#> Usage:
#>   plot_qq_3d(result, colorscale = "Jet")
#>   plot_qq_heatmap(result, colorscale = "Viridis")

Jet: MATLAB-style rainbow (blue -> cyan -> green -> yellow -> red)
BlueRed: Diverging scale, useful for coefficients centered around zero
Viridis: Perceptually uniform, colorblind-friendly
Plasma: High contrast, perceptually uniform

Heatmaps

Heatmaps provide a 2D view of the results:

# Coefficient heatmap
plot_qq_heatmap(result, type = "coefficient", colorscale = "Viridis")

# R-squared heatmap
plot_qq_heatmap(result, type = "rsquared", colorscale = "Plasma")

# P-value heatmap
plot_qq_heatmap(result, type = "pvalue", colorscale = "Jet")

Contour Plot

Contour plots show level curves of the coefficient surface:

plot_qq_contour(result, colorscale = "Jet", show_labels = TRUE)

Quantile Correlation

The correlation heatmap shows the relationship between quantiles of both variables:

plot_qq_correlation(y, x, quantiles = seq(0.1, 0.9, by = 0.1))

Detailed Example: Simulating Asymmetric Relationships

Let’s create data that mimics the oil-stock relationship from the original paper:

set.seed(2015)
n <- 500

# Generate "oil shocks"
oil_shock <- rnorm(n)

# Generate "stock returns" with asymmetric response
stock_return <- numeric(n)
for (i in 1:n) {
  # Base return
  base_return <- 0.01
  
  # Asymmetric effect
  if (oil_shock[i] < quantile(oil_shock, 0.3)) {
    # Large negative oil shocks have positive effect
    effect <- -0.02 * oil_shock[i]
  } else if (oil_shock[i] > quantile(oil_shock, 0.7)) {
    # Large positive oil shocks have weak effect
    effect <- -0.005 * oil_shock[i]
  } else {
    # Moderate shocks have little effect
    effect <- -0.001 * oil_shock[i]
  }
  
  stock_return[i] <- base_return + effect + rnorm(1, sd = 0.04)
}

# Run QQ regression with finer grid
result_oil <- qq_regression(
  y = stock_return, 
  x = oil_shock,
  y_quantiles = seq(0.1, 0.9, by = 0.1),
  x_quantiles = seq(0.1, 0.9, by = 0.1),
  verbose = FALSE
)

# Print summary
print(result_oil)
#> 
#> Quantile-on-Quantile Regression Results
#> ========================================
#> 
#> Call:
#> qq_regression(y = stock_return, x = oil_shock, y_quantiles = seq(0.1, 
#>     0.9, by = 0.1), x_quantiles = seq(0.1, 0.9, by = 0.1), verbose = FALSE)
#> 
#> Number of observations: 500 
#> Y quantiles: 9 levels
#> X quantiles: 9 levels
#> Total combinations: 81 
#> 
#> Coefficient Summary:
#>   Mean: -0.0234 
#>   Min:  -0.046 
#>   Max:  -0.0031 
#> 
#> Significant at 0.05: 69 of 81

Working with Results

Extracting Results

The results are stored in a data frame:

# Access raw results
head(result_oil$results)
#>   y_quantile x_quantile  coefficient  std_error    t_value    p_value
#> 1        0.1        0.1 -0.003145838 0.02413509 -0.1303429 0.89683985
#> 2        0.2        0.1 -0.031636163 0.02586211 -1.2232631 0.22720195
#> 3        0.3        0.1 -0.031180344 0.02435317 -1.2803401 0.20657833
#> 4        0.4        0.1 -0.023552074 0.02156900 -1.0919408 0.28030970
#> 5        0.5        0.1 -0.042512061 0.02129035 -1.9967764 0.05153707
#> 6        0.6        0.1 -0.037150726 0.01716210 -2.1646958 0.03541426
#>    r_squared              method
#> 1 0.00796267 quantile_regression
#> 2 0.02912363 quantile_regression
#> 3 0.03286816 quantile_regression
#> 4 0.04825456 quantile_regression
#> 5 0.05179744 quantile_regression
#> 6 0.07560317 quantile_regression

# Convert to matrix format
coef_matrix <- qq_to_matrix(result_oil, type = "coefficient")
print(round(coef_matrix, 4))
#>         0.1     0.2     0.3     0.4     0.5     0.6     0.7     0.8     0.9
#> 0.1 -0.0031 -0.0303 -0.0194 -0.0300 -0.0194 -0.0154 -0.0131 -0.0190 -0.0177
#> 0.2 -0.0316 -0.0373 -0.0283 -0.0265 -0.0178 -0.0127 -0.0127 -0.0168 -0.0131
#> 0.3 -0.0312 -0.0460 -0.0367 -0.0230 -0.0146 -0.0110 -0.0098 -0.0110 -0.0098
#> 0.4 -0.0236 -0.0422 -0.0326 -0.0303 -0.0168 -0.0168 -0.0144 -0.0144 -0.0128
#> 0.5 -0.0425 -0.0428 -0.0261 -0.0280 -0.0204 -0.0188 -0.0175 -0.0168 -0.0145
#> 0.6 -0.0372 -0.0372 -0.0320 -0.0308 -0.0199 -0.0193 -0.0153 -0.0152 -0.0132
#> 0.7 -0.0308 -0.0382 -0.0339 -0.0305 -0.0245 -0.0229 -0.0165 -0.0164 -0.0126
#> 0.8 -0.0307 -0.0357 -0.0341 -0.0342 -0.0252 -0.0247 -0.0221 -0.0177 -0.0147
#> 0.9 -0.0179 -0.0261 -0.0244 -0.0368 -0.0263 -0.0266 -0.0257 -0.0188 -0.0161

Exporting Results

# Export to CSV
qq_export(result_oil, file.path(tempdir(), "qq_results.csv"))

Customizing Quantile Grids

You can customize the quantile grid for more or less granularity:

# Coarse grid (faster computation)
result_coarse <- qq_regression(
  y = stock_return,
  x = oil_shock,
  y_quantiles = seq(0.2, 0.8, by = 0.2),
  x_quantiles = seq(0.2, 0.8, by = 0.2),
  verbose = FALSE
)

# Fine grid (more detail, slower)
result_fine <- qq_regression(
  y = stock_return,
  x = oil_shock,
  y_quantiles = seq(0.05, 0.95, by = 0.05),
  x_quantiles = seq(0.05, 0.95, by = 0.05),
  verbose = FALSE
)

cat("Coarse grid combinations:", nrow(result_coarse$results), "\n")
#> Coarse grid combinations: 16
cat("Fine grid combinations:", nrow(result_fine$results), "\n")
#> Fine grid combinations: 361

Methodology Details

The QQ Model

The QQ approach is based on the following model:

\[r_t^\theta = \beta^\theta(Oil_t) + \alpha^\theta r_{t-1} + v_t^\theta\]

where \(r_t^\theta\) is the \(\theta\)-quantile of the return and \(\beta^\theta(\cdot)\) is an unknown function.

Taking a Taylor expansion around the \(\tau\)-quantile of oil shocks (\(Oil^\tau\)):

\[\beta^\theta(Oil_t) \approx \beta_0(\theta, \tau) + \beta_1(\theta, \tau)(Oil_t - Oil^\tau)\]

The key insight is that \(\beta_0(\theta, \tau)\) and \(\beta_1(\theta, \tau)\) are doubly indexed by \(\theta\) and \(\tau\), capturing the dependence between both distributions.

Estimation

The estimation proceeds by:

For each \(\tau\) (quantile of X): subset data where X <= quantile(X, \(\tau\))
For each \(\theta\) (quantile of Y): perform quantile regression
Extract coefficients and compute pseudo R-squared

The pseudo R-squared is computed as:

\[R^2 = 1 - \frac{\sum \rho_\theta(y - \hat{y})}{\sum \rho_\theta(y - Q_\theta(y))}\]

where \(\rho_\theta(u) = u(\theta - I(u < 0))\) is the check function.

Comparison with Standard Methods

OLS vs Quantile Regression vs QQ

Method	What it estimates	Captures heterogeneity in…
OLS	E[Y	X]
Quantile Regression	Q_theta[Y	X]
QQ Regression	Q_theta[Y	X_tau]

When to Use QQ Regression

Use QQ regression when you suspect that:

The effect of X on Y varies across the distribution of Y (e.g., bull vs bear markets)
The effect differs for large vs small values of X (e.g., large vs small shocks)
There is asymmetry (e.g., positive vs negative shocks have different effects)
You want to understand the complete dependence structure between two variables

References

Sim, N. and Zhou, H. (2015). Oil Prices, US Stock Return, and the Dependence Between Their Quantiles. Journal of Banking & Finance, 55, 1-12. doi:10.1016/j.jbankfin.2015.01.013

Koenker, R. (2005). Quantile Regression. Cambridge University Press.

Koenker, R. and Xiao, Z. (2006). Quantile Autoregression. Journal of the American Statistical Association, 101, 980-990.

Session Info

sessionInfo()
#> R version 4.5.2 (2025-10-31 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26100)
#> 
#> Matrix products: default
#>   LAPACK version 3.12.1
#> 
#> locale:
#> [1] LC_COLLATE=C                   LC_CTYPE=French_France.utf8   
#> [3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C                  
#> [5] LC_TIME=French_France.utf8    
#> 
#> time zone: Europe/Paris
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] QuantileOnQuantile_1.0.3
#> 
#> loaded via a namespace (and not attached):
#>  [1] Matrix_1.7-4       gtable_0.3.6       jsonlite_2.0.0     dplyr_1.1.4       
#>  [5] compiler_4.5.2     tidyselect_1.2.1   MatrixModels_0.5-4 tidyr_1.3.2       
#>  [9] jquerylib_0.1.4    splines_4.5.2      scales_1.4.0       yaml_2.3.12       
#> [13] fastmap_1.2.0      lattice_0.22-7     ggplot2_4.0.1      R6_2.6.1          
#> [17] generics_0.1.4     knitr_1.51         htmlwidgets_1.6.4  MASS_7.3-65       
#> [21] tibble_3.3.1       bslib_0.9.0        pillar_1.11.1      RColorBrewer_1.1-3
#> [25] rlang_1.1.7        cachem_1.1.0       xfun_0.56          sass_0.4.10       
#> [29] S7_0.2.1           lazyeval_0.2.2     otel_0.2.0         viridisLite_0.4.2 
#> [33] plotly_4.12.0      cli_3.6.5          magrittr_2.0.4     digest_0.6.39     
#> [37] grid_4.5.2         rstudioapi_0.18.0  quantreg_6.1       lifecycle_1.0.5   
#> [41] vctrs_0.7.0        data.table_1.18.0  SparseM_1.84-2     evaluate_1.0.5    
#> [45] glue_1.8.0         farver_2.1.2       survival_3.8-3     purrr_1.2.1       
#> [49] httr_1.4.7         rmarkdown_2.30     pkgconfig_2.0.3    tools_4.5.2       
#> [53] htmltools_0.5.9