ClusterRandSSAdj

The goal of ClusterRandSSAdj is to perform small-sample adjustment to generalized linear model based statistics when applied to one observation per cluster data from a cluster randomized trial. The small-sample adjustment method is based on the following four papers:

Ford WP, Westgate PM. Improved standard error estimator for maintaining the validity of inference in cluster randomized trials with a small number of clusters. Biom J. 2017 May;59(3):478-495. doi: 10.1002/bimj.201600182. Epub 2017 Jan 27. PMID: 28128854.

Westgate PM, Cheng DM, Feaster DJ, Fernández S, Shoben AB, Vandergrift N. Marginal modeling in community randomized trials with rare events: Utilization of the negative binomial regression model. Clin Trials. 2022 Apr;19(2):162-171. doi: 10.1177/17407745211063479. Epub 2022 Jan 6. PMID: 34991359; PMCID: PMC9038610.

Kauermann G and Carroll RJ. A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association 2001; 96: 1387–1396.

Mancl LA and DeRouen TA. A covariance estimator for GEE with improved small-sample properties. Biometrics 2001; 57: 126–134.

Example Use

The following example demonstrates how to use ClusterRandSSAdj. This example uses the simulated cluster randomized trial dataset included in ClusterRandSSAdj. The simulated trial dataset, named cluster_rct_data, contains the following variables:

clusterID
Population_size
Log_population_size
Treatment
Baserate
Log_baserate
Covar1 (0/1)
Covar2 (1/2/3)
Outcome

Let’s get started on demonstrating how ClusterRandSSAdj can be used for analysis by first fitting a Negative Binomial model to the example cluster randomized trial data.

library(ClusterRandSSAdj)
library(MASS)
#> Warning: package 'MASS' was built under R version 4.5.3
nb_model_test <- glm.nb(Outcome ~ Treatment + Log_baserate + Covar1 + Covar2 +
                      Treatment:Covar2 + Treatment:Covar1 + offset(Log_population_size), data = cluster_rct_data)

# Summarize the model
summary(nb_model_test)
#> 
#> Call:
#> glm.nb(formula = Outcome ~ Treatment + Log_baserate + Covar1 + 
#>     Covar2 + Treatment:Covar2 + Treatment:Covar1 + offset(Log_population_size), 
#>     data = cluster_rct_data, init.theta = 416.5524967, link = log)
#> 
#> Coefficients:
#>                    Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)        -1.95521    0.18595 -10.515  < 2e-16 ***
#> Treatment1         -0.09655    0.02405  -4.014 5.97e-05 ***
#> Log_baserate       -0.26113    0.10102  -2.585 0.009738 ** 
#> Covar11             0.14411    0.04209   3.424 0.000618 ***
#> Covar22             0.29125    0.03010   9.677  < 2e-16 ***
#> Covar23            -0.05297    0.04475  -1.184 0.236537    
#> Treatment1:Covar22  0.11769    0.04293   2.741 0.006117 ** 
#> Treatment1:Covar23  0.09824    0.06155   1.596 0.110484    
#> Treatment1:Covar11 -0.02451    0.04980  -0.492 0.622624    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for Negative Binomial(416.5525) family taken to be 1)
#> 
#>     Null deviance: 667.264  on 49  degrees of freedom
#> Residual deviance:  50.036  on 41  degrees of freedom
#> AIC: 953.24
#> 
#> Number of Fisher Scoring iterations: 1
#> 
#> 
#>               Theta:  416.6 
#>           Std. Err.:  83.9 
#> 
#>  2 x log-likelihood:  -933.241

This package is designed to closely replicate the analysis results obtained from fitting a generalized linear model to a one-observation per cluster dataset using PROC GLIMMIX in SAS. As such, the following code estimates the SAS scale parameter for the Negative Binomial model and additionally calculates a Wald-type 95% confidence interval for the scale parameter:

scale_est <- 1/nb_model_test$theta
scale_95CI <- 1/(c(nb_model_test$theta - 1.96*nb_model_test$SE.theta, nb_model_test$theta + 1.96*nb_model_test$SE.theta))
scale_results <- data.frame(SAS_scale = scale_est, Lower95 = scale_95CI[2], Upper95 = scale_95CI[1])
scale_results
#>     SAS_scale    Lower95     Upper95
#> 1 0.002400658 0.00172095 0.003967772

Note: The SAS scale parameter reported for a Negative Binomial model fit in PROC GLIMMIX is the dispersion parameter for the Negative Binomial model.

In this example analysis, we apply small-sample adjustment to various statistics derived from the model fit. An initial task would be to make a small sample adjustment to the model parameters using the function ClusterRandSSAdj::ModelParmAdjEst(). The function applies a small-sample adjustment to the standard errors of the model parameters, then uses the adjusted SE to perform a t-test and construct 100(1-alpha)% confidence intervals. The following call to ModelParmAdjEst() will return 95% confidence intervals.

ModelParmAdjEst(nb_model_test, alpha=0.05)
#>             Variable SSAdj_DF SSAdj_Estimate SSAdj_StdErr SSAdj_tValue
#> 1        (Intercept)       41    -1.95521145   0.21393416   -9.1393137
#> 2         Treatment1       41    -0.09655416   0.02510516   -3.8459883
#> 3       Log_baserate       41    -0.26113376   0.11661814   -2.2392207
#> 4            Covar11       41     0.14411084   0.03944746    3.6532353
#> 5            Covar22       41     0.29125192   0.02529502   11.5142007
#> 6            Covar23       41    -0.05297101   0.04957703   -1.0684587
#> 7 Treatment1:Covar22       41     0.11768647   0.04575911    2.5718694
#> 8 Treatment1:Covar23       41     0.09823911   0.06771160    1.4508461
#> 9 Treatment1:Covar11       41    -0.02450817   0.04629484   -0.5293931
#>    SSAdj_Probt SSAdj_EstLCL SSAdj_EstUCL
#> 1 1.922876e-11  -2.38726026  -1.52316264
#> 2 4.111135e-04  -0.14725506  -0.04585326
#> 3 3.062897e-02  -0.49664887  -0.02561864
#> 4 7.277014e-04   0.06444509   0.22377660
#> 5 1.999076e-14   0.24016760   0.34233625
#> 6 2.915643e-01  -0.15309385   0.04715184
#> 7 1.383719e-02   0.02527406   0.21009887
#> 8 1.544311e-01  -0.03850724   0.23498546
#> 9 5.993877e-01  -0.11800251   0.06898616

A next step in the analysis would be to apply small-sample adjustment to least square means (lsmeans) of interest. For example, consider the lsmeans from combinations of variables Treatment and Covar1. For lsmeans calculations, a right-handed formula construction is used to specify the desired lsmeans. For example, to request lsmeans for Treatment by Covar1, the right-handed formula should be ~Treatment:Covar1. As with ModelParmAdjEst(), LSMeansAdj() applies a small-sample adjustment to the standard error of each lsmean, then uses that adjusted SE to perform a t-test and construct 100(1-alpha)% confidence intervals.

LSMeansAdj(nb_model_test, formula_prt=~Treatment:Covar1, alpha=0.05)
#>   Treatment Covar1 SSAdj_DF SSAdj_Estimate SSAdj_StdErr SSAdj_tValue
#> 1         0      0       41      -2.381657   0.02506988    -95.00074
#> 2         1      0       41      -2.406236   0.02468064    -97.49489
#> 3         0      1       41      -2.237546   0.02300936    -97.24504
#> 4         1      1       41      -2.286634   0.02089463   -109.43640
#>    SSAdj_Probt SSAdj_EstLCL SSAdj_EstUCL
#> 1 1.068435e-49    -2.432287    -2.331028
#> 2 3.709241e-50    -2.456080    -2.356393
#> 3 4.118915e-50    -2.284015    -2.191078
#> 4 3.308391e-52    -2.328831    -2.244436

Note: The lsmeans are reported on the log scale when using the Negative Binomial model. In the case of a Negative Binomial, the data scale is expressed as log of the expected count per analysis unit. For the cluster_rct_data dataset, the unit of analysis is a single person as indicated by the offset variable Log_population_size. To report these lsmeans in terms of events per population size, additional programming is needed. To calculate expected counts for each combination of Treatment by Covar 1, the log(expected count) values need to be exponentiated. The following code demonstrate a way to transform the log(expected count) to expected counts per 10,000 persons:

rates <- LSMeansAdj(nb_model_test, ~Treatment:Covar1, 0.05)[, c(1,2,4,8,9)]
cbind(rates[,c(1,2)], 10000*exp(rates[,c(3,4,5)]))
#>   Treatment Covar1 SSAdj_Estimate SSAdj_EstLCL SSAdj_EstUCL
#> 1         0      0       923.9733     878.3573     971.9582
#> 2         1      0       901.5398     857.7054     947.6145
#> 3         0      1      1067.2004    1018.7439    1117.9617
#> 4         1      1      1016.0795     974.0955    1059.8731

In addition to performing small-sample adjustment on the lsmeans, ClusterRandSSAdj allows users to perform small-sample adjustment on pairwise comparisons of lsmeans using ClusterRandSSAdj::LSMeansPairwiseCompAdj(). The function’s agruments are similar to those of LSMeansAdj(), with the addition of the reverse argument. For example, suppose we are interested in comparing log(expected counts) for Treatment = 1 minus Treatment = 0, when Covar1 = 0. If reverse = FALSE produces the comparison of log(expected counts) for Treatment = 0 minus Treatment = 1, when Covar1 = 0, setting reverse = TRUE will reverse the subtraction order so that the difference is oriented as desired.

lsmean_comps <- LSMeansPairwiseCompAdj(nb_model_test, ~Treatment:Covar1, 0.05, reverse=TRUE)
lsmean_comps
#>                                  contrast SSAdj_DF SSAdj_Estimate SSAdj_StdErr
#> 1 Treatment1 Covar10 - Treatment0 Covar10       41    -0.02457897   0.03236046
#> 2 Treatment0 Covar11 - Treatment0 Covar10       41     0.14411084   0.03944746
#> 3 Treatment0 Covar11 - Treatment1 Covar10       41     0.16868981   0.03604234
#> 4 Treatment1 Covar11 - Treatment0 Covar10       41     0.09502370   0.03745633
#> 5 Treatment1 Covar11 - Treatment1 Covar10       41     0.11960267   0.03991636
#> 6 Treatment1 Covar11 - Treatment0 Covar11       41    -0.04908714   0.02547114
#>   SSAdj_tValue  SSAdj_Probt SSAdj_EstLCL SSAdj_EstUCL
#> 1    -0.759537 4.518766e-01  -0.08993224  0.040774308
#> 2     3.653235 7.277014e-04   0.06444509  0.223776598
#> 3     4.680324 3.116543e-05   0.09590083  0.241478788
#> 4     2.536920 1.508442e-02   0.01937911  0.170668302
#> 5     2.996332 4.622046e-03   0.03898995  0.200215387
#> 6    -1.927167 6.090758e-02  -0.10052714  0.002352864

The lsmean_comps object contains several pairwise comparisons. In this analysis we focus on two specific comparisons:

log(expected counts) for Treatment = 1 minus Treatment = 0, when Covar1 = 0, and
log(expected counts) for Treatment = 1 minus Treatment = 0, when Covar1 = 1.

In the R object, lsmeans_comps, these two comparisons are identified in the contrast variable as ‘Treatment1 Covar10 - Treatment0 Covar10’ and ‘Treatment1 Covar11 - Treatment0 Covar11’. The following code extracts the two pairwise comparisons of interest and exponentiates the results so that they are reported on the scale of rate ratios.

lsmean_comps_sub <- lsmean_comps |> dplyr::filter(contrast %in% c("Treatment1 Covar10 - Treatment0 Covar10",
                                                           "Treatment1 Covar11 - Treatment0 Covar11"))
cbind(lsmean_comps_sub[,1], exp(lsmean_comps_sub[,c(3, 7, 8)]))
#>                     lsmean_comps_sub[, 1] SSAdj_Estimate SSAdj_EstLCL
#> 1 Treatment1 Covar10 - Treatment0 Covar10      0.9757206    0.9139931
#> 2 Treatment1 Covar11 - Treatment0 Covar11      0.9520982    0.9043606
#>   SSAdj_EstUCL
#> 1     1.041617
#> 2     1.002356

Now that our analysis has made small sample adjustments to model parameter estimates, lsmeans of interest, as well as pairwise comparisons of important lsmeans, let us consider how we might make small sample adjustments to user defined statistics. In SAS, procedures like PROC GLIMMIX allow the user to define estimate statements, which estimate important and allowable linear combinations of model parameters. This feature is helpful when a user wants to calculate a function of model parameters that is not equal to any estimate produced by the lsmeans statement. The function ClusterRandSSAdj::EstimateAdj() provides analgous functionality in R enabling the user to estimate a linear combination of model parameters and apply small-sample adjustment on the estimate’s standard error.

One illustrative use of ClusterRandSSAdj::EstimateAdj() is to replicate the lsmeans calculations returned by a call to ClusterRandSSAdj::LSMeansAdj(). For example, the following call to LSMeansAdj() will calculate the lsmeans for Treatment by Covar2:

LSMeansAdj(nb_model_test, formula_prt=~Treatment:Covar2, alpha=0.05)
#>   Treatment Covar2 SSAdj_DF SSAdj_Estimate SSAdj_StdErr SSAdj_tValue
#> 1         0      1       41      -2.389029   0.01772194   -134.80626
#> 2         1      1       41      -2.497837   0.02883157    -86.63550
#> 3         0      2       41      -2.097777   0.01565701   -133.98321
#> 4         1      2       41      -2.088899   0.01713808   -121.88638
#> 5         0      3       41      -2.442000   0.04066738    -60.04812
#> 6         1      3       41      -2.452569   0.02409422   -101.79076
#>    SSAdj_Probt SSAdj_EstLCL SSAdj_EstUCL
#> 1 6.565385e-56    -2.424819    -2.353239
#> 2 4.592680e-48    -2.556064    -2.439610
#> 3 8.434628e-56    -2.129397    -2.066157
#> 4 4.044275e-54    -2.123510    -2.054288
#> 5 1.374588e-41    -2.524129    -2.359870
#> 6 6.376523e-51    -2.501228    -2.403910

The following code containing a call to EstimateAdj() will replicate the lsmeans calculations for each combination of Treatment by Covar2 using a call to ClusterRandSSAdj::EstimateAdj().

contrast_matrix <- rbind(
  "logT0_1" = c(1, 0,1.937217, 0.5, 0, 0, 0, 0, 0),
  "logT1_1" = c(1, 1,1.937217, 0.5, 0, 0, 0, 0, 0.5),
  "logT0_2" = c(1, 0,1.937217, 0.5, 1, 0, 0, 0, 0),
  "logT1_2" = c(1, 1,1.937217, 0.5, 1, 0, 1, 0, 0.5),
  "logT0_3" = c(1, 0,1.937217, 0.5, 0, 1, 0, 0, 0),
  "logT1_3" = c(1, 1,1.937217, 0.5, 0, 1, 0, 1, 0.5)
)
lsmean_replicate <- EstimateAdj(nb_model_test, contrast_matrix, 0.05)
lsmean_replicate
#>     label SSAdj_Estimate SSadj_StdErr SSadj_tValue SSAdj_DF  SSAdj_Probt
#> 1 logT0_1      -2.389029   0.01772194   -134.80626       41 6.565380e-56
#> 2 logT1_1      -2.497837   0.02883157    -86.63550       41 4.592676e-48
#> 3 logT0_2      -2.097777   0.01565701   -133.98322       41 8.434580e-56
#> 4 logT1_2      -2.088899   0.01713808   -121.88639       41 4.044268e-54
#> 5 logT0_3      -2.442000   0.04066738    -60.04812       41 1.374590e-41
#> 6 logT1_3      -2.452569   0.02409422   -101.79078       41 6.376485e-51
#>   SSAdj_EstLCL SSAdj_EstUCL
#> 1    -2.424819    -2.353239
#> 2    -2.556064    -2.439610
#> 3    -2.129397    -2.066157
#> 4    -2.123510    -2.054288
#> 5    -2.524129    -2.359870
#> 6    -2.501228    -2.403910

While this may suggest that ClusterRandSSAdj::EstimateAdj() is redundant and replicates lsmeans calculations for Treatment by Covar2 that are produced by a call to ClusterRandSSAdj::LSMeansAdj(), let us continue to explore user defined linear combinations of model parameters using ClusterRandSSAdj::EstimateAdj(). For instance, consider pairwise comparisons of Treatment by Covar2 combinations, focusing on the Treatment = 1 minus Treatment = 0 comparisons within each level of Covar2. Covar2 has 3 levels, so we have 3 treatment comparisons, one in each level of Covar2. We can calculate these small-sample adjusted treatment effect for each level of Covar2 using ClusterRandSSAdj::EstimateAdj().

contrast_matrix2 <- rbind(
  "logT1vT0_1" = c(0, 1, 0, 0, 0, 0, 0, 0, 0.5),
  "logT1vT0_2" = c(0, 1, 0, 0, 0, 0, 1, 0, 0.5),
  "logT1vT0_3" = c(0, 1, 0, 0, 0, 0, 0, 1, 0.5)
)
EstAdj2 <- EstimateAdj(nb_model_test, contrast_matrix2, 0.10)
EstAdj2
#>        label SSAdj_Estimate SSadj_StdErr SSadj_tValue SSAdj_DF SSAdj_Probt
#> 1 logT1vT0_1   -0.108808244   0.03355662   -3.2425271       41  0.00235719
#> 2 logT1vT0_2    0.008878222   0.02300661    0.3858987       41  0.70156690
#> 3 logT1vT0_3   -0.010569136   0.04772098   -0.2214778       41  0.82581986
#>   SSAdj_EstLCL SSAdj_EstUCL
#> 1  -0.16527994  -0.05233655
#> 2  -0.02983910   0.04759554
#> 3  -0.09087772   0.06973945

Note, similar to the individual lsmeans calculations for Tretment by Covar2, these calculation could have been calculated by a call to ClusterRandSSAdj::LSMeansPairwiseCompAdj()

lsmean_comps2 <- LSMeansPairwiseCompAdj(nb_model_test, ~Treatment:Covar2, 0.05, reverse=TRUE)
lsmean_comps_sub2 <- lsmean_comps2 |> dplyr::filter(contrast %in% c("Treatment1 Covar21 - Treatment0 Covar21",
                                                                    "Treatment1 Covar22 - Treatment0 Covar22",
                                                                    "Treatment1 Covar23 - Treatment0 Covar23"))
lsmean_comps_sub2
#>                                  contrast SSAdj_DF SSAdj_Estimate SSAdj_StdErr
#> 1 Treatment1 Covar21 - Treatment0 Covar21       41   -0.108808244   0.03355662
#> 2 Treatment1 Covar22 - Treatment0 Covar22       41    0.008878222   0.02300661
#> 3 Treatment1 Covar23 - Treatment0 Covar23       41   -0.010569136   0.04772098
#>   SSAdj_tValue SSAdj_Probt SSAdj_EstLCL SSAdj_EstUCL
#> 1   -3.2425271  0.00235719  -0.17657721  -0.04103928
#> 2    0.3858987  0.70156690  -0.03758457   0.05534102
#> 3   -0.2214778  0.82581986  -0.10694361   0.08580534

Once again, do not forget the results returned by LSMeansPairwiseCompAdj and by EstimateAdj() are on the log scale; therefore, in order to transform these statistics to effect ratios, the results in EstAdj2 (or lsmean_comps_sub2) will need to be exponentiated.

cbind(EstAdj2[,1], exp(EstAdj2[,c(2, 7, 8)]))
#>   EstAdj2[, 1] SSAdj_Estimate SSAdj_EstLCL SSAdj_EstUCL
#> 1   logT1vT0_1      0.8969024    0.8476564    0.9490094
#> 2   logT1vT0_2      1.0089177    0.9706017    1.0487464
#> 3   logT1vT0_3      0.9894865    0.9131294    1.0722288

At this point, we have reached an end of functions of linear combinations of model parameters which can be estimated using lsmeans and pairwise lsmean comparison functions such as LSMeansAdj() and LSMeansPairwiseCompAdj().

Now, let’s look at differences between a specific set of pairwise comparisons, namely differences between Treatment 1 minus Treatment 0 log(expected count) estimates between different levels of Covar2. Since Covar2 has 3 levels, we have 3 treatment comparisons within each level of Covar2.

Estimates of Treatment 1 log(expected count) minus Treatment 0 log(expected count) when Covar2 is equal to 1
Estimates of Treatment 1 log(expected count) minus Treatment 0 log(expected count) when Covar2 is equal to 2
Estimates of Treatment 1 log(expected count) minus Treatment 0 log(expected count) when Covar2 is equal to 3

Given these three treatment effect estimates, we can compare levels of Covar2 with respect to these treatment effect estimates. For example, how different is the treatment effect when Covar2 is equal to 2, relative to when Covar2 is equal to 1? These estimates are difference in differences esimates and the estimates can be used to better understand the interaction between Treatment and Covar2.

Given we have 3 treatment effects, we have 3 difference in difference estimates

Covar2 = 1 treatment effect vs. Covar2 = 2 treatment effect
Covar2 = 1 treatment effect vs. Covar2 = 3 treatment effect
Covar2 = 2 treatment effect vs. Covar2 = 3 treatment effect

As such the contrast matrix is going to be a set of three differences in lsmean differences. We can estimate the parameter combination vector using the vectors in contrast_matrix2 above.

logT1vT0_1 minus logT1vT0_2 will be:

c(0, 1, 0, 0, 0, 0, 0, 0, 0.5) - c(0, 1, 0, 0, 0, 0, 1, 0, 0.5) = c(0, 0, 0, 0, 0, 0, -1, 0, 0)

logT1vT0_1 minus logT1vT0_3 will be:

c(0, 1, 0, 0, 0, 0, 0, 0, 0.5) - c(0, 1, 0, 0, 0, 0, 0, 1, 0.5) = c(0, 0, 0, 0, 0, 0, 0, -1, 0)

logT1vT0_2 minus logT1vT0_3 will be:

c(0, 1, 0, 0, 0, 0, 1, 0, 0.5) - c(0, 1, 0, 0, 0, 0, 0, 1, 0.5) = c(0, 0, 0, 0, 0, 0, 1, -1, 0)

These estimates cannot be calculated in a call to any lsmeans ot pairwise lsmeans comparison function. A linear combination of parameter estimates function is necessary, and this is where EstimateAdj() becomes very useful.

contrast_matrix3 <- rbind(
"logT1vT0_1mlogT1vT0_2" = c(0, 0, 0, 0, 0, 0, -1,  0, 0),
"logT1vT0_1mlogT1vT0_3" = c(0, 0, 0, 0, 0, 0,  0, -1, 0),
"logT1vT0_2mlogT1vT0_3" = c(0, 0, 0, 0, 0, 0,  1, -1, 0)
)
dodEst <- EstimateAdj(nb_model_test, contrast_matrix3, 0.10)
dodEst
#>                   label SSAdj_Estimate SSadj_StdErr SSadj_tValue SSAdj_DF
#> 1 logT1vT0_1mlogT1vT0_2    -0.11768647   0.04575911   -2.5718694       41
#> 2 logT1vT0_1mlogT1vT0_3    -0.09823911   0.06771160   -1.4508461       41
#> 3 logT1vT0_2mlogT1vT0_3     0.01944736   0.04806891    0.4045725       41
#>   SSAdj_Probt SSAdj_EstLCL SSAdj_EstUCL
#> 1  0.01383719  -0.19469347  -0.04067946
#> 2  0.15443114  -0.21218947   0.01571125
#> 3  0.68789474  -0.06144675   0.10034146

The estimates in dodEst are differences in differences on the log(expected counts) scale. These differences in differences can be transformed into ratios of expected count ratios, providing an expression of interaction on the ratio scale. Specifically, if exp(differences in log(expected count)), is a ratio of expected counts, then exp(differences in log(expected count1) - differences in log(expected count2)) yields a ratio of expected count ratios.

new_lab <- c("T1vT0_1_2", "T1vT0_1_3", "T1vT0_2_3")
cbind(new_lab, exp(dodEst[,c(2, 7, 8)]))
#>     new_lab SSAdj_Estimate SSAdj_EstLCL SSAdj_EstUCL
#> 1 T1vT0_1_2      0.8889747    0.8230869    0.9601368
#> 2 T1vT0_1_3      0.9064321    0.8088114    1.0158353
#> 3 T1vT0_2_3      1.0196377    0.9404030    1.1055484

Next, let us consider another function within ClusterRandSSAdj, TypeIIIAdj(). In SAS procedures, Type III tests are produced for predictors included in a model. When PROC GLIMMIX is used, the tests are F-tests. The car::Anova() function in the car package can produce Type III tests for generalized linear models, however the tests are Chi-square tests. The TypeIIIAdj() function applies small-sample adjustment to the Type III tests as implemented in the car::Anova() function. Within a call to TypeIIIAdj(), a call to car::Anova() is made and it uses “type = ‘III’” and “test.statistic =”Wald” to approximate the F-test in SAS Type III test results.

For type III tests similar to those reported in SAS, the user needs to create a list of contrast structures for each categorical variable, where the contrast structure is “contr.sum”. Once these are defined a call to TypeIIIAdj() will produce small-sample adjusted type III test results analogous to what SAS produces however these results are based on chi-square tests instead of F-tests.

type3_lst <- list(Treatment = contr.sum, Covar1 = contr.sum, Covar2 = contr.sum)
TypeIIIAdj(nb_model_test, type3_lst) 
#>                  Df ChiSq_firores ProbChisq_firores  ChiSq_root ProbChisq_root
#> (Intercept)       1    57.8051214      2.894152e-14  72.1454728   1.999038e-17
#> Treatment         1     3.9156475      4.783858e-02   4.8534322   2.759118e-02
#> Log_baserate      1     4.5012629      3.386983e-02   5.6198682   1.775798e-02
#> Covar1            1    14.7607953      1.220468e-04  19.1344317   1.218271e-05
#> Covar2            2   398.5106303      2.914178e-87 509.6530134  2.139214e-111
#> Treatment:Covar2  2     6.0904504      4.758559e-02   7.8216659   2.002382e-02
#> Treatment:Covar1  1     0.2429786      6.220631e-01   0.3268222   5.675363e-01
#>                   SSAdjChisq SSAdjChisq_pval
#> (Intercept)       64.9752971    7.584291e-16
#> Treatment          4.3845398    3.626628e-02
#> Log_baserate       5.0605655    2.447625e-02
#> Covar1            16.9476135    3.842559e-05
#> Covar2           454.0818219    2.496808e-99
#> Treatment:Covar2   6.9560582    3.086819e-02
#> Treatment:Covar1   0.2849004    5.935072e-01

These small-sample adjusted Type III tests can be used as a gate-keeping step in which statistics for combinations of multi-level covariates are reported only if the type III test for the interaction term is statistically significant. Such a test may involve at least two degrees of freedom, thus a type III test may be appropriate. In this case, small-sample adjustment can be applied to the test using TypeIIIAdj().

The chi-square test that is used in TypeIIIAdj() makes it impossible to replicate the Type III test produced by GLIMMIX. To address this, two additional functions were developed in ClusterRandSSAdj. These functions allow the users to replicate the SAS Type III test and to estimate and apply small-sample adjustment to any linear combination of the model parameters using an F-test.

The function SAS_Lincom_Ftest() allows the user to complete an F-test on any linear combination of model parameters and to adjust the standard error of the estimate using HC3 (FIRORES), HC2 (Root), or no adjustment. One requirement for use of SAS_Lincomp_Ftest() is that the user must specify the desired linear combination.
For example, the following code replicates the Type III test from PROC GLIMMIX when analyzing cluster_rct_data in PROC GLIMMIX using the empirical=FIRORES option.


# Type III F-test for treatment parameter
L_value <- matrix(c(0, 1, 0, 0, 0, 0, 0.333333333333, 0.333333333333, 0.5), nrow=1, ncol=9)
SAS_Lincom_Ftest(nb_model_test, L_value, "HC3")
#>   Numerator_df Denominator_df F_statistic   F_pvalue
#> 1            1             41    3.915647 0.05458228

#The SAS type III test for Log_baserate?
L_value <- matrix(c(0, 0, 1, 0, 0, 0, 0, 0, 0), nrow=1, ncol=9)
SAS_Lincom_Ftest(nb_model_test, L_value, "HC3")
#>   Numerator_df Denominator_df F_statistic  F_pvalue
#> 1            1             41    4.501263 0.0399601

# Type III F-test for covar1 parameter
L_value <- matrix(c(0, 0, 0, 1, 0, 0, 0, 0, 0.5), nrow=1, ncol=9)
SAS_Lincom_Ftest(nb_model_test, L_value, "HC3")
#>   Numerator_df Denominator_df F_statistic     F_pvalue
#> 1            1             41     14.7608 0.0004160738

# Type III F-test for covar2 parameter
L_value <- rbind(
  c(0, 0, 0, 0, 1, 0, 0.5,   0, 0),
  c(0, 0, 0, 0, 0, 1, 0,   0.5, 0)
)
SAS_Lincom_Ftest(nb_model_test, L_value, "HC3")
#>   Numerator_df Denominator_df F_statistic     F_pvalue
#> 1            2             41    199.2553 7.606707e-22

#The SAS type III test for Treatment:Covar2?
L_value <- rbind(
  c(0, 0, 0, 0, 0, 0, 1, 0, 0),
  c(0, 0, 0, 0, 0, 0, 0, 1, 0)
)
SAS_Lincom_Ftest(nb_model_test, L_value, "HC3")
#>   Numerator_df Denominator_df F_statistic  F_pvalue
#> 1            2             41    3.045225 0.0584717

#The SAS type III test for Treatment:Covar1?
L_value <- matrix(c(0, 0, 0, 0, 0, 0, 0, 0, 1), nrow=1, ncol=9)
SAS_Lincom_Ftest(nb_model_test, L_value, "HC3")
#>   Numerator_df Denominator_df F_statistic  F_pvalue
#> 1            1             41   0.2429786 0.6246928

Finally, a small-sample adjustment can be applied on any desired linear combination of model parameters.
For example, the Treatment by Covar1 interaction:

  Lincom_FtestAdj(nb_model_test, L_value)
#>   Numerator_df Denominator_df SSAdjF_statistic SSAdjF_pvalue
#> 1            1             41        0.2849004     0.5963894

Introduction

ClusterRandSSAdj

Example Use