library(sumExtras)
library(gtsummary)
library(dplyr)
# Apply the recommended JAMA theme
use_jama_theme()If you’ve worked with gtsummary before, you’re familiar with the
typical workflow of building summary tables: creating a base table with
tbl_summary(), then progressively adding features like
overall columns, p-values, and formatting tweaks. While gtsummary’s
modular approach provides flexibility, the same sequence of functions
appears repeatedly in analysis scripts.
sumExtras streamlines this process by providing convenience functions that apply commonly-used formatting patterns in a single step. The package handles three main pain points:
This vignette will get you started with the core functions. For more specialized workflows, see:
vignette("labeling") - Comprehensive guide to automatic
variable labeling across tables and plotsvignette("styling") - Advanced table styling and
formatting techniquesextras() FunctionThe signature function of this package, extras(),
consolidates the most common table enhancements into a single step. At
minimum, it adds bold labels, removes the “Characteristic” header, and
standardizes missing value display. With default settings, it also adds
an overall column and p-values.
| Overall N = 2001 |
Drug A N = 981 |
Drug B N = 1021 |
p-value2 | |
|---|---|---|---|---|
| Age | 47 (38, 57) | 46 (37, 60) | 48 (39, 56) | 0.7 |
| Unknown | 11 | 7 | 4 | |
| Marker Level (ng/mL) | 0.64 (0.22, 1.41) | 0.84 (0.23, 1.60) | 0.52 (0.18, 1.21) | 0.085 |
| Unknown | 10 | 6 | 4 | |
| T Stage | 0.9 | |||
| T1 | 53 (27%) | 28 (29%) | 25 (25%) | |
| T2 | 54 (27%) | 25 (26%) | 29 (28%) | |
| T3 | 43 (22%) | 22 (22%) | 21 (21%) | |
| T4 | 50 (25%) | 23 (23%) | 27 (26%) | |
| Grade | 0.9 | |||
| I | 68 (34%) | 35 (36%) | 33 (32%) | |
| II | 68 (34%) | 32 (33%) | 36 (35%) | |
| III | 64 (32%) | 31 (32%) | 33 (32%) | |
| Tumor Response | 61 (32%) | 28 (29%) | 33 (34%) | 0.5 |
| Unknown | 7 | 3 | 4 | |
| Patient Died | 112 (56%) | 52 (53%) | 60 (59%) | 0.4 |
| Months to Death/Censor | 22.4 (15.9, 24.0) | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) | 0.14 |
| 1 Median (Q1, Q3); n (%) | ||||
| 2 Wilcoxon rank sum test; Pearson’s Chi-squared test | ||||
| Overall N = 2001 |
Drug A N = 981 |
Drug B N = 1021 |
p-value2 | |
|---|---|---|---|---|
| Age | 47 (38, 57) | 46 (37, 60) | 48 (39, 56) | 0.718 |
| Unknown | 11 | 7 | 4 | |
| Marker Level (ng/mL) | 0.64 (0.22, 1.41) | 0.84 (0.23, 1.60) | 0.52 (0.18, 1.21) | 0.085 |
| Unknown | 10 | 6 | 4 | |
| T Stage | 0.866 | |||
| T1 | 53 (27%) | 28 (29%) | 25 (25%) | |
| T2 | 54 (27%) | 25 (26%) | 29 (28%) | |
| T3 | 43 (22%) | 22 (22%) | 21 (21%) | |
| T4 | 50 (25%) | 23 (23%) | 27 (26%) | |
| Grade | 0.871 | |||
| I | 68 (34%) | 35 (36%) | 33 (32%) | |
| II | 68 (34%) | 32 (33%) | 36 (35%) | |
| III | 64 (32%) | 31 (32%) | 33 (32%) | |
| Tumor Response | 61 (32%) | 28 (29%) | 33 (34%) | 0.530 |
| Unknown | 7 | 3 | 4 | |
| Patient Died | 112 (56%) | 52 (53%) | 60 (59%) | 0.412 |
| Months to Death/Censor | 22.4 (15.9, 24.0) | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) | 0.145 |
| 1 Median (Q1, Q3); n (%) | ||||
| 2 Wilcoxon rank sum test; Pearson’s Chi-squared test | ||||
Both approaches produce the same result, but extras()
requires less code and ensures consistency across your analysis.
You can control which features are applied using the function arguments:
| Overall N = 2001 |
Drug A N = 981 |
Drug B N = 1021 |
|
|---|---|---|---|
| Age | 47 (38, 57) | 46 (37, 60) | 48 (39, 56) |
| Unknown | 11 | 7 | 4 |
| Marker Level (ng/mL) | 0.64 (0.22, 1.41) | 0.84 (0.23, 1.60) | 0.52 (0.18, 1.21) |
| Unknown | 10 | 6 | 4 |
| T Stage | |||
| T1 | 53 (27%) | 28 (29%) | 25 (25%) |
| T2 | 54 (27%) | 25 (26%) | 29 (28%) |
| T3 | 43 (22%) | 22 (22%) | 21 (21%) |
| T4 | 50 (25%) | 23 (23%) | 27 (26%) |
| Grade | |||
| I | 68 (34%) | 35 (36%) | 33 (32%) |
| II | 68 (34%) | 32 (33%) | 36 (35%) |
| III | 64 (32%) | 31 (32%) | 33 (32%) |
| Tumor Response | 61 (32%) | 28 (29%) | 33 (34%) |
| Unknown | 7 | 3 | 4 |
| Patient Died | 112 (56%) | 52 (53%) | 60 (59%) |
| Months to Death/Censor | 22.4 (15.9, 24.0) | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) |
| 1 Median (Q1, Q3); n (%) | |||
| Drug A N = 981 |
Drug B N = 1021 |
p-value2 | |
|---|---|---|---|
| Age | 46 (37, 60) | 48 (39, 56) | 0.718 |
| Unknown | 7 | 4 | |
| Marker Level (ng/mL) | 0.84 (0.23, 1.60) | 0.52 (0.18, 1.21) | 0.085 |
| Unknown | 6 | 4 | |
| T Stage | 0.866 | ||
| T1 | 28 (29%) | 25 (25%) | |
| T2 | 25 (26%) | 29 (28%) | |
| T3 | 22 (22%) | 21 (21%) | |
| T4 | 23 (23%) | 27 (26%) | |
| Grade | 0.871 | ||
| I | 35 (36%) | 33 (32%) | |
| II | 32 (33%) | 36 (35%) | |
| III | 31 (32%) | 33 (32%) | |
| Tumor Response | 28 (29%) | 33 (34%) | 0.530 |
| Unknown | 3 | 4 | |
| Patient Died | 52 (53%) | 60 (59%) | 0.412 |
| Months to Death/Censor | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) | 0.145 |
| 1 Median (Q1, Q3); n (%) | |||
| 2 Wilcoxon rank sum test; Pearson’s Chi-squared test | |||
# Overall column as last column (default is to set it as first)
trial |>
tbl_summary(by = trt) |>
extras(last = TRUE)| Drug A N = 981 |
Drug B N = 1021 |
Overall N = 2001 |
p-value2 | |
|---|---|---|---|---|
| Age | 46 (37, 60) | 48 (39, 56) | 47 (38, 57) | 0.718 |
| Unknown | 7 | 4 | 11 | |
| Marker Level (ng/mL) | 0.84 (0.23, 1.60) | 0.52 (0.18, 1.21) | 0.64 (0.22, 1.41) | 0.085 |
| Unknown | 6 | 4 | 10 | |
| T Stage | 0.866 | |||
| T1 | 28 (29%) | 25 (25%) | 53 (27%) | |
| T2 | 25 (26%) | 29 (28%) | 54 (27%) | |
| T3 | 22 (22%) | 21 (21%) | 43 (22%) | |
| T4 | 23 (23%) | 27 (26%) | 50 (25%) | |
| Grade | 0.871 | |||
| I | 35 (36%) | 33 (32%) | 68 (34%) | |
| II | 32 (33%) | 36 (35%) | 68 (34%) | |
| III | 31 (32%) | 33 (32%) | 64 (32%) | |
| Tumor Response | 28 (29%) | 33 (34%) | 61 (32%) | 0.530 |
| Unknown | 3 | 4 | 7 | |
| Patient Died | 52 (53%) | 60 (59%) | 112 (56%) | 0.412 |
| Months to Death/Censor | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) | 22.4 (15.9, 24.0) | 0.145 |
| 1 Median (Q1, Q3); n (%) | ||||
| 2 Wilcoxon rank sum test; Pearson’s Chi-squared test | ||||
For projects with consistent table formatting requirements, you can define styling parameters once and reuse them:
# Define standard table settings for a project
standard_table_args <- list(
pval = TRUE,
overall = TRUE,
last = TRUE
)
# Apply consistently across multiple tables
trial |>
select(age, grade, stage, trt) |>
tbl_summary(by = trt) |>
extras(.args = standard_table_args)| Drug A N = 981 |
Drug B N = 1021 |
Overall N = 2001 |
p-value2 | |
|---|---|---|---|---|
| Age | 46 (37, 60) | 48 (39, 56) | 47 (38, 57) | 0.718 |
| Unknown | 7 | 4 | 11 | |
| Grade | 0.871 | |||
| I | 35 (36%) | 33 (32%) | 68 (34%) | |
| II | 32 (33%) | 36 (35%) | 68 (34%) | |
| III | 31 (32%) | 33 (32%) | 64 (32%) | |
| T Stage | 0.866 | |||
| T1 | 28 (29%) | 25 (25%) | 53 (27%) | |
| T2 | 25 (26%) | 29 (28%) | 54 (27%) | |
| T3 | 22 (22%) | 21 (21%) | 43 (22%) | |
| T4 | 23 (23%) | 27 (26%) | 50 (25%) | |
| 1 Median (Q1, Q3); n (%) | ||||
| 2 Wilcoxon rank sum test; Pearson’s Chi-squared test | ||||
One subtle but important aspect of table presentation is how missing or undefined values are displayed. gtsummary tables can show various representations of missing data: “0 (NA%)”, “NA (NA)”, “NA, NA”, etc. These inconsistencies create visual clutter and make tables harder to scan.
The clean_table() function (which is called
automatically by extras()) standardizes all zero
(0 (0%)) or missing value representations to “—”:
| Characteristic | Drug A N = 981 |
Drug B N = 1021 |
|---|---|---|
| age | 46 (37, 60) | NA (NA, NA) |
| Unknown | 7 | 102 |
| marker | NA (NA, NA) | 0.52 (0.18, 1.21) |
| Unknown | 98 | 4 |
| T Stage | ||
| T1 | 28 (29%) | 25 (25%) |
| T2 | 25 (26%) | 29 (28%) |
| T3 | 22 (22%) | 21 (21%) |
| T4 | 23 (23%) | 27 (26%) |
| Grade | ||
| I | 35 (36%) | 33 (32%) |
| II | 32 (33%) | 36 (35%) |
| III | 31 (32%) | 33 (32%) |
| Tumor Response | 28 (29%) | 33 (34%) |
| Unknown | 3 | 4 |
| Patient Died | 52 (53%) | 60 (59%) |
| Months to Death/Censor | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) |
| 1 Median (Q1, Q3); n (%) | ||
| Characteristic | Drug A N = 981 |
Drug B N = 1021 |
|---|---|---|
| age | 46 (37, 60) | — |
| Unknown | 7 | 102 |
| marker | — | 0.52 (0.18, 1.21) |
| Unknown | 98 | 4 |
| T Stage | ||
| T1 | 28 (29%) | 25 (25%) |
| T2 | 25 (26%) | 29 (28%) |
| T3 | 22 (22%) | 21 (21%) |
| T4 | 23 (23%) | 27 (26%) |
| Grade | ||
| I | 35 (36%) | 33 (32%) |
| II | 32 (33%) | 36 (35%) |
| III | 31 (32%) | 33 (32%) |
| Tumor Response | 28 (29%) | 33 (34%) |
| Unknown | 3 | 4 |
| Patient Died | 52 (53%) | 60 (59%) |
| Months to Death/Censor | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) |
| 1 Median (Q1, Q3); n (%) | ||
You can also use clean_table() independently if you
prefer to build tables step-by-step:
| Characteristic | Overall N = 2001 |
Drug A N = 981 |
Drug B N = 1021 |
p-value2 |
|---|---|---|---|---|
| age | 46 (37, 60) | 46 (37, 60) | — | |
| Unknown | 109 | 7 | 102 | |
| marker | 0.52 (0.18, 1.21) | — | 0.52 (0.18, 1.21) | |
| Unknown | 102 | 98 | 4 | |
| T Stage | 0.9 | |||
| T1 | 53 (27%) | 28 (29%) | 25 (25%) | |
| T2 | 54 (27%) | 25 (26%) | 29 (28%) | |
| T3 | 43 (22%) | 22 (22%) | 21 (21%) | |
| T4 | 50 (25%) | 23 (23%) | 27 (26%) | |
| Grade | 0.9 | |||
| I | 68 (34%) | 35 (36%) | 33 (32%) | |
| II | 68 (34%) | 32 (33%) | 36 (35%) | |
| III | 64 (32%) | 31 (32%) | 33 (32%) | |
| Tumor Response | 61 (32%) | 28 (29%) | 33 (34%) | 0.5 |
| Unknown | 7 | 3 | 4 | |
| Patient Died | 112 (56%) | 52 (53%) | 60 (59%) | 0.4 |
| Months to Death/Censor | 22.4 (15.9, 24.0) | 23.5 (17.4, 24.0) | 21.2 (14.5, 24.0) | 0.14 |
| 1 Median (Q1, Q3); n (%) | ||||
| 2 NA; Pearson’s Chi-squared test; Wilcoxon rank sum test | ||||
One of the most time-consuming aspects of creating publication-ready tables is labeling variables with human-readable descriptions. sumExtras provides a streamlined labeling system using data dictionaries:
# Create a simple dictionary
dictionary <- tibble::tribble(
~Variable, ~Description,
"trt", "Chemotherapy Treatment",
"age", "Age at Enrollment (years)",
"marker", "Marker Level (ng/mL)",
"stage", "T Stage",
"grade", "Tumor Grade"
)
# Apply labels automatically
trial |>
tbl_summary(by = trt, include = c(age, grade, marker)) |>
add_auto_labels(dictionary = dictionary) |>
extras()| Overall N = 2001 |
Drug A N = 981 |
Drug B N = 1021 |
p-value2 | |
|---|---|---|---|---|
| Age | 47 (38, 57) | 46 (37, 60) | 48 (39, 56) | 0.718 |
| Unknown | 11 | 7 | 4 | |
| Grade | 0.871 | |||
| I | 68 (34%) | 35 (36%) | 33 (32%) | |
| II | 68 (34%) | 32 (33%) | 36 (35%) | |
| III | 64 (32%) | 31 (32%) | 33 (32%) | |
| Marker Level (ng/mL) | 0.64 (0.22, 1.41) | 0.84 (0.23, 1.60) | 0.52 (0.18, 1.21) | 0.085 |
| Unknown | 10 | 6 | 4 | |
| 1 Median (Q1, Q3); n (%) | ||||
| 2 Wilcoxon rank sum test; Pearson’s Chi-squared test | ||||
The add_auto_labels() function is intelligent and
flexible:
tbl_summary() always override
automatic onestbl_regression()The labeling system is much more powerful than this basic example suggests. You can:
apply_labels_from_dictionary()For comprehensive coverage of these workflows and real-world
examples, see vignette("labeling").
sumExtras is designed to work best with the JAMA compact theme. Use
use_jama_theme() to apply this theme to all gtsummary
tables in your session:
# Apply JAMA compact theme (typically done once at the beginning)
use_jama_theme()
#> Setting theme "Compact"
#> Applied JAMA compact theme to {gtsummary}This is equivalent to calling
gtsummary::set_gtsummary_theme(gtsummary::theme_gtsummary_compact("jama"))
but provides a more convenient interface. You can reset to the default
gtsummary theme at any time with
gtsummary::reset_gtsummary_theme().
For information about matching gt table styles, creating styled group
headers, and advanced formatting techniques, see
vignette("styling").
Here’s a simple workflow demonstrating how these core functions work together:
# 1. Define your dictionary (typically done once per project)
my_dictionary <- tibble::tribble(
~Variable, ~Description,
"trt", "Chemotherapy Treatment",
"age", "Age at Enrollment (years)",
"marker", "Marker Level (ng/mL)",
"stage", "T Stage",
"grade", "Tumor Grade",
"response", "Tumor Response"
)
# 2. Set the recommended theme (once per session)
use_jama_theme()
#> Setting theme "Compact"
#> Applied JAMA compact theme to {gtsummary}
# 3. Create a clean, labeled table with one function call
trial |>
select(age, marker, stage, grade, response, trt) |>
tbl_summary(
by = trt,
missing = "no"
) |>
add_auto_labels(dictionary = my_dictionary) |>
extras()| Overall N = 2001 |
Drug A N = 981 |
Drug B N = 1021 |
p-value2 | |
|---|---|---|---|---|
| Age | 47 (38, 57) | 46 (37, 60) | 48 (39, 56) | 0.718 |
| Marker Level (ng/mL) | 0.64 (0.22, 1.41) | 0.84 (0.23, 1.60) | 0.52 (0.18, 1.21) | 0.085 |
| T Stage | 0.866 | |||
| T1 | 53 (27%) | 28 (29%) | 25 (25%) | |
| T2 | 54 (27%) | 25 (26%) | 29 (28%) | |
| T3 | 43 (22%) | 22 (22%) | 21 (21%) | |
| T4 | 50 (25%) | 23 (23%) | 27 (26%) | |
| Grade | 0.871 | |||
| I | 68 (34%) | 35 (36%) | 33 (32%) | |
| II | 68 (34%) | 32 (33%) | 36 (35%) | |
| III | 64 (32%) | 31 (32%) | 33 (32%) | |
| Tumor Response | 61 (32%) | 28 (29%) | 33 (34%) | 0.530 |
| 1 Median (Q1, Q3); n (%) | ||||
| 2 Wilcoxon rank sum test; Pearson’s Chi-squared test | ||||
That’s it! With just a few lines of code, you have a publication-ready table with automatic labeling, clean missing values, bold labels, an overall column, and p-values.
This vignette covered the essential functions to get you started quickly. For more advanced usage:
vignette("labeling") - Learn about
the complete labeling system, including cross-package workflows with
ggplot2, controlling label priority, working with pre-labeled data, and
real-world analysis examples
vignette("styling") - Explore
advanced styling techniques including group headers, background colors,
text formatting, and creating visually polished tables
For detailed information about individual functions, see their help documentation:
?extras - Main styling function?clean_table - Missing value standardization?add_auto_labels - Automatic variable labeling?use_jama_theme - Apply JAMA compact themeThe package is designed to reduce repetitive code while maintaining the flexibility of gtsummary’s modular approach. Use as much or as little as fits your workflow.