Help for package irtsim

Title:

Monte Carlo Simulation-Based Sample-Size Planning for Item Response Theory

Version:

0.2.0

Description:

Provides a pipeline application programming interface (API) for Monte Carlo simulation-based sample-size planning in item response theory (IRT). Implements the 10-decision framework from Schroeders and Gnambs (2025) <doi:10.1177/25152459251314798> as a three-step workflow: specify the data-generating model with irt_design(), add study conditions with irt_study(), and run simulations with irt_simulate(). Supports one-parameter logistic (1PL), two-parameter logistic (2PL), three-parameter logistic (3PL), graded response (GRM), partial credit (PCM), and generalized partial credit (GPCM) models with missing-completely-at-random (MCAR), missing-at-random (MAR), booklet, and linking missingness mechanisms. Results include mean squared error (MSE), bias, root mean squared error (RMSE), standard error (SE), and coverage criteria with summary and plot methods.

License:

GPL (≥ 3)

URL:

https://sward1.github.io/irtsim/, https://github.com/sward1/irtsim

BugReports:

https://github.com/sward1/irtsim/issues

Depends:

R (≥ 4.1.0)

Imports:

cli, future.apply, ggplot2, mirt, rlang

Suggests:

future, knitr, R.rsp (≥ 0.46.0), rmarkdown, scales, testthat (≥ 3.0.0)

VignetteBuilder:

R.rsp, knitr

Config/testthat/edition:

Encoding:

UTF-8

Language:

en-US

RoxygenNote:

7.3.3

NeedsCompilation:

Packaged:

2026-06-27 13:45:20 UTC; stephenward

Author:

Stephen Ward [aut, cre]

Maintainer:

Stephen Ward <stephen_ward+irtsim@abhome.co>

Repository:

CRAN

Date/Publication:

2026-06-27 14:50:02 UTC

irtsim: Monte Carlo Simulation-Based Sample-Size Planning for Item Response Theory

Description

Provides a pipeline application programming interface (API) for Monte Carlo simulation-based sample-size planning in item response theory (IRT). Implements the 10-decision framework from Schroeders and Gnambs (2025) doi:10.1177/25152459251314798 as a three-step workflow: specify the data-generating model with irt_design(), add study conditions with irt_study(), and run simulations with irt_simulate(). Supports one-parameter logistic (1PL), two-parameter logistic (2PL), three-parameter logistic (3PL), graded response (GRM), partial credit (PCM), and generalized partial credit (GPCM) models with missing-completely-at-random (MCAR), missing-at-random (MAR), booklet, and linking missingness mechanisms. Results include mean squared error (MSE), bias, root mean squared error (RMSE), standard error (SE), and coverage criteria with summary and plot methods.

Author(s)

Maintainer: Stephen Ward stephen_ward+irtsim@abhome.co

Internal Criterion Registry

Description

Internal Criterion Registry

Usage

.get_criterion_registry()

Value

A named list of criterion configurations.

Internal Missing Mechanism Registry

Description

Internal Missing Mechanism Registry

Usage

.get_missing_registry()

Value

A named list of missing mechanism configurations.

Internal Model Registry

Description

Internal Model Registry

Usage

.get_model_registry()

Value

A named list of model configurations.

Apply Missing Data Mechanism (Internal)

Description

Takes a complete response matrix and introduces missingness according to the study specification.

Usage

apply_missing(data, study, seed = NULL, theta = NULL)

Arguments

data

Numeric matrix (N x n_items). Complete response data.

study

An irt_study object specifying the missing data mechanism.

seed

Integer. Random seed for reproducibility.

theta

Optional numeric vector of length nrow(data). Required when study$missing == "mar".

Value

Numeric matrix of same dimensions as data, with NA values introduced according to the missingness mechanism.

Build Criterion Plot (Internal)

Description

Shared plotting engine used by both plot.irt_results() and plot.summary_irt_results().

Usage

build_criterion_plot(
  summary_obj,
  criterion = "rmse",
  param = NULL,
  item = NULL,
  threshold = NULL
)

Arguments

summary_obj

A summary_irt_results object.

criterion

Character string. Criterion to plot.

param

Optional character vector. Parameter filter.

item

Optional integer vector. Item filter.

threshold

Optional numeric. Horizontal reference line.

Value

A ggplot2::ggplot object, returned invisibly.

Build NA Item Results for Non-Converged Iterations (Internal)

Description

Build NA Item Results for Non-Converged Iterations (Internal)

Usage

build_na_item_results(true_params, iteration, sample_size)

Arguments

true_params

Data frame from build_true_params().

iteration

Integer iteration number.

sample_size

Integer sample size.

Value

Data frame matching item_results schema with NA estimates.

Build True Parameter Data Frame (Internal)

Description

Creates a data frame of true parameter values from the design, used for populating the true_value column and for building NA rows on non-convergence.

Usage

build_true_params(design)

Arguments

design

An irt_design object.

Value

Data frame with columns: item, param, true_value.

Build True Parameter Data Frame for Estimation Model (Internal)

Description

Creates a data frame of true parameter values adjusted for the estimation model, accounting for misspecification. When generation and estimation models differ, some parameters may need to be filled with defaults (e.g., discrimination = 1 for Rasch when estimating 1PL from 2PL data) or dropped entirely (e.g., discrimination when estimating 1PL from 2PL).

Usage

build_true_params_for_estimation(design, estimation_model)

Arguments

design

An irt_design object specifying the generation model.

estimation_model

Character string: one of "1PL", "2PL", "3PL", "GRM", "PCM", or "GPCM" (canonical list registered in get_model_config()).

Value

Data frame with columns: item, param, true_value, matching the schema of the estimation model.

Compute Performance Criteria for a Single Parameter (Internal)

Description

Given a vector of per-iteration estimates and the true parameter value, computes bias, empirical SE, MSE, RMSE, coverage, and Monte Carlo SEs following Morris et al. (2019).

Usage

compute_criterion(estimates, true_value, ci_lower = NULL, ci_upper = NULL)

Arguments

estimates

Numeric vector of per-iteration parameter estimates. May contain NAs (non-converged iterations), which are excluded.

true_value

Single numeric value. The data-generating (true) parameter value.

ci_lower

Optional numeric vector (same length as estimates). Lower bounds of confidence intervals. If NULL, coverage is not computed.

ci_upper

Optional numeric vector (same length as estimates). Upper bounds of confidence intervals. If NULL, coverage is not computed.

Value

A named list with elements:

bias: Mean estimate minus true value.
empirical_se: Sample standard deviation of estimates (n-1 denominator).
mse: Mean squared error: mean((estimate - true_value)^2).
rmse: Root mean squared error: sqrt(mse).
coverage: Proportion of CIs containing the true value, or NULL if CIs not provided. NAs in CIs are excluded from the denominator.
mcse_bias: Monte Carlo SE of bias: empirical_se / sqrt(K).
mcse_mse: Monte Carlo SE of MSE: sd((est - true)^2) / sqrt(K).

References

Morris, T. P., White, I. R., & Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine, 38(11), 2074–2102. doi:10.1002/sim.8086

Convert Mirt's d Parameter to IRT Difficulty b

Description

Applies the delta method to convert d (intercept) to b (difficulty) under the 2PL/GRM parameterization: b = -d / a. Propagates standard errors and confidence intervals correctly.

Usage

convert_d_to_b(a_info, d_info)

Arguments

a_info

A list with elements est, se, ci_lower, ci_upper for the discrimination parameter a. Typically from extract_one_param().

d_info

A list with elements est, se, ci_lower, ci_upper for the intercept parameter d. Typically from extract_one_param().

Details

The delta method variance for b = -d/a is:

\text{Var}(b) = \frac{\text{Var}(d)}{a^2} + \frac{d^2 \text{Var}(a)}{a^4}

CI bounds are transformed directly: b_ci = -d_ci / a.

Value

A list with elements est, se, ci_lower, ci_upper for the converted difficulty parameter b, using delta method variance propagation.

Extract a Single Parameter Estimate with SE and CI

Description

Internal helper to extract estimate, standard error, and confidence interval bounds from a single column of a mirt coefficient matrix.

Usage

extract_one_param(mat, col_name)

Arguments

mat

A coefficient matrix from mirt::coef() with rows "par", "CI_2.5", "CI_97.5" and column names matching mirt's parameter naming.

col_name

Character string: the column name to extract (e.g., "a1", "d", "d2").

Value

A list with elements:

est: The parameter estimate (from "par" row).
se: The standard error derived from CI width, or NA if CI rows absent.
ci_lower: Lower CI bound (from "CI_2.5" row), or NA if absent.
ci_upper: Upper CI bound (from "CI_97.5" row), or NA if absent.

Extract Item Parameter Estimates from a Fitted mirt Model (Internal)

Description

Pulls point estimates, SEs, and CIs from a fitted mirt object and returns them in long format matching the item_results schema.

Usage

extract_params(
  mod,
  design,
  estimation_model,
  iteration,
  sample_size,
  true_params,
  true_params_lookup,
  se = TRUE
)

Arguments

mod

A fitted mirt object.

design

An irt_design object (for true values and model type).

estimation_model

Character string: one of "1PL", "2PL", "3PL", "GRM", "PCM", or "GPCM" (the model that was fitted, which may differ from design$model; canonical list registered in get_model_config()).

iteration

Integer iteration number.

sample_size

Integer sample size.

true_params

Data frame (used for schema).

true_params_lookup

Named character vector mapping keys of the form "item_param" (e.g., "Item_1_a") to true_value (for O(1) lookup instead of repeated vector scans).

se

Logical. Extract standard errors and CIs? Default TRUE.

Details

Uses the list-based coef(mod) output (one matrix per item, keyed by item name). Default coef() returns rows ⁠par, CI_2.5, CI_97.5⁠; SE is derived from CI width: SE = (upper - lower) / (2 * z_0.975).

Value

Data frame with item_results columns.

Extract Theta Recovery Summary from a Fitted mirt Model (Internal)

Description

Computes EAP theta estimates and summarizes recovery via correlation and RMSE against true theta.

Usage

extract_theta_summary(mod, theta_true, iteration, sample_size)

Arguments

mod

A fitted mirt object.

theta_true

Numeric vector of true theta values.

iteration

Integer iteration number.

sample_size

Integer sample size.

Value

Single-row data frame with theta_results columns.

Fit an IRT Model (Internal)

Description

Wraps mirt::mirt() with error and convergence handling.

Usage

fit_model(data, model, se = TRUE)

Arguments

data

Numeric matrix of response data (may contain NAs).

model

Character string: one of "1PL", "2PL", "3PL", "GRM", "PCM", or "GPCM" (canonical list registered in get_model_config()).

se

Logical. Compute standard errors? Default TRUE.

Value

A list with elements model (fitted mirt object or NULL) and converged (logical).

Generate IRT Response Data (Internal)

Description

Wraps mirt::simdata() to produce a response matrix from an irt_design specification. Handles the b-to-d parameterization translation and theta generation.

Usage

generate_data(design, n, seed = NULL, theta = NULL)

Arguments

design

An irt_design object.

n

Integer. Number of respondents.

seed

Optional integer. Random seed for reproducibility. If NULL, the current RNG state drives draws (used by the parallel dispatch path in irt_simulate() so future.apply's L'Ecuyer-CMRG substreams are not clobbered by an explicit set.seed() call).

theta

Optional numeric vector of length n. Pre-generated theta values. If NULL, theta is drawn from design$theta_dist.

Value

A numeric matrix with n rows and design$n_items columns.

Generate Discrimination Parameters

Description

Internal helper to generate discrimination (a) parameters under a specified distribution. Currently supports log-normal.

Usage

generate_discrimination(n_items, a_dist, a_mean, a_sd)

Arguments

n_items

Positive integer: the number of items (and thus the number of discrimination values to generate).

a_dist

Character string: the distribution name. Currently only "lnorm" is supported.

a_mean

Numeric: the mean parameter for the log-normal distribution (interpreted as meanlog).

a_sd

Numeric: the standard deviation parameter for the log-normal distribution (interpreted as sdlog).

Value

A numeric vector of length n_items containing discrimination values.

Generate Theta Values from a Distribution Specification (Internal)

Description

Generate Theta Values from a Distribution Specification (Internal)

Usage

generate_theta(theta_dist, n)

Arguments

theta_dist

Character string ("normal", "uniform") or function.

n

Integer. Number of values to generate.

Value

Numeric vector of length n.

Get a Criterion Configuration from the Registry (Internal)

Description

Retrieves the metadata for a specified criterion and validates that it exists.

Usage

get_criterion_config(criterion)

Arguments

criterion

Character string: one of "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse".

Value

A named list with criterion-specific metadata: direction (char: "lower_is_better" or "higher_is_better"), use_abs (logical), and display_label (char).

Get a Missing Mechanism Configuration from the Registry (Internal)

Description

Retrieves the metadata for a specified missing data mechanism and validates that it exists.

Usage

get_missing_config(mechanism)

Arguments

mechanism

Character string: one of "none", "mcar", "mar", "booklet", or "linking".

Value

A named list with mechanism-specific metadata: requires_rate (logical), requires_test_design (logical), test_design_key (character or NA), print_label (character), print_style (character: "plain", "rate", or "design"), and design_unit (character or NA).

Get a Model Configuration from the Registry (Internal)

Description

Retrieves the configuration for a specified model and validates that it exists.

Usage

get_model_config(model)

Arguments

model

Character string: "1PL", "2PL", "3PL", "GRM", "PCM", or "GPCM".

Value

A named list with model-specific functions and metadata.

Create an IRT Design Specification

Description

Define the data-generating model for an IRT simulation study. This captures decisions 1–3 from the Schroeders & Gnambs (2025) framework: dimensionality, item parameters, and item type.

Usage

irt_design(model, n_items, item_params, theta_dist = "normal", n_factors = 1L)

Arguments

model

Character string specifying the IRT model. One of "1PL", "2PL", "3PL", "GRM", "PCM", or "GPCM". The canonical list is registered in get_model_config().

n_items

Positive integer. Number of items in the instrument.

item_params

A named list of item parameters. Contents depend on model:

1PL: b (numeric vector of length n_items). Discrimination is fixed at 1 for all items and added automatically.
2PL: a (discrimination, positive numeric vector or matrix) and b (difficulty, numeric vector), each of length n_items.
3PL: a, b, and c (guessing parameter, numeric vector with values in ⁠[0, 1)⁠), each of length n_items.
GRM: a (discrimination, positive numeric vector) of length n_items and b (threshold matrix, n_items rows by n_categories - 1 columns; thresholds ordered within row).
PCM: a (numeric vector, all 1 — Rasch family) of length n_items and b (step matrix, n_items rows by n_categories - 1 columns; steps NOT required to be ordered within row).
GPCM: a (positive numeric vector) of length n_items and b (step matrix, same shape as PCM; steps NOT required to be ordered within row).

See irt_params_1pl(), irt_params_2pl(), irt_params_3pl(), irt_params_grm(), irt_params_pcm(), and irt_params_gpcm() for helpers that generate item_params lists matching each schema.

theta_dist

Either a character string ("normal" or "uniform") or a function that takes a single argument n and returns a numeric vector of length n. Defaults to "normal".

n_factors

Positive integer specifying the number of latent factors. Defaults to 1L. Currently only n_factors = 1 is supported; multidimensional IRT (n_factors > 1) is planned for v0.4.0. Passing any value other than 1 raises an error rather than silently propagating an unsupported design to the estimator.

Value

An S3 object of class irt_design (a named list) with elements model, n_items, item_params, theta_dist, and n_factors.

Examples

# 1PL (Rasch) design with 20 items
design_1pl <- irt_design(
  model = "1PL",
  n_items = 20,
  item_params = list(b = seq(-2, 2, length.out = 20))
)

# 2PL design
design_2pl <- irt_design(
  model = "2PL",
  n_items = 30,
  item_params = list(
    a = rlnorm(30, 0, 0.25),
    b = seq(-2, 2, length.out = 30)
  )
)

Compute Required Monte Carlo Replications

Description

Uses the Burton (2003) formula to determine the minimum number of simulation replications needed to achieve a desired level of Monte Carlo precision.

Usage

irt_iterations(sigma, delta, alpha = 0.05)

Arguments

sigma

Positive numeric. The empirical standard error of the estimand across replications (or a pilot estimate thereof).

delta

Positive numeric. The acceptable Monte Carlo error (half-width of the MC confidence interval for the estimand).

alpha

Numeric in (0, 1). Two-sided significance level. Default 0.05 (i.e., 95 percent MC confidence).

Details

The formula is:

R = \lceil (z_{\alpha/2} \cdot \sigma / \delta)^2 \rceil

where \sigma is the empirical standard error of the estimand, \delta is the acceptable Monte Carlo error, and z_{\alpha/2} is the critical value for the desired confidence level.

Value

An integer: the minimum number of replications.

References

Burton, A., Altman, D. G., Royston, P., & Holder, R. L. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25(24), 4279–4292. doi:10.1002/sim.2673

Examples

# How many replications for MC SE of bias < 0.1
# when empirical SE of the estimand is 0.5?
irt_iterations(sigma = 0.5, delta = 0.1)

# Tighter tolerance with 99% MC confidence
irt_iterations(sigma = 0.5, delta = 0.05, alpha = 0.01)

Generate 1PL Item Parameters

Description

Creates a list of difficulty (b) parameters suitable for passing to irt_design() with model = "1PL". The 1PL model is Rasch-family: every item shares the same discrimination (fixed at 1), so only b is generated here — the a = 1 contract is applied downstream in the design's validate_params step.

Usage

irt_params_1pl(
  n_items,
  b_dist = "normal",
  b_mean = 0,
  b_sd = 1,
  b_range = c(-2, 2),
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

b_dist

Character string for the difficulty distribution. One of "normal" or "even". Default: "normal".

b_mean

Numeric. Mean of the normal distribution for b. Only used when b_dist = "normal". Default: 0.

b_sd

Numeric. SD of the normal distribution for b. Only used when b_dist = "normal". Default: 1.

b_range

Numeric vector of length 2. Range for evenly-spaced b values. Only used when b_dist = "even". Default: c(-2, 2).

seed

Optional integer seed for reproducibility. If NULL (default), the current RNG state is used.

Value

A named list with a single element b (numeric vector of length n_items). Note: no a is returned — 1PL fixes discrimination at 1 downstream rather than at generation time.

Examples

# Default 1PL parameters for 30 items
params <- irt_params_1pl(n_items = 30, seed = 42)

# Evenly-spaced difficulty across a wider range
params <- irt_params_1pl(n_items = 20, b_dist = "even", b_range = c(-3, 3))

Generate 2PL Item Parameters

Description

Creates a list of discrimination (a) and difficulty (b) parameters suitable for passing to irt_design().

Usage

irt_params_2pl(
  n_items,
  a_dist = "lnorm",
  a_mean = 0,
  a_sd = 0.25,
  b_dist = "normal",
  b_mean = 0,
  b_sd = 1,
  b_range = c(-2, 2),
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

a_dist

Character string for the discrimination distribution. Currently only "lnorm" (log-normal) is supported. Default: "lnorm".

a_mean

Numeric. Mean of the log-normal distribution for a (i.e., meanlog). Default: 0.

a_sd

Numeric. SD of the log-normal distribution for a (i.e., sdlog). Default: 0.25.

b_dist

Character string for the difficulty distribution. One of "normal" or "even". Default: "normal".

b_mean

Numeric. Mean of the normal distribution for b. Only used when b_dist = "normal". Default: 0.

b_sd

Numeric. SD of the normal distribution for b. Only used when b_dist = "normal". Default: 1.

b_range

Numeric vector of length 2. Range for evenly-spaced b values. Only used when b_dist = "even". Default: c(-2, 2).

seed

Optional integer seed for reproducibility. If NULL (default), the current RNG state is used.

Value

A named list with elements a (numeric vector) and b (numeric vector), each of length n_items.

Examples

# Default 2PL parameters for 30 items
params <- irt_params_2pl(n_items = 30, seed = 42)

# Evenly-spaced difficulty
params <- irt_params_2pl(n_items = 20, b_dist = "even", b_range = c(-3, 3))

Generate 3PL Item Parameters

Description

Creates a list of discrimination (a), difficulty (b), and guessing (c) parameters suitable for passing to irt_design() with model = "3PL".

Usage

irt_params_3pl(
  n_items,
  a_dist = "lnorm",
  a_mean = 0,
  a_sd = 0.25,
  b_dist = "normal",
  b_mean = 0,
  b_sd = 1,
  b_range = c(-2, 2),
  c_shape1 = 5,
  c_shape2 = 17,
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

a_dist

Character string for the discrimination distribution. Currently only "lnorm" (log-normal) is supported. Default: "lnorm".

a_mean

Numeric. meanlog for the log-normal distribution. Default: 0.

a_sd

Numeric. sdlog for the log-normal distribution. Default: 0.25.

b_dist

Character string for the difficulty distribution. One of "normal" or "even". Default: "normal".

b_mean

Numeric. Mean of the normal distribution for b. Only used when b_dist = "normal". Default: 0.

b_sd

Numeric. SD of the normal distribution for b. Only used when b_dist = "normal". Default: 1.

b_range

Numeric vector of length 2. Range for evenly-spaced b values. Only used when b_dist = "even". Default: c(-2, 2).

c_shape1

Positive numeric. First shape parameter of the Beta distribution used to generate c. Default: 5.

c_shape2

Positive numeric. Second shape parameter. Default: 17. The default Beta(5, 17) has ⁠E[c] ~= 0.227, SD ~= 0.087⁠, consistent with typical four-option multiple-choice items.

seed

Optional integer seed for reproducibility. If NULL (default), the current RNG state is used.

Value

A named list with elements a, b, c, each a numeric vector of length n_items.

Examples

# Default 3PL parameters for 30 items
params <- irt_params_3pl(n_items = 30, seed = 42)

# Custom guessing distribution (e.g., 5-option items, lower chance level)
params <- irt_params_3pl(
  n_items = 30, c_shape1 = 4, c_shape2 = 16, seed = 42
)

Generate GPCM Item Parameters

Description

Creates a list of discrimination (a) and step (b) parameters suitable for passing to irt_design() with model = "GPCM".

Usage

irt_params_gpcm(
  n_items,
  n_categories,
  a_dist = "lnorm",
  a_mean = 0,
  a_sd = 0.25,
  b_dist = "normal",
  b_mean = 0,
  b_sd = 1,
  b_range = c(-2, 2),
  step_dispersion = 1,
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

n_categories

Positive integer >= 2. Number of response categories per item. Produces n_categories - 1 step columns in b.

a_dist

Character string for the discrimination distribution. Currently only "lnorm" (log-normal) is supported. Default: "lnorm".

a_mean

Numeric. meanlog for the log-normal distribution. Default: 0.

a_sd

Numeric. sdlog for the log-normal distribution. Default: 0.25.

b_dist

Character string for the item-center distribution: either "normal" (default) or "even".

b_mean

Numeric. Mean of item centers when b_dist = "normal". Default: 0.

b_sd

Numeric. SD of item centers when b_dist = "normal". Default: 1.

b_range

Length-2 numeric vector giving the minimum and maximum item-center values. Only used when b_dist = "even". Default: c(-2, 2).

step_dispersion

Non-negative numeric. SD of the within-item step offsets drawn from rnorm(0, step_dispersion) and added to each item's center. Default: 1.0. 0 is allowed (all steps within an item equal the item center — degenerate but useful for design exploration).

seed

Optional integer seed for reproducibility.

Details

The Generalized Partial Credit Model (Muraki, 1992) is partial-credit family — like the Partial Credit Model, step parameters within each item are NOT required to be ordered (the defining contrast with the Graded Response Model). Unlike PCM, GPCM allows per-item discrimination: a is a free positive vector rather than fixed at 1. See irt_params_pcm() for the Rasch-family alternative.

Value

A named list with elements:

a: Positive numeric vector of length n_items.
b: Numeric matrix with n_items rows and n_categories - 1 columns. Steps are NOT sorted within row.

Examples

# GPCM parameters: 15 items, 4 response categories
params <- irt_params_gpcm(n_items = 15, n_categories = 4, seed = 42)

# Tighter within-item step spread and a wider discrimination distribution
params <- irt_params_gpcm(
  n_items = 15, n_categories = 4,
  a_sd = 0.50, step_dispersion = 0.5, seed = 42
)

Generate GRM Item Parameters

Description

Creates a list of discrimination (a) and threshold (b) parameters suitable for passing to irt_design() with model = "GRM".

Usage

irt_params_grm(
  n_items,
  n_categories,
  a_dist = "lnorm",
  a_mean = 0,
  a_sd = 0.25,
  b_mean = 0,
  b_sd = 1,
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

n_categories

Positive integer >= 2. Number of response categories per item. Produces n_categories - 1 threshold columns in b.

a_dist

Character string for the discrimination distribution. Currently only "lnorm" is supported. Default: "lnorm".

a_mean

Numeric. meanlog for the log-normal distribution. Default: 0.

a_sd

Numeric. sdlog for the log-normal distribution. Default: 0.25.

b_mean

Numeric. Mean around which thresholds are centered. Default: 0.

b_sd

Numeric. SD of the base threshold distribution. Default: 1.

seed

Optional integer seed for reproducibility.

Value

A named list with elements:

a: Numeric vector of length n_items.
b: Numeric matrix with n_items rows and n_categories - 1 columns. Thresholds are ordered within each row.

Examples

# GRM parameters: 15 items, 5 response categories
params <- irt_params_grm(n_items = 15, n_categories = 5, seed = 42)

Generate PCM Item Parameters

Description

Creates a list of discrimination (a, fixed at 1) and step (b) parameters suitable for passing to irt_design() with model = "PCM".

Usage

irt_params_pcm(
  n_items,
  n_categories,
  b_dist = "normal",
  b_mean = 0,
  b_sd = 1,
  b_range = c(-2, 2),
  step_dispersion = 1,
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

n_categories

Positive integer >= 2. Number of response categories per item. Produces n_categories - 1 step columns in b.

b_dist

Character string for the item-center distribution: either "normal" (default) or "even".

b_mean

Numeric. Mean of item centers when b_dist = "normal". Default: 0.

b_sd

Numeric. SD of item centers when b_dist = "normal". Default: 1.

b_range

Length-2 numeric vector giving the minimum and maximum item-center values. Only used when b_dist = "even". Default: c(-2, 2).

step_dispersion

Non-negative numeric. SD of the within-item step offsets drawn from rnorm(0, step_dispersion) and added to each item's center. Default: 1.0, consistent with mirt::simdata's polytomous conventions and the PCM examples in Embretson & Reise (2000). 0 is allowed (all steps within an item equal the item center — degenerate but useful for design exploration).

seed

Optional integer seed for reproducibility.

Details

The Partial Credit Model (Masters, 1982) is a Rasch-family polytomous model: every item shares the same discrimination (fixed at 1), and the step parameters within each item are NOT required to be ordered. This is the defining contrast with the Graded Response Model — see irt_params_grm() for the ordered-threshold alternative.

Value

A named list with elements:

a: Numeric vector of length n_items, all 1 (Rasch family).
b: Numeric matrix with n_items rows and n_categories - 1 columns. Steps are NOT sorted within row.

Examples

# PCM parameters: 15 items, 4 response categories
params <- irt_params_pcm(n_items = 15, n_categories = 4, seed = 42)

# Tighter within-item step spread (steps closer to the item center)
params <- irt_params_pcm(
  n_items = 15, n_categories = 4, step_dispersion = 0.5, seed = 42
)

Run an IRT Monte Carlo Simulation

Description

Execute a Monte Carlo simulation study based on an irt_study specification. For each iteration and sample size, data are generated, missing values applied, the IRT model is fitted, and parameter estimates are extracted and stored.

Usage

irt_simulate(
  study,
  iterations,
  seed,
  progress = TRUE,
  parallel = FALSE,
  se = TRUE,
  compute_theta = TRUE
)

Arguments

study

An irt_study object specifying the design and study conditions.

iterations

Positive integer. Number of Monte Carlo replications.

seed

Integer. Base random seed for reproducibility. Each iteration uses seed + iteration - 1.

progress

Logical. Print progress messages? Default TRUE.

parallel

Logical. Run iterations in parallel using future.apply::future_lapply()? Default FALSE. Requires users to set up a future plan (e.g., future::plan(multisession)) before calling. See Details.

se

Logical. Compute standard errors and confidence intervals for item parameter estimates? Default TRUE. Set to FALSE for significant speed improvement when only point estimates are needed (e.g., MSE, bias, RMSE criteria). When FALSE, se/ci_lower/ci_upper columns in item_results are NA.

compute_theta

Logical. Compute EAP theta estimates and recovery metrics (correlation, RMSE)? Default TRUE. Set to FALSE to skip the mirt::fscores() call when theta recovery is not needed. When FALSE, theta_cor and theta_rmse in theta_results are NA (but converged is still tracked).

Details

The returned irt_results object stores raw per-iteration estimates. Use summary.irt_results() to compute performance criteria (bias, MSE, RMSE, coverage, etc.) and plot.irt_results() to visualize results.

Parallelization

When parallel = TRUE, the Monte Carlo loop over iterations is parallelized via future.apply::future_lapply(). Each parallel task processes one iteration across all sample sizes sequentially.

Important: This function does NOT configure a future plan. Users must set their own plan before calling with parallel = TRUE:

library(future)
plan(multisession, workers = 4)  # or your preferred backend
results <- irt_simulate(study, iterations = 100, seed = 42, parallel = TRUE)

Without an explicit plan, future defaults to sequential execution (no parallelism).

Reproducibility contract

Reproducibility is guaranteed within a given dispatch mode, not across modes:

Serial mode (parallel = FALSE) uses deterministic per-cell seeds under the session's default RNG kind (Mersenne-Twister). Re-running with the same base seed reproduces identical results bit-for-bit.
Parallel mode (parallel = TRUE) delegates RNG management to future.apply::future_lapply(..., future.seed = TRUE), which assigns each iteration a formally independent L'Ecuyer-CMRG substream. Re-running with the same base seed reproduces identical results bit-for-bit across parallel runs, including across different worker counts.
Across modes, numerical results will differ because the two paths use different RNG algorithms and different seeding strategies. Both are statistically valid; the parallel path has the stronger formal guarantee of independent substreams, which is the standard for Monte Carlo work.

Progress messages are suppressed in parallel mode (workers cannot stream to stdout safely). Set progress = FALSE in serial mode to suppress messages (they appear every 10% of iterations).

Value

An S3 object of class irt_results containing:

item_results: Data frame with per-iteration item parameter estimates (columns: iteration, sample_size, item, param, true_value, estimate, se, ci_lower, ci_upper, converged).
theta_results: Data frame with per-iteration theta recovery summaries (columns: iteration, sample_size, theta_cor, theta_rmse, converged).
study: The original irt_study object.
iterations: Number of replications run.
seed: Base seed used.
elapsed: Elapsed wall-clock time in seconds.
se: Logical flag indicating whether SEs and CIs were computed.
compute_theta: Logical flag indicating whether theta recovery metrics were computed.

Examples


# Minimal example (iterations and sample sizes reduced for speed;
# use iterations >= 100 and 3+ sample sizes in practice)
design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
summary(results)
plot(results)

Define Study Conditions for an IRT Simulation

Description

Add study-level conditions to an IRT design specification. This captures decisions 4–5 from the Schroeders & Gnambs (2025) framework: sample sizes and missing data mechanism.

Usage

irt_study(
  design,
  sample_sizes,
  missing = "none",
  missing_rate = NULL,
  test_design = NULL,
  estimation_model = NULL
)

Arguments

design

An irt_design object specifying the data-generating model.

sample_sizes

Integer vector of sample sizes to evaluate. Values are coerced to integer, sorted in ascending order, and deduplicated.

missing

Character string specifying the missing data mechanism. One of "none" (default), "mcar", "mar", "booklet", or "linking".

missing_rate

Numeric value in [0, 1) specifying the proportion of missing data. Required when missing is "mcar" or "mar"; ignored when missing is "none".

test_design

A list specifying the test design for structured missingness. Required when missing is "booklet" or "linking".

booklet: Must contain booklet_matrix: a binary matrix (n_booklets x n_items) where 1 indicates the item is administered.
linking: Must contain linking_matrix: a binary matrix (n_forms x n_items) where 1 indicates the item appears on the form.

estimation_model

Character string specifying the IRT model to fit. One of "1PL", "2PL", "3PL", "GRM", "PCM", or "GPCM" (canonical list registered in get_model_config). If NULL (default), defaults to design$model (i.e., the generation model is also the estimation model). Set to a different model to perform misspecification studies (e.g., generate 2PL, estimate 1PL). Cross-fits are only allowed within the same response format (binary: 1PL, 2PL, 3PL; polytomous: GRM, PCM, GPCM).

Value

An S3 object of class irt_study (a named list) with elements design, missing, missing_rate, sample_sizes, test_design, and estimation_model.

Examples

# Simple study with no missing data
d <- irt_design(
  model = "1PL", n_items = 20,
  item_params = list(b = seq(-2, 2, length.out = 20))
)
study <- irt_study(d, sample_sizes = c(100, 250, 500))

# Study with MCAR missingness
study_mcar <- irt_study(d, sample_sizes = c(200, 400),
                        missing = "mcar", missing_rate = 0.2)

# Model misspecification: generate 2PL, fit 1PL
d_2pl <- irt_design(
  model = "2PL", n_items = 15,
  item_params = list(a = rlnorm(15, 0, 0.25), b = rnorm(15))
)
study_misspec <- irt_study(d_2pl, sample_sizes = c(100, 300),
                           estimation_model = "1PL")

Plot IRT Simulation Results

Description

Visualize performance criteria across sample sizes from an irt_simulate() result. Calls summary.irt_results() internally, then plots the requested criterion by sample size.

Usage

## S3 method for class 'irt_results'
plot(x, criterion = "rmse", param = NULL, item = NULL, threshold = NULL, ...)

Arguments

x

An irt_results object from irt_simulate().

criterion

Character string. Which criterion to plot. Default "rmse". Valid values: "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse".

param

Optional character vector. Filter to specific parameter types (e.g., "a", "b", "b1").

item

Optional integer vector. Filter to specific item numbers.

threshold

Optional numeric. If provided, draws a horizontal reference line at this value.

...

Additional arguments passed to summary.irt_results().

Value

A ggplot2::ggplot object, returned invisibly.

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
plot(results)
plot(results, criterion = "bias", threshold = 0.05, param = "b")

Plot Summary of IRT Simulation Results

Description

Visualize performance criteria from a summary.irt_results() object. This is a convenience method for users who already have a summary; plot.irt_results() is the primary interface.

Usage

## S3 method for class 'summary_irt_results'
plot(x, criterion = "rmse", param = NULL, item = NULL, threshold = NULL, ...)

Arguments

x

A summary_irt_results object from summary.irt_results().

criterion

Character string. Which criterion to plot. Default "rmse".

param

Optional character vector. Filter to specific parameter types.

item

Optional integer vector. Filter to specific item numbers.

threshold

Optional numeric. If provided, draws a horizontal reference line at this value.

...

Additional arguments (ignored).

Value

A ggplot2::ggplot object, returned invisibly.

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
s <- summary(results)
plot(s, criterion = "rmse", threshold = 0.15)

Print an IRT Design

Description

Display a compact summary of an irt_design object, including model type, number of items, theta distribution, and parameter ranges.

Usage

## S3 method for class 'irt_design'
print(x, ...)

Arguments

x

An irt_design object.

...

Additional arguments (ignored).

Value

x, invisibly.

Examples

d <- irt_design("1PL", 10, list(b = seq(-2, 2, length.out = 10)))
print(d)

Print an IRT Simulation Result

Description

Display a compact summary of an irt_simulate() result, including model, items, sample sizes, iterations, convergence rate, and elapsed time.

Usage

## S3 method for class 'irt_results'
print(x, ...)

Arguments

x

An irt_results object.

...

Additional arguments (ignored).

Value

x, invisibly.

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
print(results)

Print an IRT Study

Description

Display a compact summary of an irt_study object, including model, items, sample sizes, and missing data mechanism.

Usage

## S3 method for class 'irt_study'
print(x, ...)

Arguments

x

An irt_study object.

...

Additional arguments (ignored).

Value

x, invisibly.

Examples

d <- irt_design("1PL", 10, list(b = seq(-2, 2, length.out = 10)))
s <- irt_study(d, sample_sizes = c(100, 500))
print(s)

Print Summary of IRT Simulation Results

Description

Display item parameter criteria and theta recovery statistics from a summary.irt_results() object.

Usage

## S3 method for class 'summary_irt_results'
print(x, ...)

Arguments

x

A summary_irt_results object.

...

Additional arguments (ignored).

Value

x, invisibly.

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
s <- summary(results)
print(s)

Find the Minimum Sample Size Meeting a Criterion Threshold

Description

Given a summary.irt_results() object, find the smallest sample size at which a performance criterion meets the specified threshold for each item and parameter combination.

Usage

recommended_n(object, ...)

## S3 method for class 'summary_irt_results'
recommended_n(
  object,
  criterion,
  threshold,
  param = NULL,
  item = NULL,
  aggregate = c("max", "mean", "median", "none"),
  ...
)

Arguments

object

A summary_irt_results object from summary.irt_results().

...

Additional arguments (ignored).

criterion

Character string. Which criterion to evaluate. One of: "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse".

threshold

Positive numeric. The threshold value the criterion must meet.

param

Optional character vector. Filter to specific parameter types (e.g., "a", "b", "b1").

item

Optional integer vector. Filter to specific item numbers.

aggregate

Character. How to roll the per-item recommended sample sizes up into a single recommendation. One of "max" (default — the smallest N that powers every item/param), "mean", "median", or "none" (return the per-item data frame unchanged). "mean" and "median" round up via ceiling() so the recommendation is never under the computed central tendency.

Details

For criteria where smaller is better (bias, empirical_se, mse, rmse, mcse_bias, mcse_mse), the threshold is met when the criterion value is at or below the threshold. For bias, the absolute value is used. For coverage (where higher is better), the threshold is met when coverage is at or above the threshold.

Value

When aggregate = "none", a data frame with columns:

item: Item number.
param: Parameter name.
recommended_n: Minimum sample size meeting the threshold, or NA if no tested sample size meets it.
criterion: The criterion used (echoed back for reference).
threshold: The threshold used (echoed back for reference).

When aggregate is "max", "mean", or "median" (the typical case), an integer scalar carrying the recommended sample size with attributes details (the per-item data frame above), aggregate, criterion, and threshold. If any item/param combination fails to meet the threshold at every tested sample size, the aggregate is NA_integer_ and a warning lists the affected combinations.

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
s <- summary(results)

# Default — single recommended N (max across items) for RMSE <= 0.20
n_rec <- recommended_n(s, criterion = "rmse", threshold = 0.20)
n_rec
attr(n_rec, "details")  # per-item breakdown

# Mean / median aggregates (rounded up via ceiling)
recommended_n(s, criterion = "rmse", threshold = 0.20, aggregate = "mean")

# Legacy behavior — full per-item data frame
recommended_n(s, criterion = "rmse", threshold = 0.20, aggregate = "none")

# Minimum N for 95% coverage on difficulty parameters only
recommended_n(s, criterion = "coverage", threshold = 0.95, param = "b")

Summarize IRT Simulation Results

Description

Compute performance criteria for each sample size, item, and parameter combination from an irt_simulate() result. Criteria follow Morris et al. (2019) definitions. Optionally, users can provide a custom callback function to compute additional item-level performance criteria (e.g., conditional reliability, external criterion SE).

Usage

## S3 method for class 'irt_results'
summary(object, criterion = NULL, param = NULL, criterion_fn = NULL, ...)

Arguments

object

An irt_results object from irt_simulate().

criterion

Optional character vector. Which criteria to include in the output. Valid values: "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse". If NULL (default), all criteria are returned.

param

Optional character vector. Which parameter types to include (e.g., "a", "b", "b1"). If NULL (default), all parameters are summarized.

criterion_fn

Optional function. A user-defined callback to compute custom performance criteria. Must accept named arguments estimates (numeric vector), true_value (scalar), ci_lower (numeric), ci_upper (numeric), converged (logical), and ... (for future use). Must return a named numeric vector of length >= 1. The names become new columns in item_summary, appended after n_converged. If NULL (default), no custom criteria are computed.

...

Additional arguments (ignored).

Value

An S3 object of class summary_irt_results containing:

item_summary: Data frame with one row per sample_size × item × param combination, containing the requested criteria plus n_converged and any custom columns from criterion_fn.
theta_summary: Data frame with one row per sample_size, containing mean_cor, sd_cor, mean_rmse, sd_rmse, and n_converged.
iterations: Number of replications.
seed: Base seed used.
model: IRT model type.

References

Morris, T. P., White, I. R., & Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine, 38(11), 2074–2102. doi:10.1002/sim.8086

Examples


# Minimal example (iterations reduced for speed; use 100+ in practice)
design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)

s <- summary(results)
s$item_summary
s$theta_summary

# Only bias and RMSE for difficulty parameters
summary(results, criterion = c("bias", "rmse"), param = "b")

# Compute custom criterion: relative bias
custom_fn <- function(estimates, true_value, ci_lower, ci_upper, converged, ...) {
  valid_est <- estimates[!is.na(estimates)]
  rel_bias <- (mean(valid_est) - true_value) / true_value
  c(relative_bias = rel_bias)
}
summary(results, criterion_fn = custom_fn)

# Multiple custom criteria
multi_fn <- function(estimates, true_value, ci_lower, ci_upper, converged, ...) {
  valid_est <- estimates[!is.na(estimates)]
  c(mean_est = mean(valid_est), sd_est = sd(valid_est))
}
summary(results, criterion_fn = multi_fn)

Get All Valid Criteria (Internal)

Description

Returns the names of all criteria in the registry.

Usage

valid_criteria()

Value

Character vector of criterion names, in registry order.

Get All Valid Missing Mechanisms (Internal)

Description

Returns the names of all mechanisms in the registry.

Usage

valid_missing_mechanisms()

Value

Character vector of mechanism names, in registry order.

Validate a Binary Design Matrix

Description

Internal helper to validate that a matrix meets standard requirements for booklet and linking design matrices: is a matrix, has correct column count, and contains only binary values.

Usage

validate_design_matrix(mat, n_items, matrix_name)

Arguments

mat

An object to validate (should be a matrix).

n_items

Integer: the expected number of columns.

matrix_name

Character string: the name of the matrix type for error messages (e.g., "booklet_matrix", "linking_matrix").

Value

Invisibly returns NULL if all checks pass. Throws an error (via stop()) if any check fails.

Package {irtsim}

irtsim: Monte Carlo Simulation-Based Sample-Size Planning for Item Response Theory

Description

Author(s)

See Also

Internal Criterion Registry

Description

Usage

Value

Internal Missing Mechanism Registry

Description

Usage

Value

Internal Model Registry

Description

Usage

Value

Apply Missing Data Mechanism (Internal)

Description

Usage

Arguments

Value

Build Criterion Plot (Internal)

Description

Usage

Arguments

Value

Build NA Item Results for Non-Converged Iterations (Internal)

Description

Usage

Arguments

Value

Build True Parameter Data Frame (Internal)

Description

Usage

Arguments

Value

Build True Parameter Data Frame for Estimation Model (Internal)

Description

Usage

Arguments

Value

Compute Performance Criteria for a Single Parameter (Internal)

Description

Usage

Arguments

Value

References

Convert Mirt's d Parameter to IRT Difficulty b

Description

Usage

Arguments

Details

Value

Extract a Single Parameter Estimate with SE and CI

Description

Usage

Arguments

Value

Extract Item Parameter Estimates from a Fitted mirt Model (Internal)

Description

Usage

Arguments

Details

Value

Extract Theta Recovery Summary from a Fitted mirt Model (Internal)

Description

Usage

Arguments

Value

Fit an IRT Model (Internal)

Description

Usage

Arguments

Value

Generate IRT Response Data (Internal)

Description

Usage

Arguments

Value