Title: Monte Carlo Simulation-Based Sample-Size Planning for Item Response Theory
Version: 0.1.1
Description: Provides a pipeline application programming interface (API) for Monte Carlo simulation-based sample-size planning in item response theory (IRT). Implements the 10-decision framework from Schroeders and Gnambs (2025) <doi:10.1177/25152459251314798> as a three-step workflow: specify the data-generating model with irt_design(), add study conditions with irt_study(), and run simulations with irt_simulate(). Supports one-parameter logistic (1PL), two-parameter logistic (2PL), and graded response models with missing-completely-at-random (MCAR), missing-at-random (MAR), booklet, and linking missingness mechanisms. Results include mean squared error (MSE), bias, root mean squared error (RMSE), standard error (SE), and coverage criteria with summary and plot methods.
License: GPL (≥ 3)
URL: https://github.com/sward1/irtsim
BugReports: https://github.com/sward1/irtsim/issues
Depends: R (≥ 4.1.0)
Imports: cli, future.apply, ggplot2, mirt, rlang
Suggests: future, knitr, R.rsp (≥ 0.46.0), rmarkdown, scales, testthat (≥ 3.0.0)
VignetteBuilder: R.rsp
Config/testthat/edition: 3
Encoding: UTF-8
Language: en-US
RoxygenNote: 7.3.3
NeedsCompilation: no
Packaged: 2026-04-22 13:35:50 UTC; stephenward
Author: Stephen Ward [aut, cre]
Maintainer: Stephen Ward <stephen_ward+irtsim@abhome.co>
Repository: CRAN
Date/Publication: 2026-04-23 19:50:11 UTC

irtsim: Monte Carlo Simulation-Based Sample-Size Planning for Item Response Theory

Description

Provides a pipeline application programming interface (API) for Monte Carlo simulation-based sample-size planning in item response theory (IRT). Implements the 10-decision framework from Schroeders and Gnambs (2025) doi:10.1177/25152459251314798 as a three-step workflow: specify the data-generating model with irt_design(), add study conditions with irt_study(), and run simulations with irt_simulate(). Supports one-parameter logistic (1PL), two-parameter logistic (2PL), and graded response models with missing-completely-at-random (MCAR), missing-at-random (MAR), booklet, and linking missingness mechanisms. Results include mean squared error (MSE), bias, root mean squared error (RMSE), standard error (SE), and coverage criteria with summary and plot methods.

Author(s)

Maintainer: Stephen Ward stephen_ward+irtsim@abhome.co

See Also

Useful links:


Internal Criterion Registry

Description

Internal Criterion Registry

Usage

.get_criterion_registry()

Value

A named list of criterion configurations.


Internal Missing Mechanism Registry

Description

Internal Missing Mechanism Registry

Usage

.get_missing_registry()

Value

A named list of missing mechanism configurations.


Internal Model Registry

Description

Internal Model Registry

Usage

.get_model_registry()

Value

A named list of model configurations.


Apply Missing Data Mechanism (Internal)

Description

Takes a complete response matrix and introduces missingness according to the study specification.

Usage

apply_missing(data, study, seed = NULL, theta = NULL)

Arguments

data

Numeric matrix (N x n_items). Complete response data.

study

An irt_study object specifying the missing data mechanism.

seed

Integer. Random seed for reproducibility.

theta

Optional numeric vector of length nrow(data). Required when study$missing == "mar".

Value

Numeric matrix of same dimensions as data, with NA values introduced according to the missingness mechanism.


Build Criterion Plot (Internal)

Description

Shared plotting engine used by both plot.irt_results() and plot.summary_irt_results().

Usage

build_criterion_plot(
  summary_obj,
  criterion = "rmse",
  param = NULL,
  item = NULL,
  threshold = NULL
)

Arguments

summary_obj

A summary_irt_results object.

criterion

Character string. Criterion to plot.

param

Optional character vector. Parameter filter.

item

Optional integer vector. Item filter.

threshold

Optional numeric. Horizontal reference line.

Value

A ggplot2::ggplot object, returned invisibly.


Build NA Item Results for Non-Converged Iterations (Internal)

Description

Build NA Item Results for Non-Converged Iterations (Internal)

Usage

build_na_item_results(true_params, iteration, sample_size)

Arguments

true_params

Data frame from build_true_params().

iteration

Integer iteration number.

sample_size

Integer sample size.

Value

Data frame matching item_results schema with NA estimates.


Build True Parameter Data Frame (Internal)

Description

Creates a data frame of true parameter values from the design, used for populating the true_value column and for building NA rows on non-convergence.

Usage

build_true_params(design)

Arguments

design

An irt_design object.

Value

Data frame with columns: item, param, true_value.


Build True Parameter Data Frame for Estimation Model (Internal)

Description

Creates a data frame of true parameter values adjusted for the estimation model, accounting for misspecification. When generation and estimation models differ, some parameters may need to be filled with defaults (e.g., discrimination = 1 for Rasch when estimating 1PL from 2PL data) or dropped entirely (e.g., discrimination when estimating 1PL from 2PL).

Usage

build_true_params_for_estimation(design, estimation_model)

Arguments

design

An irt_design object specifying the generation model.

estimation_model

Character string: "1PL", "2PL", or "GRM".

Value

Data frame with columns: item, param, true_value, matching the schema of the estimation model.


Compute Performance Criteria for a Single Parameter (Internal)

Description

Given a vector of per-iteration estimates and the true parameter value, computes bias, empirical SE, MSE, RMSE, coverage, and Monte Carlo SEs following Morris et al. (2019).

Usage

compute_criterion(estimates, true_value, ci_lower = NULL, ci_upper = NULL)

Arguments

estimates

Numeric vector of per-iteration parameter estimates. May contain NAs (non-converged iterations), which are excluded.

true_value

Single numeric value. The data-generating (true) parameter value.

ci_lower

Optional numeric vector (same length as estimates). Lower bounds of confidence intervals. If NULL, coverage is not computed.

ci_upper

Optional numeric vector (same length as estimates). Upper bounds of confidence intervals. If NULL, coverage is not computed.

Value

A named list with elements:

bias

Mean estimate minus true value.

empirical_se

Sample standard deviation of estimates (n-1 denominator).

mse

Mean squared error: mean((estimate - true_value)^2).

rmse

Root mean squared error: sqrt(mse).

coverage

Proportion of CIs containing the true value, or NULL if CIs not provided. NAs in CIs are excluded from the denominator.

mcse_bias

Monte Carlo SE of bias: empirical_se / sqrt(K).

mcse_mse

Monte Carlo SE of MSE: sd((est - true)^2) / sqrt(K).

References

Morris, T. P., White, I. R., & Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine, 38(11), 2074–2102. doi:10.1002/sim.8086


Convert Mirt's d Parameter to IRT Difficulty b

Description

Applies the delta method to convert d (intercept) to b (difficulty) under the 2PL/GRM parameterization: b = -d / a. Propagates standard errors and confidence intervals correctly.

Usage

convert_d_to_b(a_info, d_info)

Arguments

a_info

A list with elements est, se, ci_lower, ci_upper for the discrimination parameter a. Typically from extract_one_param().

d_info

A list with elements est, se, ci_lower, ci_upper for the intercept parameter d. Typically from extract_one_param().

Details

The delta method variance for b = -d/a is:

\text{Var}(b) = \frac{\text{Var}(d)}{a^2} + \frac{d^2 \text{Var}(a)}{a^4}

CI bounds are transformed directly: b_ci = -d_ci / a.

Value

A list with elements est, se, ci_lower, ci_upper for the converted difficulty parameter b, using delta method variance propagation.


Extract a Single Parameter Estimate with SE and CI

Description

Internal helper to extract estimate, standard error, and confidence interval bounds from a single column of a mirt coefficient matrix.

Usage

extract_one_param(mat, col_name)

Arguments

mat

A coefficient matrix from mirt::coef() with rows "par", "CI_2.5", "CI_97.5" and column names matching mirt's parameter naming.

col_name

Character string: the column name to extract (e.g., "a1", "d", "d2").

Value

A list with elements:

est

The parameter estimate (from "par" row).

se

The standard error derived from CI width, or NA if CI rows absent.

ci_lower

Lower CI bound (from "CI_2.5" row), or NA if absent.

ci_upper

Upper CI bound (from "CI_97.5" row), or NA if absent.


Extract Item Parameter Estimates from a Fitted mirt Model (Internal)

Description

Pulls point estimates, SEs, and CIs from a fitted mirt object and returns them in long format matching the item_results schema.

Usage

extract_params(
  mod,
  design,
  estimation_model,
  iteration,
  sample_size,
  true_params,
  true_params_lookup,
  se = TRUE
)

Arguments

mod

A fitted mirt object.

design

An irt_design object (for true values and model type).

estimation_model

Character string: "1PL", "2PL", or "GRM" (the model that was fitted, which may differ from design$model).

iteration

Integer iteration number.

sample_size

Integer sample size.

true_params

Data frame (used for schema).

true_params_lookup

Named character vector mapping keys of the form "item_param" (e.g., "Item_1_a") to true_value (for O(1) lookup instead of repeated vector scans).

se

Logical. Extract standard errors and CIs? Default TRUE.

Details

Uses the list-based coef(mod) output (one matrix per item, keyed by item name). Default coef() returns rows ⁠par, CI_2.5, CI_97.5⁠; SE is derived from CI width: SE = (upper - lower) / (2 * z_0.975).

Value

Data frame with item_results columns.


Extract Theta Recovery Summary from a Fitted mirt Model (Internal)

Description

Computes EAP theta estimates and summarizes recovery via correlation and RMSE against true theta.

Usage

extract_theta_summary(mod, theta_true, iteration, sample_size)

Arguments

mod

A fitted mirt object.

theta_true

Numeric vector of true theta values.

iteration

Integer iteration number.

sample_size

Integer sample size.

Value

Single-row data frame with theta_results columns.


Fit an IRT Model (Internal)

Description

Wraps mirt::mirt() with error and convergence handling.

Usage

fit_model(data, model, se = TRUE)

Arguments

data

Numeric matrix of response data (may contain NAs).

model

Character string: "1PL", "2PL", or "GRM".

se

Logical. Compute standard errors? Default TRUE.

Value

A list with elements model (fitted mirt object or NULL) and converged (logical).


Generate IRT Response Data (Internal)

Description

Wraps mirt::simdata() to produce a response matrix from an irt_design specification. Handles the b-to-d parameterization translation and theta generation.

Usage

generate_data(design, n, seed = NULL, theta = NULL)

Arguments

design

An irt_design object.

n

Integer. Number of respondents.

seed

Optional integer. Random seed for reproducibility. If NULL, the current RNG state drives draws (used by the parallel dispatch path in irt_simulate() so future.apply's L'Ecuyer-CMRG substreams are not clobbered by an explicit set.seed() call).

theta

Optional numeric vector of length n. Pre-generated theta values. If NULL, theta is drawn from design$theta_dist.

Value

A numeric matrix with n rows and design$n_items columns.


Generate Discrimination Parameters

Description

Internal helper to generate discrimination (a) parameters under a specified distribution. Currently supports log-normal.

Usage

generate_discrimination(n_items, a_dist, a_mean, a_sd)

Arguments

n_items

Positive integer: the number of items (and thus the number of discrimination values to generate).

a_dist

Character string: the distribution name. Currently only "lnorm" is supported.

a_mean

Numeric: the mean parameter for the log-normal distribution (interpreted as meanlog).

a_sd

Numeric: the standard deviation parameter for the log-normal distribution (interpreted as sdlog).

Value

A numeric vector of length n_items containing discrimination values.


Generate Theta Values from a Distribution Specification (Internal)

Description

Generate Theta Values from a Distribution Specification (Internal)

Usage

generate_theta(theta_dist, n)

Arguments

theta_dist

Character string ("normal", "uniform") or function.

n

Integer. Number of values to generate.

Value

Numeric vector of length n.


Get a Criterion Configuration from the Registry (Internal)

Description

Retrieves the metadata for a specified criterion and validates that it exists.

Usage

get_criterion_config(criterion)

Arguments

criterion

Character string: one of "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse".

Value

A named list with criterion-specific metadata: direction (char: "lower_is_better" or "higher_is_better"), use_abs (logical), and display_label (char).


Get a Missing Mechanism Configuration from the Registry (Internal)

Description

Retrieves the metadata for a specified missing data mechanism and validates that it exists.

Usage

get_missing_config(mechanism)

Arguments

mechanism

Character string: one of "none", "mcar", "mar", "booklet", or "linking".

Value

A named list with mechanism-specific metadata: requires_rate (logical), requires_test_design (logical), test_design_key (character or NA), print_label (character), print_style (character: "plain", "rate", or "design"), and design_unit (character or NA).


Get a Model Configuration from the Registry (Internal)

Description

Retrieves the configuration for a specified model and validates that it exists.

Usage

get_model_config(model)

Arguments

model

Character string: "1PL", "2PL", or "GRM".

Value

A named list with model-specific functions and metadata.


Create an IRT Design Specification

Description

Define the data-generating model for an IRT simulation study. This captures decisions 1–3 from the Schroeders & Gnambs (2025) framework: dimensionality, item parameters, and item type.

Usage

irt_design(model, n_items, item_params, theta_dist = "normal", n_factors = 1L)

Arguments

model

Character string specifying the IRT model. One of "1PL", "2PL", or "GRM".

n_items

Positive integer. Number of items in the instrument.

item_params

A named list of item parameters. Contents depend on model:

1PL

b (numeric vector of length n_items). Discrimination is fixed at 1 for all items and added automatically.

2PL

a (discrimination, positive numeric vector or matrix) and b (difficulty, numeric vector), each of length n_items.

GRM

a (discrimination, positive numeric vector) of length n_items and b (threshold matrix, n_items rows by n_categories - 1 columns).

theta_dist

Either a character string ("normal" or "uniform") or a function that takes a single argument n and returns a numeric vector of length n. Defaults to "normal".

n_factors

Positive integer specifying the number of latent factors. Defaults to 1L.

Value

An S3 object of class irt_design (a named list) with elements model, n_items, item_params, theta_dist, and n_factors.

See Also

irt_study() to add study conditions, irt_params_2pl() and irt_params_grm() to generate item parameters.

Examples

# 1PL (Rasch) design with 20 items
design_1pl <- irt_design(
  model = "1PL",
  n_items = 20,
  item_params = list(b = seq(-2, 2, length.out = 20))
)

# 2PL design
design_2pl <- irt_design(
  model = "2PL",
  n_items = 30,
  item_params = list(
    a = rlnorm(30, 0, 0.25),
    b = seq(-2, 2, length.out = 30)
  )
)


Compute Required Monte Carlo Replications

Description

Uses the Burton (2003) formula to determine the minimum number of simulation replications needed to achieve a desired level of Monte Carlo precision.

Usage

irt_iterations(sigma, delta, alpha = 0.05)

Arguments

sigma

Positive numeric. The empirical standard error of the estimand across replications (or a pilot estimate thereof).

delta

Positive numeric. The acceptable Monte Carlo error (half-width of the MC confidence interval for the estimand).

alpha

Numeric in (0, 1). Two-sided significance level. Default 0.05 (i.e., 95 percent MC confidence).

Details

The formula is:

R = \lceil (z_{\alpha/2} \cdot \sigma / \delta)^2 \rceil

where \sigma is the empirical standard error of the estimand, \delta is the acceptable Monte Carlo error, and z_{\alpha/2} is the critical value for the desired confidence level.

Value

An integer: the minimum number of replications.

References

Burton, A., Altman, D. G., Royston, P., & Holder, R. L. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25(24), 4279–4292. doi:10.1002/sim.2673

See Also

irt_simulate() for running the simulation with the computed number of replications.

Examples

# How many replications for MC SE of bias < 0.1
# when empirical SE of the estimand is 0.5?
irt_iterations(sigma = 0.5, delta = 0.1)

# Tighter tolerance with 99% MC confidence
irt_iterations(sigma = 0.5, delta = 0.05, alpha = 0.01)


Generate 2PL Item Parameters

Description

Creates a list of discrimination (a) and difficulty (b) parameters suitable for passing to irt_design().

Usage

irt_params_2pl(
  n_items,
  a_dist = "lnorm",
  a_mean = 0,
  a_sd = 0.25,
  b_dist = "normal",
  b_mean = 0,
  b_sd = 1,
  b_range = c(-2, 2),
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

a_dist

Character string for the discrimination distribution. Currently only "lnorm" (log-normal) is supported. Default: "lnorm".

a_mean

Numeric. Mean of the log-normal distribution for a (i.e., meanlog). Default: 0.

a_sd

Numeric. SD of the log-normal distribution for a (i.e., sdlog). Default: 0.25.

b_dist

Character string for the difficulty distribution. One of "normal" or "even". Default: "normal".

b_mean

Numeric. Mean of the normal distribution for b. Only used when b_dist = "normal". Default: 0.

b_sd

Numeric. SD of the normal distribution for b. Only used when b_dist = "normal". Default: 1.

b_range

Numeric vector of length 2. Range for evenly-spaced b values. Only used when b_dist = "even". Default: c(-2, 2).

seed

Optional integer seed for reproducibility. If NULL (default), the current RNG state is used.

Value

A named list with elements a (numeric vector) and b (numeric vector), each of length n_items.

See Also

irt_params_grm() for GRM parameters, irt_design() to use the generated parameters.

Examples

# Default 2PL parameters for 30 items
params <- irt_params_2pl(n_items = 30, seed = 42)

# Evenly-spaced difficulty
params <- irt_params_2pl(n_items = 20, b_dist = "even", b_range = c(-3, 3))


Generate GRM Item Parameters

Description

Creates a list of discrimination (a) and threshold (b) parameters suitable for passing to irt_design() with model = "GRM".

Usage

irt_params_grm(
  n_items,
  n_categories,
  a_dist = "lnorm",
  a_mean = 0,
  a_sd = 0.25,
  b_mean = 0,
  b_sd = 1,
  seed = NULL
)

Arguments

n_items

Positive integer. Number of items.

n_categories

Positive integer >= 2. Number of response categories per item. Produces n_categories - 1 threshold columns in b.

a_dist

Character string for the discrimination distribution. Currently only "lnorm" is supported. Default: "lnorm".

a_mean

Numeric. meanlog for the log-normal distribution. Default: 0.

a_sd

Numeric. sdlog for the log-normal distribution. Default: 0.25.

b_mean

Numeric. Mean around which thresholds are centered. Default: 0.

b_sd

Numeric. SD of the base threshold distribution. Default: 1.

seed

Optional integer seed for reproducibility.

Value

A named list with elements:

a

Numeric vector of length n_items.

b

Numeric matrix with n_items rows and n_categories - 1 columns. Thresholds are ordered within each row.

See Also

irt_params_2pl() for 2PL parameters, irt_design() to use the generated parameters.

Examples

# GRM parameters: 15 items, 5 response categories
params <- irt_params_grm(n_items = 15, n_categories = 5, seed = 42)


Run an IRT Monte Carlo Simulation

Description

Execute a Monte Carlo simulation study based on an irt_study specification. For each iteration and sample size, data are generated, missing values applied, the IRT model is fitted, and parameter estimates are extracted and stored.

Usage

irt_simulate(
  study,
  iterations,
  seed,
  progress = TRUE,
  parallel = FALSE,
  se = TRUE,
  compute_theta = TRUE
)

Arguments

study

An irt_study object specifying the design and study conditions.

iterations

Positive integer. Number of Monte Carlo replications.

seed

Integer. Base random seed for reproducibility. Each iteration uses seed + iteration - 1.

progress

Logical. Print progress messages? Default TRUE.

parallel

Logical. Run iterations in parallel using future.apply::future_lapply()? Default FALSE. Requires users to set up a future plan (e.g., future::plan(multisession)) before calling. See Details.

se

Logical. Compute standard errors and confidence intervals for item parameter estimates? Default TRUE. Set to FALSE for significant speed improvement when only point estimates are needed (e.g., MSE, bias, RMSE criteria). When FALSE, se/ci_lower/ci_upper columns in item_results are NA.

compute_theta

Logical. Compute EAP theta estimates and recovery metrics (correlation, RMSE)? Default TRUE. Set to FALSE to skip the mirt::fscores() call when theta recovery is not needed. When FALSE, theta_cor and theta_rmse in theta_results are NA (but converged is still tracked).

Details

The returned irt_results object stores raw per-iteration estimates. Use summary.irt_results() to compute performance criteria (bias, MSE, RMSE, coverage, etc.) and plot.irt_results() to visualize results.

Parallelization

When parallel = TRUE, the Monte Carlo loop over iterations is parallelized via future.apply::future_lapply(). Each parallel task processes one iteration across all sample sizes sequentially.

Important: This function does NOT configure a future plan. Users must set their own plan before calling with parallel = TRUE:

library(future)
plan(multisession, workers = 4)  # or your preferred backend
results <- irt_simulate(study, iterations = 100, seed = 42, parallel = TRUE)

Without an explicit plan, future defaults to sequential execution (no parallelism).

Reproducibility contract

Reproducibility is guaranteed within a given dispatch mode, not across modes:

Progress messages are suppressed in parallel mode (workers cannot stream to stdout safely). Set progress = FALSE in serial mode to suppress messages (they appear every 10% of iterations).

Value

An S3 object of class irt_results containing:

item_results

Data frame with per-iteration item parameter estimates (columns: iteration, sample_size, item, param, true_value, estimate, se, ci_lower, ci_upper, converged).

theta_results

Data frame with per-iteration theta recovery summaries (columns: iteration, sample_size, theta_cor, theta_rmse, converged).

study

The original irt_study object.

iterations

Number of replications run.

seed

Base seed used.

elapsed

Elapsed wall-clock time in seconds.

se

Logical flag indicating whether SEs and CIs were computed.

compute_theta

Logical flag indicating whether theta recovery metrics were computed.

See Also

irt_study() for specifying study conditions, summary.irt_results() and plot.irt_results() for analyzing output, irt_iterations() for determining the number of replications.

Examples


# Minimal example (iterations and sample sizes reduced for speed;
# use iterations >= 100 and 3+ sample sizes in practice)
design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
summary(results)
plot(results)



Define Study Conditions for an IRT Simulation

Description

Add study-level conditions to an IRT design specification. This captures decisions 4–5 from the Schroeders & Gnambs (2025) framework: sample sizes and missing data mechanism.

Usage

irt_study(
  design,
  sample_sizes,
  missing = "none",
  missing_rate = NULL,
  test_design = NULL,
  estimation_model = NULL
)

Arguments

design

An irt_design object specifying the data-generating model.

sample_sizes

Integer vector of sample sizes to evaluate. Values are coerced to integer, sorted in ascending order, and deduplicated.

missing

Character string specifying the missing data mechanism. One of "none" (default), "mcar", "mar", "booklet", or "linking".

missing_rate

Numeric value in [0, 1) specifying the proportion of missing data. Required when missing is "mcar" or "mar"; ignored when missing is "none".

test_design

A list specifying the test design for structured missingness. Required when missing is "booklet" or "linking".

booklet

Must contain booklet_matrix: a binary matrix (n_booklets x n_items) where 1 indicates the item is administered.

linking

Must contain linking_matrix: a binary matrix (n_forms x n_items) where 1 indicates the item appears on the form.

estimation_model

Character string specifying the IRT model to fit. One of "1PL", "2PL", or "GRM". If NULL (default), defaults to design$model (i.e., the generation model is also the estimation model). Set to a different model to perform misspecification studies (e.g., generate 2PL, estimate 1PL). Cross-fits are only allowed within the same response format (binary: 1PL, 2PL; polytomous: GRM).

Value

An S3 object of class irt_study (a named list) with elements design, missing, missing_rate, sample_sizes, test_design, and estimation_model.

See Also

irt_design() for the design specification, irt_simulate() to run the simulation.

Examples

# Simple study with no missing data
d <- irt_design(
  model = "1PL", n_items = 20,
  item_params = list(b = seq(-2, 2, length.out = 20))
)
study <- irt_study(d, sample_sizes = c(100, 250, 500))

# Study with MCAR missingness
study_mcar <- irt_study(d, sample_sizes = c(200, 400),
                        missing = "mcar", missing_rate = 0.2)

# Model misspecification: generate 2PL, fit 1PL
d_2pl <- irt_design(
  model = "2PL", n_items = 15,
  item_params = list(a = rlnorm(15, 0, 0.25), b = rnorm(15))
)
study_misspec <- irt_study(d_2pl, sample_sizes = c(100, 300),
                           estimation_model = "1PL")


Plot IRT Simulation Results

Description

Visualize performance criteria across sample sizes from an irt_simulate() result. Calls summary.irt_results() internally, then plots the requested criterion by sample size.

Usage

## S3 method for class 'irt_results'
plot(x, criterion = "rmse", param = NULL, item = NULL, threshold = NULL, ...)

Arguments

x

An irt_results object from irt_simulate().

criterion

Character string. Which criterion to plot. Default "rmse". Valid values: "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse".

param

Optional character vector. Filter to specific parameter types (e.g., "a", "b", "b1").

item

Optional integer vector. Filter to specific item numbers.

threshold

Optional numeric. If provided, draws a horizontal reference line at this value.

...

Additional arguments passed to summary.irt_results().

Value

A ggplot2::ggplot object, returned invisibly.

See Also

summary.irt_results() for the underlying criteria, recommended_n() for sample-size recommendations.

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
plot(results)
plot(results, criterion = "bias", threshold = 0.05, param = "b")



Plot Summary of IRT Simulation Results

Description

Visualize performance criteria from a summary.irt_results() object. This is a convenience method for users who already have a summary; plot.irt_results() is the primary interface.

Usage

## S3 method for class 'summary_irt_results'
plot(x, criterion = "rmse", param = NULL, item = NULL, threshold = NULL, ...)

Arguments

x

A summary_irt_results object from summary.irt_results().

criterion

Character string. Which criterion to plot. Default "rmse".

param

Optional character vector. Filter to specific parameter types.

item

Optional integer vector. Filter to specific item numbers.

threshold

Optional numeric. If provided, draws a horizontal reference line at this value.

...

Additional arguments (ignored).

Value

A ggplot2::ggplot object, returned invisibly.

See Also

plot.irt_results(), summary.irt_results()

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
s <- summary(results)
plot(s, criterion = "rmse", threshold = 0.15)



Print an IRT Design

Description

Display a compact summary of an irt_design object, including model type, number of items, theta distribution, and parameter ranges.

Usage

## S3 method for class 'irt_design'
print(x, ...)

Arguments

x

An irt_design object.

...

Additional arguments (ignored).

Value

x, invisibly.

See Also

irt_design()

Examples

d <- irt_design("1PL", 10, list(b = seq(-2, 2, length.out = 10)))
print(d)


Print an IRT Simulation Result

Description

Display a compact summary of an irt_simulate() result, including model, items, sample sizes, iterations, convergence rate, and elapsed time.

Usage

## S3 method for class 'irt_results'
print(x, ...)

Arguments

x

An irt_results object.

...

Additional arguments (ignored).

Value

x, invisibly.

See Also

irt_simulate()

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
print(results)



Print an IRT Study

Description

Display a compact summary of an irt_study object, including model, items, sample sizes, and missing data mechanism.

Usage

## S3 method for class 'irt_study'
print(x, ...)

Arguments

x

An irt_study object.

...

Additional arguments (ignored).

Value

x, invisibly.

See Also

irt_study()

Examples

d <- irt_design("1PL", 10, list(b = seq(-2, 2, length.out = 10)))
s <- irt_study(d, sample_sizes = c(100, 500))
print(s)


Print Summary of IRT Simulation Results

Description

Display item parameter criteria and theta recovery statistics from a summary.irt_results() object.

Usage

## S3 method for class 'summary_irt_results'
print(x, ...)

Arguments

x

A summary_irt_results object.

...

Additional arguments (ignored).

Value

x, invisibly.

See Also

summary.irt_results()

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
s <- summary(results)
print(s)



Description

Given a summary.irt_results() object, find the smallest sample size at which a performance criterion meets the specified threshold for each item and parameter combination.

Usage

recommended_n(object, ...)

## S3 method for class 'summary_irt_results'
recommended_n(object, criterion, threshold, param = NULL, item = NULL, ...)

Arguments

object

A summary_irt_results object from summary.irt_results().

...

Additional arguments (ignored).

criterion

Character string. Which criterion to evaluate. One of: "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse".

threshold

Positive numeric. The threshold value the criterion must meet.

param

Optional character vector. Filter to specific parameter types (e.g., "a", "b", "b1").

item

Optional integer vector. Filter to specific item numbers.

Details

For criteria where smaller is better (bias, empirical_se, mse, rmse, mcse_bias, mcse_mse), the threshold is met when the criterion value is at or below the threshold. For bias, the absolute value is used. For coverage (where higher is better), the threshold is met when coverage is at or above the threshold.

Value

A data frame with columns:

item

Item number.

param

Parameter name.

recommended_n

Minimum sample size meeting the threshold, or NA if no tested sample size meets it.

criterion

The criterion used (echoed back for reference).

threshold

The threshold used (echoed back for reference).

See Also

summary.irt_results() for computing criteria, plot.irt_results() for visualization.

Examples


design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)
s <- summary(results)

# Minimum N for RMSE <= 0.20 on all items
recommended_n(s, criterion = "rmse", threshold = 0.20)

# Minimum N for 95% coverage on difficulty parameters only
recommended_n(s, criterion = "coverage", threshold = 0.95, param = "b")



Summarize IRT Simulation Results

Description

Compute performance criteria for each sample size, item, and parameter combination from an irt_simulate() result. Criteria follow Morris et al. (2019) definitions. Optionally, users can provide a custom callback function to compute additional item-level performance criteria (e.g., conditional reliability, external criterion SE).

Usage

## S3 method for class 'irt_results'
summary(object, criterion = NULL, param = NULL, criterion_fn = NULL, ...)

Arguments

object

An irt_results object from irt_simulate().

criterion

Optional character vector. Which criteria to include in the output. Valid values: "bias", "empirical_se", "mse", "rmse", "coverage", "mcse_bias", "mcse_mse". If NULL (default), all criteria are returned.

param

Optional character vector. Which parameter types to include (e.g., "a", "b", "b1"). If NULL (default), all parameters are summarized.

criterion_fn

Optional function. A user-defined callback to compute custom performance criteria. Must accept named arguments estimates (numeric vector), true_value (scalar), ci_lower (numeric), ci_upper (numeric), converged (logical), and ... (for future use). Must return a named numeric vector of length >= 1. The names become new columns in item_summary, appended after n_converged. If NULL (default), no custom criteria are computed.

...

Additional arguments (ignored).

Value

An S3 object of class summary_irt_results containing:

item_summary

Data frame with one row per sample_size × item × param combination, containing the requested criteria plus n_converged and any custom columns from criterion_fn.

theta_summary

Data frame with one row per sample_size, containing mean_cor, sd_cor, mean_rmse, sd_rmse, and n_converged.

iterations

Number of replications.

seed

Base seed used.

model

IRT model type.

References

Morris, T. P., White, I. R., & Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine, 38(11), 2074–2102. doi:10.1002/sim.8086

See Also

irt_simulate() for running simulations, plot.irt_results() for visualization, recommended_n() for sample-size recommendations.

Examples


# Minimal example (iterations reduced for speed; use 100+ in practice)
design <- irt_design(
  model = "1PL", n_items = 5,
  item_params = list(b = seq(-2, 2, length.out = 5))
)
study <- irt_study(design, sample_sizes = c(200, 500))
results <- irt_simulate(study, iterations = 10, seed = 42)

s <- summary(results)
s$item_summary
s$theta_summary

# Only bias and RMSE for difficulty parameters
summary(results, criterion = c("bias", "rmse"), param = "b")

# Compute custom criterion: relative bias
custom_fn <- function(estimates, true_value, ci_lower, ci_upper, converged, ...) {
  valid_est <- estimates[!is.na(estimates)]
  rel_bias <- (mean(valid_est) - true_value) / true_value
  c(relative_bias = rel_bias)
}
summary(results, criterion_fn = custom_fn)

# Multiple custom criteria
multi_fn <- function(estimates, true_value, ci_lower, ci_upper, converged, ...) {
  valid_est <- estimates[!is.na(estimates)]
  c(mean_est = mean(valid_est), sd_est = sd(valid_est))
}
summary(results, criterion_fn = multi_fn)



Get All Valid Criteria (Internal)

Description

Returns the names of all criteria in the registry.

Usage

valid_criteria()

Value

Character vector of criterion names, in registry order.


Get All Valid Missing Mechanisms (Internal)

Description

Returns the names of all mechanisms in the registry.

Usage

valid_missing_mechanisms()

Value

Character vector of mechanism names, in registry order.


Validate a Binary Design Matrix

Description

Internal helper to validate that a matrix meets standard requirements for booklet and linking design matrices: is a matrix, has correct column count, and contains only binary values.

Usage

validate_design_matrix(mat, n_items, matrix_name)

Arguments

mat

An object to validate (should be a matrix).

n_items

Integer: the expected number of columns.

matrix_name

Character string: the name of the matrix type for error messages (e.g., "booklet_matrix", "linking_matrix").

Value

Invisibly returns NULL if all checks pass. Throws an error (via stop()) if any check fails.