DGEAR

Differential Gene Expression Analysis with R

The goal of DGEAR is to help researchers identify differentially expressed genes (DEGs) from microarray gene expression data or RNA-seq count data in the simplest way possible. DGEAR runs five independent statistical tests, combines their results using a majority voting strategy, and reports DEGs that are consistently flagged across methods.


How it works

DGEAR applies five statistical tests to every gene in your expression matrix:

Test Function
Welch two-sample t-test perform_t_test()
One-way ANOVA perform_anova()
Dunnett’s test perform_dunnett_test()
Half’s modified t-test perform_h_test()
Wilcoxon-Mann-Whitney U-test perform_wilcox_test()

Each test independently flags a gene as significant (BH-adjusted p ≤ α). A gene is reported as a DEG when it is flagged by at least votting_cutoff out of 5 tests. Combined p-values across all five tests are also computed using Fisher’s method (metapod::parallelFisher).


Installation

Install the released version from CRAN:

install.packages("DGEAR")

Or install the development version from GitHub:

# install.packages("remotes")
remotes::install_github("koushikbardhan2000/DGEAR_0.2.1")

DGEAR depends on two packages. Install them once if needed:

# DescTools is on CRAN:
install.packages("DescTools")

# metapod is on Bioconductor:
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("metapod")

Quick start

library(DGEAR)

# Load the built-in example dataset (10 genes × 20 samples)
# Columns 1–10 = control, columns 11–20 = experiment
# Genes 1–5 are true DEGs (expression ~13× higher in control)
data("gene_exp_data")
head(gene_exp_data)

Run the full ensemble analysis

result <- DGEAR(
  dataframe      = gene_exp_data,
  con1           = 1,
  con2           = 10,
  exp1           = 11,
  exp2           = 20,
  alpha          = 0.05,
  votting_cutoff = 3        # gene must be flagged by at least 3/5 tests
)

# DEGs identified by majority voting
result$DEGs

# Full table: per-test BH FDRs, Fisher-combined FDR, log2FC, ensemble score
head(result$FDR_Table)

# Concise summary table
head(result$Results_Table)

# Raw output from each individual test
names(result$IndividualTests)   # "t_test" "anova" "dunnett" "half_t" "wilcoxon"

Run individual tests

Each test function can also be called on its own and returns a Table and a DEGs data.frame:

perform_t_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_anova(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_dunnett_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_h_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_wilcox_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)

Using a GEO SOFT annotation file (optional)

Gene annotation is entirely optional. When no annotation is supplied, row names are used as gene identifiers. To map probe IDs to gene symbols using a GEO SOFT family file:

annot <- read.delim("GSExxxxx_family.soft")

result <- DGEAR(
  dataframe      = your_data,
  con1           = 1, con2 = 8,
  exp1           = 9, exp2 = 21,
  alpha          = 0.05,
  votting_cutoff = 3,
  annot_df       = annot        # <-- optional annotation table
)

Data format

DGEAR expects a plain data.frame or matrix where:

Raw intensity / count values are automatically log2-transformed when the data appear to be on a linear scale (no manual transformation needed).


Run the interactive demo

A fully annotated walkthrough script is bundled with the package:

source(system.file("examples", "DGEAR_demo.R", package = "DGEAR"))

Function reference

Function Description
DGEAR() Main ensemble function — runs all five tests and returns DEGs
perform_t_test() Welch two-sample t-test
perform_anova() One-way ANOVA
perform_dunnett_test() Dunnett’s multiple comparison test
perform_h_test() Half’s modified t-test
perform_wilcox_test() Wilcoxon-Mann-Whitney U-test

Authors

License

MIT © Koushik Bardhan