Differential Gene Expression Analysis with R
The goal of DGEAR is to help researchers identify differentially expressed genes (DEGs) from microarray gene expression data or RNA-seq count data in the simplest way possible. DGEAR runs five independent statistical tests, combines their results using a majority voting strategy, and reports DEGs that are consistently flagged across methods.
DGEAR applies five statistical tests to every gene in your expression matrix:
| Test | Function |
|---|---|
| Welch two-sample t-test | perform_t_test() |
| One-way ANOVA | perform_anova() |
| Dunnett’s test | perform_dunnett_test() |
| Half’s modified t-test | perform_h_test() |
| Wilcoxon-Mann-Whitney U-test | perform_wilcox_test() |
Each test independently flags a gene as significant (BH-adjusted p ≤
α). A gene is reported as a DEG when it is flagged by at least
votting_cutoff out of 5 tests. Combined p-values across all
five tests are also computed using Fisher’s method
(metapod::parallelFisher).
Install the released version from CRAN:
install.packages("DGEAR")Or install the development version from GitHub:
# install.packages("remotes")
remotes::install_github("koushikbardhan2000/DGEAR_0.2.1")DGEAR depends on two packages. Install them once if needed:
# DescTools is on CRAN:
install.packages("DescTools")
# metapod is on Bioconductor:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("metapod")library(DGEAR)
# Load the built-in example dataset (10 genes × 20 samples)
# Columns 1–10 = control, columns 11–20 = experiment
# Genes 1–5 are true DEGs (expression ~13× higher in control)
data("gene_exp_data")
head(gene_exp_data)result <- DGEAR(
dataframe = gene_exp_data,
con1 = 1,
con2 = 10,
exp1 = 11,
exp2 = 20,
alpha = 0.05,
votting_cutoff = 3 # gene must be flagged by at least 3/5 tests
)
# DEGs identified by majority voting
result$DEGs
# Full table: per-test BH FDRs, Fisher-combined FDR, log2FC, ensemble score
head(result$FDR_Table)
# Concise summary table
head(result$Results_Table)
# Raw output from each individual test
names(result$IndividualTests) # "t_test" "anova" "dunnett" "half_t" "wilcoxon"Each test function can also be called on its own and returns a
Table and a DEGs data.frame:
perform_t_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_anova(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_dunnett_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_h_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)
perform_wilcox_test(gene_exp_data, con1 = 1, con2 = 10, exp1 = 11, exp2 = 20)Gene annotation is entirely optional. When no annotation is supplied, row names are used as gene identifiers. To map probe IDs to gene symbols using a GEO SOFT family file:
annot <- read.delim("GSExxxxx_family.soft")
result <- DGEAR(
dataframe = your_data,
con1 = 1, con2 = 8,
exp1 = 9, exp2 = 21,
alpha = 0.05,
votting_cutoff = 3,
annot_df = annot # <-- optional annotation table
)DGEAR expects a plain data.frame or matrix
where:
con1 to con2)exp1 to exp2)Raw intensity / count values are automatically log2-transformed when the data appear to be on a linear scale (no manual transformation needed).
A fully annotated walkthrough script is bundled with the package:
source(system.file("examples", "DGEAR_demo.R", package = "DGEAR"))| Function | Description |
|---|---|
DGEAR() |
Main ensemble function — runs all five tests and returns DEGs |
perform_t_test() |
Welch two-sample t-test |
perform_anova() |
One-way ANOVA |
perform_dunnett_test() |
Dunnett’s multiple comparison test |
perform_h_test() |
Half’s modified t-test |
perform_wilcox_test() |
Wilcoxon-Mann-Whitney U-test |
MIT © Koushik Bardhan