Package: BatchQC 2.9.0

Yaoan Leng

BatchQC: Batch Effects Quality Control Software

Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.

Authors:Jessica Anderson [aut], W. Evan Johnson [aut, fnd], Yaoan Leng [ctb, cre], Solaiappan Manimaran [aut], Heather Selby [ctb], Claire Ruberman [ctb], Kwame Okrah [ctb], Hector Corrada Bravo [ctb], Michael Silverstein [ctb], Regan Conrad [ctb], Zhaorong Li [ctb], Evan Holmes [ctb], Solomon Joseph [ctb], Howard Fan [ctb], Sean Lu [ctb]

BatchQC_2.9.0.tar.gz
BatchQC_2.9.0.zip(r-4.7)BatchQC_2.9.0.zip(r-4.6)BatchQC_2.9.0.zip(r-4.5)
BatchQC_2.9.0.tgz(r-4.6-any)BatchQC_2.9.0.tgz(r-4.5-any)
BatchQC_2.9.0.tar.gz(r-4.7-any)BatchQC_2.9.0.tar.gz(r-4.6-any)
BatchQC_2.9.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
BatchQC/json (API)
NEWS

# Install 'BatchQC' in R:
install.packages('BatchQC', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/wejlab/batchqc/issues

Datasets:

On BioConductor:BatchQC-2.9.0(bioc 3.24)BatchQC-2.8.0(bioc 3.23)

batcheffectgeneexpressiongraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology

9.35 score 11 stars 61 scripts 494 downloads 8 mentions 40 exports 236 dependencies

Last updated from:c30deb77cf. Checks:1 NOTE, 9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
bioc-checksNOTE280
linux-devel-x86_64OK567
source / vignettesOK652
linux-release-x86_64OK612
macos-release-arm64OK540
macos-oldrel-arm64OK446
windows-develOK607
windows-releaseOK695
windows-oldrelOK617
wasm-releaseOK247

Exports:batch_correctbatch_designBatchQCbatchqc_explained_variationbisectbladder_data_uploadcolor_palettecompute_aiccompute_lambdaconfound_metricscor_propscovariates_not_confoundedcramers_vDE_analyzedendrogram_alpha_numeric_checkdendrogram_color_palettedendrogram_plotterEV_plotterEV_tablegoodness_of_fit_nbheatmap_num_to_char_converterheatmap_plotteris_design_balancedkBETnormalize_SEPCA_plotterplot_kBETprocess_dendrogrampval_plotterpval_summaryratio_plotterrun_kBETrun_lambdastd_pearson_corr_coefsummarized_experimentsummary_stats_EV_tabletb_data_uploadumapvariation_ratiosvolcano_plot

Dependencies:abindannotateAnnotationDbiaskpassassortheadbackportsbase64encbeachmatBHBiobaseBiocGenericsBiocNeighborsBiocParallelBiocSingularBiostringsbitbit64bitopsblobblockmodelingblusterbootbriobroombslibcachemcallrcarcarDatacaToolscellrangerCkmeans.1d.dpclicliprclustercodetoolscolorspacecommonmarkconflictedcorrplotcowplotcpp11crayoncurldata.tableDBIdbplyrDelayedArrayDerivdescDESeq2diffobjdigestdoBydplyrdqrngdtplyrEBSeqedgeRevaluatefarverfastmapFNNfontawesomeforcatsforecastformatRFormulafracdifffsfutile.loggerfutile.optionsgarglegenefiltergenericsGenomicRangesggdendroggnewscaleggplot2ggpubrggrepelggsciggsignifgluegoogledrivegooglesheets4gplotsgridExtragtablegtoolsHarmanhavenherehighrhmshtmltoolshttpuvhttridsigraphIRangesirlbaisobandjquerylibjsonliteKEGGRESTKernSmoothknitrlabelinglambda.rlaterlatticelifecyclelimmalme4lmtestlocfitlubridatemagrittrMASSMatrixMatrixGenericsMatrixModelsmatrixStatsmemoisemetapodmgcvmicrobenchmarkmimeminqamodelrNCmiscnlmenloptrnnetnumDerivopensslotelpbkrtestpheatmappillarpkgbuildpkgconfigpkgloadplyrpngpolynompraiseprettyunitsprocessxprogresspromisespspurrrquantregR6raggrappdirsrbibutilsRColorBrewerRcppRcppArmadilloRcppEigenRcppTOMLRdpackreaderreadrreadxlreformulasrematchrematch2reprexreshape2reticulaterlangrmarkdownrprojrootRSpectraRSQLiterstatixrstudioapirsvdrvestS4ArraysS4VectorsS7sassScaledMatrixscalesscranscuttleselectrSeqinfoshinyshinyjsshinythemesSingleCellExperimentsitmosnowsourcetoolsSparseArraySparseMstatmodstringistringrSummarizedExperimentsurvivalsvasyssystemfontstestthattextshapingtibbletidyrtidyselecttidyversetimechangetimeDatetinytextzdbumapurcautf8uuidvctrsviridisLitevroomwaldowithrxfunXMLxml2xtableXVectoryamlzoo

BatchQC Examples

Rendered fromBatchQC_examples.Rmdusingknitr::rmarkdownon Jun 10 2026.

Last update: 2025-11-21
Started: 2023-08-03

Introduction to BatchQC

Rendered fromBatchQC_Intro.Rmdusingknitr::rmarkdownon Jun 10 2026.

Last update: 2026-02-03
Started: 2022-04-29

Readme and manuals

Help Manual

Help pageTopics
Boxplots for the distribution of AIC for each methodAIC_boxplots
Batch Correct This function allows you to Add batch corrected count matrix to the SE objectbatch_correct
This function allows you to make a batch design matrixbatch_design
Batch and Condition indicator for signature databatch_indicator
Run BatchQC shiny appBatchQC
Returns a list of explained variation by batch and condition combinationsbatchqc_explained_variation
bisect - a generic bisection functionbisect
Bladder data upload This function uploads the Bladder data set from the bladderbatch package. This dataset is from bladder cancer data with 22,283 different microarray gene expression data. It has 57 bladder samples with 3 metadata variables (batch, outcome and cancer). It contains 5 batches, 3 cancer types (cancer, biopsy, control), and 5 outcomes (Biopsy, mTCC, sTCC-CIS, sTCC+CIS, and Normal). Batch 1 contains only cancer, 2 has cancer and controls, 3 has only controls, 4 contains only biopsy, and 5 contains cancer and biopsybladder_data_upload
This function returns BMI data that comes form the data in "Comparing tuberculosis gene signatures in malnourished individuals using the TBSignatureProfiler" paper. Subject IDs were matched as shown on "github.com/jessmcc22/BatchQCv2_Manuscript/blob/devel/R/subjectID_match.R"BMI_data
Helper function to save variables as factors if not already factorscheck_valid_input
Color palettecolor_palette
ComBat Correction This function applies ComBat correction to your summarized experiment objectComBat_correction
ComBat-Seq Correction This function applies ComBat-seq correction to your summarized experiment objectComBat_seq_correction
This function creates the commentary recommendation when there are more than 20 samples.commentary
Compute the AIC for lognormal (ComBat) model, negative binomial (ComBat-seq) model and the Voom modelcompute_aic
Compute the lambda index for determining a need for batch correctioncompute_lambda
Combine std. Pearson correlation coefficient and Cramer's Vconfound_metrics
This function allows you to calculate correlation propertiescor_props
Returns list of covariates not confounded by batch; helper function for explained variation and for populating shiny app condition optionscovariates_not_confounded
This function allows you to calculate Cramer's Vcramers_v
Differential Expression AnalysisDE_analyze
Dendrogram alpha or numeric checkerdendrogram_alpha_numeric_check
Dendrogram color palettedendrogram_color_palette
Dendrogram Plotdendrogram_plotter
This function calculated the goodness of fit of DESeq2 for larger sample sizes (intended for more than 150 samples).DESeq_large_analysis
This function calculated the goodness of fit of DESeq2 for small sample sizes (intended for less than 20 samples).DESeq2_small_size
This function calculated the goodness of fit of edgeR for larger sample sizes (intended for more than 150 samples).edgeR_large_analysis
This function calculated the goodness of fit of edgeR for small sample sizes (intended for less than or equal to 20 samples).edgeR_small_size
This function allows you to plot explained variationEV_plotter
EV Table Returns table with percent variation explained for specified number of genesEV_table
Helper function to get residualsget.res
This function calculates goodness-of-fit pvalues for all genes by looking at how the NB model by edgeR or DESeq2 fit the datagoodness_of_fit_nb
Harman Correction This function applies Harman correction to a summarized experiment object with and reconstructs the data back into the original feature spaceHarman_correction
Heatmap numeric to character converterheatmap_num_to_char_converter
Heatmap Plotterheatmap_plotter
Check if the experimental design is balanced or unbalancedis_design_balanced
kBET - k-nearest neighbour batch effect testkBET
Limma Correction This function applies limma batch correction to your provided assaylimma_correction
BMI and matched sample names for TB datamerged_IDs
This function creates a histogram from the negative binomial goodness-of-fit adjusted pvalues.nb_histogram
This function determines the proportion of p-values below a specific value and compares to the previously determined thresholdnb_proportion
This function allows you to add normalized count matrix to the SE objectnormalize_SE
This function allows you to plot PCAPCA_plotter
This function performs DESeq on the permuted dataset.permuted_DESeq
This function performs edgeR on the permuted dataset adjusted pvalues.permuted_edgeR
This function formats the PCA plot using ggplotplot_data
kBET Rejection Plotterplot_kBET
Create potential min_distance values for exploratory analysis based on the value of spreadpossible_distances
Create a vector of possible nearest neighbor values from 5, 15, 25, 50, and 100possible_k_neighbors
Preprocess assay datapreprocess
Process Dendrogramprocess_dendrogram
Protein data with 39 protein expression levelsprotein_data
Batch and Condition indicator for protein expression dataprotein_sample_info
P-value Plotter This function allows you to plot p-values of explained variationpval_plotter
Returns summary table for p-values of explained variationpval_summary
This function allows you to plot ratios of explained variationratio_plotter
Helper function that contains the code to run the lognormal, voom, and negative binomial AIC models for 'compute_aic'run_AIC_models
kBET rejection raterun_kBET
Provide a recommendation on batch correction based on lambda calculationrun_lambda
Signature data with 1600 gene expression levelssignature_data
Calculate a standardized Pearson correlation coefficientstd_pearson_corr_coef
This function creates a summarized experiment object from count and metadata files uploaded by the usersummarized_experiment
Summary Stats EV Table Returns table with min, 1st quartile, mean, 2nd quartile, and max for each variable in the explained variation boxplotsummary_stats_EV_table
sva Correction This function applies sva correction to a summarized experiment object (implementation adapted from sva::psva)sva_correction
svaseq Correction This function applies sva correction to a summarized experiment object with count based RNA-seq datasvaseq_correction
TB data upload This function uploads the TB data set from the curatedTBData package.tb_data_upload
Create a umap plot; wrapper function for umap package plus custom plottingumap
Creates Ratios of batch to variable variation statisticvariation_ratios
Volcano plotvolcano_plot