| Title: | Exposure-Aware Multi-Omics Risk Modeling |
|---|---|
| Description: | ExpoRiskR provides tools for exposure-aware multi-omics risk modeling in translational and environmental health studies. The package aligns sample identifiers across exposure and multi-omics blocks, performs lightweight preprocessing, and fits exposure-adjusted association models to build interpretable microbe–metabolite networks. It also computes simple exposure perturbation summaries and generates publication-ready visualizations. Workflows support both matrix-based inputs and SummarizedExperiment objects. |
| Authors: | Prem Prashant Chaudhary [aut, cre] (ORCID: <https://orcid.org/0000-0002-3467-8608>) |
| Maintainer: | Prem Prashant Chaudhary <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.0 |
| Built: | 2026-05-30 08:29:57 UTC |
| Source: | https://github.com/bioc/ExpoRiskR |
Ensures that microbiome, metabolome, exposures, and metadata all refer to the same
set of samples in the same order. Sample IDs are taken from rownames of matrices/
data.frames, or from a column in meta if id_col is provided.
align_omics( microbiome, metabolome, exposures, meta, id_col = NULL, strict = TRUE )align_omics( microbiome, metabolome, exposures, meta, id_col = NULL, strict = TRUE )
microbiome |
Matrix/data.frame of samples x microbes. |
metabolome |
Matrix/data.frame of samples x metabolites. |
exposures |
Matrix/data.frame of samples x exposures. |
meta |
data.frame of sample-level metadata (must include outcome later). |
id_col |
Optional column name in |
strict |
If TRUE, errors if any block has samples not found in others. If FALSE, intersects common samples and drops others. |
A list with aligned microbiome, metabolome, exposures, meta, and sample_id.
set.seed(4) d <- generate_dummy_exporisk(n = 20, p_micro = 6, p_metab = 8, p_expo = 3) aligned <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) names(aligned)set.seed(4) d <- generate_dummy_exporisk(n = 20, p_micro = 6, p_metab = 8, p_expo = 3) aligned <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) names(aligned)
Convenience wrapper to (i) align microbiome, metabolome, and exposures by sample ID
and (ii) return two SummarizedExperiment objects (microbiome + metabolome)
that share the same colData (meta + exposures). This is useful for
Bioconductor-style workflows.
Inputs microbiome, metabolome, exposures are expected to be
sample-by-feature matrices (or coercible to matrices). Sample IDs are taken from
rownames when present; otherwise from meta[[id_col]].
align_omics_se( microbiome, metabolome, exposures, meta, id_col = "sample_id", strict = TRUE )align_omics_se( microbiome, metabolome, exposures, meta, id_col = "sample_id", strict = TRUE )
microbiome |
Matrix/data.frame (samples x microbes). |
metabolome |
Matrix/data.frame (samples x metabolites). |
exposures |
Matrix/data.frame (samples x exposures). |
meta |
Data.frame with sample metadata including |
id_col |
Column name in |
strict |
If TRUE, require that all blocks contain the same sample IDs; otherwise subset to the intersection (default TRUE). |
A list with:
se_microbiome: SummarizedExperiment for microbiome (features x samples)
se_metabolome: SummarizedExperiment for metabolome (features x samples)
exposures: aligned numeric matrix (samples x exposures)
meta: aligned meta data.frame
sample_ids: character vector of aligned sample IDs
set.seed(7) d <- generate_dummy_exporisk(n = 12, p_micro = 5, p_metab = 6, p_expo = 3) out <- align_omics_se( d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE ) out$se_microbiome out$se_metabolomeset.seed(7) d <- generate_dummy_exporisk(n = 12, p_micro = 5, p_metab = 6, p_expo = 3) out <- align_omics_se( d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE ) out$se_microbiome out$se_metabolome
For each (microbe, metabolite) pair, fits a linear model:
and uses the microbe coefficient as the edge weight.
This is an MVP, interpretable approach suitable for Bioconductor submission.
build_exposure_network( X, Y, E, covar = NULL, fdr = 0.1, max_pairs = 5000, seed = NULL )build_exposure_network( X, Y, E, covar = NULL, fdr = 0.1, max_pairs = 5000, seed = NULL )
X |
Numeric matrix (samples x microbes). |
Y |
Numeric matrix (samples x metabolites). |
E |
Numeric matrix (samples x exposures). |
covar |
Optional data.frame of sample-level covariates (rows = samples). |
fdr |
FDR threshold for keeping edges (BH adjusted p-value). |
max_pairs |
Max number of (microbe, metabolite) pairs to test (for speed). If NULL, tests all pairs (may be slow). |
seed |
Optional random seed used only when |
A list with:
edges: data.frame of significant edges (microbe, metabolite, weight, p_value, fdr)
graph: igraph object (bipartite)
meta: list of settings and counts
set.seed(1) d <- generate_dummy_exporisk(n = 30, p_micro = 10, p_metab = 12, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) net <- build_exposure_network(pr$X, pr$Y, pr$E, fdr = 0.5, max_pairs = 120, seed = 1) utils::head(net$edges)set.seed(1) d <- generate_dummy_exporisk(n = 30, p_micro = 10, p_metab = 12, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) net <- build_exposure_network(pr$X, pr$Y, pr$E, fdr = 0.5, max_pairs = 120, seed = 1) utils::head(net$edges)
Builds a reference network using all exposures, then for each exposure j builds a network leaving out exposure j, and computes a perturbation score based on differences in edge weights for a subset of tested pairs.
This is an MVP perturbation metric designed to be interpretable and fast enough for simulated/demo datasets.
exposure_perturbation_score( X, Y, E, covar = NULL, fdr = 0.2, max_pairs = 3000, seed = 1 )exposure_perturbation_score( X, Y, E, covar = NULL, fdr = 0.2, max_pairs = 3000, seed = 1 )
X |
Microbiome matrix (samples x microbes). |
Y |
Metabolome matrix (samples x metabolites). |
E |
Exposures matrix (samples x exposures). |
covar |
Optional covariates data.frame. |
fdr |
FDR threshold passed to build_exposure_network(). |
max_pairs |
Number of pairs to test per network build (speed control). |
seed |
Random seed. |
A data.frame with exposure, perturbation_score, n_edges_ref, n_edges_drop.
set.seed(2) d <- generate_dummy_exporisk(n = 30, p_micro = 10, p_metab = 12, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) scores <- exposure_perturbation_score(pr$X, pr$Y, pr$E, fdr = 0.5, max_pairs = 120, seed = 1) scoresset.seed(2) d <- generate_dummy_exporisk(n = 30, p_micro = 10, p_metab = 12, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) scores <- exposure_perturbation_score(pr$X, pr$Y, pr$E, fdr = 0.5, max_pairs = 120, seed = 1) scores
Creates a reproducible toy dataset for demonstrating ExpoRiskR workflows: exposures (E), microbiome-like positive features (X), metabolome-like positive features (Y), and a binary disease outcome.
If seed is provided, reproducibility is ensured locally without modifying
the global RNG state.
generate_dummy_exporisk( n = 120, p_micro = 50, p_metab = 80, p_expo = 10, n_signal = 6, seed = NULL )generate_dummy_exporisk( n = 120, p_micro = 50, p_metab = 80, p_expo = 10, n_signal = 6, seed = NULL )
n |
Number of samples. |
p_micro |
Number of microbiome features. |
p_metab |
Number of metabolomics features. |
p_expo |
Number of exposure variables. |
n_signal |
Number of truly associated features per block. |
seed |
Optional random seed for reproducible simulation. |
A list with matrices: microbiome, metabolome, exposures; and meta data.frame.
d <- generate_dummy_exporisk(n = 20, p_micro = 6, p_metab = 8, p_expo = 3, seed = 1) str(d)d <- generate_dummy_exporisk(n = 20, p_micro = 6, p_metab = 8, p_expo = 3, seed = 1) str(d)
Plots a bipartite igraph network returned by build_exposure_network().
Uses base igraph plotting (no extra dependencies).
plot_exposure_network( net, file = NULL, width = 10, height = 7, dpi = 300, layout = "layout_with_fr", max_label_nodes = 30 )plot_exposure_network( net, file = NULL, width = 10, height = 7, dpi = 300, layout = "layout_with_fr", max_label_nodes = 30 )
net |
A list returned by |
file |
Optional output filename. If provided, saves a PNG (recommended). |
width, height
|
Plot device size (in inches) when saving. |
dpi |
DPI when saving PNG. |
layout |
Layout function name passed to igraph. Default
|
max_label_nodes |
Max nodes to label (largest by degree). Default 30. |
Invisibly returns net$graph.
d <- generate_dummy_exporisk(seed = 1, n = 12, p_micro = 5, p_metab = 6, p_expo = 3) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) net <- build_exposure_network(pr$X, pr$Y, pr$E, fdr = 0.95, max_pairs = 120, seed = 1) plot_exposure_network(net)d <- generate_dummy_exporisk(seed = 1, n = 12, p_micro = 5, p_metab = 6, p_expo = 3) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) net <- build_exposure_network(pr$X, pr$Y, pr$E, fdr = 0.95, max_pairs = 120, seed = 1) plot_exposure_network(net)
Plot exposure perturbation ranking
plot_exposure_ranking(scores, top_n = 20)plot_exposure_ranking(scores, top_n = 20)
scores |
A data.frame from exposure_perturbation_score(). |
top_n |
Show only top N exposures (default 20). Use NULL for all. |
A ggplot object.
d <- generate_dummy_exporisk(n = 30, p_micro = 10, p_metab = 12, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) scores <- exposure_perturbation_score(pr$X, pr$Y, pr$E, fdr = 0.5, max_pairs = 120, seed = 1) plot_exposure_ranking(scores)d <- generate_dummy_exporisk(n = 30, p_micro = 10, p_metab = 12, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) scores <- exposure_perturbation_score(pr$X, pr$Y, pr$E, fdr = 0.5, max_pairs = 120, seed = 1) plot_exposure_ranking(scores)
Fits a logistic regression outcome ~ exposures and ranks exposures by the absolute standardized coefficient magnitude.
plot_feature_importance(E, outcome, top_n = 25)plot_feature_importance(E, outcome, top_n = 25)
E |
Numeric matrix (samples x exposures). |
outcome |
Binary vector (0/1), length = nrow(E). |
top_n |
Number of top exposures to show. |
A ggplot object.
d <- generate_dummy_exporisk(seed = 1, n = 20, p_micro = 6, p_metab = 8, p_expo = 4) outcome <- d$meta$outcome names(outcome) <- d$meta$sample_id p <- plot_feature_importance(E = d$exposures, outcome = outcome, top_n = 10) print(p)d <- generate_dummy_exporisk(seed = 1, n = 20, p_micro = 6, p_metab = 8, p_expo = 4) outcome <- d$meta$outcome names(outcome) <- d$meta$sample_id p <- plot_feature_importance(E = d$exposures, outcome = outcome, top_n = 10) print(p)
Fits outcome ~ exposures and shows per-exposure contribution for one sample based on standardized coefficients and standardized exposure values.
plot_individual_risk_profile(sample_id, E, outcome, top_n = 20)plot_individual_risk_profile(sample_id, E, outcome, top_n = 20)
sample_id |
Sample ID (must be in rownames(E)). |
E |
Numeric matrix (samples x exposures) with rownames. |
outcome |
Binary vector (0/1), named by sample IDs or same row order as E. |
top_n |
Number of top contributing exposures to display. |
A ggplot object.
d <- generate_dummy_exporisk(seed = 1, n = 20, p_micro = 6, p_metab = 8, p_expo = 4) outcome <- d$meta$outcome names(outcome) <- d$meta$sample_id sid <- rownames(d$exposures)[1] p <- plot_individual_risk_profile(sample_id = sid, E = d$exposures, outcome = outcome, top_n = 10) print(p)d <- generate_dummy_exporisk(seed = 1, n = 20, p_micro = 6, p_metab = 8, p_expo = 4) outcome <- d$meta$outcome names(outcome) <- d$meta$sample_id sid <- rownames(d$exposures)[1] p <- plot_individual_risk_profile(sample_id = sid, E = d$exposures, outcome = outcome, top_n = 10) print(p)
Builds a reference network using all samples, then repeatedly bootstraps samples with replacement, rebuilds the network, and computes Jaccard overlap between edge sets.
plot_network_stability( X, Y, E, n_boot = 50, fdr = 0.2, max_pairs = 2000, seed = NULL )plot_network_stability( X, Y, E, n_boot = 50, fdr = 0.2, max_pairs = 2000, seed = NULL )
X |
Numeric matrix (samples x microbes). |
Y |
Numeric matrix (samples x metabolites). |
E |
Numeric matrix (samples x exposures). |
n_boot |
Number of bootstrap resamples. |
fdr |
FDR threshold passed to build_exposure_network(). |
max_pairs |
Maximum pairs passed to build_exposure_network(). |
seed |
Optional seed controlling bootstrap resampling only. |
A ggplot object.
d <- generate_dummy_exporisk(seed = 1, n = 20, p_micro = 8, p_metab = 10, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) p <- plot_network_stability(pr$X, pr$Y, pr$E, n_boot = 2, fdr = 0.95, max_pairs = 120, seed = 1) print(p)d <- generate_dummy_exporisk(seed = 1, n = 20, p_micro = 8, p_metab = 10, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) p <- plot_network_stability(pr$X, pr$Y, pr$E, n_boot = 2, fdr = 0.95, max_pairs = 120, seed = 1) print(p)
Plot disease risk stratification ROC curves (MVP)
plot_risk_roc(X, Y, E, outcome, edges, top_edges = 200)plot_risk_roc(X, Y, E, outcome, edges, top_edges = 200)
X |
Microbiome matrix (samples x features) |
Y |
Metabolome matrix (samples x features) |
E |
Exposures matrix (samples x features) |
outcome |
Binary vector (0/1) |
edges |
Network edges data.frame |
top_edges |
Number of strongest edges for network feature |
A ggplot object
d <- generate_dummy_exporisk(seed = 1, n = 25, p_micro = 8, p_metab = 10, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) net <- build_exposure_network(pr$X, pr$Y, pr$E, fdr = 0.95, max_pairs = 150, seed = 1) outcome <- d$meta$outcome names(outcome) <- d$meta$sample_id p <- plot_risk_roc(pr$X, pr$Y, pr$E, outcome = outcome, edges = net$edges, top_edges = 30) print(p)d <- generate_dummy_exporisk(seed = 1, n = 25, p_micro = 8, p_metab = 10, p_expo = 4) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) net <- build_exposure_network(pr$X, pr$Y, pr$E, fdr = 0.95, max_pairs = 150, seed = 1) outcome <- d$meta$outcome names(outcome) <- d$meta$sample_id p <- plot_risk_roc(pr$X, pr$Y, pr$E, outcome = outcome, edges = net$edges, top_edges = 30) print(p)
Lightweight preprocessing for MVP and Bioconductor-friendly workflows. Converts inputs to numeric matrices, checks sample alignment, optionally imputes missing values, applies log1p transforms, and scales features.
prep_omics( microbiome, metabolome, exposures, log1p_micro = TRUE, log1p_metab = TRUE, z_expo = TRUE, scale_omics = TRUE, na_action = c("error", "impute") )prep_omics( microbiome, metabolome, exposures, log1p_micro = TRUE, log1p_metab = TRUE, z_expo = TRUE, scale_omics = TRUE, na_action = c("error", "impute") )
microbiome |
Matrix/data.frame of samples x microbes. |
metabolome |
Matrix/data.frame of samples x metabolites. |
exposures |
Matrix/data.frame of samples x exposures. |
log1p_micro |
If TRUE (default), apply log1p to microbiome. |
log1p_metab |
If TRUE (default), apply log1p to metabolome. |
z_expo |
If TRUE (default), z-score exposures. |
scale_omics |
If TRUE (default), center/scale microbiome and metabolome features. |
na_action |
What to do with NA values: "error" (default) or "impute". |
A list with processed matrices: X, Y, E.
set.seed(1) d <- generate_dummy_exporisk(n = 20, p_micro = 6, p_metab = 8, p_expo = 3) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) str(pr)set.seed(1) d <- generate_dummy_exporisk(n = 20, p_micro = 6, p_metab = 8, p_expo = 3) al <- align_omics(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) pr <- prep_omics(al$microbiome, al$metabolome, al$exposures) str(pr)
Preprocess SummarizedExperiment-based omics blocks and exposures
prep_omics_se(aligned, assay_micro = NULL, assay_metab = NULL, ...)prep_omics_se(aligned, assay_micro = NULL, assay_metab = NULL, ...)
aligned |
Output from align_omics_se() or align_omics(). |
assay_micro |
Assay name for microbiome SE (default: first assay). |
assay_metab |
Assay name for metabolome SE (default: first assay). |
... |
Passed to prep_omics(). |
A list with preprocessed matrices: X, Y, E.
set.seed(8) d <- generate_dummy_exporisk(n = 12, p_micro = 5, p_metab = 6, p_expo = 3) aligned <- align_omics_se(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) se2 <- prep_omics_se(aligned) se2set.seed(8) d <- generate_dummy_exporisk(n = 12, p_micro = 5, p_metab = 6, p_expo = 3) aligned <- align_omics_se(d$microbiome, d$metabolome, d$exposures, d$meta, id_col = "sample_id", strict = TRUE) se2 <- prep_omics_se(aligned) se2