| Title: | scGraphVerse: A Gene Network Analysis Package |
|---|---|
| Description: | A package for inferring, comparing, and visualizing gene networks from single-cell RNA sequencing data. It integrates multiple methods (GENIE3, GRNBoost2, ZILGM, PCzinb, and JRF) for robust network inference, supports consensus building across methods or datasets, and provides tools for evaluating regulatory structure and community similarity. GRNBoost2 requires Python package 'arboreto' which can be installed using init_py(install_missing = TRUE). This package includes adapted functions from ZILGM (Park et al., 2021), JRF (Petralia et al., 2015), and learn2count (Nguyen et al. 2023) packages with proper attribution under GPL-2 license. |
| Authors: | Francesco Cecere [aut, cre] (ORCID: <https://orcid.org/0000-0002-0329-0870>), Annamaria Carissimo [aut], Daniela De Canditiis [aut], Claudia Angelini [aut, fnd] |
| Maintainer: | Francesco Cecere <[email protected]> |
| License: | GPL-3 + file LICENSE |
| Version: | 1.3.0 |
| Built: | 2026-05-30 06:48:58 UTC |
| Source: | https://github.com/bioc/scGraphVerse |
Constructs a SummarizedExperiment container for multiple gene regulatory network adjacency matrices with shared gene space (p×p matrices).
build_network_se( networks, networkData = NULL, geneData = NULL, metadata = list() )build_network_se( networks, networkData = NULL, geneData = NULL, metadata = list() )
networks |
A list of adjacency matrices (all must have same dimensions) |
networkData |
A DataFrame with metadata for each network |
geneData |
Optional. A DataFrame with gene-level annotations |
metadata |
Optional. List of global metadata |
A SummarizedExperiment object where each assay is a p×p network
# Example 1: Building SE from a list of adjacency matrices data("toy_adj_matrix") # Create a list of network matrices net_list <- list( network1 = toy_adj_matrix, network2 = toy_adj_matrix ) # Build SummarizedExperiment network_se <- build_network_se(net_list) network_se # Example 2: Using with inferred networks data("toy_counts") networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # generate_adjacency() internally uses build_network_se() wadj_se <- generate_adjacency(networks) wadj_se# Example 1: Building SE from a list of adjacency matrices data("toy_adj_matrix") # Create a list of network matrices net_list <- list( network1 = toy_adj_matrix, network2 = toy_adj_matrix ) # Build SummarizedExperiment network_se <- build_network_se(net_list) network_se # Example 2: Using with inferred networks data("toy_counts") networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # generate_adjacency() internally uses build_network_se() wadj_se <- generate_adjacency(networks) wadj_se
Compares a consensus network to a reference network and classifies edges as True Positives (TP), False Positives (FP), or False Negatives (FN).
classify_edges(consensus_matrix, reference_matrix = NULL, use_stringdb = TRUE)classify_edges(consensus_matrix, reference_matrix = NULL, use_stringdb = TRUE)
consensus_matrix |
A SummarizedExperiment object representing the consensus network. |
reference_matrix |
A SummarizedExperiment object
representing the reference (ground truth) network. If |
use_stringdb |
Logical. If TRUE and |
If reference_matrix is NULL and use_stringdb = TRUE,
this function queries STRINGdb to generate a human high-confidence
(score > 900) physical interaction network.
A list containing:
tp_edges: Character vector of True Positive edges
fp_edges: Character vector of False Positive edges
fn_edges: Character vector of False Negative edges
consensus_graph: igraph object of consensus network
reference_graph: igraph object of reference network
edge_colors: Color vector for TP (red) and FN (blue) edges
use_stringdb: Logical indicating if STRINGdb was used
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff (returns SummarizedExperiment) binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) # Create consensus (returns SummarizedExperiment) consensus <- create_consensus(binary_se, method = "union") # Wrap reference matrix in SummarizedExperiment ref_se <- build_network_se(list(reference = toy_adj_matrix)) # classify_edges expects SummarizedExperiment objects edge_class <- classify_edges(consensus, ref_se)data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff (returns SummarizedExperiment) binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) # Create consensus (returns SummarizedExperiment) consensus <- create_consensus(binary_se, method = "union") # Wrap reference matrix in SummarizedExperiment ref_se <- build_network_se(list(reference = toy_adj_matrix)) # classify_edges expects SummarizedExperiment objects edge_class <- classify_edges(consensus, ref_se)
Detects gene communities within an adjacency network using one or two community detection methods, and performs pathway enrichment for each detected community.
community_path( adj_matrix, methods = "louvain", pathway_db = "KEGG", organism = c("human", "mouse"), genes_path = 5, plot = TRUE, verbose = TRUE, method_params = list(), comparison_params = list(), BPPARAM = BiocParallel::bpparam() )community_path( adj_matrix, methods = "louvain", pathway_db = "KEGG", organism = c("human", "mouse"), genes_path = 5, plot = TRUE, verbose = TRUE, method_params = list(), comparison_params = list(), BPPARAM = BiocParallel::bpparam() )
adj_matrix |
A square adjacency matrix or a SummarizedExperiment object containing a single adjacency matrix. Row and column names must correspond to gene symbols. |
methods |
A character vector of one or two community detection
methods supported by robin. If two are given, performance is
compared and the best is selected. Default: |
pathway_db |
Character string specifying the pathway database to use:
|
organism |
Character string specifying the organism: |
genes_path |
Integer. Minimum number of genes per community to run
enrichment analysis. Default: |
plot |
Logical. If |
verbose |
Logical. If |
method_params |
List of parameters for community detection methods. Common parameters include:
|
comparison_params |
List of parameters for robin comparison:
|
BPPARAM |
BiocParallel backend for parallel processing.
Default: |
If two methods are provided, the function uses
robinCompare and selects the method with higher AUC. Pathway
enrichment is done via clusterProfiler (KEGG) or via
ReactomePA (Reactome). Communities smaller than
genes_path are excluded.
A list with elements:
communities: List with best_method and a named
vector of community membership per gene.
pathways: List of enrichment results per community
(only for communities meeting size threshold).
graph: The igraph object with community
annotations.
data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff (returns SummarizedExperiment) binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) # Create consensus (returns SummarizedExperiment) consensus <- create_consensus(binary_se, method = "union") # community_path now accepts SummarizedExperiment objects directly comm_cons <- community_path(consensus)data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff (returns SummarizedExperiment) binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) # Create consensus (returns SummarizedExperiment) consensus <- create_consensus(binary_se, method = "union") # community_path now accepts SummarizedExperiment objects directly comm_cons <- community_path(consensus)
Convenience wrapper that computes community assignment metrics,
topological properties, and optionally visualizes the comparison.
For more control, use the individual functions:
compute_community_metrics,
compute_topology_metrics, and
plot_community_comparison.
community_similarity(control_output, predicted_list, plot = TRUE)community_similarity(control_output, predicted_list, plot = TRUE)
control_output |
A list output from |
predicted_list |
A list of lists, each output from |
plot |
Logical. If TRUE, displays plots immediately. If FALSE, no plots are displayed. Default: TRUE. |
This function requires the igraph package. If
plot = TRUE, the fmsb package is also required.
Community similarity is measured using variation of information (VI),
normalized mutual information (NMI), and adjusted Rand index (ARI).
A list containing:
community_metrics: A data frame with VI, NMI, and ARI
scores for each prediction.
topology_measures: A data frame with raw topological
metrics for each prediction.
control_topology: A list of raw topological metrics for
the ground truth network.
compute_community_metrics,
compute_topology_metrics,
plot_community_comparison
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) sim_score <- community_similarity(comm_truth, list(comm_cons))data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) sim_score <- community_similarity(comm_truth, list(comm_cons))
Convenience wrapper that classifies edges and visualizes the comparison
between consensus and reference networks. For more control, use the
individual functions: classify_edges and
plot_network_comparison.
compare_consensus( consensus_matrix, reference_matrix = NULL, false_plot = FALSE )compare_consensus( consensus_matrix, reference_matrix = NULL, false_plot = FALSE )
consensus_matrix |
A SummarizedExperiment object representing the consensus network. |
reference_matrix |
Optional. A SummarizedExperiment obj
representing the reference network. If |
false_plot |
Logical. If |
If no reference_matrix is provided, STRINGdb is queried
to generate a high-confidence physical interaction network.
A ggplot object visualizing the comparison.
Requires ggraph and ggplot2. If reference_matrix
is NULL, also requires STRINGdb. If false_plot = TRUE,
requires patchwork.
classify_edges, plot_network_comparison
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") # Wrap reference matrix in SummarizedExperiment ref_se <- build_network_se(list(reference = toy_adj_matrix)) # Compare consensus to reference compare_consensus( consensus, reference_matrix = ref_se, false_plot = FALSE )data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") # Wrap reference matrix in SummarizedExperiment ref_se <- build_network_se(list(reference = toy_adj_matrix)) # Compare consensus to reference compare_consensus( consensus, reference_matrix = ref_se, false_plot = FALSE )
Calculates community assignment similarity between a reference community structure and one or more predicted structures using variation of information (VI), normalized mutual information (NMI), and adjusted Rand index (ARI).
compute_community_metrics(control_output, predicted_list)compute_community_metrics(control_output, predicted_list)
control_output |
A list output from |
predicted_list |
A list of lists, each output from |
This function requires the igraph package. Lower VI values indicate better similarity (VI = 0 is perfect match). Higher NMI and ARI values indicate better similarity (both range 0-1).
A data frame with columns VI, NMI, and ARI for each prediction. Row names indicate which prediction (e.g., "Predicted_1").
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) metrics <- compute_community_metrics(comm_truth, list(comm_cons))data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) metrics <- compute_community_metrics(comm_truth, list(comm_cons))
Calculates topological properties (modularity, number of communities, density, transitivity) for one or more networks.
compute_topology_metrics(control_output, predicted_list)compute_topology_metrics(control_output, predicted_list)
control_output |
A list output from |
predicted_list |
A list of lists, each output from |
This function requires the igraph package. Topological metrics include:
Modularity: Quality of community division
Communities: Number of detected communities
Density: Proportion of actual vs. possible edges
Transitivity: Clustering coefficient
A list containing:
topology_measures: A data frame with Modularity,Communities,
Density, and Transitivity for each prediction.
control_topology: A numeric vector of the same metrics
for the control network.
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) topo <- compute_topology_metrics(comm_truth, list(comm_cons))data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) topo <- compute_topology_metrics(comm_truth, list(comm_cons))
Builds a consensus adjacency matrix from networks stored in a
SummarizedExperiment using one of three methods:
"vote", "union", or "INet".
create_consensus( adj_matrix_list, method = c("vote", "union", "INet"), weighted_list = NULL, theta = 0.04, threshold = 0.5, ncores = 1, tolerance = 0.1, nitermax = 50, verbose = FALSE )create_consensus( adj_matrix_list, method = c("vote", "union", "INet"), weighted_list = NULL, theta = 0.04, threshold = 0.5, ncores = 1, tolerance = 0.1, nitermax = 50, verbose = FALSE )
adj_matrix_list |
A SummarizedExperiment object containing binary adjacency matrices (square, 0/1) with identical dimensions and matching row/column names, or a list of such matrices. |
method |
Character string specifying the consensus strategy. One of:
|
weighted_list |
A SummarizedExperiment object containing
weighted adjacency matrices (required if |
theta |
Numeric. Tuning parameter passed to |
threshold |
Numeric between 0 and 1. Threshold for "vote" and
"INet" methods. Default is |
ncores |
Integer. Number of CPU cores to use when |
tolerance |
Numeric. Tolerance for differences between similar graphs
in INet method. Default is |
nitermax |
Integer. Maximum number of iterations for INet algorithm.
Default is |
verbose |
Logical. If TRUE, display verbose output for INet method.
Default is |
Consensus construction depends on the selected method:
Counts the presence of each edge across all
matrices and includes edges supported by at least
threshold × N matrices.
Includes any edge that appears in any matrix.
Multiplies binary matrices by corresponding
weighted matrices, normalizes the results, and applies
consensusNet to generate a consensus network.
For "INet", both binary and weighted adjacency matrices must be provided with matching dimensions.
A SummarizedExperiment object with a single assay containing the consensus adjacency matrix (binary or weighted, depending on the method). Metadata includes consensus method and parameters.
data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") head(consensus)data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") head(consensus)
Converts a list of count matrices, Seurat objects, or SingleCellExperiment objects into a MultiAssayExperiment for integrated network inference.
create_mae(datasets, colData = NULL, ...)create_mae(datasets, colData = NULL, ...)
datasets |
A named list of datasets. Each element can be:
|
colData |
Optional. A DataFrame with metadata for each experiment. If NULL, automatically generated from list names. |
... |
Additional arguments (currently unused) |
A MultiAssayExperiment object with:
experiments: List of SingleCellExperiment objects
colData: Metadata for each experiment/condition
# Load the example MAE data("toy_counts") # Extract the list of SingleCellExperiment objects sce_list <- MultiAssayExperiment::experiments(toy_counts) sce_list <- as.list(sce_list) # Create a new MAE from the SCE list mae <- create_mae(sce_list) mae# Load the example MAE data("toy_counts") # Extract the list of SingleCellExperiment objects sce_list <- MultiAssayExperiment::experiments(toy_counts) sce_list <- as.list(sce_list) # Create a new MAE from the SCE list mae <- create_mae(sce_list) mae
Applies a cutoff to weighted adjacency matrices using a percentile
estimated from shuffled versions of the original expression matrices.
Supports inference methods "GENIE3", "GRNBoost2",
and "JRF".
cutoff_adjacency( count_matrices, weighted_adjm_list, n, method = c("GENIE3", "GRNBoost2", "JRF"), quantile_threshold = 0.99, weight_function = "mean", nCores = 1, grnboost_modules = NULL, debug = FALSE )cutoff_adjacency( count_matrices, weighted_adjm_list, n, method = c("GENIE3", "GRNBoost2", "JRF"), quantile_threshold = 0.99, weight_function = "mean", nCores = 1, grnboost_modules = NULL, debug = FALSE )
count_matrices |
A MultiAssayExperiment object containing expression data from multiple experiments or conditions. |
weighted_adjm_list |
A SummarizedExperiment object containing weighted adjacency matrices (one per experiment) to threshold. |
n |
Integer. Number of shuffled replicates generated per original expression matrix. |
method |
Character string. One of |
quantile_threshold |
Numeric. The quantile used to define the cutoff.
Default is |
weight_function |
Character string or function used to symmetrize
adjacency matrices ( |
nCores |
Integer. Number of CPU cores to use for parallelization. Default is the number of workers in the current BiocParallel backend. Note: JRF uses C implementation and does not use this parameter. |
grnboost_modules |
Python modules needed for |
debug |
Logical. If |
For each input expression matrix, n shuffled versions are
generated by randomly permuting each gene’s expression across cells.
Network inference is performed on the shuffled matrices, and a cutoff
is determined as the specified quantile (quantile_threshold) of
the resulting edge weights. The original weighted adjacency matrices
are then thresholded using these estimated cutoffs.
Parallelization is handled via BiocParallel.
The methods are based on:
GENIE3: Random Forest-based inference (Huynh-Thu et al., 2010).
GRNBoost2: Gradient boosting trees using arboreto (Moerman et al., 2019).
JRF: Joint Random Forests across multiple conditions (Petralia et al., 2015).
A SummarizedExperiment object where each assay is a binary (thresholded) adjacency matrix corresponding to an input weighted matrix. Metadata includes cutoff values and method parameters.
data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]])data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]])
Downloads an RDS file from a specified URL and reads its contents into R. We used it for https://www.singlecellatlas.org
download_Atlas(file_url)download_Atlas(file_url)
file_url |
Character; URL of the RDS file to download. |
This function uses httr to perform the download. The RDS file is read directly from a raw connection without saving to disk. An internet connection is required.
If the download fails (e.g., invalid URL, server error), an informative error message is returned.
An R object loaded from the downloaded RDS file.
url <- "https://zenodo.org/records/15511027/files/sce_obj.rds?download=1" atlas_data <- download_Atlas(url)url <- "https://zenodo.org/records/15511027/files/sce_obj.rds?download=1" atlas_data <- download_Atlas(url)
Extracts expression data from a MultiAssayExperiment object, modifies cell identifiers by appending a unique experiment index (e.g., "-m1", "-m2", etc.), and merges the datasets into a single object.
earlyj(input_list, rowg = TRUE)earlyj(input_list, rowg = TRUE)
input_list |
A MultiAssayExperiment object containing expression data from multiple experiments or conditions. |
rowg |
Logical. If |
For matrices, this function optionally transposes the input before combining.
For Seurat and SingleCellExperiment objects, only features
(genes) common across all input datasets are retained before merging.
The cell names are suffixed with "-m1", "-m2", etc., according to their
original list position. The result is returned as a MultiAssayExperiment
with a single merged experiment.
A MultiAssayExperiment object containing a single merged experiment with modified (unique) cell names.
data(toy_counts) merged_mae <- earlyj(toy_counts) merged_maedata(toy_counts) merged_mae <- earlyj(toy_counts) merged_mae
Query PubMed for literature evidence supporting predicted gene–gene interactions.
edge_mining( predicted_list, ground_truth, delay = 1, query_field = "Title/Abstract", query_edge_types = c("TP", "FP", "FN"), max_retries = 10, BPPARAM = BiocParallel::bpparam() )edge_mining( predicted_list, ground_truth, delay = 1, query_field = "Title/Abstract", query_edge_types = c("TP", "FP", "FN"), max_retries = 10, BPPARAM = BiocParallel::bpparam() )
predicted_list |
A list of predicted adjacency matrices (row and column names are gene symbols), or a SummarizedExperiment object containing adjacency matrices. |
ground_truth |
A 0/1 adjacency matrix with row and column names. |
delay |
Numeric. Seconds to wait between consecutive queries (default = 1). |
query_field |
Character. PubMed search field. Options: "Title/Abstract" (default), "Title", "Abstract". |
query_edge_types |
Character vector. Edge types to query: c("TP", "FP", "FN") (default all). |
max_retries |
Integer. Max retries for PubMed queries (default = 10). |
BPPARAM |
A BiocParallel parameter object. Default = bpparam(). |
This function compares predicted adjacency matrices against a ground truth matrix, identifies edge types (TP, FP, FN), and queries PubMed for each gene pair. Returns counts of hits, PMIDs, and query status.
A named list of data.frames. Each data.frame has columns:
First gene in interaction
Second gene
One of "TP", "FP", or "FN"
Number of PubMed hits
Comma-separated PubMed IDs or NA
One of "hits_found", "no_hits", or "error"
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") head(consensus) em <- edge_mining(consensus, toy_adj_matrix, query_edge_types = "TP")data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) consensus <- create_consensus(binary_se, method = "union") head(consensus) em <- edge_mining(consensus, toy_adj_matrix, query_edge_types = "TP")
Constructs adjacency matrices from a list of data frames (network edge lists) and returns them in a SummarizedExperiment object.
generate_adjacency(df_list, nCores = 1)generate_adjacency(df_list, nCores = 1)
df_list |
A list of data frames. Each data frame must have three columns:
|
nCores |
Integer. Number of CPU cores to use for parallel processing. Defaults to the number of available workers from the current BiocParallel backend. |
The function first identifies all unique genes across all data frames to define the matrix dimensions. For each interaction table, it places the corresponding weights at the appropriate gene-pair positions. Parallelization is handled by BiocParallel for improved performance on multiple datasets.
Missing weights (NA) are ignored during construction. Only
gene pairs matching the global gene list are inserted.
A SummarizedExperiment object where each assay is a square numeric adjacency matrix (p×p genes). Diagonal entries are set to zero (no self-interactions).
data("toy_counts") # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) # returns SummarizedExperiment head(wadj_se[[1]])data("toy_counts") # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) # returns SummarizedExperiment head(wadj_se[[1]])
Infers weighted gene regulatory networks (GRNs) from one or more
expression matrices using different inference methods:
"GENIE3", "GRNBoost2", "ZILGM",
"JRF", or "PCzinb".
infer_networks( count_matrices_list, method = c("GENIE3", "GRNBoost2", "ZILGM", "JRF", "PCzinb"), adjm = NULL, nCores = 1, grnboost_modules = NULL, genie3_params = list(), grnboost2_params = list(), zilgm_params = list(), jrf_params = list(), pczinb_params = list(), verbose = FALSE )infer_networks( count_matrices_list, method = c("GENIE3", "GRNBoost2", "ZILGM", "JRF", "PCzinb"), adjm = NULL, nCores = 1, grnboost_modules = NULL, genie3_params = list(), grnboost2_params = list(), zilgm_params = list(), jrf_params = list(), pczinb_params = list(), verbose = FALSE )
count_matrices_list |
A MultiAssayExperiment object containing expression data from multiple experiments or conditions. |
method |
Character string. Inference method to use. One of:
|
adjm |
Optional. Reference adjacency matrix for matching dimensions
when using |
nCores |
Integer. Number of CPU cores to use for parallelization. Default: 1. |
grnboost_modules |
Python modules required for |
genie3_params |
List of parameters for GENIE3 method:
|
grnboost2_params |
List of parameters for GRNBoost2 method:
|
zilgm_params |
List of parameters for ZILGM method:
|
jrf_params |
List of parameters for JRF method:
|
pczinb_params |
List of parameters for PCzinb method:
|
verbose |
Logical. If TRUE, display messages. Default: FALSE. |
Each expression matrix is preprocessed automatically depending
on its object type (Seurat, SingleCellExperiment, or
plain matrix).
Parallelization behavior:
GENIE3: No external parallelization; internal
nCores parameter controls computation.
ZILGM: Uses nCores parameter for internal
parallelization.
GRNBoost2 and PCzinb: Parallelized across matrices using BiocParallel.
JRF: Joint modeling of all matrices together using optimized C implementation.
Methods are based on:
GENIE3: Random Forest-based inference (Huynh-Thu et al., 2010).
GRNBoost2: Gradient boosting trees using arboreto (Moerman et al., 2019).
ZILGM: Zero-Inflated Graphical Models for scRNA-seq (Zhang et al., 2021).
JRF: Joint Random Forests across multiple conditions (Petralia et al., 2015).
PCzinb: Pairwise correlation under ZINB models (Nguyen et al., 2023).
A list of inferred networks:
For "GENIE3", "GRNBoost2", "ZILGM",
and "PCzinb", a list of inferred network objects (edge
lists or adjacency matrices).
For "JRF", a list of data frames with inferred edge
lists for each condition or dataset.
data("toy_counts") # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]])data("toy_counts") # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]])
Sets up the Python environment and lazily loads modules required for
running GRNBoost2: arboreto, pandas, and numpy.
Automatically installs missing Python packages if requested.
init_py( python_path = "/usr/bin/python3", required = TRUE, install_missing = FALSE, install_method = "auto", verbose = TRUE )init_py( python_path = "/usr/bin/python3", required = TRUE, install_missing = FALSE, install_method = "auto", verbose = TRUE )
python_path |
Character string. Path to the Python executable,
e.g., |
required |
Logical. If |
install_missing |
Logical. If |
install_method |
Character string. Installation method when
|
verbose |
Logical. If |
Uses reticulate to bind R to the specified Python
interpreter and lazily import modules needed for GRNBoost2. If
install_missing = TRUE, automatically installs the 'arboreto'
package using the specified method if not found.
A list with three Python module objects:
arboreto: GRNBoost2 algorithm module.
pandas: Data handling module.
numpy: Numerical operations module.
# Initialize Python environment (handles missing modules gracefully) tryCatch( { modules <- init_py(required = FALSE) }, error = function(e) { message("Python environment not available: ", e$message) } )# Initialize Python environment (handles missing modules gracefully) tryCatch( { modules <- init_py(required = FALSE) }, error = function(e) { message("Python environment not available: ", e$message) } )
This function performs structure learning for count data using various PC algorithms adapted for different distributional assumptions including Poisson, Negative Binomial, and Zero-Inflated Negative Binomial models.
PCzinb( x, method = c("poi", "nb", "zinb0", "zinb1"), alpha = NULL, maxcard = 2, extend = TRUE, nCores = 1, whichAssay = "processed", ... )PCzinb( x, method = c("poi", "nb", "zinb0", "zinb1"), alpha = NULL, maxcard = 2, extend = TRUE, nCores = 1, whichAssay = "processed", ... )
x |
A matrix of count data (n × p), SummarizedExperiment, or SingleCellExperiment object. For matrix input, rows are samples and columns are genes. |
method |
The algorithm used to estimate the graph: |
alpha |
The significance level of the tests. Default: 2 * pnorm(nrow(x)^0.2, lower.tail = FALSE). |
maxcard |
The upper bound of the cardinality of the conditional sets K. Default: 2. |
extend |
If TRUE, considers the union of the tests; if FALSE, considers the intersection. Default: TRUE. |
nCores |
Number of cores for parallel processing. Default: 1. |
whichAssay |
The assay to use as input (for SummarizedExperiment or SingleCellExperiment objects). Default: "processed". |
... |
Additional arguments (currently unused). |
PCzinb performs structure learning using PC algorithms for count data. Different methods handle different distributional assumptions:
poi: Poisson distribution
nb: Negative Binomial distribution
zinb0: Zero-Inflated NB with structure only in mean parameter
zinb1: Zero-Inflated NB with structure in both mean and
zero-inflation parameters
For SummarizedExperiment and SingleCellExperiment inputs, if the specified
whichAssay is "processed" but not found, the function will use the first
assay and issue a warning recommending QPtransform().
If x is a matrix: the estimated adjacency matrix of the graph
If x is a SummarizedExperiment: the object with adjacency matrix
stored in metadata as adj_mat
If x is a SingleCellExperiment: the object with adjacency matrix stored as rowPair
# Matrix input mat <- matrix(rpois(50, 5), nrow = 10) PCzinb(mat, method = "poi") # SummarizedExperiment input library(SummarizedExperiment) se <- SummarizedExperiment(matrix(rpois(50, 5), ncol = 10)) se_result <- PCzinb(se, method = "poi") # SingleCellExperiment input library(SingleCellExperiment) sce <- SingleCellExperiment(matrix(rpois(50, 5), ncol = 10)) sce_result <- PCzinb(sce, method = "poi") rowPair(sce_result)# Matrix input mat <- matrix(rpois(50, 5), nrow = 10) PCzinb(mat, method = "poi") # SummarizedExperiment input library(SummarizedExperiment) se <- SummarizedExperiment(matrix(rpois(50, 5), ncol = 10)) se_result <- PCzinb(se, method = "poi") # SingleCellExperiment input library(SingleCellExperiment) sce <- SingleCellExperiment(matrix(rpois(50, 5), ncol = 10)) sce_result <- PCzinb(sce, method = "poi") rowPair(sce_result)
Creates visualization plots for community assignment metrics and topological properties comparison.
plot_community_comparison( community_metrics, topology_measures, control_topology )plot_community_comparison( community_metrics, topology_measures, control_topology )
community_metrics |
A data frame with VI, NMI, and ARI scores
(output from |
topology_measures |
A data frame with Modularity, Communities,
Density, and Transitivity (from |
control_topology |
A named numeric vector of control network
topology metrics (from |
This function requires the fmsb package for radar chart visualization. The radar plot shows normalized community similarity metrics. The bar plots compare raw topological properties between predicted and control networks.
Invisibly returns NULL. Displays a radar plot for community metrics and bar plots for topology comparison.
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) comm_metrics <- compute_community_metrics(comm_truth, list(comm_cons)) topo_metrics <- compute_topology_metrics(comm_truth, list(comm_cons)) plot_community_comparison( comm_metrics, topo_metrics$topology_measures, topo_metrics$control_topology )data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) consensus <- create_consensus(binary_se, method = "union") comm_cons <- community_path(consensus) comm_truth <- community_path(toy_adj_matrix) comm_metrics <- compute_community_metrics(comm_truth, list(comm_cons)) topo_metrics <- compute_topology_metrics(comm_truth, list(comm_cons)) plot_community_comparison( comm_metrics, topo_metrics$topology_measures, topo_metrics$control_topology )
Creates visualization plots comparing consensus and reference networks, showing True Positives (TP), False Negatives (FN), and optionally False Positives (FP) edges.
plot_network_comparison(edge_classification, show_fp = FALSE)plot_network_comparison(edge_classification, show_fp = FALSE)
edge_classification |
A list output from |
show_fp |
Logical. If TRUE, displays False Positive edges in a separate plot. Default: FALSE. |
This function requires the ggraph and ggplot2
packages. If show_fp = TRUE, the patchwork package is
also required.
The plots differentiate:
TP/CE (True Positives/Confirmed Edges): Red edges present in both
FN/ME (False Negatives/Missing Edges): Blue edges in reference only
FP/EE (False Positives/Extra Edges): Edges in consensus only
If STRINGdb was used for reference, labels are CE/ME/EE. Otherwise, TP/FN/FP.
A ggplot object visualizing the comparison. If
show_fp = TRUE, a combined plot using patchwork is returned.
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff (returns SummarizedExperiment) binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) # Create consensus (returns SummarizedExperiment) consensus <- create_consensus(binary_se, method = "union") # Wrap reference matrix in SummarizedExperiment ref_se <- build_network_se(list(reference = toy_adj_matrix)) # Classify edges edge_class <- classify_edges(consensus, ref_se) # Plot comparison plot_network_comparison(edge_class, show_fp = FALSE)data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff (returns SummarizedExperiment) binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1 ) # Create consensus (returns SummarizedExperiment) consensus <- create_consensus(binary_se, method = "union") # Wrap reference matrix in SummarizedExperiment ref_se <- build_network_se(list(reference = toy_adj_matrix)) # Classify edges edge_class <- classify_edges(consensus, ref_se) # Plot comparison plot_network_comparison(edge_class, show_fp = FALSE)
Generates and arranges multiple graph visualizations from a list of adjacency matrices. Each matrix is converted into an undirected igraph object and visualized using a force-directed layout via ggraph.
plotg(adj_matrix_list)plotg(adj_matrix_list)
adj_matrix_list |
A list of square, symmetric adjacency matrices with zeros on the diagonal (no self-loops), or a SummarizedExperiment object containing such matrices as assays. Each matrix represents an undirected graph. |
Each adjacency matrix is validated to ensure it is square and symmetric. Disconnected nodes (degree zero) are removed prior to visualization. Graphs are visualized with a force-directed layout using ggraph, and multiple plots are arranged into a grid with gridExtra.
Each subplot title includes the graph index, number of nodes, and number of edges.
A grid of plots displaying all valid graphs in the input list.
This function requires the following packages: igraph, ggraph, and gridExtra. If any are missing, an informative error will be thrown.
data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) plotg(binary_se)data(toy_counts) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) head(binary_se[[1]]) plotg(binary_se)
Computes and visualizes Receiver Operating Characteristic (ROC) curves for predicted adjacency matrices stored in a SummarizedExperiment object, compared against a binary ground truth network.
plotROC(predicted_se, ground_truth, plot_title, is_binary = FALSE)plotROC(predicted_se, ground_truth, plot_title, is_binary = FALSE)
predicted_se |
A SummarizedExperiment object containing
predicted adjacency matrices as assays. Each matrix must share dimnames
with |
ground_truth |
A square binary matrix indicating true interactions (1)
in the upper triangle. Must match dims and names of each assay in
|
plot_title |
Character string. Title for the ROC plot. |
is_binary |
Logical. If |
For binary matrices, a single TPR/FPR point is computed per matrix. For weighted ones, a full ROC curve is computed from continuous scores. Diagonals are ignored; symmetry is not enforced.
A list with:auc: data frame of AUC per matrix.plot: the ROC plot (via ggplot2).
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # plotROC now accepts SummarizedExperiment directly roc_res <- plotROC( swadj_se, toy_adj_matrix, plot_title = "ROC Curve: GENIE3", is_binary = FALSE ) roc_res$plot auc_joint <- roc_res$aucdata(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate and symmetrize adjacency matrices (returns SummarizedExperiment) wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # plotROC now accepts SummarizedExperiment directly roc_res <- plotROC( swadj_se, toy_adj_matrix, plot_title = "ROC Curve: GENIE3", is_binary = FALSE ) roc_res$plot auc_joint <- roc_res$auc
Computes classification metrics by comparing predicted adjacency matrices to a ground truth binary network and visualizes the performance via a radar (spider) plot.
pscores(ground_truth, predicted_list, zero_diag = TRUE)pscores(ground_truth, predicted_list, zero_diag = TRUE)
ground_truth |
A square binary adjacency matrix representing the ground truth network. Values must be 0 or 1. Only the upper triangle is used for evaluation. |
predicted_list |
A list of predicted adjacency matrices to evaluate,
or a SummarizedExperiment object containing such matrices
as assays. Each matrix must have the same dimensions and row/column
names as |
zero_diag |
Logical. If |
For each predicted matrix, the confusion matrix is computed using the upper triangle (non-self edges). Metrics including True Positive Rate (TPR), False Positive Rate (FPR), Precision, F1-score, and Matthews Correlation Coefficient (MCC) are calculated.
A radar plot is automatically generated summarizing the key scores across matrices.
A list with one element:Statistics: Data frame of evaluation metrics (TP, TN, FP, FN,
TPR, FPR, Precision, F1, MCC) for each predicted matrix.
Requires the fmsb, dplyr, and tidyr packages.
data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) pscores_data <- pscores(toy_adj_matrix, binary_se)data(toy_counts) data(toy_adj_matrix) # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean") # Apply cutoff binary_se <- cutoff_adjacency( count_matrices = toy_counts, weighted_adjm_list = swadj_se, n = 1, method = "GENIE3", quantile_threshold = 0.95, nCores = 1, debug = TRUE ) pscores_data <- pscores(toy_adj_matrix, binary_se)
Identifies and returns the top n most highly expressed genes across
all cells or within a specific cell type. Supports objects of class
Seurat, SingleCellExperiment, or a numeric
expression matrix (genes × cells).
selgene( object, top_n, cell_type = NULL, cell_type_col = "cell_type", assay = NULL, remove_mt = FALSE, remove_rib = FALSE )selgene( object, top_n, cell_type = NULL, cell_type_col = "cell_type", assay = NULL, remove_mt = FALSE, remove_rib = FALSE )
object |
A Seurat object, SingleCellExperiment object, or numeric matrix (genes × cells). |
top_n |
Integer. Number of top expressed genes to return. |
cell_type |
Optional string. If provided, filters the expression matrix to only include cells of this type. |
cell_type_col |
Character. Name of the column in metadata (Seurat
|
assay |
Character. For SingleCellExperiment objects only. Name of the
assay to use. If |
remove_mt |
Logical. If |
remove_rib |
Logical. If |
The function assumes that log-normalized values are available in the
"data" slot (for Seurat objects) or the "logcounts" assay
(for SingleCellExperiment). If raw counts are provided as a matrix, no
transformation is applied.
Optional filtering is available to exclude mitochondrial genes
("^MT-") and ribosomal genes ("^RP[SL]"), which may otherwise
dominate the top expressed genes.
A character vector of the top n most highly expressed gene
names.
When using a Seurat object, the function retrieves the log-normalized data
from the default assay's "data" slot. For SingleCellExperiment, it
uses the specified assay (default is "logcounts"). For matrices, no
checks or transformations are applied, and subsetting by cell type is not
supported.
Mitochondrial and ribosomal gene removal is based on regular expressions
matching gene names. These should follow standard naming conventions (e.g.,
MT-ND1, RPL13A, RPS6).
data(toy_counts) genes <- selgene( object = toy_counts[[1]], top_n = 5, cell_type = "T_cells", cell_type_col = "CELL_TYPE", remove_rib = TRUE, remove_mt = TRUE, assay = "counts" )data(toy_counts) genes <- selgene( object = toy_counts[[1]], top_n = 5, cell_type = "T_cells", cell_type_col = "CELL_TYPE", remove_rib = TRUE, remove_mt = TRUE, assay = "counts" )
Constructs weighted and binary adj matrices for physical protein-protein interactions using a POST request to the STRING database API.
stringdb_adjacency( genes, species = 9606, required_score = 400, keep_all_genes = TRUE, verbose = TRUE )stringdb_adjacency( genes, species = 9606, required_score = 400, keep_all_genes = TRUE, verbose = TRUE )
genes |
A character vector of gene symbols or identifiers, e.g.,
|
species |
Integer. NCBI taxonomy ID of the species. Default is
|
required_score |
Integer in \[0,1000\]. Minimum confidence score for
interactions. Default is |
keep_all_genes |
Logical. If |
verbose |
Logical. If |
This function:
Maps input genes to STRING internal IDs.
Uses a POST request to retrieve physical protein-protein interactions from STRING.
Builds a weighted adjacency matrix using the STRING combined score.
Builds a binary adjacency matrix indicating presence/absence.
Genes not mapped to STRING are optionally retained as zero rows/columns if
keep_all_genes = TRUE.
A list containing:
weighted: A square numeric adjacency matrix with scores as
weights.
binary: A corresponding binary (0/1) adjacency matrix.
Requires packages: STRINGdb, httr, jsonlite.
data(toy_counts) genes <- selgene( object = toy_counts[[1]], top_n = 5, cell_type = "T_cells", cell_type_col = "CELL_TYPE", remove_rib = TRUE, remove_mt = TRUE, assay = "counts" ) str_res <- stringdb_adjacency( genes = genes, species = 9606, required_score = 900, keep_all_genes = FALSE ) wadj_truth <- str_res$weighted toy_adj_matrix <- str_res$binarydata(toy_counts) genes <- selgene( object = toy_counts[[1]], top_n = 5, cell_type = "T_cells", cell_type_col = "CELL_TYPE", remove_rib = TRUE, remove_mt = TRUE, assay = "counts" ) str_res <- stringdb_adjacency( genes = genes, species = 9606, required_score = 900, keep_all_genes = FALSE ) wadj_truth <- str_res$weighted toy_adj_matrix <- str_res$binary
Symmetrizes each adjacency matrix in a SummarizedExperiment by ensuring entries (i, j) and (j, i) are identical, using a specified combination function.
symmetrize(matrix_list, weight_function = "mean", nCores = 1)symmetrize(matrix_list, weight_function = "mean", nCores = 1)
matrix_list |
A SummarizedExperiment object containing adjacency matrices to symmetrize. |
weight_function |
Character string or function. Method to combine
entries (i, j) and (j, i). Options include |
nCores |
Integer. Number of CPU cores to use for parallel processing. Defaults to the number of available workers in the current BiocParallel backend. |
For each pair of off-diagonal elements (i, j) and (j, i):
If one value is zero, the non-zero value is used.
If both are non-zero, they are combined using the specified
weight_function.
Diagonal entries are preserved as-is and not modified.
Parallelization is managed via BiocParallel for improved performance.
A SummarizedExperiment object with symmetrized
adjacency matrices, where for each matrix for
all .
data("toy_counts") # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean")data("toy_counts") # Infer networks (toy_counts is already a MultiAssayExperiment) networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 ) head(networks[[1]]) # Generate adjacency matrices wadj_se <- generate_adjacency(networks) swadj_se <- symmetrize(wadj_se, weight_function = "mean")
An adjacency matrix generated using real dataset and STRINGdb for demonstrating functions in the scGraphVerse package.
data(toy_adj_matrix)data(toy_adj_matrix)
A matrix of dimension 10x10.
data(toy_adj_matrix) str(toy_adj_matrix)data(toy_adj_matrix) str(toy_adj_matrix)
A simulated dataset generated using zinb_simdata() for demonstrating network inference functions in the scGraphVerse package.
data(toy_counts)data(toy_counts)
A MultiAssayExperiment object containing 3 SingleCellExperiment objects (experiments), each with 10 genes x 40 cells. The data was simulated using a known ground truth network (toy_adj_matrix) with zero-inflated negative binomial distributions.
data(toy_counts) toy_counts # Access individual experiments MultiAssayExperiment::experiments(toy_counts) # Use directly with infer_networks networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 )data(toy_counts) toy_counts # Access individual experiments MultiAssayExperiment::experiments(toy_counts) # Use directly with infer_networks networks <- infer_networks( count_matrices_list = toy_counts, method = "GENIE3", nCores = 1 )
Simulates one or more count matrices following a zero-inflated negative binomial (ZINB) distribution, incorporating gene-gene interaction structures and cell-specific sequencing depth variation.
zinb_simdata( n, p, B, mu_range, mu_noise, theta, pi, kmat = 1, depth_range = NA )zinb_simdata( n, p, B, mu_range, mu_noise, theta, pi, kmat = 1, depth_range = NA )
n |
Integer. Number of cells (samples) in each simulated matrix. |
p |
Integer. Number of genes (features) in each simulated matrix. |
B |
A symmetric binary adjacency matrix (0/1) defining gene-gene connectivity. Row and column names correspond to gene names. |
mu_range |
List of numeric vectors (length 2 each). Range of gene expression means for each simulated matrix. |
mu_noise |
Numeric vector. Mean of background noise for each matrix. |
theta |
Numeric vector. Dispersion parameters of the negative
binomial distribution for each matrix. Smaller |
pi |
Numeric vector. Probability of excess zeros ( |
kmat |
Integer. Number of count matrices to simulate. Default is 1. |
depth_range |
Numeric vector of length 2 or |
Each simulated matrix:
Generates gene expression values based on a ZINB model.
Modulates expression using the adjacency matrix B.
Applies random sequencing depth scaling if
depth_range is provided.
Useful for benchmarking single-cell RNA-seq network inference methods with dropout events and network structure.
A list containing kmat matrices. Each matrix has:
Rows representing cells (cell_1, ..., cell_n).
Columns representing genes (rownames(B)).
Count values following a ZINB distribution.
data(toy_adj_matrix) nodes <- nrow(toy_adj_matrix) sims <- zinb_simdata( n = 50, p = nodes, B = toy_adj_matrix, mu_range = list(c(1, 4), c(1, 7), c(1, 10)), mu_noise = c(1, 3, 5), theta = c(1, 0.7, 0.5), pi = c(0.2, 0.2, 0.2), kmat = 3, depth_range = c(0.8 * nodes * 3, 1.2 * nodes * 3) )data(toy_adj_matrix) nodes <- nrow(toy_adj_matrix) sims <- zinb_simdata( n = 50, p = nodes, B = toy_adj_matrix, mu_range = list(c(1, 4), c(1, 7), c(1, 10)), mu_noise = c(1, 3, 5), theta = c(1, 0.7, 0.5), pi = c(0.2, 0.2, 0.2), kmat = 3, depth_range = c(0.8 * nodes * 3, 1.2 * nodes * 3) )