Package 'MSstatsBioNet'

Title: Network Analysis for MS-based Proteomics Experiments
Description: A set of tools for network analysis using mass spectrometry-based proteomics data and network databases. The package takes as input the output of MSstats differential abundance analysis and provides functions to perform enrichment analysis and visualization in the context of prior knowledge from past literature. Notably, this package integrates with INDRA, which is a database of biological networks extracted from the literature using text mining techniques.
Authors: Anthony Wu [aut, cre] (ORCID: <https://orcid.org/0009-0001-7391-9902>), Olga Vitek [aut] (ORCID: <https://orcid.org/0000-0003-1728-1104>)
Maintainer: Anthony Wu <[email protected]>
License: file LICENSE
Version: 1.5.1
Built: 2026-06-16 18:07:43 UTC
Source: https://github.com/bioc/MSstatsBioNet

Help Index


Populate Entity Information in Data Frame

Description

Initialises the three entity grounding columns and dispatches to the appropriate populator: the INDRA cogex path for UniProt-based inputs, the Gilda grounding path for name-based inputs (HGNC name / metabolite).

Usage

.populateEntityInformationInDataFrame(df, proteinIdType)

Arguments

df

A data frame containing protein information.

proteinIdType

A character string specifying the type of protein ID.

Value

A data frame with populated entity grounding columns.


Populate entity grounding columns via Gilda

Description

Grounds each GlobalProtein text through Gilda. For "Hgnc_Name" the response is filtered to HGNC candidates (and restricted to human via the organism filter); for "Metabolite" every grounding namespace Gilda returns is kept. Multi-grounded inputs are semicolon-joined and positionally aligned across all three Entity columns.

Usage

.populateEntityInformationWithGilda(df, proteinIdType)

Arguments

df

A data frame with a GlobalProtein column.

proteinIdType

One of "Hgnc_Name" or "Metabolite".

Value

The data frame with EntityNamespace, EntityId, EntityName set for resolved rows.


Populate entity grounding columns via INDRA cogex APIs

Description

Converts UniprotId to an HGNC id via the INDRA cogex endpoint, then looks up the canonical HGNC name. Sets EntityNamespace = "HGNC" for any row whose UniProt resolved.

Usage

.populateEntityInformationWithIndraCogex(df)

Arguments

df

A data frame with a populated UniprotId column.

Value

The data frame with EntityNamespace, EntityId, EntityName set for resolved rows.


Populate Kinase Info in Data Frame

Description

Populate Kinase Info in Data Frame

Usage

.populateKinaseInfoInDataFrame(df, proteinIdType)

Arguments

df

A data frame containing protein information.

proteinIdType

The proteinIdType supplied by the caller. Gene-only flags are NA (no API call) when this is "Metabolite".

Value

A data frame with populated kinase information.


Populate Phosphatase Info in Data Frame

Description

Populate Phosphatase Info in Data Frame

Usage

.populatePhophataseInfoInDataFrame(df, proteinIdType)

Arguments

df

A data frame containing protein information.

proteinIdType

The proteinIdType supplied by the caller. Gene-only flags are NA (no API call) when this is "Metabolite".

Value

A data frame with populated phosphatase information.


Populate Transcription Factor Info in Data Frame

Description

Populate Transcription Factor Info in Data Frame

Usage

.populateTranscriptionFactorInfoInDataFrame(df, proteinIdType)

Arguments

df

A data frame containing protein information.

proteinIdType

The proteinIdType supplied by the caller. Gene-only flags are NA (no API call) when this is "Metabolite".

Value

A data frame with populated transcription factor information.


Populate Uniprot IDs in Data Frame

Description

Populate Uniprot IDs in Data Frame

Usage

.populateUniprotIdsInDataFrame(df, proteinIdType)

Arguments

df

A data frame containing protein information.

proteinIdType

A character string specifying the type of protein ID.

Value

A data frame with populated Uniprot IDs.


Validate Annotate Protein Info Input

Description

Validate Annotate Protein Info Input

Usage

.validateAnnotateProteinInfoFromIndraInput(df, proteinIdType)

Arguments

df

A data frame containing protein information.

proteinIdType

The proteinIdType supplied by the caller.

Value

None. Throws an error if validation fails.


Annotate Protein Information from Indra

Description

This function standardizes entity identifiers from protein, compound, or gene inputs to a unified namespace using ID conversion from INDRA cogex or Gilda grounding.

Usage

annotateProteinInfoFromIndra(df, proteinIdType)

Arguments

df

output of groupComparison function's comparisonResult table. Must contain a Protein column whose values are interpreted according to proteinIdType.

proteinIdType

A character string specifying the type of analyte identifier in the Protein column. One of "Uniprot", "Uniprot_Mnemonic", "Hgnc_Name", or "Metabolite". The "Metabolite" value treats inputs as metabolite names and grounds them through Gilda, keeping whatever namespace Gilda returns (CHEBI / PUBCHEM / CHEMBL / ...).

Value

A data frame with the following columns:

Protein

Character. The original identifier from the input.

GlobalProtein

Character. The input identifier without the PTM site suffix (typically _<amino acid><site number>, e.g. _S148) stripped, used as the grounding key.

UniprotId

Character. The Uniprot ID of the protein, or NA for "Hgnc_Name" and "Metabolite" inputs.

EntityNamespace

Character. The grounding namespace (e.g. "HGNC", "CHEBI"). When a single input grounds to multiple candidates, namespaces are semicolon-joined and positionally aligned with EntityId and EntityName.

EntityId

Character. The bare grounding identifier within its namespace (e.g. "1097" for HGNC, "28748" for CHEBI). Semicolon-joined when multi-grounded.

EntityName

Character. The canonical display name from the grounding source. Semicolon-joined when multi-grounded.

IsTranscriptionFactor

Logical. NA for proteinIdType == "Metabolite".

IsKinase

Logical. NA for proteinIdType == "Metabolite".

IsPhosphatase

Logical. NA for proteinIdType == "Metabolite".

Examples

df <- data.frame(Protein = c("CLH1_HUMAN"))
annotated_df <- annotateProteinInfoFromIndra(df, "Uniprot_Mnemonic")
head(annotated_df)

Bootstrap the topic decomposition to find each topic's robust top words

Description

Refits the NMF topic model on many bootstrap resamples of the papers and reports, for every topic, how reliably each word stays among the topic's top terms. This separates words that genuinely characterise a topic from words that only surface in a single lucky fit, and lets you see how the top-word lists change with and without the PPI view (run once per include_ppi).

Usage

bootstrapTopicModels(
  subnetwork,
  n_boot = 50,
  n_topics = 5,
  include_ppi = TRUE,
  n_top_terms = 10,
  min_term_count = 2,
  max_iter = 200,
  tol = 1e-04,
  seed = 1
)

Arguments

subnetwork

list with nodes and edges data.frames, e.g. the output of getSubnetworkFromIndra.

n_boot

number of bootstrap resamples. Default 50.

n_topics

number of topics (rank of the factorization). Default 5.

include_ppi

logical; factorize the PPI/edge view jointly with the text (TRUE, default) or use paper words only (FALSE). See decomposeSubnetworkByTopic.

n_top_terms

number of top words that define a topic's "top list" in each resample (the cutoff for the selection-frequency tally). Default 10.

min_term_count

minimum corpus frequency for a word to be kept when building the text matrix. Default 2.

max_iter

maximum number of NMF multiplicative-update iterations. Default 200.

tol

relative-change tolerance for NMF early stopping. Default 1e-4.

seed

random seed for the reference fit, the resampling, and each bootstrap NMF. Default 1.

Details

Because topic indices are arbitrary across fits (label switching), each resample's topics are first aligned to a reference fit on the full data by cosine similarity of their topic-word vectors. The papers are resampled with replacement; the word vocabulary is held fixed (from the full data) so topics remain comparable across resamples, and the NMF seed is held fixed so the variability reported reflects data resampling rather than random initialization.

Value

A list with

include_ppi, n_boot, n_topics

the settings used.

topTerms

named list topic_1 ... topic_k. Each is a data.frame sorted by selection_freq, with columns term, selection_freq (fraction of resamples the word was in this topic's top n_top_terms), and mean_weight (mean within-topic word weight across resamples). A word with selection_freq near 1 is a stable signature of the topic.

reference

named list of the top n_top_terms words per topic from the single full-data fit, for comparison.

Note

Beta feature: This function is experimental and the API may change without notice in future versions.

See Also

decomposeSubnetworkByTopic, compareTopicModels

Examples

## Not run: 
input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)

# Top words with PPIs included vs. words only:
boot_ppi  <- bootstrapTopicModels(subnetwork, include_ppi = TRUE)
boot_text <- bootstrapTopicModels(subnetwork, include_ppi = FALSE)

head(boot_ppi$topTerms$topic_1)     # robust signature words for topic 1
boot_text$topTerms$topic_1

## End(Not run)

Test whether including PPIs changes topic structure beyond random chance

Description

Quantifies how much the topic decomposition produced by decomposeSubnetworkByTopic changes when the PPI/edge view is included (include_ppi = TRUE) versus excluded (include_ppi = FALSE), and separates that change from the run-to-run variability that NMF produces just from its random initialization.

Usage

compareTopicModels(
  subnetwork,
  seeds = seq_len(20),
  n_topics = 5,
  unit = c("edges", "papers"),
  min_term_count = 2,
  max_iter = 200,
  tol = 1e-04
)

Arguments

subnetwork

list with nodes and edges data.frames, e.g. the output of getSubnetworkFromIndra.

seeds

integer vector of NMF seeds to fit (at least 2). Default 1:20.

n_topics

number of topics (rank of the factorization). Default 5.

unit

either "edges" (compare edge-to-topic assignments, the default, matching the subnetworks the decomposition returns) or "papers" (compare paper-to-topic assignments).

min_term_count

minimum corpus frequency for a word to be kept when building the text matrix. Default 2.

max_iter

maximum number of NMF multiplicative-update iterations. Default 200.

tol

relative-change tolerance for NMF early stopping. Default 1e-4.

Details

NMF converges to a local optimum that depends on the random seed, so a single joint-vs-text comparison conflates the real effect of the PPI view with optimization noise. This function instead refits both modes across many seeds and compares three distributions of partition agreement (Adjusted Rand Index, ARI):

within_joint

ARI between pairs of joint runs (different seeds) — how much the joint solution wobbles on its own.

within_text

ARI between pairs of text-only runs — the same for the text-only solution.

between

ARI between the joint and text-only run at the same seed. Because both modes draw W and H_text from the same seeded stream, a matched seed gives both modes an identical initialization, so this isolates the effect of adding the PPI view from the starting point.

If the between-mode ARI is systematically lower than the within-mode ARIs, the PPI view changes the topic structure more than chance would — a one-sided Wilcoxon rank-sum test (between < within) puts a p-value on it. If the between distribution sits inside the within distributions, the apparent difference is just optimization noise.

The expensive, network-bound steps (evidence extraction, abstract fetching, matrix construction) run once; only the NMF is repeated per seed.

Value

A list with

unit

the comparison unit used.

seeds

the seeds fitted.

n_topics

the effective number of topics.

ari

list of numeric vectors within_joint, within_text, and between (matched seeds).

summary

data.frame of median/mean ARI and count per comparison.

test

the wilcox.test object comparing the between distribution against the pooled within distributions (alternative = "less").

consensus

list of consensus (co-membership) matrices, joint and text, across seeds.

dispersion

named numeric vector of consensus dispersion coefficients (1 = identical clustering across all seeds).

partitions

list of the raw per-seed partitions, joint and text, for further inspection.

Note

Beta feature: This function is experimental and the API may change without notice in future versions.

See Also

decomposeSubnetworkByTopic

Examples

## Not run: 
input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)
cmp <- compareTopicModels(subnetwork, seeds = 1:20, n_topics = 5)
cmp$summary
cmp$test            # p < 0.05 => PPI changes topics beyond chance
cmp$dispersion      # how stable each mode is across seeds

## End(Not run)

Render a Cytoscape network visualisation

Description

Creates an interactive network diagram powered by Cytoscape.js and the dagre layout algorithm. Nodes can carry log fold-change (logFC) values which are mapped to a blue-grey-red colour gradient. PTM (post-translational modification) site information is shown as small satellite nodes and edge overlaps are surfaced as hover tooltips.

Usage

cytoscapeNetwork(
  nodes,
  edges = data.frame(),
  displayLabelType = "id",
  nodeFontSize = 12,
  layoutOptions = NULL,
  width = NULL,
  height = NULL,
  elementId = NULL
)

Arguments

nodes

Data frame with at minimum an id column. Optional columns: logFC (numeric), entityName (character; may be semicolon-joined for multi-grounded rows), entityId (character), Site (character, underscore-separated PTM site list).

edges

Data frame with columns source, target, interaction. Optional: site, evidenceLink.

displayLabelType

"id" (default) or "entityName" – controls which column is used as the visible node label.

nodeFontSize

Font size (px) for node labels. Default 12.

layoutOptions

Named list of dagre layout options to override the defaults (e.g. list(rankDir = "LR")).

width, height

Widget dimensions passed to createWidget.

elementId

Optional explicit HTML element id.

Value

An htmlwidget object that renders in R Markdown, Shiny, or the RStudio Viewer pane.

Examples

## Not run: 
nodes <- data.frame(
  id    = c("TP53", "MDM2", "CDKN1A"),
  logFC = c(1.5, -0.8, 2.1),
  stringsAsFactors = FALSE
)
edges <- data.frame(
  source      = c("TP53",  "MDM2"),
  target      = c("MDM2",  "TP53"),
  interaction = c("Activation", "Inhibition"),
  stringsAsFactors = FALSE
)
cytoscapeNetwork(nodes, edges)

## End(Not run)

Shiny output binding for cytoscapeNetwork

Description

Creates a Shiny output binding for a Cytoscape network visualization, allowing the network to be rendered within Shiny applications.

Usage

cytoscapeNetworkOutput(outputId, width = "100%", height = "500px")

Arguments

outputId

output variable to read from

width, height

Must be a valid CSS unit (like "100%", "400px", "auto") or a number, which will be coerced to a string and have "px" appended.

Value

A Shiny output binding for a Cytoscape network visualization.

Examples

## Not run: 
library(shiny)

ui <- fluidPage(
  cytoscapeNetworkOutput("cytoNetwork")
)

server <- function(input, output, session) {
  output$cytoNetwork <- renderCytoscapeNetwork({
    nodes <- data.frame(
      id = c("TP53", "MDM2", "CDKN1A"),
      logFC = c(1.5, -0.8, 2.1),
      stringsAsFactors = FALSE
    )
    edges <- data.frame(
      source = c("TP53", "MDM2"),
      target = c("MDM2", "TP53"),
      interaction = c("Activation", "Inhibition"),
      stringsAsFactors = FALSE
    )
    cytoscapeNetwork(nodes, edges)
  })
}

shinyApp(ui, server)

## End(Not run)

Decompose a subnetwork into topic-specific subnetworks via joint NMF

Description

Takes a subnetwork (the output of getSubnetworkFromIndra) and splits it into a list of smaller, topic-specific subnetworks discovered with unsupervised non-negative matrix factorization (NMF).

Usage

decomposeSubnetworkByTopic(
  subnetwork,
  n_topics = 5,
  edge_topic_cutoff = 0.2,
  n_top_terms = 10,
  min_term_count = 2,
  max_iter = 200,
  tol = 1e-04,
  seed = 1,
  include_ppi = TRUE
)

Arguments

subnetwork

list with nodes and edges data.frames, e.g. the output of getSubnetworkFromIndra.

n_topics

number of topics (rank of the factorization). Default 5.

edge_topic_cutoff

numeric in [0, 1]; an edge is added to a topic's subnetwork when the topic carries at least this share of the edge's total loading. Each edge is always included in at least its highest-loading topic. Default 0.2.

n_top_terms

number of top words to report per topic. Default 10.

min_term_count

minimum corpus frequency for a word to be kept when building X_text. Default 2.

max_iter

maximum number of NMF multiplicative-update iterations. Default 200.

tol

relative-change tolerance for NMF early stopping. Default 1e-4.

seed

random seed for NMF initialization. Default 1.

include_ppi

logical; if TRUE (default) the PPI/edge matrix is factorized jointly with the text matrix via a shared basis. If FALSE, NMF is run on the paper-word matrix only and edge-topic loadings are derived afterwards by folding edge counts onto the text-learned topics, so the PPIs do not influence the topics themselves.

Details

The procedure is:

  1. For every edge, the supporting INDRA evidence is retrieved and the PubMed abstract of each referenced PMID is fetched. Papers (PMIDs) are the shared unit of analysis.

  2. Two matrices are built that share the same rows (papers): X_text (papers x words, term counts from the abstracts) and X_edges (papers x unique source_target_interaction combinations, evidence-sentence counts).

  3. NMF learns a basis matrix W (papers x topics). When include_ppi = TRUE (the default) a joint NMF learns a single shared W such that XtextWHtextX_{text} \approx W H_{text} and XedgesWHedgesX_{edges} \approx W H_{edges}, tying each learned topic to both a set of words and a set of edges. When include_ppi = FALSE the factorization uses only X_text (XtextWHtextX_{text} \approx W H_{text}); the PPI evidence is excluded from the modeling and edge-topic loadings are instead derived afterwards by folding the edge counts onto the text-learned topics (Hedges=WXedgesH_{edges} = W^\top X_{edges}). This lets you compare topic structure with and without the PPI view.

  4. Each topic becomes its own subnetwork: an edge is included in a topic when that topic carries at least edge_topic_cutoff of the edge's loading (soft, overlapping assignment), and nodes are restricted to those touched by the kept edges.

Value

A list of length n_topics, named topic_1 ... topic_k. Each element is a topic-specific subnetwork: a list with

nodes

nodes data.frame restricted to the topic's edges.

edges

edges data.frame for the topic, with an added topicWeight column (the edge's topic share).

topic

the topic index.

topTerms

character vector of the topic's top words.

pmids

PMIDs whose strongest topic loading is this topic.

The full factorization (W, H_text, H_edges, etc.) is attached as the "nmf" attribute of the returned list.

Note

Beta feature: This function is experimental and the API may change without notice in future versions.

See Also

getSubnetworkFromIndra, filterSubnetworkByContext

Examples

## Not run: 
input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)
topics <- decomposeSubnetworkByTopic(subnetwork, n_topics = 5)
topics$topic_1$topTerms
exportNetworkToHTML(topics$topic_1$nodes, topics$topic_1$edges)

## End(Not run)

Delete an edge from a network edges data frame

Description

Removes the row(s) from an edges data frame that match the given source, target, and interaction values. This is the programmatic counterpart of the interactive Ctrl+click / right-click edge deletion available in cytoscapeNetwork.

Usage

deleteEdgeFromNetwork(edges, source, target, interaction)

Arguments

edges

Data frame with at minimum columns source, target, and interaction.

source

Character. The source node identifier of the edge to remove.

target

Character. The target node identifier of the edge to remove.

interaction

Character. The interaction type of the edge to remove.

Value

The edges data frame with the matching row(s) removed.

Examples

edges <- data.frame(
  source      = c("TP53",  "MDM2",  "CDKN1A"),
  target      = c("MDM2",  "TP53",  "TP53"),
  interaction = c("Activation", "Inhibition", "Activation"),
  stringsAsFactors = FALSE
)
deleteEdgeFromNetwork(edges, "MDM2", "TP53", "Inhibition")

Export network data with Cytoscape visualization

Description

Convenience function that takes nodes and edges data directly and creates both the configuration and HTML export in one step.

Usage

exportNetworkToHTML(
  nodes,
  edges,
  filename = "network_visualization.html",
  displayLabelType = "id",
  nodeFontSize = 12,
  ...
)

Arguments

nodes

Data frame with at minimum an id column. Optional columns: logFC (numeric), entityName (character; may be semicolon-joined for multi-grounded rows), entityId (character), Site (character, underscore-separated PTM site list).

edges

Data frame with columns source, target, interaction. Optional: site, evidenceLink.

filename

Output HTML filename

displayLabelType

"id" (default) or "entityName" – controls which column is used as the visible node label.

nodeFontSize

Font size (px) for node labels. Default 12.

...

Additional arguments passed to exportCytoscapeToHTML()

Value

Invisibly returns the file path of the created HTML file


Filter a subnetwork by contextual relevance

Description

Fetches PubMed abstracts for evidence PMIDs, scores each abstract against a user-supplied query, and returns only the nodes, edges, and evidence rows whose abstracts meet the scoring cutoff.

Usage

filterSubnetworkByContext(
  nodes,
  edges,
  query,
  cutoff = NULL,
  method = c("tag_count", "cosine")
)

Arguments

nodes

A dataframe of network nodes.

edges

A dataframe of network edges with columns: source, target, interaction, site, evidenceLink, stmt_hash.

query

For method = "tag_count": a character vector of tags, e.g. c("CHEK1", "DNA damage", "DNA damage repair"). For method = "cosine": a single character string.

cutoff

Numeric threshold applied to the chosen scoring method.

  • "tag_count": integer >= 0; abstracts must contain at least this many tags. Max possible value is length(query). Default 1.

  • "cosine": numeric in [-1, 1]; abstracts must score >= this value. Default 0.10.

method

One of "tag_count" (default) or "cosine".

Details

Two scoring methods are available, controlled by the method argument:

"tag_count" (default)

Counts how many tags from query appear as substrings in the abstract (case-insensitive). The score for each abstract is an integer in [0, length(query)]. Set cutoff to the minimum number of tags that must appear - e.g. cutoff = 2 keeps abstracts that mention at least 2 of your tags. query must be a character vector of tags when using this method.

"cosine"

Scores abstracts using TF-IDF cosine similarity against query. Scores are in [-1, 1] (in practice [0, 1] for text). Set cutoff to a decimal threshold - e.g. cutoff = 0.10. query should be a single character string; expand it with synonyms and related terms for better recall under exact token matching.

Value

A named list with three elements:

nodes

Filtered nodes dataframe (only nodes present in kept edges)

edges

Filtered edges dataframe

evidence

Dataframe with columns: source, target, interaction, site, evidenceLink, stmt_hash, text, pmid, score. The score column contains tag counts (integer) or cosine similarities (numeric) depending on the method used.

Note

Beta feature: This function is experimental and the API may change without notice in future versions.


Get subnetwork from INDRA database

Description

Using differential abundance results from MSstats, this function retrieves a subnetwork of protein interactions from INDRA database.

Usage

getSubnetworkFromIndra(
  input,
  protein_level_data = NULL,
  pvalueCutoff = NULL,
  statement_types = NULL,
  paper_count_cutoff = 1,
  evidence_count_cutoff = 1,
  correlation_cutoff = 0.3,
  sources_filter = NULL,
  logfc_cutoff = NULL,
  force_include_other = NULL,
  filter_by_curation = FALSE,
  filter_by_ptm_site = FALSE,
  include_infinite_fc = FALSE,
  direction = c("both", "up", "down")
)

Arguments

input

output of groupComparison function's comparisionResult table, annotated by annotateProteinInfoFromIndra. Must contain Protein, EntityNamespace, and EntityId columns (and typically also EntityName, log2FC, adj.pvalue). When an analyte grounds to multiple candidates the three Entity* columns are semicolon-joined and positionally aligned.

protein_level_data

output of the dataProcess function's ProteinLevelData table, which contains a list of proteins and their corresponding abundances. Used for annotating correlation information and applying correlation cutoffs.

pvalueCutoff

p-value cutoff for filtering. Default is NULL, i.e. no filtering

statement_types

list of interaction types to filter on. Equivalent to statement type in INDRA. Default is NULL.

paper_count_cutoff

number of papers to filter on. Default is 1.

evidence_count_cutoff

number of evidence to filter on for each paper. E.g. A paper may have 5 sentences describing the same interaction vs 1 sentence. Default is 1.

correlation_cutoff

if protein_level_abundance is not NULL, apply a cutoff for edges with correlation less than a specified cutoff. Default is 0.3

sources_filter

filtering only on specific sources. Default is no filter, i.e. NULL. Otherwise, should be a list, e.g. c('reach', 'medscan').

logfc_cutoff

absolute log fold change cutoff for filtering proteins. Only proteins with |logFC| greater than this value will be retained. Default is NULL, i.e. no logFC filtering.

force_include_other

character vector of identifiers to include in the network, regardless if those ids are in the input data. Should be formatted as "namespace:identifier", e.g. "HGNC:1234" or "CHEBI:4911".

filter_by_curation

logical, whether to filter out statements that have been curated as incorrect in INDRA. Default is FALSE.

filter_by_ptm_site

logical, whether to filter edges based on whether the site information from INDRA matches with the PTM site in the input. Default is FALSE. Only applicable for differential PTM abundance results.

include_infinite_fc

logical, whether to include proteins with infinite log fold change (i.e. proteins that are only detected in one condition). Default is FALSE.

direction

Character string specifying the direction of regulation to include. One of "both" (default), "up" (upregulated only), or "down" (downregulated only).

Value

list of 2 data.frames, nodes and edges

Examples

input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)
head(subnetwork$nodes)
head(subnetwork$edges)

Preview network in browser

Description

Generates a temporary HTML file for the network visualization and opens it in the default web browser for quick preview.

Usage

previewNetworkInBrowser(
  nodes,
  edges,
  displayLabelType = "id",
  nodeFontSize = 12
)

Arguments

nodes

Data frame with at minimum an id column. Optional columns: logFC (numeric), entityName (character; may be semicolon-joined for multi-grounded rows), entityId (character), Site (character, underscore-separated PTM site list).

edges

Data frame with columns source, target, interaction. Optional: site, evidenceLink.

displayLabelType

"id" (default) or "entityName" – controls which column is used as the visible node label.

nodeFontSize

Font size (px) for node labels. Default 12.

Value

Invisibly returns the file path of the temporary HTML file.

Examples

## Not run: 
nodes <- data.frame(id = c("A", "B", "C"))
edges <- data.frame(source = c("A", "B"), target = c("B", "C"))
previewNetworkInBrowser(nodes, edges)

## End(Not run)

Render a Cytoscape network in a Shiny application. This function is used to render a Cytoscape network visualization within a Shiny application.

Description

Render a Cytoscape network in a Shiny application. This function is used to render a Cytoscape network visualization within a Shiny application.

Usage

renderCytoscapeNetwork(expr, env = parent.frame())

Arguments

expr

An expression that generates an HTML widget (or a promise of an HTML widget).

env

The environment in which to evaluate expr.

Value

A rendered Cytoscape network widget for use in Shiny applications.

Examples

## Not run: 
library(shiny)
library(MSstatsBioNet)

ui <- fluidPage(
  cytoscapeNetworkOutput("cytoNetwork")
)

server <- function(input, output, session) {
  output$cytoNetwork <- renderCytoscapeNetwork({
    nodes <- data.frame(
      id    = c("TP53", "MDM2", "CDKN1A"),
      logFC = c(1.5, -0.8, 2.1),
      stringsAsFactors = FALSE
    )
    edges <- data.frame(
      source      = c("TP53",  "MDM2"),
      target      = c("MDM2",  "TP53"),
      interaction = c("Activation", "Inhibition"),
      stringsAsFactors = FALSE
    )
    cytoscapeNetwork(nodes, edges)
  })
}

shinyApp(ui, server)

## End(Not run)