| Title: | Network Analysis for MS-based Proteomics Experiments |
|---|---|
| Description: | A set of tools for network analysis using mass spectrometry-based proteomics data and network databases. The package takes as input the output of MSstats differential abundance analysis and provides functions to perform enrichment analysis and visualization in the context of prior knowledge from past literature. Notably, this package integrates with INDRA, which is a database of biological networks extracted from the literature using text mining techniques. |
| Authors: | Anthony Wu [aut, cre] (ORCID: <https://orcid.org/0009-0001-7391-9902>), Olga Vitek [aut] (ORCID: <https://orcid.org/0000-0003-1728-1104>) |
| Maintainer: | Anthony Wu <[email protected]> |
| License: | file LICENSE |
| Version: | 1.5.1 |
| Built: | 2026-05-28 14:49:46 UTC |
| Source: | https://github.com/bioc/MSstatsBioNet |
This function populates the HGNC IDs in the data frame based on the Uniprot IDs.
.populateHgncIdsInDataFrame(df, proteinIdType).populateHgncIdsInDataFrame(df, proteinIdType)
df |
A data frame containing protein information. |
proteinIdType |
A character string specifying the type of protein ID. It can be either "Uniprot", "Uniprot_Mnemonic", or "Hgnc_Name". |
A data frame with populated HGNC IDs.
This function populates the HGNC names in the data frame based on the HGNC IDs.
.populateHgncNamesInDataFrame(df).populateHgncNamesInDataFrame(df)
df |
A data frame containing protein information. |
A data frame with populated HGNC names.
This function populates the kinase information in the data frame based on the HGNC names.
.populateKinaseInfoInDataFrame(df).populateKinaseInfoInDataFrame(df)
df |
A data frame containing protein information. |
A data frame with populated kinase information.
This function populates the phosphatase information in the data frame based on the HGNC names.
.populatePhophataseInfoInDataFrame(df).populatePhophataseInfoInDataFrame(df)
df |
A data frame containing protein information. |
A data frame with populated phosphatase information.
This function populates the transcription factor information in the data frame based on the HGNC names.
.populateTranscriptionFactorInfoInDataFrame(df).populateTranscriptionFactorInfoInDataFrame(df)
df |
A data frame containing protein information. |
A data frame with populated transcription factor information.
This function populates the Uniprot IDs in the data frame based on the protein ID type.
.populateUniprotIdsInDataFrame(df, proteinIdType).populateUniprotIdsInDataFrame(df, proteinIdType)
df |
A data frame containing protein information. |
proteinIdType |
A character string specifying the type of protein ID. It can be either "Uniprot" or "Uniprot_Mnemonic". |
A data frame with populated Uniprot IDs.
This function validates the input data frame for the annotateProteinInfoFromIndra function.
.validateAnnotateProteinInfoFromIndraInput(df).validateAnnotateProteinInfoFromIndraInput(df)
df |
A data frame containing protein information. |
None. Throws an error if validation fails.
This function annotates a data frame with protein information from Indra.
annotateProteinInfoFromIndra(df, proteinIdType)annotateProteinInfoFromIndra(df, proteinIdType)
df |
output of |
proteinIdType |
A character string specifying the type of protein ID. It can be either "Uniprot", "Uniprot_Mnemonic", or "Hgnc_Name". |
A data frame with the following columns:
Character. The original protein identifier.
Character. The Uniprot ID of the protein.
Character. The HGNC ID of the protein.
Character. The HGNC name of the protein.
Logical. Indicates if the protein is a transcription factor.
Logical. Indicates if the protein is a kinase.
Logical. Indicates if the protein is a phosphatase.
df <- data.frame(Protein = c("CLH1_HUMAN")) annotated_df <- annotateProteinInfoFromIndra(df, "Uniprot_Mnemonic") head(annotated_df)df <- data.frame(Protein = c("CLH1_HUMAN")) annotated_df <- annotateProteinInfoFromIndra(df, "Uniprot_Mnemonic") head(annotated_df)
Creates an interactive network diagram powered by Cytoscape.js and the dagre layout algorithm. Nodes can carry log fold-change (logFC) values which are mapped to a blue-grey-red colour gradient. PTM (post-translational modification) site information is shown as small satellite nodes and edge overlaps are surfaced as hover tooltips.
cytoscapeNetwork( nodes, edges = data.frame(), displayLabelType = "id", nodeFontSize = 12, layoutOptions = NULL, width = NULL, height = NULL, elementId = NULL )cytoscapeNetwork( nodes, edges = data.frame(), displayLabelType = "id", nodeFontSize = 12, layoutOptions = NULL, width = NULL, height = NULL, elementId = NULL )
nodes |
Data frame with at minimum an |
edges |
Data frame with columns |
displayLabelType |
|
nodeFontSize |
Font size (px) for node labels. Default |
layoutOptions |
Named list of dagre layout options to override the
defaults (e.g. |
width, height
|
Widget dimensions passed to
|
elementId |
Optional explicit HTML element id. |
An htmlwidget object that renders in R Markdown, Shiny, or
the RStudio Viewer pane.
## Not run: nodes <- data.frame( id = c("TP53", "MDM2", "CDKN1A"), logFC = c(1.5, -0.8, 2.1), stringsAsFactors = FALSE ) edges <- data.frame( source = c("TP53", "MDM2"), target = c("MDM2", "TP53"), interaction = c("Activation", "Inhibition"), stringsAsFactors = FALSE ) cytoscapeNetwork(nodes, edges) ## End(Not run)## Not run: nodes <- data.frame( id = c("TP53", "MDM2", "CDKN1A"), logFC = c(1.5, -0.8, 2.1), stringsAsFactors = FALSE ) edges <- data.frame( source = c("TP53", "MDM2"), target = c("MDM2", "TP53"), interaction = c("Activation", "Inhibition"), stringsAsFactors = FALSE ) cytoscapeNetwork(nodes, edges) ## End(Not run)
Creates a Shiny output binding for a Cytoscape network visualization, allowing the network to be rendered within Shiny applications.
cytoscapeNetworkOutput(outputId, width = "100%", height = "500px")cytoscapeNetworkOutput(outputId, width = "100%", height = "500px")
outputId |
output variable to read from |
width, height
|
Must be a valid CSS unit (like |
A Shiny output binding for a Cytoscape network visualization.
## Not run: library(shiny) ui <- fluidPage( cytoscapeNetworkOutput("cytoNetwork") ) server <- function(input, output, session) { output$cytoNetwork <- renderCytoscapeNetwork({ nodes <- data.frame( id = c("TP53", "MDM2", "CDKN1A"), logFC = c(1.5, -0.8, 2.1), stringsAsFactors = FALSE ) edges <- data.frame( source = c("TP53", "MDM2"), target = c("MDM2", "TP53"), interaction = c("Activation", "Inhibition"), stringsAsFactors = FALSE ) cytoscapeNetwork(nodes, edges) }) } shinyApp(ui, server) ## End(Not run)## Not run: library(shiny) ui <- fluidPage( cytoscapeNetworkOutput("cytoNetwork") ) server <- function(input, output, session) { output$cytoNetwork <- renderCytoscapeNetwork({ nodes <- data.frame( id = c("TP53", "MDM2", "CDKN1A"), logFC = c(1.5, -0.8, 2.1), stringsAsFactors = FALSE ) edges <- data.frame( source = c("TP53", "MDM2"), target = c("MDM2", "TP53"), interaction = c("Activation", "Inhibition"), stringsAsFactors = FALSE ) cytoscapeNetwork(nodes, edges) }) } shinyApp(ui, server) ## End(Not run)
Removes the row(s) from an edges data frame that match the given
source, target, and interaction values. This is
the programmatic counterpart of the interactive Ctrl+click / right-click
edge deletion available in cytoscapeNetwork.
deleteEdgeFromNetwork(edges, source, target, interaction)deleteEdgeFromNetwork(edges, source, target, interaction)
edges |
Data frame with at minimum columns |
source |
Character. The source node identifier of the edge to remove. |
target |
Character. The target node identifier of the edge to remove. |
interaction |
Character. The interaction type of the edge to remove. |
The edges data frame with the matching row(s) removed.
edges <- data.frame( source = c("TP53", "MDM2", "CDKN1A"), target = c("MDM2", "TP53", "TP53"), interaction = c("Activation", "Inhibition", "Activation"), stringsAsFactors = FALSE ) deleteEdgeFromNetwork(edges, "MDM2", "TP53", "Inhibition")edges <- data.frame( source = c("TP53", "MDM2", "CDKN1A"), target = c("MDM2", "TP53", "TP53"), interaction = c("Activation", "Inhibition", "Activation"), stringsAsFactors = FALSE ) deleteEdgeFromNetwork(edges, "MDM2", "TP53", "Inhibition")
Convenience function that takes nodes and edges data directly and creates both the configuration and HTML export in one step.
exportNetworkToHTML( nodes, edges, filename = "network_visualization.html", displayLabelType = "id", nodeFontSize = 12, ... )exportNetworkToHTML( nodes, edges, filename = "network_visualization.html", displayLabelType = "id", nodeFontSize = 12, ... )
nodes |
Data frame with at minimum an |
edges |
Data frame with columns |
filename |
Output HTML filename |
displayLabelType |
|
nodeFontSize |
Font size (px) for node labels. Default |
... |
Additional arguments passed to exportCytoscapeToHTML() |
Invisibly returns the file path of the created HTML file
Fetches PubMed abstracts for evidence PMIDs, scores each abstract against a user-supplied query, and returns only the nodes, edges, and evidence rows whose abstracts meet the scoring cutoff.
filterSubnetworkByContext( nodes, edges, query, cutoff = NULL, method = c("tag_count", "cosine") )filterSubnetworkByContext( nodes, edges, query, cutoff = NULL, method = c("tag_count", "cosine") )
nodes |
A dataframe of network nodes. |
edges |
A dataframe of network edges with columns: source, target, interaction, site, evidenceLink, stmt_hash. |
query |
For |
cutoff |
Numeric threshold applied to the chosen scoring method.
|
method |
One of |
Two scoring methods are available, controlled by the method argument:
"tag_count" (default)Counts how many tags from query appear as substrings in the
abstract (case-insensitive). The score for each abstract is an integer
in [0, length(query)]. Set cutoff to the minimum number of
tags that must appear - e.g. cutoff = 2 keeps abstracts that
mention at least 2 of your tags. query must be a character
vector of tags when using this method.
"cosine"Scores abstracts using TF-IDF cosine similarity against query.
Scores are in [-1, 1] (in practice [0, 1] for text).
Set cutoff to a decimal threshold - e.g. cutoff = 0.10.
query should be a single character string; expand it with
synonyms and related terms for better recall under exact token matching.
A named list with three elements:
nodes |
Filtered nodes dataframe (only nodes present in kept edges) |
edges |
Filtered edges dataframe |
evidence |
Dataframe with columns: source, target, interaction, site,
evidenceLink, stmt_hash, text, pmid, score. The |
Using differential abundance results from MSstats, this function retrieves a subnetwork of protein interactions from INDRA database.
getSubnetworkFromIndra( input, protein_level_data = NULL, pvalueCutoff = NULL, statement_types = NULL, paper_count_cutoff = 1, evidence_count_cutoff = 1, correlation_cutoff = 0.3, sources_filter = NULL, logfc_cutoff = NULL, force_include_other = NULL, filter_by_curation = FALSE, filter_by_ptm_site = FALSE, include_infinite_fc = FALSE, direction = c("both", "up", "down") )getSubnetworkFromIndra( input, protein_level_data = NULL, pvalueCutoff = NULL, statement_types = NULL, paper_count_cutoff = 1, evidence_count_cutoff = 1, correlation_cutoff = 0.3, sources_filter = NULL, logfc_cutoff = NULL, force_include_other = NULL, filter_by_curation = FALSE, filter_by_ptm_site = FALSE, include_infinite_fc = FALSE, direction = c("both", "up", "down") )
input |
output of |
protein_level_data |
output of the |
pvalueCutoff |
p-value cutoff for filtering. Default is NULL, i.e. no filtering |
statement_types |
list of interaction types to filter on. Equivalent to statement type in INDRA. Default is NULL. |
paper_count_cutoff |
number of papers to filter on. Default is 1. |
evidence_count_cutoff |
number of evidence to filter on for each paper. E.g. A paper may have 5 sentences describing the same interaction vs 1 sentence. Default is 1. |
correlation_cutoff |
if protein_level_abundance is not NULL, apply a cutoff for edges with correlation less than a specified cutoff. Default is 0.3 |
sources_filter |
filtering only on specific sources. Default is no filter, i.e. NULL. Otherwise, should be a list, e.g. c('reach', 'medscan'). |
logfc_cutoff |
absolute log fold change cutoff for filtering proteins. Only proteins with |logFC| greater than this value will be retained. Default is NULL, i.e. no logFC filtering. |
force_include_other |
character vector of identifiers to include in the network, regardless if those ids are in the input data. Should be formatted as "namespace:identifier", e.g. "HGNC:1234" or "CHEBI:4911". |
filter_by_curation |
logical, whether to filter out statements that have been curated as incorrect in INDRA. Default is FALSE. |
filter_by_ptm_site |
logical, whether to filter edges based on whether the site information from INDRA matches with the PTM site in the input. Default is FALSE. Only applicable for differential PTM abundance results. |
include_infinite_fc |
logical, whether to include proteins with infinite log fold change (i.e. proteins that are only detected in one condition). Default is FALSE. |
direction |
Character string specifying the direction of regulation to
include. One of |
list of 2 data.frames, nodes and edges
input <- data.table::fread(system.file( "extdata/groupComparisonModel.csv", package = "MSstatsBioNet" )) subnetwork <- getSubnetworkFromIndra(input) head(subnetwork$nodes) head(subnetwork$edges)input <- data.table::fread(system.file( "extdata/groupComparisonModel.csv", package = "MSstatsBioNet" )) subnetwork <- getSubnetworkFromIndra(input) head(subnetwork$nodes) head(subnetwork$edges)
Generates a temporary HTML file for the network visualization and opens it in the default web browser for quick preview.
previewNetworkInBrowser( nodes, edges, displayLabelType = "id", nodeFontSize = 12 )previewNetworkInBrowser( nodes, edges, displayLabelType = "id", nodeFontSize = 12 )
nodes |
Data frame with at minimum an |
edges |
Data frame with columns |
displayLabelType |
|
nodeFontSize |
Font size (px) for node labels. Default |
Invisibly returns the file path of the temporary HTML file.
## Not run: nodes <- data.frame(id = c("A", "B", "C")) edges <- data.frame(source = c("A", "B"), target = c("B", "C")) previewNetworkInBrowser(nodes, edges) ## End(Not run)## Not run: nodes <- data.frame(id = c("A", "B", "C")) edges <- data.frame(source = c("A", "B"), target = c("B", "C")) previewNetworkInBrowser(nodes, edges) ## End(Not run)
Render a Cytoscape network in a Shiny application. This function is used to render a Cytoscape network visualization within a Shiny application.
renderCytoscapeNetwork(expr, env = parent.frame())renderCytoscapeNetwork(expr, env = parent.frame())
expr |
An expression that generates an HTML widget (or a promise of an HTML widget). |
env |
The environment in which to evaluate |
A rendered Cytoscape network widget for use in Shiny applications.
## Not run: library(shiny) library(MSstatsBioNet) ui <- fluidPage( cytoscapeNetworkOutput("cytoNetwork") ) server <- function(input, output, session) { output$cytoNetwork <- renderCytoscapeNetwork({ nodes <- data.frame( id = c("TP53", "MDM2", "CDKN1A"), logFC = c(1.5, -0.8, 2.1), stringsAsFactors = FALSE ) edges <- data.frame( source = c("TP53", "MDM2"), target = c("MDM2", "TP53"), interaction = c("Activation", "Inhibition"), stringsAsFactors = FALSE ) cytoscapeNetwork(nodes, edges) }) } shinyApp(ui, server) ## End(Not run)## Not run: library(shiny) library(MSstatsBioNet) ui <- fluidPage( cytoscapeNetworkOutput("cytoNetwork") ) server <- function(input, output, session) { output$cytoNetwork <- renderCytoscapeNetwork({ nodes <- data.frame( id = c("TP53", "MDM2", "CDKN1A"), logFC = c(1.5, -0.8, 2.1), stringsAsFactors = FALSE ) edges <- data.frame( source = c("TP53", "MDM2"), target = c("MDM2", "TP53"), interaction = c("Activation", "Inhibition"), stringsAsFactors = FALSE ) cytoscapeNetwork(nodes, edges) }) } shinyApp(ui, server) ## End(Not run)