Package 'MSstatsBioNet' reference manual

Title:	Network Analysis for MS-based Proteomics Experiments
Description:	A set of tools for network analysis using mass spectrometry-based proteomics data and network databases. The package takes as input the output of MSstats differential abundance analysis and provides functions to perform enrichment analysis and visualization in the context of prior knowledge from past literature. Notably, this package integrates with INDRA, which is a database of biological networks extracted from the literature using text mining techniques.
Authors:	Anthony Wu [aut, cre], Olga Vitek [aut]
Maintainer:	Anthony Wu <[email protected]>
License:	file LICENSE
Version:	0.99.9
Built:	2025-02-22 03:26:57 UTC
Source:	https://github.com/bioc/MSstatsBioNet

Populate HGNC IDs in Data Frame

Description

This function populates the HGNC IDs in the data frame based on the Uniprot IDs.

Usage

.populateHgncIdsInDataFrame(df)
.populateHgncIdsInDataFrame(df)

Arguments

`df`	A data frame containing protein information.

Value

A data frame with populated HGNC IDs.

Populate HGNC Names in Data Frame

Description

This function populates the HGNC names in the data frame based on the HGNC IDs.

Usage

.populateHgncNamesInDataFrame(df)
.populateHgncNamesInDataFrame(df)

Arguments

`df`	A data frame containing protein information.

Value

A data frame with populated HGNC names.

Populate Kinase Info in Data Frame

Description

This function populates the kinase information in the data frame based on the HGNC names.

Usage

.populateKinaseInfoInDataFrame(df)
.populateKinaseInfoInDataFrame(df)

Arguments

`df`	A data frame containing protein information.

Value

A data frame with populated kinase information.

Populate Phosphatase Info in Data Frame

Description

This function populates the phosphatase information in the data frame based on the HGNC names.

Usage

.populatePhophataseInfoInDataFrame(df)
.populatePhophataseInfoInDataFrame(df)

Arguments

`df`	A data frame containing protein information.

Value

A data frame with populated phosphatase information.

Populate Transcription Factor Info in Data Frame

Description

This function populates the transcription factor information in the data frame based on the HGNC names.

Usage

.populateTranscriptionFactorInfoInDataFrame(df)
.populateTranscriptionFactorInfoInDataFrame(df)

Arguments

`df`	A data frame containing protein information.

Value

A data frame with populated transcription factor information.

Populate Uniprot IDs in Data Frame

Description

This function populates the Uniprot IDs in the data frame based on the protein ID type.

Usage

.populateUniprotIdsInDataFrame(df, proteinIdType)
.populateUniprotIdsInDataFrame(df, proteinIdType)

Arguments

`df`	A data frame containing protein information.
`proteinIdType`	A character string specifying the type of protein ID. It can be either "Uniprot" or "Uniprot_Mnemonic".

Value

A data frame with populated Uniprot IDs.

Validate Annotate Protein Info Input

Description

This function validates the input data frame for the annotateProteinInfoFromIndra function.

Usage

.validateAnnotateProteinInfoFromIndraInput(df)
.validateAnnotateProteinInfoFromIndraInput(df)

Arguments

`df`	A data frame containing protein information.

Value

None. Throws an error if validation fails.

Annotate Protein Information from Indra

Description

This function annotates a data frame with protein information from Indra.

Usage

annotateProteinInfoFromIndra(df, proteinIdType)
annotateProteinInfoFromIndra(df, proteinIdType)

Arguments

`df`	output of `groupComparison` function's comparisonResult table, which contains a list of proteins and their corresponding p-values, logFCs, along with additional HGNC ID and HGNC name columns
`proteinIdType`	A character string specifying the type of protein ID. It can be either "Uniprot" or "Uniprot_Mnemonic".

Value

A data frame with the following columns:

Protein: Character. The original protein identifier.
UniprotID: Character. The Uniprot ID of the protein.
HgncID: Character. The HGNC ID of the protein.
HgncName: Character. The HGNC name of the protein.
IsTranscriptionFactor: Logical. Indicates if the protein is a transcription factor.
IsKinase: Logical. Indicates if the protein is a kinase.
IsPhosphatase: Logical. Indicates if the protein is a phosphatase.

Examples

df <- data.frame(Protein = c("CLH1_HUMAN"))
annotated_df <- annotateProteinInfoFromIndra(df, "Uniprot_Mnemonic")
head(annotated_df)
df <- data.frame(Protein = c("CLH1_HUMAN"))
annotated_df <- annotateProteinInfoFromIndra(df, "Uniprot_Mnemonic")
head(annotated_df)

Get subnetwork from INDRA database

Description

Using differential abundance results from MSstats, this function retrieves a subnetwork of protein interactions from INDRA database.

Usage

getSubnetworkFromIndra(
  input,
  protein_level_data = NULL,
  pvalueCutoff = NULL,
  statement_types = c("IncreaseAmount", "DecreaseAmount"),
  paper_count_cutoff = 1,
  evidence_count_cutoff = 1,
  correlation_cutoff = 0.3
)
getSubnetworkFromIndra(
  input,
  protein_level_data = NULL,
  pvalueCutoff = NULL,
  statement_types = c("IncreaseAmount", "DecreaseAmount"),
  paper_count_cutoff = 1,
  evidence_count_cutoff = 1,
  correlation_cutoff = 0.3
)

Arguments

`input`	output of `groupComparison` function's comparisionResult table, which contains a list of proteins and their corresponding p-values, logFCs, along with additional HGNC ID and HGNC name columns
`protein_level_data`	output of the `dataProcess` function's ProteinLevelData table, which contains a list of proteins and their corresponding abundances. Used for annotating correlation information and applying correlation cutoffs.
`pvalueCutoff`	p-value cutoff for filtering. Default is NULL, i.e. no filtering
`statement_types`	list of interaction types to filter on. Equivalent to statement type in INDRA. Default is c("IncreaseAmount", "DecreaseAmount").
`paper_count_cutoff`	number of papers to filter on. Default is 1.
`evidence_count_cutoff`	number of evidence to filter on for each paper. E.g. A paper may have 5 sentences describing the same interaction vs 1 sentence. Default is 1.
`correlation_cutoff`	if protein_level_abundance is not NULL, apply a cutoff for edges with correlation less than a specified cutoff. Default is 0.3

Value

list of 2 data.frames, nodes and edges

Examples

input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)
head(subnetwork$nodes)
head(subnetwork$edges)

input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)
head(subnetwork$nodes)
head(subnetwork$edges)

Create visualization of network

Description

Use results from INDRA to generate a visualization of the a network on Cytoscape Desktop. Note that the Cytoscape Desktop app must be open for this function to work.

Usage

visualizeNetworks(
  nodes,
  edges,
  pvalueCutoff = 0.05,
  logfcCutoff = 0.5,
  node_label_column = "id",
  main_targets = c()
)
visualizeNetworks(
  nodes,
  edges,
  pvalueCutoff = 0.05,
  logfcCutoff = 0.5,
  node_label_column = "id",
  main_targets = c()
)

Arguments

`nodes`	dataframe of nodes consisting of columns id (chararacter), pvalue (number), logFC (number)
`edges`	dataframe of edges consisting of columns source (character), target (character), interaction (character), evidenceCount (number), evidenceLink (character)
`pvalueCutoff`	p-value cutoff for coloring significant proteins. Default is 0.05
`logfcCutoff`	log fold change cutoff for coloring significant proteins. Default is 0.5
`node_label_column`	The column of the nodes dataframe to use as the node label. Default is "id". "hgncName" can be used for gene name.
`main_targets`	character vector of main targets to stand-out with a different node shape. Default is an empty vector c(). IDs of main targets should match the column used by the node_label_column parameter.

Value

cytoscape visualization of subnetwork

Examples

input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)
visualizeNetworks(subnetwork$nodes, subnetwork$edges)

input <- data.table::fread(system.file(
    "extdata/groupComparisonModel.csv",
    package = "MSstatsBioNet"
))
subnetwork <- getSubnetworkFromIndra(input)
visualizeNetworks(subnetwork$nodes, subnetwork$edges)

Package 'MSstatsBioNet'

Help Index

Populate HGNC IDs in Data Frame

Description

Usage

Arguments

Value

Populate HGNC Names in Data Frame

Description

Usage

Arguments

Value

Populate Kinase Info in Data Frame

Description

Usage

Arguments

Value

Populate Phosphatase Info in Data Frame

Description

Usage

Arguments

Value

Populate Transcription Factor Info in Data Frame

Description

Usage

Arguments

Value

Populate Uniprot IDs in Data Frame

Description

Usage

Arguments

Value

Validate Annotate Protein Info Input

Description

Usage

Arguments

Value

Annotate Protein Information from Indra

Description

Usage

Arguments

Value

Examples

Get subnetwork from INDRA database

Description

Usage

Arguments

Value

Examples

Create visualization of network

Description

Usage

Arguments

Value

Examples