| Title: | A phosphoproteomics data analysis package with an interactive ShinyApp |
|---|---|
| Description: | To facilitate and streamline phosphoproteomics data analysis, we developed SmartPhos, an R package for the pre-processing, quality control, and exploratory analysis of phosphoproteomics data generated by MaxQuant and Spectronaut. The package can be used either through the R command line or through an interactive ShinyApp called SmartPhos Explorer. The package contains methods such as normalization and normalization correction, transformation, imputation, batch effect correction, PCA, heatmap, differential expression, time-series clustering, gene set enrichment analysis, and kinase activity inference. |
| Authors: | Shubham Agrawal [aut, cre] (ORCID: <https://orcid.org/0009-0005-2630-9342>), Junyan Lu [aut] (ORCID: <https://orcid.org/0000-0002-9211-0746>) |
| Maintainer: | Shubham Agrawal <[email protected]> |
| License: | GPL-3 |
| Version: | 1.3.0 |
| Built: | 2026-05-30 09:39:02 UTC |
| Source: | https://github.com/bioc/SmartPhos |
addZeroTime adds a zero timepoint to a specific treatment's data
subset.
addZeroTime(data, condition, treat, zeroTreat, timeRange)addZeroTime(data, condition, treat, zeroTreat, timeRange)
data |
A |
condition |
|
treat |
|
zeroTreat |
|
timeRange |
|
The function performs the following steps:
Subsets the data for the specified treatment and time range.
Subsets the data for the zero timepoint of the specified zero treatment.
Combines the assays from the treatment and zero timepoint subsets.
Updates the column data to reflect the combined treatment.
Returns a SummarizedExperiment object with the combined data.
A SummarizedExperiment object with the zero timepoint added to
the specified treatment's data.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Call the function addZeroTime(se, condition = "treatment", treat = "EGF", zeroTreat = "1stCrtl", timeRange = c("20min","40min", "6h"))library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Call the function addZeroTime(se, condition = "treatment", treat = "EGF", zeroTreat = "1stCrtl", timeRange = c("20min","40min", "6h"))
decoupleR
calcKinaseScore calculates kinase activity scores based on input data
and a specified network of regulatory relationships
(decoupler network).
calcKinaseScore( resTab, decoupler_network, corrThreshold = 0.9, statType = c("stat", "log2FC"), nPerm = 100 )calcKinaseScore( resTab, decoupler_network, corrThreshold = 0.9, statType = c("stat", "log2FC"), nPerm = 100 )
resTab |
A |
decoupler_network |
A |
corrThreshold |
A |
statType |
A |
nPerm |
A |
The function performs the following steps:
Removes duplicate rows based on the site column.
Filters the data to include only those sites present in the target
column of the decoupler network.
Prepares the input table based on the specified statType.
Intersects the input table with the decoupler network to find
common regulons.
Checks for correlated regulons and filters out those exceeding the correlation threshold.
Calculates kinase activity using a weighted mean approach.
Processes the results to handle NA values and formats the output.
A data frame with kinase activity scores, including columns
for 'source', 'score', and 'p_value'.
resTab <- data.frame( site = c("EGFR_Y1172", "EGFR_Y1197", "EGFR_S1166", "ROCK2_S1374", "WASL_Y256", "GAB1_Y259", "ADD1_S586", "EPHA2_Y772", "PRKDC_T2638", "PRKDC_T2609", "PRKDC_S2612"), stat = c(-10.038770, -5.945562, 5.773384, -7.303834, 5.585326, 5.971104, 5.199119, -5.169500, 5.130228, 5.407387, 4.493933), log2FC = c(-2.6113343, -2.4858615, 1.0056629, -1.1561780, 1.6421145, 2.0296634, 1.3766283, -0.8531656, 1.0742881, 1.0042942, 1.0608129) ) decoupler_network <- data.frame( source = c(rep("ABL1", 5), rep("CDK2", 6)), mor = c(rep(1, 11)), target = c("EGFR_Y1172", "EGFR_Y1197", "EGFR_S1166", "ROCK2_S1374", "WASL_Y256", "GAB1_Y259", "ADD1_S586", "EPHA2_Y772", "PRKDC_T2638", "PRKDC_T2609", "PRKDC_S2612"), likelihood = c(rep(1, 11)) ) # Call the function calcKinaseScore(resTab, decoupler_network)resTab <- data.frame( site = c("EGFR_Y1172", "EGFR_Y1197", "EGFR_S1166", "ROCK2_S1374", "WASL_Y256", "GAB1_Y259", "ADD1_S586", "EPHA2_Y772", "PRKDC_T2638", "PRKDC_T2609", "PRKDC_S2612"), stat = c(-10.038770, -5.945562, 5.773384, -7.303834, 5.585326, 5.971104, 5.199119, -5.169500, 5.130228, 5.407387, 4.493933), log2FC = c(-2.6113343, -2.4858615, 1.0056629, -1.1561780, 1.6421145, 2.0296634, 1.3766283, -0.8531656, 1.0742881, 1.0042942, 1.0608129) ) decoupler_network <- data.frame( source = c(rep("ABL1", 5), rep("CDK2", 6)), mor = c(rep(1, 11)), target = c("EGFR_Y1172", "EGFR_Y1197", "EGFR_S1166", "ROCK2_S1374", "WASL_Y256", "GAB1_Y259", "ADD1_S586", "EPHA2_Y772", "PRKDC_T2638", "PRKDC_T2609", "PRKDC_S2612"), likelihood = c(rep(1, 11)) ) # Call the function calcKinaseScore(resTab, decoupler_network)
checkRatioMat checks the ratio matrix for samples that do not
have sufficient overlap of phospho-peptides between enriched (PP) and
unenriched (FP) samples.
checkRatioMat(ratioMat, minOverlap = 3)checkRatioMat(ratioMat, minOverlap = 3)
ratioMat |
A numeric |
minOverlap |
A |
A character vector of sample names that do not meet the
overlap criteria.
clusterEnrich performs enrichment analysis on gene clusters, using
Fisher's Exact Test to determine the significance of enrichment for each
cluster.
clusterEnrich( clusterTab, se, inputSet, reference = NULL, ptm = FALSE, adj = "BH", filterP = 0.05, ifFDR = FALSE )clusterEnrich( clusterTab, se, inputSet, reference = NULL, ptm = FALSE, adj = "BH", filterP = 0.05, ifFDR = FALSE )
clusterTab |
A |
se |
A |
inputSet |
A |
reference |
A |
ptm |
|
adj |
|
filterP |
|
ifFDR |
|
The function first retrieves or computes the reference set of genes or PTM
sites. It then performs enrichment analysis for each cluster using the
runFisher function.
The results are filtered based on the p-value threshold and adjusted for
multiple testing if ifFDR is TRUE. The function generates a
dot plot where the size and color of the points represent the significance
of enrichment.
A list containing two elements:
'table': A data frame with enrichment results for each
cluster and pathway.
'plot': A ggplot2 object showing the significance of
enrichment for each pathway across clusters.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) seProcess <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") result <- addZeroTime(seProcess, condition = "treatment", treat = "EGF", zeroTreat = "1stCrtl", timeRange = c("20min","40min", "6h")) # Get the numeric matrix exprMat <- SummarizedExperiment::assay(result) # Call the clustering function clust <- clusterTS(x = exprMat, k = 3) genesetPath <- appDir <- system.file("shiny-app/geneset", package = "SmartPhos") inGMT <- piano::loadGSC(paste0(genesetPath, "/Cancer_Hallmark.gmt"),type="gmt") # Call the function clusterEnrich(clust$cluster, seProcess, inGMT)library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) seProcess <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") result <- addZeroTime(seProcess, condition = "treatment", treat = "EGF", zeroTreat = "1stCrtl", timeRange = c("20min","40min", "6h")) # Get the numeric matrix exprMat <- SummarizedExperiment::assay(result) # Call the clustering function clust <- clusterTS(x = exprMat, k = 3) genesetPath <- appDir <- system.file("shiny-app/geneset", package = "SmartPhos") inGMT <- piano::loadGSC(paste0(genesetPath, "/Cancer_Hallmark.gmt"),type="gmt") # Call the function clusterEnrich(clust$cluster, seProcess, inGMT)
clusterTS performs clustering on time-series data and generates plots
for visualization.
clusterTS(x, k = 5, pCut = NULL, twoCondition = FALSE)clusterTS(x, k = 5, pCut = NULL, twoCondition = FALSE)
x |
A numeric |
k |
A |
pCut |
A |
twoCondition |
A |
The function performs the following steps:
Sets a seed for reproducibility.
Removes rows with missing values.
Performs clustering using fuzzy C-means.
Filters clusters based on the probability cutoff if provided.
Generates plots for visualizing clustering results.
A list containing:
cluster |
A |
plot |
A |
library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) seProcess <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") result <- addZeroTime(seProcess, condition = "treatment", treat = "EGF", zeroTreat = "1stCrtl", timeRange = c("20min","40min", "6h")) # Get the numeric matrix exprMat <- assay(result) # Call the function clusterTS(x = exprMat, k = 3)library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) seProcess <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") result <- addZeroTime(seProcess, condition = "treatment", treat = "EGF", zeroTreat = "1stCrtl", timeRange = c("20min","40min", "6h")) # Get the numeric matrix exprMat <- assay(result) # Call the function clusterTS(x = exprMat, k = 3)
A sample of Data Dependent Acquisition (DDA) mass spectrometry data from Max Quant.
data(dda_example)data(dda_example)
a class S4 object of MultiAssayExperiment
A MultiAssayExperiment object containing a sample of DDA mass
spectrometry data from Max Quant.
data(dda_example)data(dda_example)
A sample of Data Independent Acquisition (DIA) mass spectrometry data from Spectronaut.
data(dia_example)data(dia_example)
a class S4 object of MultiAssayExperiment
A MultiAssayExperiment object containing a sample of DIA mass
spectrometry data from Spectronaut.
data(dia_example)data(dia_example)
enrichDifferential performs enrichment analysis on differentially
expressed genes and phospho-sites for either pathway or phospho-specific
enrichment, depending on the input parameters. It supports multiple
statistical methods such as PAGE and GSEA for pathway enrichment and a
Kolmogorov-Smirnov approach for phospho-enrichment.
enrichDifferential( dea, type = c("Pathway enrichment", "Phospho-signature enrichment"), gsaMethod = c("PAGE", "GSEA"), geneSet, ptmSet, statType = c("stat", "log2FC"), nPerm = 100, sigLevel = 0.05, ifFDR = FALSE )enrichDifferential( dea, type = c("Pathway enrichment", "Phospho-signature enrichment"), gsaMethod = c("PAGE", "GSEA"), geneSet, ptmSet, statType = c("stat", "log2FC"), nPerm = 100, sigLevel = 0.05, ifFDR = FALSE )
dea |
A |
type |
A |
gsaMethod |
A |
geneSet |
A gene set collection to use for pathway enrichment. |
ptmSet |
A post-translational modification (PTM) set database for phospho-enrichment analysis. |
statType |
A |
nPerm |
A |
sigLevel |
A |
ifFDR |
A |
The 'enrichDifferential' function performs either pathway enrichment or phospho-enrichment analysis based on the 'type' parameter. For pathway enrichment, it uses either the PAGE or GSEA method with a provided gene set collection. For phospho-enrichment, it uses a Kolmogorov-Smirnov test with a PTM set database. Results can be filtered by significance level and optionally adjusted for FDR.
A data frame containing the results of the enrichment analysis, including columns such as the gene set name, statistical significance, and adjusted p-values.
library(SummarizedExperiment) library(piano) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Preprocess the proteome assay result <- preprocessPhos(se, normalize = TRUE) # Call the function to perform differential expression analyis dea <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") # Load the gene set genesetPath <- appDir <- system.file("shiny-app/geneset", package = "SmartPhos") inGMT <- loadGSC(paste0(genesetPath,"/Cancer_Hallmark.gmt"),type="gmt") # Call the function resTab <- enrichDifferential(dea = dea$resDE, type = "Pathway enrichment", gsaMethod = "PAGE", geneSet = inGMT, statType = "stat", nPerm = 200, sigLevel = 0.05, ifFDR = FALSE)library(SummarizedExperiment) library(piano) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Preprocess the proteome assay result <- preprocessPhos(se, normalize = TRUE) # Call the function to perform differential expression analyis dea <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") # Load the gene set genesetPath <- appDir <- system.file("shiny-app/geneset", package = "SmartPhos") inGMT <- loadGSC(paste0(genesetPath,"/Cancer_Hallmark.gmt"),type="gmt") # Call the function resTab <- enrichDifferential(dea = dea$resDE, type = "Pathway enrichment", gsaMethod = "PAGE", geneSet = inGMT, statType = "stat", nPerm = 200, sigLevel = 0.05, ifFDR = FALSE)
generateInputTable generates an input table for proteomic and
phosphoproteomic analysis by reading files from a specified folder.
generateInputTable(rawFolder, batchAsFolder = FALSE)generateInputTable(rawFolder, batchAsFolder = FALSE)
rawFolder |
A |
batchAsFolder |
A |
The function performs the following steps:
Optionally treats subdirectories as separate batches.
Reads the summary file containing experimental information.
Generates unique experimental IDs based on batch and sample names.
Processes file paths for full proteome and phosphoproteome data.
Creates a combined input table with file names, sample names, search types, batches, and IDs.
A data.frame with columns fileName, sample, searchType, batch,
and id that can be used as input for further analysis.
generateInputTable_DIA generates an input table for DIA analysis by
reading files from a specified folder.
generateInputTable_DIA(rawFolder)generateInputTable_DIA(rawFolder)
rawFolder |
A |
The function performs the following steps:
Reads the summary file containing experimental information.
Generates unique experimental IDs based on sample type, treatment, timepoint, and replicate.
Processes file paths for full proteome and phosphoproteome data.
Creates a combined input table with file names, search types, and IDs.
A data.frame with columns fileName, searchType, and id that
can be used as input for further analysis.
getDecouplerNetwork loads the kinase-substrate interaction network for
a specified species from pre-defined files.
getDecouplerNetwork(speciesRef = c("Homo sapiens", "Mus musculus"))getDecouplerNetwork(speciesRef = c("Homo sapiens", "Mus musculus"))
speciesRef |
A |
A data frame containing the kinase-substrate interaction
network for the specified species.
# Load the human kinase-substrate interaction network getDecouplerNetwork("Homo sapiens") # Load the mouse kinase-substrate interaction network getDecouplerNetwork("Mus musculus")# Load the human kinase-substrate interaction network getDecouplerNetwork("Homo sapiens") # Load the mouse kinase-substrate interaction network getDecouplerNetwork("Mus musculus")
getOneSymbol extracts the last gene symbol from a semicolon-separated
list of gene symbols.
getOneSymbol(Gene)getOneSymbol(Gene)
Gene |
A |
This function processes a character vector where each element consists of gene symbols separated by semicolons. It splits each element by semicolons and extracts the last gene symbol from the resulting list. The output is a character vector of these last gene symbols.
A character vector containing the last gene symbol from each
element of the input vector.
getRatioMatrix calculates the ratio matrix of phosphoproteome data
from a MultiAssayExperiment object.
getRatioMatrix(maeData, normalization = FALSE, getAdjustedPP = FALSE)getRatioMatrix(maeData, normalization = FALSE, getAdjustedPP = FALSE)
maeData |
A |
normalization |
A |
getAdjustedPP |
A |
A numeric matrix representing the ratio of intensity of PP
(phosphoproteome) data to FP (full proteome) data.
# Load multiAssayExperiment object data("dia_example") # Call the function getRatioMatrix(dia_example, normalization = TRUE)# Load multiAssayExperiment object data("dia_example") # Call the function getRatioMatrix(dia_example, normalization = TRUE)
A prior knowledge database about the known kinase-phosphosite interactions for Homo sapiens
data(Homo_sapien_kinase_substrate_network)data(Homo_sapien_kinase_substrate_network)
a data.frame object
A data frame containing the information about the known
kinase-phosphosite interactions for Homo sapiens.
data(Homo_sapien_kinase_substrate_network)data(Homo_sapien_kinase_substrate_network)
intensityBoxPlot creates a boxplot for the Intensity data of a given
gene or feature, with optional subject-specific lines.
intensityBoxPlot(se, id, symbol)intensityBoxPlot(se, id, symbol)
se |
A |
id |
|
symbol |
|
This function generates a boxplot for the intensity data of a specified gene
or feature from a SummarizedExperiment (SE) object. The plot shows the
distribution of normalized intensities across different groups specified in
the comparison column of the SE object.
The function can handle both grouped data and repeated measures:
- If the SE object does not contain a subjectID column, the function
plots a standard boxplot grouped by the comparison column.
- If the SE object contains a subjectID column, the function adds
lines connecting the points for each subject across the groups, providing a
visual indication of subject-specific changes.
The boxplot is customized with various aesthetic elements, such as box
width, transparency, point size, axis labels, and title formatting.
A ggplot2 object representing the boxplot of the intensity
data.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis de <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") # Plot the box plot for the given id and symbol intensityBoxPlot(de$seSub, "p99", "PPP6C")library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis de <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") # Plot the box plot for the given id and symbol intensityBoxPlot(de$seSub, "p99", "PPP6C")
makeSmartPhosDirectory creates a directory for the SmartPhos shiny
app, and copies the necessary Shiny app files into the newly created
directory.
makeSmartPhosDirectory(path)makeSmartPhosDirectory(path)
path |
A |
The function first creates the main directory at the specified path and a
subdirectory named '"save"' for storing MultiAssayExperiment object.
It then locates the Shiny application files from the SmartPhos package and
copies them into the new directory.
None (invisible NULL). The function creates the necessary directories and copies files.
makeSmartPhosDirectory("shinyApp")makeSmartPhosDirectory("shinyApp")
medianNorm normalizes the columns of a matrix by either the median or
the mean.
medianNorm(x, method = "median")medianNorm(x, method = "median")
x |
A |
method |
A |
The function performs the following steps:
If the method is "median", it calculates the median of each
column and adjusts by the overall median of these medians.
If the method is "mean", it calculates the mean of each column
and adjusts by the overall mean of these means.
It constructs a matrix of these adjusted values and subtracts it from the original matrix to normalize the columns.
A numeric matrix with normalized columns.
# Example usage: x <- matrix(rnorm(20), nrow=5, ncol=4) medianNorm(x, method = "median")# Example usage: x <- matrix(rnorm(20), nrow=5, ncol=4) medianNorm(x, method = "median")
mscale scales and centers each row of a matrix, with options for using
mean or median, standard deviation or mean absolute deviation, and censoring
extreme values.
mscale(x, center = TRUE, scale = TRUE, censor = NULL, useMad = FALSE)mscale(x, center = TRUE, scale = TRUE, censor = NULL, useMad = FALSE)
x |
A |
center |
|
scale |
|
censor |
A |
useMad |
|
The function allows for flexible scaling and centering of the rows of a
matrix:
If both center and scale are TRUE, rows are
centered and scaled.
If only center is TRUE, rows are centered but not
scaled.
If only scale is TRUE, rows are scaled but not
centered.
If neither center nor scale is TRUE, the
original matrix is returned.
The function can also censor extreme values, either symmetrically or
asymmetrically, based on the censor parameter.
A scaled and centered numeric matrix with the same dimensions
as the input matrix 'x'.
# Create a sample matrix (3 rows by 5 columns) sample_matrix <- matrix(c(1:15), nrow = 3, byrow = TRUE) # Scale and center the matrix using the default settings mscale(sample_matrix, center = TRUE, scale = TRUE) # Only center the matrix without scaling mscale(sample_matrix, center = TRUE, scale = FALSE) # Only scale the matrix without centering mscale(sample_matrix, center = FALSE, scale = TRUE)# Create a sample matrix (3 rows by 5 columns) sample_matrix <- matrix(c(1:15), nrow = 3, byrow = TRUE) # Scale and center the matrix using the default settings mscale(sample_matrix, center = TRUE, scale = TRUE) # Only center the matrix without scaling mscale(sample_matrix, center = TRUE, scale = FALSE) # Only scale the matrix without centering mscale(sample_matrix, center = FALSE, scale = TRUE)
A prior knowledge database about the known kinase-phosphosite interactions for Mus musculus
data(Mus_musculus_kinase_substrate_network)data(Mus_musculus_kinase_substrate_network)
a data.frame object
A data frame containing the information about the known
kinase-phosphosite interactions for Mus musculus.
data(Mus_musculus_kinase_substrate_network)data(Mus_musculus_kinase_substrate_network)
normByFullProteome normalizes the phosphoproteome data by the
corresponding full proteome data in a MultiAssayExperiment object.
The "Phosphoproteome" assay
in the MultiAssayExperiment will be replaced by the ratio.
normByFullProteome(mae, replace = TRUE)normByFullProteome(mae, replace = TRUE)
mae |
A |
replace |
|
The function performs the following steps:
Checks if both phosphoproteome and proteome assays are present in
the MultiAssayExperiment object.
Extracts the phosphoproteome and proteome assays along with the sample annotations.
Matches the samples between the phosphoproteome and proteome assays.
Normalizes the phosphoproteome data by dividing it by the corresponding proteome data.
Replaces the phosphoproteome assay in the MultiAssayExperiment
object or adds the normalized data as a new assay, depending on the
replace parameter.
A MultiAssayExperiment object with the normalized
phosphoproteome data.
# load mae object data("dia_example") # call the function normByFullProteome(dia_example)# load mae object data("dia_example") # call the function normByFullProteome(dia_example)
performCombinedNormalization performs combined normalization on
proteome and phosphoproteome data from a MultiAssayExperiment object.
performCombinedNormalization(maeData)performCombinedNormalization(maeData)
maeData |
A |
The function performs the following steps:
Extracts the count matrices for Full Proteome (FP) samples.
Combines the proteome and phosphoproteome data into a single matrix.
Removes rows with all NA values.
Performs median normalization and log2 transformation on the combined matrix.
A numeric matrix with normalized and log2-transformed data.
# Load multiAssayExperiment object data("dia_example") # Call the function performCombinedNormalization(dia_example)# Load multiAssayExperiment object data("dia_example") # Call the function performCombinedNormalization(dia_example)
performDifferentialExp performs differential expression analysis on a
given SummarizedExperiment object using either the limma or
ProDA method.
performDifferentialExp( se, assay, method = c("limma", "ProDA"), condition = NULL, reference, target, refTime = NULL, targetTime = NULL, pairedTtest = FALSE )performDifferentialExp( se, assay, method = c("limma", "ProDA"), condition = NULL, reference, target, refTime = NULL, targetTime = NULL, pairedTtest = FALSE )
se |
A |
assay |
A |
method |
A |
condition |
A |
reference |
A |
target |
A |
refTime |
A |
targetTime |
A |
pairedTtest |
A |
This function is designed to facilitate differential expression analysis on
a SummarizedExperiment (SE) object. The function allows users to
specify various parameters to tailor the analysis to their specific
experimental setup.
The main steps of the function are as follows:
1. Sample Selection: Based on the provided condition,
reference, and target arguments, the function identifies the
relevant samples for the analysis. If time points (refTime and
targetTime) are provided, it further refines the sample selection.
2. Subsetting the SE Object: The SE object is subsetted to include only the
selected samples. A new column comparison is added to the
colData, indicating whether each sample belongs to the reference or
target group.
3. Design Matrix Construction: The function constructs a design matrix for
the differential expression analysis. If the SE object contains a
subjectID column, this is included in the design to account for
repeated measures or paired samples.
4. Differential Expression Analysis: Depending on the specified
method, the function performs the differential expression analysis
using either the limma or ProDA package:
- Limma: The function fits a linear model to the expression data
and applies empirical Bayes moderation to the standard errors. The
results are then extracted and formatted.
- ProDA: The function fits a probabilistic dropout model to the
expression data and tests for differential expression. The results are
then extracted and formatted.
5. Result Formatting: The differential expression results are merged with the metadata from the SE object, and the resulting table is formatted into a tibble. The table includes columns for log2 fold change (log2FC), test statistic (stat), p-value (pvalue), adjusted p-value (padj), and gene/feature ID (ID).
The function returns a list containing the formatted differential
expression results and the subsetted SE object. This allows users to further
explore or visualize the results as needed.
A list containing:
resDE |
A |
seSub |
A |
library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment")library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment")
plotAdjustmentResults generates plots to visualize the results of
phosphoproteome adjustment.
plotAdjustmentResults(maeData, normalization = FALSE)plotAdjustmentResults(maeData, normalization = FALSE)
maeData |
A |
normalization |
A |
The function performs the following steps:
Checks if the adjustment factor is present in the sample annotation.
Calculates the ratio matrix before and after adjustment.
Creates a trend line plot for features present in all samples.
Creates box plots of the PP/FP ratios and phosphorylation intensities before and after adjustment.
A list containing:
ratioTrendPlot |
A |
ratioBoxplot |
A |
ppBoxplot |
A |
plotHeatmap generates a heatmap for intensity assay for different
conditions, including top variants, differentially expressed genes, and
selected time series clusters.
plotHeatmap( type = c("Top variant", "Differentially expressed", "Selected time series cluster"), se, data = NULL, top = 100, cutCol = 1, cutRow = 1, clustCol = TRUE, clustRow = TRUE, annotationCol = NULL, title = NULL )plotHeatmap( type = c("Top variant", "Differentially expressed", "Selected time series cluster"), se, data = NULL, top = 100, cutCol = 1, cutRow = 1, clustCol = TRUE, clustRow = TRUE, annotationCol = NULL, title = NULL )
type |
A |
se |
A |
data |
An optional |
top |
A |
cutCol |
A |
cutRow |
A |
clustCol |
A |
clustRow |
A |
annotationCol |
A |
title |
A |
This function creates a heatmap using the Intensity assay from a
SummarizedExperiment object. The heatmap can show the top variants
based on standard deviation, differentially expressed genes, or selected time
series clusters. Row normalization is performed, and the heatmap can include
annotations based on specified metadata columns.
A pheatmap object showing the heatmap of Intensity data.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Generate the imputed assay result <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") # Plot heatmap for top variant plotHeatmap(type = "Top variant", top = 10, se = result, cutCol = 2)library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Generate the imputed assay result <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") # Plot heatmap for top variant plotHeatmap(type = "Top variant", top = 10, se = result, cutCol = 2)
plotIntensity generates boxplots of assay intensities for each sample
in a SummarizedExperiment object. Optionally, the boxplots can be
colored based on a specified metadata column. The function handles missing
values by filtering them out before plotting.
plotIntensity(se, colorByCol = "none")plotIntensity(se, colorByCol = "none")
se |
A |
colorByCol |
A |
A ggplot2 object showing boxplots of intensities for each
sample.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Preprocess the phosphoproteome assay result <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") # Call the plotting function plotIntensity(result, colorByCol = "replicate")library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Preprocess the phosphoproteome assay result <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") # Call the plotting function plotIntensity(result, colorByCol = "replicate")
'plotKinaseDE' generates a bar plot of the top kinases associated
with the differentially expressed genes based on their scores.
plotKinaseDE(scoreTab, nTop = 10, pCut = 0.05)plotKinaseDE(scoreTab, nTop = 10, pCut = 0.05)
scoreTab |
A |
nTop |
A |
pCut |
A |
The function performs the following steps:
Adds a column for significance based on the p-value cutoff.
Adds a column for the sign of the score.
Filters out kinases with a score of 0.
Selects the top nTop kinases by absolute score for each sign
of the score.
Creates a bar plot with the selected kinases.
A ggplot2 object representing the bar plot of kinase score.
# Example usage: scoreTab <- data.frame( source = c("Kinase1", "Kinase2", "Kinase3", "Kinase4"), score = c(2.3, -1.5, 0, 3.1), p_value = c(0.01, 0.2, 0.05, 0.03) ) plotKinaseDE(scoreTab, nTop = 3, pCut = 0.05)# Example usage: scoreTab <- data.frame( source = c("Kinase1", "Kinase2", "Kinase3", "Kinase4"), score = c(2.3, -1.5, 0, 3.1), p_value = c(0.01, 0.2, 0.05, 0.03) ) plotKinaseDE(scoreTab, nTop = 3, pCut = 0.05)
plotKinaseTimeSeries creates a heatmap to visualize the result of
kinase activity inference for time-series clustering, with significant
activity changes marked.
plotKinaseTimeSeries(scoreTab, pCut = 0.05, clusterName = "cluster1")plotKinaseTimeSeries(scoreTab, pCut = 0.05, clusterName = "cluster1")
scoreTab |
A |
pCut |
A |
clusterName |
A |
The heatmap shows kinase activity scores over different time points. Significant activities (based on the specified p-value threshold) are marked with an asterisk (*). The color gradient represents the activity score, with blue indicating low activity, red indicating high activity, and white as the midpoint.
A ggplot2 object representing the heatmap of kinase activity
score.
# Example usage: scoreTab <- data.frame( timepoint = rep(c("0h", "1h", "2h"), each = 3), source = rep(c("KinaseA", "KinaseB", "KinaseC"), times = 3), score = runif(9, -2, 2), p_value = runif(9, 0, 0.1) ) plotKinaseTimeSeries(scoreTab)# Example usage: scoreTab <- data.frame( timepoint = rep(c("0h", "1h", "2h"), each = 3), source = rep(c("KinaseA", "KinaseB", "KinaseC"), times = 3), score = runif(9, -2, 2), p_value = runif(9, 0, 0.1) ) plotKinaseTimeSeries(scoreTab)
plotLogRatio generates a boxplot of the log2 ratio of intensities of
phosphoproteome to full proteome data from a MultiAssayExperiment
object.
plotLogRatio(maeData, normalization = FALSE)plotLogRatio(maeData, normalization = FALSE)
maeData |
A |
normalization |
A |
A ggplot2 object representing the boxplot of the log2 ratios.
# Load multiAssayExperiment object data("dia_example") # Call the function plotLogRatio(dia_example, normalization = TRUE)# Load multiAssayExperiment object data("dia_example") # Call the function plotLogRatio(dia_example, normalization = TRUE)
plotMissing generates a bar plot showing the completeness (percentage
of non-missing values) for each sample in a SummarizedExperiment
object.
plotMissing(se)plotMissing(se)
se |
A |
This function calculates the percentage of non-missing values for each sample
in the provided SummarizedExperiment object. It then generates a bar
plot where each bar represents a sample, and the height of the bar
corresponds to the completeness (percentage of non-missing values) of that
sample.
A ggplot2 object showing the percentage of completeness for
each sample.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Phosphoproteome"]] colData(se) <- colData(dda_example) # Call the function plotMissing(se)library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Phosphoproteome"]] colData(se) <- colData(dda_example) # Call the function plotMissing(se)
plotPCA generates a PCA plot using the results from a PCA analysis
and a SummarizedExperiment object. The points on the plot can be
colored and shaped based on metadata.
plotPCA(pca, se, xaxis = "PC1", yaxis = "PC2", color = "none", shape = "none")plotPCA(pca, se, xaxis = "PC1", yaxis = "PC2", color = "none", shape = "none")
pca |
A PCA result object, typically obtained from |
se |
A |
xaxis |
A |
yaxis |
A |
color |
A |
shape |
A |
This function creates a PCA plot using the scores from a PCA result object
and metadata from a SummarizedExperiment object. The x-axis and y-axis
can be customized to display different principal components, and the points
can be optionally colored and shaped based on specified metadata columns.
A ggplot2 object showing the PCA plot.
# Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] SummarizedExperiment::colData(se) <- SummarizedExperiment::colData( dia_example) # Generate the imputed assay result <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") # Perform PCA pcaResult <- stats::prcomp(t( SummarizedExperiment::assays(result)[["imputed"]]), center = TRUE, scale. = TRUE) # Plot PCA results plotPCA(pca = pcaResult, se = result, color = "treatment")# Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] SummarizedExperiment::colData(se) <- SummarizedExperiment::colData( dia_example) # Generate the imputed assay result <- preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC") # Perform PCA pcaResult <- stats::prcomp(t( SummarizedExperiment::assays(result)[["imputed"]]), center = TRUE, scale. = TRUE) # Plot PCA results plotPCA(pca = pcaResult, se = result, color = "treatment")
plotTimeSeries plots time series data for a given gene or phospho site
from a given SummarizedExperiment object, allowing different types of
plots such as expression, log fold change, or two-condition expression.
plotTimeSeries( se, type = c("expression", "logFC", "two-condition expression"), geneID, symbol, condition, treatment, refTreat, addZero = FALSE, zeroTreat = NULL, timerange )plotTimeSeries( se, type = c("expression", "logFC", "two-condition expression"), geneID, symbol, condition, treatment, refTreat, addZero = FALSE, zeroTreat = NULL, timerange )
se |
A |
type |
|
geneID |
|
symbol |
|
condition |
|
treatment |
|
refTreat |
|
addZero |
|
zeroTreat |
|
timerange |
|
This function generates time series plots for a specified gene or feature
from a SummarizedExperiment (SE) object. The type of plot can be one
of the following:
- "expression": Plots normalized expression levels over time.
- "logFC": Plots log fold change (logFC) over time, comparing a treatment to
a reference treatment.
- "two-condition expression": Plots normalized expression levels over time
for two conditions.
The function can add a zero time point if specified and handles data with and without subject-specific information. The plot includes points for each time point and a summary line representing the mean value.
The x-axis represents time, and the y-axis represents the selected metric (normalized expression or logFC). The plot is customized with various aesthetic elements, such as point size, line type, axis labels, and title formatting.
A ggplot2 object representing the time series plot.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Plot a specific gene experssion over time timerange <- unique(se$timepoint) plotTimeSeries(result, type = "expression", geneID = "p18", symbol = "TMEM238", condition = "treatment", treatment = "EGF", timerange = timerange)library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Plot a specific gene experssion over time timerange <- unique(se$timepoint) plotTimeSeries(result, type = "expression", geneID = "p18", symbol = "TMEM238", condition = "treatment", treatment = "EGF", timerange = timerange)
plotVolcano generates a volcano plot to visualize differential
expression results.
plotVolcano(tableDE, pFilter = 0.05, fcFilter = 0.5)plotVolcano(tableDE, pFilter = 0.05, fcFilter = 0.5)
tableDE |
A |
pFilter |
A |
fcFilter |
A |
This function creates a volcano plot where differentially expressed genes are categorized as 'Up', 'Down', or 'Not Sig' based on the provided p-value and log2 fold-change thresholds. Points on the plot are color-coded to indicate their expression status.
A ggplot2 object representing the volcano plot.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis de <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") # Plot the volcano plot from the result plotVolcano(de$resDE)library(SummarizedExperiment) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis de <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") # Plot the volcano plot from the result plotVolcano(de$resDE)
preprocessPhos preprocesses phosphoproteome data stored in a
SummarizedExperiment object by performing filtering, transformation,
normalization, imputation, and batch effect removal.
preprocessPhos( seData, filterList = NULL, missCut = 50, transform = c("log2", "vst", "none"), normalize = FALSE, getFP = FALSE, removeOutlier = NULL, assayName = NULL, batch = NULL, scaleFactorTab = NULL, impute = c("none", "QRILC", "MLE", "bpca", "missForest", "MinDet"), verbose = FALSE )preprocessPhos( seData, filterList = NULL, missCut = 50, transform = c("log2", "vst", "none"), normalize = FALSE, getFP = FALSE, removeOutlier = NULL, assayName = NULL, batch = NULL, scaleFactorTab = NULL, impute = c("none", "QRILC", "MLE", "bpca", "missForest", "MinDet"), verbose = FALSE )
seData |
A |
filterList |
A |
missCut |
|
transform |
|
normalize |
|
getFP |
|
removeOutlier |
|
assayName |
|
batch |
|
scaleFactorTab |
|
impute |
|
verbose |
|
A SummarizedExperiment object with preprocessed
phosphoproteome data.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Call the function preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC")library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Phosphoproteome"]] colData(se) <- colData(dia_example) # Call the function preprocessPhos(seData = se, normalize = TRUE, impute = "QRILC")
preprocessProteome preprocesses proteome data stored in a
SummarizedExperiment object by performing filtering, transformation,
normalization, imputation, and batch effect removal.
preprocessProteome( seData, filterList = NULL, missCut = 50, transform = c("log2", "vst", "none"), normalize = FALSE, getPP = FALSE, removeOutlier = NULL, impute = c("none", "QRILC", "MLE", "bpca", "missForest", "MinDet"), batch = NULL, verbose = FALSE, scaleFactorTab = NULL )preprocessProteome( seData, filterList = NULL, missCut = 50, transform = c("log2", "vst", "none"), normalize = FALSE, getPP = FALSE, removeOutlier = NULL, impute = c("none", "QRILC", "MLE", "bpca", "missForest", "MinDet"), batch = NULL, verbose = FALSE, scaleFactorTab = NULL )
seData |
A |
filterList |
A |
missCut |
|
transform |
|
normalize |
|
getPP |
|
removeOutlier |
|
impute |
|
batch |
|
verbose |
|
scaleFactorTab |
|
A SummarizedExperiment object with preprocessed proteome data.
library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Proteome"]] colData(se) <- colData(dia_example) # Call the function preprocessProteome(seData = se, normalize = TRUE, impute = "QRILC")library(SummarizedExperiment) # Load multiAssayExperiment object data("dia_example") # Get SummarizedExperiment object se <- dia_example[["Proteome"]] colData(se) <- colData(dia_example) # Call the function preprocessProteome(seData = se, normalize = TRUE, impute = "QRILC")
readExperiment reads and processes DDA (Data-Dependent Acquisition)
phosphoproteomic and proteomic data from a given file table, and returns a
MultiAssayExperiment object.
readExperiment( fileTable, localProbCut = 0.75, scoreDiffCut = 5, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE, annotation_col = c(), verbose = FALSE )readExperiment( fileTable, localProbCut = 0.75, scoreDiffCut = 5, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE, annotation_col = c(), verbose = FALSE )
fileTable |
A |
localProbCut |
|
scoreDiffCut |
|
fdrCut |
|
scoreCut |
|
pepNumCut |
|
ifLFQ |
|
annotation_col |
A |
verbose |
|
The function performs the following steps:
Reads and processes the phosphoproteomic data using the
readPhosphoExperiment function.
Reads and processes the proteomic data using the
readProteomeExperiment function.
Prepares the sample annotation table.
Constructs and returns a MultiAssayExperiment object
containing the processed data.
A MultiAssayExperiment object containing the processed
phosphoproteomic and proteomic data from a DDA experiment.
# Example usage: file1 <- system.file("extdata", "phosDDA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDDA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome"), fileName = c(file1, file2), sample = c("Sample1", "sample1"), id = c("s1", "s2") ) # Call the function readExperiment(fileTable, localProbCut = 0.75, scoreDiffCut = 5, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE, annotation_col = c("id"))# Example usage: file1 <- system.file("extdata", "phosDDA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDDA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome"), fileName = c(file1, file2), sample = c("Sample1", "sample1"), id = c("s1", "s2") ) # Call the function readExperiment(fileTable, localProbCut = 0.75, scoreDiffCut = 5, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE, annotation_col = c("id"))
readExperimentDIA reads and processes DIA (Data-Independent
Acquisition) data for both phosphoproteome and proteome experiments, and
constructs a MultiAssayExperiment object.
readExperimentDIA( fileTable, localProbCut = 0.75, annotation_col = c(), onlyReviewed = TRUE, normalizeByProtein = FALSE, verbose = FALSE )readExperimentDIA( fileTable, localProbCut = 0.75, annotation_col = c(), onlyReviewed = TRUE, normalizeByProtein = FALSE, verbose = FALSE )
fileTable |
A |
localProbCut |
|
annotation_col |
A |
onlyReviewed |
A |
normalizeByProtein |
|
verbose |
|
The function performs the following steps:
Reads and processes phosphoproteomic data using
readPhosphoExperimentDIA.
Reads and processes proteomic data using
readProteomeExperimentDIA.
Prepares sample annotations based on the provided fileTable and annotation_col.
Constructs a MultiAssayExperiment object with the processed
data and sample annotations.
The readPhosphoExperimentDIA and readProteomeExperimentDIA
functions are used to read and filter the data for phosphoproteome and
proteome experiments, respectively, and they must be available in the
environment.
A MultiAssayExperiment object containing the processed
phosphoproteome and proteome data.
# Example usage: file1 <- system.file("extdata", "phosDIA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDIA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome", "proteome"), fileName = c(file1, file2, file2), id = c("Sample_1", "sample1", "sample2"), outputID = c("s1", "s2", "s3") ) # Call the function readExperimentDIA(fileTable, localProbCut = 0.75, annotation_col = c("id"), onlyReviewed = FALSE, normalizeByProtein = FALSE)# Example usage: file1 <- system.file("extdata", "phosDIA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDIA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome", "proteome"), fileName = c(file1, file2, file2), id = c("Sample_1", "sample1", "sample2"), outputID = c("s1", "s2", "s3") ) # Call the function readExperimentDIA(fileTable, localProbCut = 0.75, annotation_col = c("id"), onlyReviewed = FALSE, normalizeByProtein = FALSE)
readOnePhos reads phosphorylation data from an input table, filters it
based on localization probability, score difference, and intensity, and
returns the filtered data for a specific sample.
readOnePhos( inputTab, sampleName, localProbCut = 0.75, scoreDiffCut = 5, multiMap )readOnePhos( inputTab, sampleName, localProbCut = 0.75, scoreDiffCut = 5, multiMap )
inputTab |
A |
sampleName |
A |
localProbCut |
A |
scoreDiffCut |
A |
multiMap |
A |
The function filters the input phosphorylation data based on three criteria: localization probability, score difference, and intensity. Only rows that meet or exceed the specified cutoffs for these criteria and have non-zero intensity are retained. The filtered data is then returned with a unique identifier for each row.
A data.frame containing the filtered phosphorylation data for
the specified sample, with columns for intensity, Uniprot ID, gene name,
position within proteins, amino acid residue, and sequence window.
readOnePhosDIA reads and processes phosphorylation data for a single
sample from a DIA experiment, applying filters for localization probability
and removing duplicates if specified.
readOnePhosDIA(inputTab, sampleName, localProbCut = 0.75, removeDup = FALSE)readOnePhosDIA(inputTab, sampleName, localProbCut = 0.75, removeDup = FALSE)
inputTab |
A |
sampleName |
A |
localProbCut |
A |
removeDup |
A |
This function processes phosphorylation data for a single sample by filtering based on localization probability and non-zero intensity. It handles multiplicity by summarizing intensities and optionally removes duplicates. The resulting data is returned as a data.table with unique identifiers.
A data.table containing the processed phosphorylation data
for the specified sample.
readOneProteom reads and processes proteomics data for a single
sample, applying filters for peptide count and optionally using LFQ
quantification. It returns a data.table with useful columns and
unique identifiers.
readOneProteom(inputTab, sampleName, pepNumCut = 1, ifLFQ = TRUE)readOneProteom(inputTab, sampleName, pepNumCut = 1, ifLFQ = TRUE)
inputTab |
A |
sampleName |
A |
pepNumCut |
A |
ifLFQ |
A |
This function processes proteomics data for a single sample by filtering based on the number of peptides and optionally using LFQ quantification. It ensures that unique identifiers are created for each protein, and removes rows with missing or zero quantification values.
A data.table with the processed proteomics data, including
columns for intensity, Uniprot ID, peptide counts, and gene names.
readOneProteomDIA reads and processes data from a single DIA
proteomics sample, applying filtering and data transformation steps.
readOneProteomDIA(inputTab, sampleName)readOneProteomDIA(inputTab, sampleName)
inputTab |
A |
sampleName |
A |
This function processes DIA proteomics data for a single sample by filtering out rows with non-quantitative data, converting character values to numeric, and renaming columns for consistency. It also ensures that each protein group has a unique identifier.
A data.table containing the processed data with columns for
intensity, UniProt ID, and gene name.
readPhosphoExperiment reads and processes phosphorylation experiment
data from multiple files, filtering based on localization probability and
score difference, and constructs a SummarizedExperiment object.
readPhosphoExperiment(fileTable, localProbCut = 0.75, scoreDiffCut = 5)readPhosphoExperiment(fileTable, localProbCut = 0.75, scoreDiffCut = 5)
fileTable |
A |
localProbCut |
A |
scoreDiffCut |
A |
This function reads phosphorylation data from multiple files as specified in
fileTable, filters the data based on localization probability and score
difference, and removes reverse and potential contaminant entries. It
constructs an intensity matrix and annotation data, which are then used to
create a SummarizedExperiment object.
A SummarizedExperiment object containing the processed
phosphorylation data.
file1 <- system.file("extdata", "phosDDA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDDA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome"), fileName = c(file1, file2), sample = c("Sample1", "sample1"), id = c("s1", "s2") ) # Call the function readPhosphoExperiment(fileTable, localProbCut = 0.75, scoreDiffCut = 5)file1 <- system.file("extdata", "phosDDA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDDA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome"), fileName = c(file1, file2), sample = c("Sample1", "sample1"), id = c("s1", "s2") ) # Call the function readPhosphoExperiment(fileTable, localProbCut = 0.75, scoreDiffCut = 5)
readPhosphoExperimentDIA reads and processes phosphorylation data
from DIA experiments, applying filters for localization probability, and
optionally including only reviewed proteins. It constructs a
SummarizedExperiment object.
readPhosphoExperimentDIA( fileTable, localProbCut = 0.75, onlyReviewed = TRUE, showProgressBar = FALSE )readPhosphoExperimentDIA( fileTable, localProbCut = 0.75, onlyReviewed = TRUE, showProgressBar = FALSE )
fileTable |
A |
localProbCut |
A |
onlyReviewed |
A |
showProgressBar |
A |
This function processes phosphorylation data from DIA experiments by
filtering based on localization probability and non-zero intensity,
handling multiplicity, and optionally including only reviewed proteins.
The resulting data is returned as a SummarizedExperiment object with
annotations and an intensity matrix.
A SummarizedExperiment object containing the processed
phosphorylation data.
file <- system.file("extdata", "phosDIA_1.xls", package = "SmartPhos") fileTable <- data.frame(searchType = "phosphoproteome", fileName = file, id = c("Sample_1")) readPhosphoExperimentDIA(fileTable, localProbCut = 0.75, onlyReviewed = FALSE, showProgressBar = FALSE)file <- system.file("extdata", "phosDIA_1.xls", package = "SmartPhos") fileTable <- data.frame(searchType = "phosphoproteome", fileName = file, id = c("Sample_1")) readPhosphoExperimentDIA(fileTable, localProbCut = 0.75, onlyReviewed = FALSE, showProgressBar = FALSE)
readProteomeExperiment reads and processes proteomics data from
multiple samples, applying various quality filters, and returns a
SummarizedExperiment object.
readProteomeExperiment( fileTable, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE )readProteomeExperiment( fileTable, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE )
fileTable |
A |
fdrCut |
A |
scoreCut |
A |
pepNumCut |
A |
ifLFQ |
A |
This function processes proteomics data by filtering based on FDR, score,
and peptide count, and optionally using LFQ quantification. It aggregates
the data from multiple samples and constructs a SummarizedExperiment
object.
A SummarizedExperiment object containing the processed
proteomics data.
file1 <- system.file("extdata", "phosDDA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDDA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome"), fileName = c(file1, file2), sample = c("Sample1", "sample1"), id = c("s1", "s2") ) # Call the function readProteomeExperiment(fileTable, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE)file1 <- system.file("extdata", "phosDDA_1.xls", package = "SmartPhos") file2 <- system.file("extdata", "proteomeDDA_1.xls", package = "SmartPhos") # Create fileTable fileTable <- data.frame( searchType = c("phosphoproteome", "proteome"), fileName = c(file1, file2), sample = c("Sample1", "sample1"), id = c("s1", "s2") ) # Call the function readProteomeExperiment(fileTable, fdrCut = 0.1, scoreCut = 10, pepNumCut = 1, ifLFQ = TRUE)
readProteomeExperimentDIA reads and processes DIA (Data-Independent
Acquisition) proteome data from multiple files and constructs a
SummarizedExperiment object.
readProteomeExperimentDIA(fileTable, showProgressBar = FALSE)readProteomeExperimentDIA(fileTable, showProgressBar = FALSE)
fileTable |
A |
showProgressBar |
|
A SummarizedExperiment object containing the processed
proteome data.
#' @details
The function performs the following steps:
Filters the 'fileTable' to include only rows where 'searchType' is "proteome".
For each file specified in 'fileTable', reads the data using 'data.table::fread'.
Removes rows where the 'PG.ProteinGroups' column is NA or empty.
Processes each sample in parallel using 'BiocParallel::bplapply', applying the 'readOneProteomDIA' function to filter and clean the data for each sample.
Combines the processed data from all files.
Constructs a matrix of intensities with rows corresponding to proteins and columns corresponding to samples.
Constructs a 'SummarizedExperiment' object with the intensity matrix and protein annotations.
The readOneProteomDIA function is used to read and filter the data for
each individual sample, and it must be available in the environment.
file <- system.file("extdata", "proteomeDIA_1.xls", package = "SmartPhos") fileTable <- data.frame(searchType = "proteome", fileName = file, id = c("sample1", "sample2")) readProteomeExperimentDIA(fileTable)file <- system.file("extdata", "proteomeDIA_1.xls", package = "SmartPhos") fileTable <- data.frame(searchType = "proteome", fileName = file, id = c("sample1", "sample2")) readProteomeExperimentDIA(fileTable)
runFisher performs Fisher's Exact Test to determine the enrichment of
a set of genes within reference gene sets.
runFisher(genes, reference, inputSet, ptm = FALSE)runFisher(genes, reference, inputSet, ptm = FALSE)
genes |
A |
reference |
A |
inputSet |
A |
ptm |
|
The function can operate in two modes: standard gene sets and PTM-specific gene sets. For PTM-specific gene sets, additional filtering and processing are performed.
A data frame with the results of the Fisher's Exact Test,
including the gene set name, the number of genes in the set, set size,
p-value, adjusted p-value, and the genes in the set.
library(SummarizedExperiment) library(piano) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis de <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") genesList <- unique(de$resDE$Gene) referenceList <- unique(SummarizedExperiment::rowData(result)$Gene) genesetPath <- appDir <- system.file("shiny-app/geneset", package = "SmartPhos") inGMT <- loadGSC(paste0(genesetPath,"/Cancer_Hallmark.gmt"),type="gmt") # Run the function runFisher(genes = genesList, reference = referenceList, inputSet = inGMT)library(SummarizedExperiment) library(piano) # Load multiAssayExperiment object data("dda_example") # Get SummarizedExperiment object se <- dda_example[["Proteome"]] colData(se) <- colData(dda_example) # Preprocess the proteome assay result <- preprocessProteome(se, normalize = TRUE) # Call the function to perform differential expression analyis de <- performDifferentialExp(se = result, assay = "Intensity", method = "limma", reference = "1stCrtl", target = "EGF", condition = "treatment") genesList <- unique(de$resDE$Gene) referenceList <- unique(SummarizedExperiment::rowData(result)$Gene) genesetPath <- appDir <- system.file("shiny-app/geneset", package = "SmartPhos") inGMT <- loadGSC(paste0(genesetPath,"/Cancer_Hallmark.gmt"),type="gmt") # Run the function runFisher(genes = genesList, reference = referenceList, inputSet = inGMT)
runGSEAforPhospho performs Gene Set Enrichment Analysis (GSEA) for
phosphorylation data.
runGSEAforPhospho( geneStat, ptmSetDb, nPerm, weight = 1, correl.type = c("rank", "symm.rank", "z.score"), statistic = c("Kolmogorov-Smirnov", "area.under.RES"), min.overlap = 5 )runGSEAforPhospho( geneStat, ptmSetDb, nPerm, weight = 1, correl.type = c("rank", "symm.rank", "z.score"), statistic = c("Kolmogorov-Smirnov", "area.under.RES"), min.overlap = 5 )
geneStat |
A |
ptmSetDb |
A |
nPerm |
A |
weight |
A |
correl.type |
A |
statistic |
A |
min.overlap |
A |
This function runs GSEA on phosphorylation data to identify enriched PTM sets. It calculates enrichment scores and p-values for each set, normalizes the scores, and adjusts p-values for multiple testing.
A tibble with enrichment scores and associated statistics for
each PTM set.
runPhosphoAdjustment performs phospho adjustment on a
MultiAssayExperiment object to normalize the phosphoproteome data.
runPhosphoAdjustment( maeData, normalization = FALSE, minOverlap = 3, completeness = 0, ncore = 1 )runPhosphoAdjustment( maeData, normalization = FALSE, minOverlap = 3, completeness = 0, ncore = 1 )
maeData |
A |
normalization |
A |
minOverlap |
A |
completeness |
A |
ncore |
A |
The function performs the following steps:
Defines an optimization function to minimize the sum of squared differences between pairs of samples.
Calculates the ratio matrix of phosphoproteome to full proteome data.
Subsets features based on completeness criteria.
Performs a sanity check to identify and exclude problematic samples.
Sets initial values for the adjustment factor based on column medians.
Estimates the adjustment factor using parallel optimization.
Adjusts the phosphoproteome measurements using the estimated adjustment factor.
A MultiAssayExperiment object with adjusted phosphoproteome
data.
runSmartPhos launches the SmartPhos Shiny application,
which provides an interactive interface for analyzing phosphoproteomic data.
runSmartPhos()runSmartPhos()
The runSmartPhos function locates the Shiny app directory
within the SmartPhos package and launches the application.
If the app directory cannot be found, the function will stop and prompt the
user to re-install the SmartPhos package.
The function does not return a value; it starts the Shiny application
for SmartPhos.
# To run the SmartPhos Shiny application, simply call: # runSmartPhos()# To run the SmartPhos Shiny application, simply call: # runSmartPhos()
splineFilter filters an expression matrix based on spline
models fitted to time-series data, optionally considering treatment and
subject ID.
splineFilter( exprMat, subjectID = NULL, time, df, pCut = 0.5, ifFDR = FALSE, treatment = NULL, refTreatment = NULL )splineFilter( exprMat, subjectID = NULL, time, df, pCut = 0.5, ifFDR = FALSE, treatment = NULL, refTreatment = NULL )
exprMat |
A |
subjectID |
|
time |
A |
df |
A |
pCut |
A |
ifFDR |
A |
treatment |
|
refTreatment |
|
The function performs the following steps:
Converts time points from minutes to hours if both units are present.
Removes rows with missing values from the expression matrix.
Constructs a design matrix for the spline model, optionally including subject IDs and treatments.
Fits a linear model using the design matrix and performs empirical Bayes moderation.
Extracts significant features based on the specified p-value or FDR cutoff.
A filtered expression matrix containing only the features that
meet the significance criteria.
This is a high-quality, manually curated protein sequence database which provides a high level of annotations (such as the description of the function of a protein, structure of its domains, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases.
data(swissProt)data(swissProt)
an object of "tbl_df" (tidy table)
A data frame or tibble containing high-level
annotations for manually curated proteins.
data(swissProt)data(swissProt)