Package 'PRONE' reference manual

Title:	The PROteomics Normalization Evaluator
Description:	High-throughput omics data are often affected by systematic biases introduced throughout all the steps of a clinical study, from sample collection to quantification. Normalization methods aim to adjust for these biases to make the actual biological signal more prominent. However, selecting an appropriate normalization method is challenging due to the wide range of available approaches. Therefore, a comparative evaluation of unnormalized and normalized data is essential in identifying an appropriate normalization strategy for a specific data set. This R package provides different functions for preprocessing, normalizing, and evaluating different normalization approaches. Furthermore, normalization methods can be evaluated on downstream steps, such as differential expression analysis and statistical enrichment analysis. Spike-in data sets with known ground truth and real-world data sets of biological experiments acquired by either tandem mass tag (TMT) or label-free quantification (LFQ) can be analyzed.
Authors:	Lis Arend [aut, cre]
Maintainer:	Lis Arend <[email protected]>
License:	GPL (>= 3)
Version:	1.1.8
Built:	2025-03-22 09:23:18 UTC
Source:	https://github.com/bioc/PRONE

Apply other thresholds to DE results

Description

Apply other thresholds to DE results

Usage

apply_thresholds(
  de_res,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05
)
apply_thresholds(
  de_res,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05
)

Arguments

`de_res`	data table resulting of run_DE
`logFC`	Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE)
`logFC_up`	Upper log2 fold change threshold (dividing into up regulated)
`logFC_down`	Lower log2 fold change threshold (dividing into down regulated)
`p_adj`	Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE)
`alpha`	Threshold for adjusted p-values or p-values

Value

data table updating the Change column with the newly applied thresholds

Examples

data(tuberculosis_TMT_de_res)
de_res <- apply_thresholds(tuberculosis_TMT_de_res, logFC = FALSE,
                           p_adj = TRUE, alpha = 0.01)

data(tuberculosis_TMT_de_res)
de_res <- apply_thresholds(tuberculosis_TMT_de_res, logFC = FALSE,
                           p_adj = TRUE, alpha = 0.01)

Outlier detection via POMA R Package

Description

Outlier detection via POMA R Package

Usage

detect_outliers_POMA(
  se,
  ain = "log2",
  condition = NULL,
  method = "euclidean",
  type = "median",
  group = TRUE,
  coeff = 1.5
)
detect_outliers_POMA(
  se,
  ain = "log2",
  condition = NULL,
  method = "euclidean",
  type = "median",
  group = TRUE,
  coeff = 1.5
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which data type should be used (default raw)
`condition`	Column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`method`	String specifying the method that should be used to calculate the distance matrix
`type`	String specifying the type of distance calculation to centroid or spatial median
`group`	String specifying if the outlier detection should be performed multi-variate (with conditions) or on the complete data set
`coeff`	This value corresponds to the classical 1.5 in Q3 + 1.5 * IQR formula to detect outliers. By changing this value, the permissiveness in outlier detection will change.

Value

list of two ggplot objects and a data.table with outlier samples

Examples

data(tuberculosis_TMT_se)
poma_res <- detect_outliers_POMA(tuberculosis_TMT_se, ain="raw",
                                 condition = NULL, method="euclidean",
                                 type="median", group=TRUE, coeff = 1.5)
data(tuberculosis_TMT_se)
poma_res <- detect_outliers_POMA(tuberculosis_TMT_se, ain="raw",
                                 condition = NULL, method="euclidean",
                                 type="median", group=TRUE, coeff = 1.5)

EigenMS Normalization

Description

EigenMS fits an analysis of variance model to estimate the effects of the experimental factors on the data using the knowledge about the experimental design, and then applies singular value decomposition to identify systematic trends contributing to significant variation not explained by the experimental factors Log2-scaled data should be used as input (on_raw = FALSE).

Usage

eigenMSNorm(se, ain = "log2", aout = "EigenMS", on_raw = FALSE)
eigenMSNorm(se, ain = "log2", aout = "EigenMS", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the EigenMS normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- eigenMSNorm(tuberculosis_TMT_se, ain = "log2",
                                    aout = "EigenMS", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- eigenMSNorm(tuberculosis_TMT_se, ain = "log2",
                                    aout = "EigenMS", on_raw = FALSE)

Export the SummarizedExperiment object, the meta data, and the normalized data.

Description

Export the SummarizedExperiment object, the meta data, and the normalized data.

Usage

export_data(se, out_dir, ain = NULL)
export_data(se, out_dir, ain = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`out_dir`	Path of output directory
`ain`	Vector of strings which assay should be downloaded (default NULL). If NULL then all assays of the se object are saved.

Value

Nothing

Examples

data(tuberculosis_TMT_se)
## Not run: export_data(tuberculosis_TMT_se, out_dir = "data/",
            ain = c("IRS_on_RobNorm", "IRS_on_Median"))
## End(Not run)

data(tuberculosis_TMT_se)
## Not run: export_data(tuberculosis_TMT_se, out_dir = "data/",
            ain = c("IRS_on_RobNorm", "IRS_on_Median"))
## End(Not run)

Extract consensus DE candidates

Description

Extract consensus DE candidates

Usage

extract_consensus_DE_candidates(
  de_res,
  ain = NULL,
  comparisons = NULL,
  norm_thr = 0.8,
  per_comparison = FALSE
)
extract_consensus_DE_candidates(
  de_res,
  ain = NULL,
  comparisons = NULL,
  norm_thr = 0.8,
  per_comparison = FALSE
)

Arguments

`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`comparisons`	Vector of comparisons (must be valid comparisons saved in de_res)
`norm_thr`	Threshold for the number of normalization methods that must agree on a DE candidate
`per_comparison`	Logical indicating if the consensus should be calculated per comparison

Value

data table with consensus DE candidates

Examples

data(tuberculosis_TMT_de_res)
extract_consensus_DE_candidates(tuberculosis_TMT_de_res, ain = NULL,
                  comparisons = NULL, norm_thr = 0.8, per_comparison = TRUE)

data(tuberculosis_TMT_de_res)
extract_consensus_DE_candidates(tuberculosis_TMT_de_res, ain = NULL,
                  comparisons = NULL, norm_thr = 0.8, per_comparison = TRUE)

Extract the DE results from eBayes fit of perform_limma function.

Description

Extract the DE results from eBayes fit of perform_limma function.

Usage

extract_limma_DE(
  fit,
  comparisons,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05
)
extract_limma_DE(
  fit,
  comparisons,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05
)

Arguments

`fit`	eBayes object resulting from perform_limma method
`comparisons`	Vector of comparisons that are performed in the DE analysis (from specify_comparisons method)
`logFC`	Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE)
`logFC_up`	Upper log2 fold change threshold (dividing into up regulated)
`logFC_down`	Lower log2 fold change threshold (dividing into down regulated)
`p_adj`	Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE)
`alpha`	Threshold for adjusted p-values or p-values

Value

Data table with limma DE results

Remove proteins with NAs in all samples

Description

Remove proteins with NAs in all samples

Usage

filter_out_complete_NA_proteins(se)
filter_out_complete_NA_proteins(se)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set

Value

filtered SummarizedExperiment object

Examples

data(tuberculosis_TMT_se)
tuberulosis_TMT_se <- filter_out_complete_NA_proteins(tuberculosis_TMT_se)

data(tuberculosis_TMT_se)
tuberulosis_TMT_se <- filter_out_complete_NA_proteins(tuberculosis_TMT_se)

Filter proteins based on their NA pattern using a specific threshold

Description

Filter proteins based on their NA pattern using a specific threshold

Usage

filter_out_NA_proteins_by_threshold(se, thr = 0.8)
filter_out_NA_proteins_by_threshold(se, thr = 0.8)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`thr`	Threshold for the minimum fraction of valid values allowed for any protein

Value

filtered SummarizedExperiment object

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- filter_out_NA_proteins_by_threshold(tuberculosis_TMT_se,
                                                       thr = 0.8)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- filter_out_NA_proteins_by_threshold(tuberculosis_TMT_se,
                                                       thr = 0.8)

Remove proteins by their ID

Description

Remove proteins by their ID

Usage

filter_out_proteins_by_ID(se, protein_ids)
filter_out_proteins_by_ID(se, protein_ids)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`protein_ids`	Vector of protein IDs that should be kept

Value

filtered SummarizedExperiment object

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- filter_out_proteins_by_ID(tuberculosis_TMT_se,
                                protein_ids = c("P0A8V2", "P0A8V2"))

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- filter_out_proteins_by_ID(tuberculosis_TMT_se,
                                protein_ids = c("P0A8V2", "P0A8V2"))

Remove proteins by value in specific column

Description

Remove proteins by value in specific column

Usage

filter_out_proteins_by_value(se, column_name = "Reverse", values = c("+"))
filter_out_proteins_by_value(se, column_name = "Reverse", values = c("+"))

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`column_name`	name of column of which proteins with a specific value should be removed
`values`	value of the column defining the proteins that should be removed

Value

filtered SummarizedExperiment object

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- filter_out_proteins_by_value(tuberculosis_TMT_se,
                                 column_name = "Reverse", values = c("+"))

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- filter_out_proteins_by_value(tuberculosis_TMT_se,
                                 column_name = "Reverse", values = c("+"))

Function to get a long data table of all intensities of all kind of normalization

Description

Function to get a long data table of all intensities of all kind of normalization

Usage

get_complete_dt(se, ain = NULL)
get_complete_dt(se, ain = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which assay should be used as input (default NULL) If NULL then all normalization of the SummarizedExperiment object are plotted next to each other (except raw).

Value

data table

Function to get a long data table of all PCA1 and PCA2 values of all kind of normalization

Description

Function to get a long data table of all PCA1 and PCA2 values of all kind of normalization

Usage

get_complete_pca_dt(se, ain = NULL)
get_complete_pca_dt(se, ain = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which assay should be used as input (default NULL) If NULL then all normalization of the SummarizedExperiment object are plotted next to each other (except raw).

Value

data table

Function returning some values on the numbers of NA in the data

Description

Function returning some values on the numbers of NA in the data

Usage

get_NA_overview(se, ain = "log2")
get_NA_overview(se, ain = "log2")

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which data type should be used (default raw)

Value

list with total amount of values in the data, amount of NA values, and the percentage of NAs

Examples

data(tuberculosis_TMT_se)
get_NA_overview(tuberculosis_TMT_se, ain="log2")
data(tuberculosis_TMT_se)
get_NA_overview(tuberculosis_TMT_se, ain="log2")

Function to return available normalization methods' identifier names

Description

Function to return available normalization methods' identifier names

Usage

get_normalization_methods()
get_normalization_methods()

Value

Vector of normalization methods

Examples

get_normalization_methods()
  

get_normalization_methods()

Get overview table of DE results

Description

Get overview table of DE results

Usage

get_overview_DE(de_res)
get_overview_DE(de_res)

Arguments

de_res

data table resulting of run_DE

Value

data table of numbers of DE proteins per comparison and per normalization method

Examples

data(tuberculosis_TMT_de_res)
get_overview_DE(tuberculosis_TMT_de_res)

data(tuberculosis_TMT_de_res)
get_overview_DE(tuberculosis_TMT_de_res)

Get proteins by value in specific column

Description

Get proteins by value in specific column

Usage

get_proteins_by_value(se, column_name = "Reverse", values = c("+"))
get_proteins_by_value(se, column_name = "Reverse", values = c("+"))

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`column_name`	name of column of which proteins with a specific value should be identified
`values`	value of the column defining the proteins that should be identified

Value

vector of protein IDs

Examples

data(tuberculosis_TMT_se)
proteins <- get_proteins_by_value(tuberculosis_TMT_se,
                      column_name = "Potential.contaminant", values = c("+"))

data(tuberculosis_TMT_se)
proteins <- get_proteins_by_value(tuberculosis_TMT_se,
                      column_name = "Potential.contaminant", values = c("+"))

Get performance metrics of DE results of spike-in data set.

Description

Get performance metrics of DE results of spike-in data set.

Usage

get_spiked_stats_DE(se, de_res)
get_spiked_stats_DE(se, de_res)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`de_res`	data table resulting of run_DE

Value

data table with multiple performance metrics of the DE results

Examples

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)

Total Intensity Normalization

Description

Intensities of each variable in a sample are divided with the sum of intensities of all variables in the sample and multiplied with the median or mean of sum of intensities of all variables in all samples. Raw data should be taken as input (on_raw = TRUE).

Usage

globalIntNorm(
  se,
  ain = "raw",
  aout = "GlobalMedian",
  type = "median",
  on_raw = TRUE
)
globalIntNorm(
  se,
  ain = "raw",
  aout = "GlobalMedian",
  type = "median",
  on_raw = TRUE
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`type`	String whether to use median or mean to calculate the scaling factor
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the total intensity normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- globalIntNorm(tuberculosis_TMT_se, ain = "raw",
                                           aout = "GlobalMedian",
                                           type = "median",
                                           on_raw = TRUE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- globalIntNorm(tuberculosis_TMT_se, ain = "raw",
                                           aout = "GlobalMedian",
                                           type = "median",
                                           on_raw = TRUE)

Total Intensity Normalization Using the Mean for the Calculation of Scaling Factors

Description

Intensities of each variable in a sample are divided with the sum of intensities of all variables in the sample and multiplied with the mean of sum of intensities of all variables in all samples. Raw data should be taken as input (on_raw = TRUE).

Usage

globalMeanNorm(se, ain = "raw", aout = "GlobalMean", on_raw = TRUE)
globalMeanNorm(se, ain = "raw", aout = "GlobalMean", on_raw = TRUE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the total intensity normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- globalMeanNorm(tuberculosis_TMT_se, ain = "raw",
                              aout = "GlobalMean", on_raw = TRUE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- globalMeanNorm(tuberculosis_TMT_se, ain = "raw",
                              aout = "GlobalMean", on_raw = TRUE)

Total Intensity Normalization Using the Median for the Calculation of Scaling Factors

Description

Intensities of each variable in a sample are divided with the sum of intensities of all variables in the sample and multiplied with the median of sum of intensities of all variables in all samples. Raw data should be taken as input (on_raw = TRUE).

Usage

globalMedianNorm(se, ain = "raw", aout = "GlobalMedian", on_raw = TRUE)
globalMedianNorm(se, ain = "raw", aout = "GlobalMedian", on_raw = TRUE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the total intensity normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- globalMedianNorm(tuberculosis_TMT_se, ain = "raw",
                              aout = "GlobalMedian", on_raw = TRUE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- globalMedianNorm(tuberculosis_TMT_se, ain = "raw",
                              aout = "GlobalMedian", on_raw = TRUE)

Method to impute SummarizedExperiment. This method performs a mixed imputation on the proteins. It uses a k-nearest neighbor imputation for proteins with missing values at random (MAR) and imputes missing values by random draws from a left-shifted Gaussian distribution for proteins with missing values not at random (MNAR).

Description

Method to impute SummarizedExperiment. This method performs a mixed imputation on the proteins. It uses a k-nearest neighbor imputation for proteins with missing values at random (MAR) and imputes missing values by random draws from a left-shifted Gaussian distribution for proteins with missing values not at random (MNAR).

Usage

impute_se(se, ain = NULL, condition = NULL)
impute_se(se, ain = NULL, condition = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics dataset
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`condition`	name of column of colData(se) representing the conditions of the data

Value

SummarizedExperiment with imputed intensities

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se,
                          column = "Label", values = c("1.HC_Pool1"))
tuberculosis_TMT_se <- impute_se(tuberculosis_TMT_se, ain = NULL,
                          condition = NULL)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se,
                          column = "Label", values = c("1.HC_Pool1"))
tuberculosis_TMT_se <- impute_se(tuberculosis_TMT_se, ain = NULL,
                          condition = NULL)

Internal Reference Scaling Normalization

Description

IRS makes different measurements of the same thing all exactly the same and puts all of the intensities on the same scale. Raw data should be taken as input (on_raw = TRUE)

Usage

irsNorm(se, ain = "raw", aout = "IRS", on_raw = TRUE)
irsNorm(se, ain = "raw", aout = "IRS", on_raw = TRUE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the IRS normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- irsNorm(tuberculosis_TMT_se, ain = "raw",
                                aout = "IRS", on_raw = TRUE)
data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- irsNorm(tuberculosis_TMT_se, ain = "raw",
                                aout = "IRS", on_raw = TRUE)

limma::removeBatchEffects (limBE)

Description

Log2-scaled data should be used as input (on_raw = FALSE).

Usage

limmaNorm(se, ain = "log2", aout = "limBE", on_raw = FALSE)
limmaNorm(se, ain = "log2", aout = "limBE", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the limBE normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- limmaNorm(tuberculosis_TMT_se, ain = "log2",
                                      aout = "limBE", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- limmaNorm(tuberculosis_TMT_se, ain = "log2",
                                      aout = "limBE", on_raw = FALSE)

Load real-world proteomics data into a SummarizedExperiment

Description

Load real-world proteomics data into a SummarizedExperiment

Usage

load_data(
  data,
  md,
  protein_column = "Protein.IDs",
  gene_column = "Gene.Names",
  ref_samples = NULL,
  batch_column = NULL,
  condition_column = NULL,
  label_column = NULL
)
load_data(
  data,
  md,
  protein_column = "Protein.IDs",
  gene_column = "Gene.Names",
  ref_samples = NULL,
  batch_column = NULL,
  condition_column = NULL,
  label_column = NULL
)

Arguments

`data`	tabular data table with rows = proteins and columns = samples (such as proteinGroups.txt of MaxQuant)
`md`	experimental design table (requires a column named "Column" for the column names of the sample intensities in data)
`protein_column`	name of the column in data containing the protein IDs
`gene_column`	name of the column in data containing the gene names
`ref_samples`	reference samples if TMT experiment provided (names of samples)
`batch_column`	name of the column in md defining the batches
`condition_column`	name of the column in md defining the condition (can still be changed afterwards)
`label_column`	name of the column in md containing simple sample names (for visualization)

Value

SummarizedExperiment object

Examples

data_path <- readPRONE_example("tuberculosis_protein_intensities.csv")
md_path <- readPRONE_example("tuberculosis_metadata.csv")
data <- read.csv(data_path)
md <- read.csv(md_path)
md$Column <- stringr::str_replace_all(md$Column, " ", ".")
ref_samples <- md[md$Group == "ref",]$Column
se <- load_data(data, md, protein_column = "Protein.IDs",
               gene_column = "Gene.names", ref_samples = ref_samples,
               batch_column = "Pool", condition_column = "Group",
               label_column = "Label")

data_path <- readPRONE_example("tuberculosis_protein_intensities.csv")
md_path <- readPRONE_example("tuberculosis_metadata.csv")
data <- read.csv(data_path)
md <- read.csv(md_path)
md$Column <- stringr::str_replace_all(md$Column, " ", ".")
ref_samples <- md[md$Group == "ref",]$Column
se <- load_data(data, md, protein_column = "Protein.IDs",
               gene_column = "Gene.names", ref_samples = ref_samples,
               batch_column = "Pool", condition_column = "Group",
               label_column = "Label")

Load spike-in proteomics data into a SummarizedExperiment

Description

Load spike-in proteomics data into a SummarizedExperiment

Usage

load_spike_data(
  data,
  md,
  spike_column,
  spike_value,
  spike_concentration,
  protein_column = "Protein.IDs",
  gene_column = "Gene.Names",
  ref_samples = NULL,
  batch_column = NULL,
  condition_column = NULL,
  label_column = NULL
)
load_spike_data(
  data,
  md,
  spike_column,
  spike_value,
  spike_concentration,
  protein_column = "Protein.IDs",
  gene_column = "Gene.Names",
  ref_samples = NULL,
  batch_column = NULL,
  condition_column = NULL,
  label_column = NULL
)

Arguments

`data`	tabular data table with rows = proteins and columns = samples (such as proteinGroups.txt of MaxQuant)
`md`	experimental design table (requires a column named "Column" for the column names of the sample intensities in data)
`spike_column`	name of the column specifying which proteins are the spike-ins
`spike_value`	String value specifying the spike-in proteins in the spike-in column
`spike_concentration`	name of the column in md defining the spike-in concentration per sample
`protein_column`	name of the column in data containing the protein IDs
`gene_column`	name of the column in data containing the gene names
`ref_samples`	reference samples if TMT experiment provided (names of samples)
`batch_column`	name of the column in md defining the batches
`condition_column`	name of the column in md defining the condition (can still be changed afterwards)
`label_column`	name of the column in md containing simple sample names (for visualization)

Value

SummarizedExperiment object

Examples

data_path <- readPRONE_example("Ecoli_human_MaxLFQ_protein_intensities.csv")
md_path <- readPRONE_example("Ecoli_human_MaxLFQ_metadata.csv")
data <- read.csv(data_path)
md <- read.csv(md_path)
mixed <- grepl("Homo sapiens.*Escherichia|Escherichia.*Homo sapiens", data$Fasta.headers)
data <- data[!mixed,]
data$Spiked <- rep("HUMAN", nrow(data))
data$Spiked[grepl("ECOLI", data$Fasta.headers)] <- "ECOLI"
se <- load_spike_data(data, md, spike_column = "Spiked", spike_value = "ECOLI", 
        spike_concentration = "Concentration", protein_column = "Protein.IDs", 
        gene_column = "Gene.names", ref_samples = NULL, batch_column = NULL, 
        condition_column = "Condition", label_column = "Label")

data_path <- readPRONE_example("Ecoli_human_MaxLFQ_protein_intensities.csv")
md_path <- readPRONE_example("Ecoli_human_MaxLFQ_metadata.csv")
data <- read.csv(data_path)
md <- read.csv(md_path)
mixed <- grepl("Homo sapiens.*Escherichia|Escherichia.*Homo sapiens", data$Fasta.headers)
data <- data[!mixed,]
data$Spiked <- rep("HUMAN", nrow(data))
data$Spiked[grepl("ECOLI", data$Fasta.headers)] <- "ECOLI"
se <- load_spike_data(data, md, spike_column = "Spiked", spike_value = "ECOLI", 
        spike_concentration = "Concentration", protein_column = "Protein.IDs", 
        gene_column = "Gene.names", ref_samples = NULL, batch_column = NULL, 
        condition_column = "Condition", label_column = "Label")

Cyclic Loess Normalization of limma

Description

Two samples of the data are MA transformed and normalized at a time, and all pairs of samples are iterated through. Log2-scaled data should be taken as input (on_raw = FALSE).

Usage

loessCycNorm(se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)
loessCycNorm(se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the loessCyc normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- loessCycNorm(tuberculosis_TMT_se, ain = "log2",
                                      aout = "LoessCyc", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- loessCycNorm(tuberculosis_TMT_se, ain = "log2",
                                      aout = "LoessCyc", on_raw = FALSE)

Fast Loess Normalization of limma

Description

Using mean intensities over all the samples as its reference A sample. Log2-scaled data should be used as input (on_raw = FALSE).

Usage

loessFNorm(se, ain = "log2", aout = "LoessF", on_raw = FALSE)
loessFNorm(se, ain = "log2", aout = "LoessF", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the LoessF normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- loessFNorm(tuberculosis_TMT_se, ain = "log2",
                                    aout = "LoessCyc", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- loessFNorm(tuberculosis_TMT_se, ain = "log2",
                                    aout = "LoessCyc", on_raw = FALSE)

Mean Normalization

Description

The intensity of each protein group in a given sample is divided by the mean of the intensities of all protein groups in that sample and then multiplied by the mean of mean of sum of intensities of all protein groups in all samples.

Usage

meanNorm(se, ain = "raw", aout = "Mean", on_raw = TRUE)
meanNorm(se, ain = "raw", aout = "Mean", on_raw = TRUE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean whether normalized should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the mean normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- meanNorm(tuberculosis_TMT_se, ain = "raw",
                          aout = "Mean", on_raw = TRUE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- meanNorm(tuberculosis_TMT_se, ain = "raw",
                          aout = "Mean", on_raw = TRUE)

Median Absolute Deviation Normalization

Description

Substracts the median and divides the data by the median absolute deviation (MAD). Log2-scaled data should be used as input (on_raw = FALSE).

Usage

medianAbsDevNorm(se, ain = "log2", aout = "MAD", on_raw = FALSE)
medianAbsDevNorm(se, ain = "log2", aout = "MAD", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scale data

Value

SummarizedExperiment containing the MAD normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- medianAbsDevNorm(tuberculosis_TMT_se, ain = "log2",
                                              aout = "MAD", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- medianAbsDevNorm(tuberculosis_TMT_se, ain = "log2",
                                              aout = "MAD", on_raw = FALSE)

Median Normalization

Description

The intensity of each protein group in a given sample is divided by the median of the intensities of all protein groups in that sample and then multiplied by the mean of median of sum of intensities of all protein groups in all samples.

Usage

medianNorm(se, ain = "raw", aout = "Median", on_raw = TRUE)
medianNorm(se, ain = "raw", aout = "Median", on_raw = TRUE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the median normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- medianNorm(tuberculosis_TMT_se, ain = "raw",
                          aout = "Median", on_raw = TRUE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- medianNorm(tuberculosis_TMT_se, ain = "raw",
                          aout = "Median", on_raw = TRUE)

Normalize SummarizedExperiment object using single normalization methods or specified combinations of normalization methods

Description

Normalize SummarizedExperiment object using single normalization methods or specified combinations of normalization methods

Usage

normalize_se(
  se,
  methods,
  combination_pattern = "_on_",
  on_raw = NULL,
  gamma.0 = 0.1,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  top_x = 50,
  VSN_quantile = 0.9
)
normalize_se(
  se,
  methods,
  combination_pattern = "_on_",
  on_raw = NULL,
  gamma.0 = 0.1,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  top_x = 50,
  VSN_quantile = 0.9
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`methods`	Vector of normalization methods to apply for normalizing the proteomics data of the SummarizedExperiment object (identifier of normalization methods can be retrieved using get_normalization_methods())
`combination_pattern`	String specifying how normalization methods are combined. For instance, methods = c("IRS", "Median_on_IRS"), combination_pattern = "_on_".
`on_raw`	Logical indicating if the normalization should be performed on the raw data or on log2-transformed data. If on_raw = NULL(default), the normalization is performed on the default method specific on_raw setting (suggestion based on publications).
`gamma.0`	Numeric representing the exponent of the weighted density of RobNorm normalization. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1.
`reduce_correlation_by`	If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc.
`NormicsVSN_quantile`	The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers.
`top_x`	Number of reference proteins extracted for the calculation of parameters
`VSN_quantile`	Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression (see vsn2 lts.quantile)

Value

SummarizedExperiment object with normalized data saved as assays

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normalize_se(tuberculosis_TMT_se,
           methods = c("IRS_on_GlobalMedian", "IRS_on_Median",
           "limBE_on_NormicsVSN"), on_raw = NULL,
           combination_pattern = "_on_", gamma.0 = 0.1,
           reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50,
           VSN_quantile = 0.9)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normalize_se(tuberculosis_TMT_se,
           methods = c("IRS_on_GlobalMedian", "IRS_on_Median",
           "limBE_on_NormicsVSN"), on_raw = NULL,
           combination_pattern = "_on_", gamma.0 = 0.1,
           reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50,
           VSN_quantile = 0.9)

Normalize SummarizedExperiment object using combinations of normalization methods

Description

Normalize SummarizedExperiment object using combinations of normalization methods

Usage

normalize_se_combination(
  se,
  methods,
  ains,
  on_raw = NULL,
  combination_pattern = "_on_",
  gamma.0 = 0.1,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  top_x = 50,
  VSN_quantile = 0.9
)
normalize_se_combination(
  se,
  methods,
  ains,
  on_raw = NULL,
  combination_pattern = "_on_",
  gamma.0 = 0.1,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  top_x = 50,
  VSN_quantile = 0.9
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`methods`	Vector of normalization methods to apply for normalizing the proteomics data of the SummarizedExperiment object (identifier of normalization methods can be retrieved using get_normalization_methods())
`ains`	Vector of assays of SummarizedExperiment object to apply the normalization methods (e.g. if you want to perform Median normalization on IRS-normalized data)
`on_raw`	Logical indicating if the normalization should be performed on the raw data or on log2-transformed data. If on_raw = NULL(default), the normalization is performed on the default method specific on_raw setting (suggestion based on publications).
`combination_pattern`	String to give name to combination of methods (e.g. IRS_on_Median –> "_on_")
`gamma.0`	Numeric representing the exponent of the weighted density of RobNorm normalization. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1.
`reduce_correlation_by`	If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc.
`NormicsVSN_quantile`	The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers.
`top_x`	Number of reference proteins extracted for the calculation of parameters
`VSN_quantile`	Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression. (see vsn2 lts.quantile)

Value

SummarizedExperiment object with normalized data saved as assays

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normalize_se_combination(tuberculosis_TMT_se,
          methods = c("Median","NormicsVSN"), ains = c("IRS"), on_raw = NULL,
          combination_pattern = "_on_", gamma.0 = 0.1,
          reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50,
          VSN_quantile = 0.9)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normalize_se_combination(tuberculosis_TMT_se,
          methods = c("Median","NormicsVSN"), ains = c("IRS"), on_raw = NULL,
          combination_pattern = "_on_", gamma.0 = 0.1,
          reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50,
          VSN_quantile = 0.9)

Normalize SummarizedExperiment object using different normalization methods

Description

Normalize SummarizedExperiment object using different normalization methods

Usage

normalize_se_single(
  se,
  methods = NULL,
  on_raw = NULL,
  gamma.0 = 0.1,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  top_x = 50,
  VSN_quantile = 0.9
)
normalize_se_single(
  se,
  methods = NULL,
  on_raw = NULL,
  gamma.0 = 0.1,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  top_x = 50,
  VSN_quantile = 0.9
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`methods`	Vector of normalization methods to apply for normalizing the proteomics data of the SummarizedExperiment object (identifier of normalization methods can be retrieved using get_normalization_methods())
`on_raw`	Logical indicating if the normalization should be performed on the raw data or on log2-transformed data. If on_raw = NULL(default), the normalization is performed on the default method specific on_raw setting (suggestion based on publications).
`gamma.0`	Numeric representing the exponent of the weighted density of RobNorm normalization. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1.
`reduce_correlation_by`	If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc.
`NormicsVSN_quantile`	The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers.
`top_x`	Number of reference proteins extracted for the calculation of parameters
`VSN_quantile`	Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression. (see vsn2 lts.quantile)

Value

SummarizedExperiment object with normalized data saved as assays

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normalize_se_single(tuberculosis_TMT_se,
          methods = c("RobNorm", "Median", "NormicsVSN","VSN"),
          on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1,
          NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normalize_se_single(tuberculosis_TMT_se,
          methods = c("RobNorm", "Median", "NormicsVSN","VSN"),
          on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1,
          NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)

Normics Normalization (Normics using VSN or using Median)

Description

Log2-scaled data should be used as input (on_raw = FALSE).

Usage

normicsNorm(
  se,
  ain = "raw",
  aout = "NormicsVSN",
  method = "NormicsVSN",
  on_raw = TRUE,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  TMT_ratio = FALSE,
  top_x = 50
)
normicsNorm(
  se,
  ain = "raw",
  aout = "NormicsVSN",
  method = "NormicsVSN",
  on_raw = TRUE,
  reduce_correlation_by = 1,
  NormicsVSN_quantile = 0.8,
  TMT_ratio = FALSE,
  top_x = 50
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`method`	String specifying the method to use (NORMICS or NORMICSmedian)
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data
`reduce_correlation_by`	If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc.
`NormicsVSN_quantile`	The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers.
`TMT_ratio`	Indicates if the data involves Tandem Mass Tag (TMT) ratio-based measurements (common in proteomics). If TRUE, the method may handle the data differently.
`top_x`	Number of reference proteins extracted for the calculation of parameters

Value

SummarizedExperiment containing the NormicsVSN/NormicsMedian normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normicsNorm(tuberculosis_TMT_se, ain = "raw",
                                aout = "NormicsVSN", method = "NormicsVSN",
                                on_raw = TRUE)


data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- normicsNorm(tuberculosis_TMT_se, ain = "raw",
                                aout = "NormicsVSN", method = "NormicsVSN",
                                on_raw = TRUE)

Perform DEqMS

Description

Perform DEqMS

Usage

perform_DEqMS(
  fit,
  se,
  DEqMS_PSMs_column = NULL,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05
)
perform_DEqMS(
  fit,
  se,
  DEqMS_PSMs_column = NULL,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05
)

Arguments

`fit`	eBayes object resulting from perform_limma method
`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`DEqMS_PSMs_column`	String specifying which column name to use for DEqMS (default NULL). Any column of the rowData(se) is accepted.
`logFC`	Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE)
`logFC_up`	Upper log2 fold change threshold (dividing into up regulated)
`logFC_down`	Lower log2 fold change threshold (dividing into down regulated)
`p_adj`	Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE)
`alpha`	Threshold for adjusted p-values or p-values

Value

data.table of DE results

Fitting a linear model using limma

Description

Fitting a linear model using limma

Usage

perform_limma(
  data,
  condition_vector,
  comparisons,
  covariate = NULL,
  trend = TRUE,
  robust = TRUE
)
perform_limma(
  data,
  condition_vector,
  comparisons,
  covariate = NULL,
  trend = TRUE,
  robust = TRUE
)

Arguments

`data`	Data table of intensities (rows = proteins, cols = samples)
`condition_vector`	Vector of experimental design specifying the condition(s) to compare
`comparisons`	Vector of comparisons that are performed in the DE analysis (from specify_comparisons method)
`covariate`	String specifying which column to include as covariate into limma
`trend`	logical, should an intensity-dependent trend be allowed for the prior variance? If FALSE then the prior variance is constant. Alternatively, trend can be a row-wise numeric vector, which will be used as the covariate for the prior variance.
`robust`	logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?

Value

eBayes object

Performing ROTS

Description

Performing ROTS

Usage

perform_ROTS(
  data,
  condition,
  comparisons,
  condition_name,
  coldata,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05,
  B = 100,
  K = 500
)
perform_ROTS(
  data,
  condition,
  comparisons,
  condition_name,
  coldata,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05,
  B = 100,
  K = 500
)

Arguments

`data`	Data table of intensities (rows = proteins, cols = samples)
`condition`	Vector of experimental design specifying the condition(s) to compare
`comparisons`	Vector of comparisons that are performed in the DE analysis (from specify_comparisons method)
`condition_name`	String of name of condition in colData
`coldata`	colData of the SummarizedExperiment
`logFC`	Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE)
`logFC_up`	Upper log2 fold change threshold (dividing into up regulated)
`logFC_down`	Lower log2 fold change threshold (dividing into down regulated)
`p_adj`	Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE)
`alpha`	Threshold for adjusted p-values or p-values
`B`	Number of bootstrapping for ROTS
`K`	Number of top-ranked features for reproducibility optimization

Value

Data table with DE results

Plot the distributions of the normalized data as boxplots

Description

Plot the distributions of the normalized data as boxplots

Usage

plot_boxplots(
  se,
  ain = NULL,
  color_by = NULL,
  label_by = NULL,
  facet_norm = TRUE,
  ncol = 3
)
plot_boxplots(
  se,
  ain = NULL,
  color_by = NULL,
  label_by = NULL,
  facet_norm = TRUE,
  ncol = 3
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`label_by`	String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)
`facet_norm`	Boolean specifying whether to facet by normalization methods (default TRUE). If FALSE, list of plots of the different normalized data is returned.
`ncol`	Number of columns in plot (for faceting)

Value

if facet_norm = TRUE, ggplot object, else list of ggplot objects

Examples

data(tuberculosis_TMT_se)
plot_boxplots(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL,
              facet_norm = TRUE, ncol = 3)
plot_boxplots(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"), color_by = "Pool",
              label_by = "Label", facet_norm = FALSE)

data(tuberculosis_TMT_se)
plot_boxplots(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL,
              facet_norm = TRUE, ncol = 3)
plot_boxplots(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"), color_by = "Pool",
              label_by = "Label", facet_norm = FALSE)

Barplot showing the number of samples per condition

Description

Barplot showing the number of samples per condition

Usage

plot_condition_overview(se, condition = NULL)
plot_condition_overview(se, condition = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)

Value

ggplot object

Examples

data(tuberculosis_TMT_se)
plot_condition_overview(tuberculosis_TMT_se, condition = NULL)
data(tuberculosis_TMT_se)
plot_condition_overview(tuberculosis_TMT_se, condition = NULL)

Plot the densities of the normalized data

Description

Plot the densities of the normalized data

Usage

plot_densities(se, ain = NULL, color_by = NULL, facet_norm = TRUE, ncol = 3)
plot_densities(se, ain = NULL, color_by = NULL, facet_norm = TRUE, ncol = 3)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`facet_norm`	Boolean specifying whether to facet by normalization methods (default TRUE). If FALSE, list of plots of the different normalized data is returned.
`ncol`	Number of columns in plot (for faceting)

Value

if facet_norm = TRUE, ggplot object, else list of ggplot objects

Examples

data(tuberculosis_TMT_se)
plot_densities(tuberculosis_TMT_se, ain = NULL, color_by = NULL,
                          facet_norm = TRUE, ncol = 3)
plot_densities(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"),
                          color_by = "Label",
              facet_norm = FALSE)

data(tuberculosis_TMT_se)
plot_densities(tuberculosis_TMT_se, ain = NULL, color_by = NULL,
                          facet_norm = TRUE, ncol = 3)
plot_densities(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"),
                          color_by = "Label",
              facet_norm = FALSE)

Boxplot of log fold changes of spike-in and background proteins for specific normalization methods and comparisons. The ground truth (calculated based on the concentrations of the spike-ins) is shown as a horizontal line.

Description

Boxplot of log fold changes of spike-in and background proteins for specific normalization methods and comparisons. The ground truth (calculated based on the concentrations of the spike-ins) is shown as a horizontal line.

Usage

plot_fold_changes_spiked(se, de_res, condition, ain = NULL, comparisons = NULL)
plot_fold_changes_spiked(se, de_res, condition, ain = NULL, comparisons = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`de_res`	data table resulting of run_DE
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)

Value

ggplot object

Examples

data(spike_in_se)
data(spike_in_de_res)
plot_fold_changes_spiked(spike_in_se, spike_in_de_res,
                         condition = "Condition", ain = NULL,
                         comparisons = NULL)

data(spike_in_se)
data(spike_in_de_res)
plot_fold_changes_spiked(spike_in_se, spike_in_de_res,
                         condition = "Condition", ain = NULL,
                         comparisons = NULL)

Plot a heatmap of the sample intensities with optional column annotations for a selection of normalization methods

Description

Plot a heatmap of the sample intensities with optional column annotations for a selection of normalization methods

Usage

plot_heatmap(
  se,
  ain = NULL,
  color_by = c("Group", "Pool"),
  label_by = NULL,
  only_refs = FALSE
)
plot_heatmap(
  se,
  ain = NULL,
  color_by = c("Group", "Pool"),
  label_by = NULL,
  only_refs = FALSE
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`color_by`	Vector of strings specifying the columns to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bars added.)
`label_by`	String specifying the column in the metadata used to label the samples for the UpSet plot
`only_refs`	Logical, if TRUE, only reference samples (ComRef) are included in the plot

Value

list of ggplot objects

Examples

data(tuberculosis_TMT_se)
plot_heatmap(tuberculosis_TMT_se, ain = c("log2"), color_by = NULL,
             label_by = NULL, only_refs = FALSE)

data(tuberculosis_TMT_se)
plot_heatmap(tuberculosis_TMT_se, ain = c("log2"), color_by = NULL,
             label_by = NULL, only_refs = FALSE)

Heatmap of DE results

Description

Heatmap of DE results

Usage

plot_heatmap_DE(
  se,
  de_res,
  ain,
  comparison,
  condition = NULL,
  label_by = NULL,
  pvalue_column = "adj.P.Val",
  col_vector = NULL
)
plot_heatmap_DE(
  se,
  de_res,
  ain,
  comparison,
  condition = NULL,
  label_by = NULL,
  pvalue_column = "adj.P.Val",
  col_vector = NULL
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set (including the normalized intensities)
`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`comparison`	String of comparison (must be a valid comparison saved in de_res)
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`label_by`	String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)
`pvalue_column`	column name of p-values in de_res
`col_vector`	Vector of colors to use for the heatmap. If NULL, default colors are used.

Value

list of ComplexHeatmaps for each method

Examples

data(tuberculosis_TMT_se)
data(tuberculosis_TMT_de_res)
plot_heatmap_DE(tuberculosis_TMT_se, tuberculosis_TMT_de_res, ain = NULL,
                comparison = "PTB-HC",
                condition = NULL, label_by = NULL,
                pvalue_column = "adj.P.Val", col_vector = NULL)

data(tuberculosis_TMT_se)
data(tuberculosis_TMT_de_res)
plot_heatmap_DE(tuberculosis_TMT_se, tuberculosis_TMT_de_res, ain = NULL,
                comparison = "PTB-HC",
                condition = NULL, label_by = NULL,
                pvalue_column = "adj.P.Val", col_vector = NULL)

Plot histogram of the spike-in and background protein intensities per condition.

Description

Plot histogram of the spike-in and background protein intensities per condition.

Usage

plot_histogram_spiked(se, condition = NULL)
plot_histogram_spiked(se, condition = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)

Value

ggplot object

Examples

data(spike_in_se)
plot_histogram_spiked(spike_in_se, condition = NULL)

data(spike_in_se)
plot_histogram_spiked(spike_in_se, condition = NULL)

Plot number of identified spike-in proteins per sample.

Description

Plot number of identified spike-in proteins per sample.

Usage

plot_identified_spiked_proteins(se, color_by = NULL, label_by = NULL)
plot_identified_spiked_proteins(se, color_by = NULL, label_by = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`label_by`	String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)#'

Value

ggplot object

Examples

data(spike_in_se)
plot_identified_spiked_proteins(spike_in_se, color_by = NULL,
                                label_by = NULL)

data(spike_in_se)
plot_identified_spiked_proteins(spike_in_se, color_by = NULL,
                                label_by = NULL)

Functional enrichment analysis for analyzing the DE results of different normalization methods and biologically interpreting the results

Description

Functional enrichment analysis for analyzing the DE results of different normalization methods and biologically interpreting the results

Usage

plot_intersection_enrichment(
  se,
  de_res,
  comparison,
  ain = NULL,
  id_column = "Gene.Names",
  organism = "hsapiens",
  source = "KEGG",
  signif_thr = 0.05
)
plot_intersection_enrichment(
  se,
  de_res,
  comparison,
  ain = NULL,
  id_column = "Gene.Names",
  organism = "hsapiens",
  source = "KEGG",
  signif_thr = 0.05
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`de_res`	data table resulting of run_DE
`comparison`	String of comparison (must be a valid comparison saved in de_res)
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`id_column`	String specifying the column of the rowData of the SummarizedExperiment object which includes the gene names
`organism`	Organism name (gprofiler parameter)
`source`	Data source to use (gprofiler parameter, example: KEGG)
`signif_thr`	Significance threshold

Value

list of ggplot objects

Examples

data(tuberculosis_TMT_se)
data(tuberculosis_TMT_de_res)
plot_intersection_enrichment(tuberculosis_TMT_se, tuberculosis_TMT_de_res,
                ain = c("IRS_on_Median", "IRS_on_RobNorm", "RobNorm"),
                comparison = "Rx-TBL", id_column = "Gene.Names",
                organism = "hsapiens", source = "GO:BP", signif_thr = 0.05)

data(tuberculosis_TMT_se)
data(tuberculosis_TMT_de_res)
plot_intersection_enrichment(tuberculosis_TMT_se, tuberculosis_TMT_de_res,
                ain = c("IRS_on_Median", "IRS_on_RobNorm", "RobNorm"),
                comparison = "Rx-TBL", id_column = "Gene.Names",
                organism = "hsapiens", source = "GO:BP", signif_thr = 0.05)

Plot intragroup correlation of the normalized data

Description

Plot intragroup correlation of the normalized data

Usage

plot_intragroup_correlation(
  se,
  ain = NULL,
  condition = NULL,
  method = "pearson"
)
plot_intragroup_correlation(
  se,
  ain = NULL,
  condition = NULL,
  method = "pearson"
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`method`	String specifying the method for correlation calculation (pearson, spearman or kendall)

Value

ggplot object (boxplot)

Examples

data(tuberculosis_TMT_se)
plot_intragroup_correlation(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, method = "pearson")

data(tuberculosis_TMT_se)
plot_intragroup_correlation(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, method = "pearson")

Plot intragroup pooled coefficient of variation (PCV) of the normalized data

Description

Plot intragroup pooled coefficient of variation (PCV) of the normalized data

Usage

plot_intragroup_PCV(se, ain = NULL, condition = NULL, diff = FALSE)
plot_intragroup_PCV(se, ain = NULL, condition = NULL, diff = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`condition`	Column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`diff`	Boolean indicating whether to visualize the reduction of intragroup variation (PCV) compared to "log" (TRUE) or a normal boxplot of intragroup variation (PCV) for each normalization method (FALSE).

Value

ggplot object (boxplot)

Examples

data(tuberculosis_TMT_se)
plot_intragroup_PCV(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, diff = FALSE)

data(tuberculosis_TMT_se)
plot_intragroup_PCV(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, diff = FALSE)

Plot intragroup pooled estimate of variance (PEV) of the normalized data

Description

Plot intragroup pooled estimate of variance (PEV) of the normalized data

Usage

plot_intragroup_PEV(se, ain = NULL, condition = NULL, diff = FALSE)
plot_intragroup_PEV(se, ain = NULL, condition = NULL, diff = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`diff`	Boolean indicating whether to visualize the reduction of intragroup variation (PEV) compared to "log" (TRUE) or a normal boxplot of intragroup variation (PEV) for each normalization method (FALSE).

Value

ggplot object (boxplot)

Examples

data(tuberculosis_TMT_se)
plot_intragroup_PEV(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, diff = FALSE)

data(tuberculosis_TMT_se)
plot_intragroup_PEV(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, diff = FALSE)

Plot intragroup pooled median absolute deviation (PMAD) of the normalized data

Description

Plot intragroup pooled median absolute deviation (PMAD) of the normalized data

Usage

plot_intragroup_PMAD(se, ain = NULL, condition = NULL, diff = FALSE)
plot_intragroup_PMAD(se, ain = NULL, condition = NULL, diff = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`diff`	Boolean indicating whether to visualize the reduction of intragroup variation (PMAD) compared to "log" (TRUE) or a normal boxplot of intragroup variation (PMAD) for each normalization method (FALSE).

Value

ggplot object (boxplot)

Examples

data(tuberculosis_TMT_se)
plot_intragroup_PMAD(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, diff = FALSE)

data(tuberculosis_TMT_se)
plot_intragroup_PMAD(tuberculosis_TMT_se, ain = NULL,
                            condition = NULL, diff = FALSE)

Jaccard similarity heatmap of DE proteins of the different normalization methods

Description

Jaccard similarity heatmap of DE proteins of the different normalization methods

Usage

plot_jaccard_heatmap(
  de_res,
  ain = NULL,
  comparisons = NULL,
  plot_type = "single"
)
plot_jaccard_heatmap(
  de_res,
  ain = NULL,
  comparisons = NULL,
  plot_type = "single"
)

Arguments

`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`comparisons`	Vector of comparisons (must be valid comparisons saved in de_res)
`plot_type`	String indicating whether to plot a single plot per comparison ("single"), facet by comparison ("facet_comp"), or include all comparisons in a single plot ("all")

Value

ggplot object (list of objects if plot_type == "single)

Examples

data(tuberculosis_TMT_de_res)
plot_jaccard_heatmap(tuberculosis_TMT_de_res, ain = NULL,
                    comparisons = NULL, plot_type = "all")

data(tuberculosis_TMT_de_res)
plot_jaccard_heatmap(tuberculosis_TMT_de_res, ain = NULL,
                    comparisons = NULL, plot_type = "all")

Line plot of number of true and false positives when applying different logFC thresholds

Description

Line plot of number of true and false positives when applying different logFC thresholds

Usage

plot_logFC_thresholds_spiked(
  se,
  de_res,
  condition,
  ain = NULL,
  comparisons = NULL,
  nrow = 2,
  alpha = 0.05
)
plot_logFC_thresholds_spiked(
  se,
  de_res,
  condition,
  ain = NULL,
  comparisons = NULL,
  nrow = 2,
  alpha = 0.05
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`de_res`	data table resulting of run_DE
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)
`nrow`	number of rows for facet wrap
`alpha`	threshold for adjusted p-values

Value

list of ggplot objects

Examples

data(spike_in_se)
data(spike_in_de_res)
plot_logFC_thresholds_spiked(spike_in_se, spike_in_de_res,
                             condition = "Condition", ain = NULL,
                             comparisons = NULL, nrow = 2, alpha = 0.05)

data(spike_in_se)
data(spike_in_de_res)
plot_logFC_thresholds_spiked(spike_in_se, spike_in_de_res,
                             condition = "Condition", ain = NULL,
                             comparisons = NULL, nrow = 2, alpha = 0.05)

Boxplots of intensities of specific markers

Description

Boxplots of intensities of specific markers

Usage

plot_markers_boxplots(
  se,
  markers,
  ain = NULL,
  id_column = "Protein.IDs",
  color_by = NULL,
  shape_by = NULL,
  facet_norm = TRUE,
  facet_marker = FALSE
)
plot_markers_boxplots(
  se,
  markers,
  ain = NULL,
  id_column = "Protein.IDs",
  color_by = NULL,
  shape_by = NULL,
  facet_norm = TRUE,
  facet_marker = FALSE
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`markers`	Vector of the IDs of the markers to plot
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`id_column`	String specifying the column of the rowData of the SummarizedExperiment object which includes the IDs of the markers
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`shape_by`	String specifying the column to shape the samples (If NULL or "No", no shaping of samples is done.)
`facet_norm`	Boolean indicating whether to facet by normalization method (TRUE) or not (FALSE)
`facet_marker`	Boolean indicating whether to facet by comparison (TRUE) or not (FALSE). Only valid if facet_norm = FALSE.

Value

ggplot object

Examples

data(tuberculosis_TMT_se)
plot_markers_boxplots(tuberculosis_TMT_se, markers = c("Q7Z7F0", "Q13790"),
                     ain = c("log2"), id_column = "Protein.IDs",
                     color_by = NULL,
                     shape_by = "Pool",
                     facet_norm = FALSE,
                     facet_marker = TRUE)

data(tuberculosis_TMT_se)
plot_markers_boxplots(tuberculosis_TMT_se, markers = c("Q7Z7F0", "Q13790"),
                     ain = c("log2"), id_column = "Protein.IDs",
                     color_by = NULL,
                     shape_by = "Pool",
                     facet_norm = FALSE,
                     facet_marker = TRUE)

Plot the intensity distribution of proteins with and without NAs

Description

Plot the intensity distribution of proteins with and without NAs

Usage

plot_NA_density(se)
plot_NA_density(se)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set

Value

ggplot object

Examples

data(tuberculosis_TMT_se)
plot_NA_density(tuberculosis_TMT_se)

data(tuberculosis_TMT_se)
plot_NA_density(tuberculosis_TMT_se)

Plot protein identification overlap (x = identified in number of Samples, y=number of proteins)

Description

Plot protein identification overlap (x = identified in number of Samples, y=number of proteins)

Usage

plot_NA_frequency(se)
plot_NA_frequency(se)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set

Value

ggplot object

Examples

data(tuberculosis_TMT_se)
plot_NA_frequency(tuberculosis_TMT_se)

data(tuberculosis_TMT_se)
plot_NA_frequency(tuberculosis_TMT_se)

Plot heatmap of the NA pattern

Description

Plot heatmap of the NA pattern

Usage

plot_NA_heatmap(
  se,
  color_by = NULL,
  label_by = NULL,
  cluster_samples = TRUE,
  cluster_proteins = TRUE,
  show_row_dend = TRUE,
  show_column_dend = FALSE,
  col_vector = NULL
)
plot_NA_heatmap(
  se,
  color_by = NULL,
  label_by = NULL,
  cluster_samples = TRUE,
  cluster_proteins = TRUE,
  show_row_dend = TRUE,
  show_column_dend = FALSE,
  col_vector = NULL
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`label_by`	String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)
`cluster_samples`	Boolean. TRUE if samples should be clustered, else FALSE.
`cluster_proteins`	Boolean. TRUE if proteins should be clustered, else FALSE.
`show_row_dend`	Boolean. TRUE if row dendrogram should be shown.
`show_column_dend`	Boolean. TRUE if column dendrogram should be shown.
`col_vector`	Vector of colors for the color bar. If NULL, default colors are used.

Value

ComplexHeatmap plot (only showing proteins with at least one missing value)

Examples

data(tuberculosis_TMT_se)
plot_NA_heatmap(tuberculosis_TMT_se, color_by = NULL,
                label_by = NULL, cluster_samples = TRUE,
                cluster_proteins = TRUE, show_row_dend = TRUE,
                show_column_dend = FALSE,
                col_vector = NULL)

data(tuberculosis_TMT_se)
plot_NA_heatmap(tuberculosis_TMT_se, color_by = NULL,
                label_by = NULL, cluster_samples = TRUE,
                cluster_proteins = TRUE, show_row_dend = TRUE,
                show_column_dend = FALSE,
                col_vector = NULL)

Plot number of non-zero proteins per sample

Description

Plot number of non-zero proteins per sample

Usage

plot_nr_prot_samples(se, ain = "raw", color_by = NULL, label_by = NULL)
plot_nr_prot_samples(se, ain = "raw", color_by = NULL, label_by = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which data type should be used (default raw)
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`label_by`	String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)

Value

ggplot object

Examples

data(tuberculosis_TMT_se)
plot_nr_prot_samples(tuberculosis_TMT_se, ain="raw", color_by = "Group",
                     label_by = "Label")

data(tuberculosis_TMT_se)
plot_nr_prot_samples(tuberculosis_TMT_se, ain="raw", color_by = "Group",
                     label_by = "Label")

Overview plots of DE results

Description

Overview plots of DE results

Usage

plot_overview_DE_bar(
  de_res,
  ain = NULL,
  comparisons = NULL,
  plot_type = "single"
)
plot_overview_DE_bar(
  de_res,
  ain = NULL,
  comparisons = NULL,
  plot_type = "single"
)

Arguments

`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`comparisons`	Vector of comparisons (must be valid comparisons saved in de_res)
`plot_type`	String indicating whether to plot a single plot per comparison ("single"), facet by comparison ("facet_comp"), stack the number of DE per comparison ("stacked"), or stack the number of DE per comparison but facet by up- and down-regulated ("facet_regulation")

Value

list of ggplot objects or single object if plot_type = facet or stacked

Examples

data(tuberculosis_TMT_de_res)
plot_overview_DE_bar(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL,
                     plot_type = "facet_regulation")

data(tuberculosis_TMT_de_res)
plot_overview_DE_bar(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL,
                     plot_type = "facet_regulation")

Overview heatmap plot of DE results

Description

Overview heatmap plot of DE results

Usage

plot_overview_DE_tile(de_res, ain = NULL, comparisons = NULL)
plot_overview_DE_tile(de_res, ain = NULL, comparisons = NULL)

Arguments

`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`comparisons`	Vector of comparisons (must be valid comparisons saved in de_res)

Value

ggplot object

Examples

data(tuberculosis_TMT_de_res)
plot_overview_DE_tile(tuberculosis_TMT_de_res, ain = NULL,
                      comparisons = NULL)

data(tuberculosis_TMT_de_res)
plot_overview_DE_tile(tuberculosis_TMT_de_res, ain = NULL,
                      comparisons = NULL)

PCA plot of the normalized data

Description

PCA plot of the normalized data

Usage

plot_PCA(
  se,
  ain = NULL,
  color_by = NULL,
  label_by = NULL,
  shape_by = NULL,
  facet_norm = TRUE,
  facet_by = NULL,
  ellipse = FALSE,
  ncol = 3
)
plot_PCA(
  se,
  ain = NULL,
  color_by = NULL,
  label_by = NULL,
  shape_by = NULL,
  facet_norm = TRUE,
  facet_by = NULL,
  ellipse = FALSE,
  ncol = 3
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`label_by`	String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)
`shape_by`	String specifying the column to shape the samples (If NULL or "No", no shaping of samples is done.)
`facet_norm`	Boolean specifying whether to facet by normalization methods (default TRUE). If FALSE, list of plots of the different normalized data is returned. However, then you can also facet by any column of the metadata.
`facet_by`	String specifying the column to facet the samples (If facet = FALSE, the plot will not be faceted by the normalization methods, but instead a list of plots of each normalization method is returned. Then, the PCA plot can be faceted by any column of the metadata, for instance by "Batch". If facet_by is NULL or "No", no faceting is performed.)
`ellipse`	Boolean to indicate if ellipses should be drawn
`ncol`	Number of columns in plot (for faceting)

Value

if facet_norm = TRUE, ggplot object, else list of ggplot objects

Examples

data(tuberculosis_TMT_se)
plot_PCA(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL,
         shape_by = "Pool",
         facet_norm = TRUE, ncol = 3)
plot_PCA(tuberculosis_TMT_se, ain = c("IRS_on_RobNorm"), color_by = "Group",
              label_by = "Label", facet_norm = FALSE, facet_by = "Pool")

data(tuberculosis_TMT_se)
plot_PCA(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL,
         shape_by = "Pool",
         facet_norm = TRUE, ncol = 3)
plot_PCA(tuberculosis_TMT_se, ain = c("IRS_on_RobNorm"), color_by = "Group",
              label_by = "Label", facet_norm = FALSE, facet_by = "Pool")

Plot profiles of the spike-in and background proteins using the log2 average protein intensities as a function of the different concentrations.

Description

Plot profiles of the spike-in and background proteins using the log2 average protein intensities as a function of the different concentrations.

Usage

plot_profiles_spiked(se, xlab = "Concentration")
plot_profiles_spiked(se, xlab = "Concentration")

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`xlab`	String for the x-label of the plot

Value

ggplot object

Examples

data(spike_in_se)
plot_profiles_spiked(spike_in_se, xlab = "Concentration")

data(spike_in_se)
plot_profiles_spiked(spike_in_se, xlab = "Concentration")

Boxplot of p-values of spike-in and background proteins for specific normalization methods and comparisons. The ground truth (calculated based on the concentrations of the spike-ins) is shown as a horizontal line.

Description

Boxplot of p-values of spike-in and background proteins for specific normalization methods and comparisons. The ground truth (calculated based on the concentrations of the spike-ins) is shown as a horizontal line.

Usage

plot_pvalues_spiked(se, de_res, ain = NULL, comparisons = NULL)
plot_pvalues_spiked(se, de_res, ain = NULL, comparisons = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)

Value

ggplot object

Examples

data(spike_in_se)
data(spike_in_de_res)
plot_pvalues_spiked(spike_in_se, spike_in_de_res, ain = NULL,
                    comparisons = NULL)

data(spike_in_se)
data(spike_in_de_res)
plot_pvalues_spiked(spike_in_se, spike_in_de_res, ain = NULL,
                    comparisons = NULL)

Plot ROC curve and barplot of AUC values for each method for a specific comparion or for all comparisons

Description

Plot ROC curve and barplot of AUC values for each method for a specific comparion or for all comparisons

Usage

plot_ROC_AUC_spiked(se, de_res, ain = NULL, comparisons = NULL)
plot_ROC_AUC_spiked(se, de_res, ain = NULL, comparisons = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)

Value

list of ggplot objects

Examples

data(spike_in_se)
data(spike_in_de_res)
plot_ROC_AUC_spiked(spike_in_se, spike_in_de_res)

data(spike_in_se)
data(spike_in_de_res)
plot_ROC_AUC_spiked(spike_in_se, spike_in_de_res)

Heatmap of performance metrics for spike-in data sets

Description

Heatmap of performance metrics for spike-in data sets

Usage

plot_stats_spiked_heatmap(
  stats,
  ain = NULL,
  comparisons = NULL,
  metrics = c("Accuracy", "Precision", "F1Score")
)
plot_stats_spiked_heatmap(
  stats,
  ain = NULL,
  comparisons = NULL,
  metrics = c("Accuracy", "Precision", "F1Score")
)

Arguments

`stats`	data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE)
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)
`metrics`	vector of Strings specifying the metrics (must be colnames of stats)

Value

ggplot object

Examples

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_stats_spiked_heatmap(stats, ain = NULL, comparisons = NULL,
                          metrics = c("F1Score", "Accuracy"))

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_stats_spiked_heatmap(stats, ain = NULL, comparisons = NULL,
                          metrics = c("F1Score", "Accuracy"))

Plot total protein intensity per sample

Description

Plot total protein intensity per sample

Usage

plot_tot_int_samples(se, ain = "raw", color_by = NULL, label_by = NULL)
plot_tot_int_samples(se, ain = "raw", color_by = NULL, label_by = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	String which data type should be used (default raw)
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.)
`label_by`	String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)

Value

list of a ggplot object and the dataframe of outliers

Examples

data(tuberculosis_TMT_se)
plot_tot_int_samples(tuberculosis_TMT_se, ain="raw", color_by = NULL,
                     label_by = NULL)

data(tuberculosis_TMT_se)
plot_tot_int_samples(tuberculosis_TMT_se, ain="raw", color_by = NULL,
                     label_by = NULL)

Barplot of true and false positives for specific comparisons and normalization methods

Description

Barplot of true and false positives for specific comparisons and normalization methods

Usage

plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)
plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)

Arguments

`stats`	data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE)
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)

Value

ggplot object (barplot)

Examples

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)

Boxplot of true and false positives for specific comparisons and normalization methods

Description

Boxplot of true and false positives for specific comparisons and normalization methods

Usage

plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)
plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)

Arguments

`stats`	data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE)
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)

Value

ggplot object (barplot)

Examples

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)

Scatterplot of true positives and false positives (median with errorbars as Q1, and Q3) for all comparisons

Description

Scatterplot of true positives and false positives (median with errorbars as Q1, and Q3) for all comparisons

Usage

plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)
plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)

Arguments

`stats`	data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE)
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats)
`comparisons`	Vector of comparisons (must be valid comparisons saved in stats)

Value

ggplot object

Examples

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)

data(spike_in_se)
data(spike_in_de_res)
stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)

Create an UpSet Plot from SummarizedExperiment Data

Description

This function generates an UpSet plot from a given SummarizedExperiment object. It allows for the visualization of overlaps between sets defined by a specific column in the metadata. The function supports subsetting to reference samples and customizable color mapping.

Usage

plot_upset(
  se,
  color_by = NULL,
  label_by = NULL,
  mb.ratio = c(0.7, 0.3),
  only_refs = FALSE
)
plot_upset(
  se,
  color_by = NULL,
  label_by = NULL,
  mb.ratio = c(0.7, 0.3),
  only_refs = FALSE
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`color_by`	String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used.)
`label_by`	String specifying the column in the metadata used to label the samples for the UpSet plot
`mb.ratio`	A numeric vector of length 2, specifying the barplot and matrix area ratios
`only_refs`	Logical, if TRUE, only reference samples (ComRef) are included in the plot

Value

ggplot object

Examples

data(tuberculosis_TMT_se)
plot_upset(tuberculosis_TMT_se, color_by = NULL, label_by = NULL,
           mb.ratio = c(0.7, 0.3), only_refs = FALSE)

data(tuberculosis_TMT_se)
plot_upset(tuberculosis_TMT_se, color_by = NULL, label_by = NULL,
           mb.ratio = c(0.7, 0.3), only_refs = FALSE)

Upset plots of DE results of the different normalization methods

Description

Upset plots of DE results of the different normalization methods

Usage

plot_upset_DE(
  de_res,
  ain = NULL,
  comparisons = NULL,
  min_degree = 2,
  plot_type = "single"
)
plot_upset_DE(
  de_res,
  ain = NULL,
  comparisons = NULL,
  min_degree = 2,
  plot_type = "single"
)

Arguments

`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`comparisons`	Vector of comparisons (must be valid comparisons saved in de_res)
`min_degree`	Minimal degree of an intersection for it to be included
`plot_type`	String indicating whether to plot a single plot per comparison ("single") or stack the number of DE per comparison ("stacked)

Value

list of plots and intersection tables (split by comparison if plot_type == "single)

Examples

data(tuberculosis_TMT_de_res)
plot_upset_DE(tuberculosis_TMT_de_res,
              ain = c("IRS_on_RobNorm", "IRS_on_Median"),
              comparisons = NULL, min_degree = 2,
              plot_type = "stacked")

data(tuberculosis_TMT_de_res)
plot_upset_DE(tuberculosis_TMT_de_res,
              ain = c("IRS_on_RobNorm", "IRS_on_Median"),
              comparisons = NULL, min_degree = 2,
              plot_type = "stacked")

Volcano plots of DE results

Description

Volcano plots of DE results

Usage

plot_volcano_DE(
  de_res,
  ain = NULL,
  comparisons = NULL,
  facet_norm = TRUE,
  facet_comparison = FALSE
)
plot_volcano_DE(
  de_res,
  ain = NULL,
  comparisons = NULL,
  facet_norm = TRUE,
  facet_comparison = FALSE
)

Arguments

`de_res`	data table resulting of run_DE
`ain`	Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res)
`comparisons`	Vector of comparisons (must be valid comparisons saved in de_res)
`facet_norm`	Boolean indicating whether to facet by normalization method (TRUE) or not (FALSE)
`facet_comparison`	Boolean indicating whether to facet by comparison (TRUE) or not (FALSE). Only valid if facet_norm = FALSE.

Value

list of ggplot objects

Examples

data(tuberculosis_TMT_de_res)
plot_volcano_DE(tuberculosis_TMT_de_res, ain = NULL,
                comparisons = NULL, facet_norm = TRUE,
                facet_comparison = FALSE)

data(tuberculosis_TMT_de_res)
plot_volcano_DE(tuberculosis_TMT_de_res, ain = NULL,
                comparisons = NULL, facet_norm = TRUE,
                facet_comparison = FALSE)

Quantile Normalization of preprocessCore package.

Description

Forces distributions of the samples to be the same on the basis of the quantiles of the samples by replacing each protein of a sample with the mean of the corresponding quantile. Log2-scaled data should be taken as input (on_raw = FALSE)

Usage

quantileNorm(se, ain = "log2", aout = "Quantile", on_raw = FALSE)
quantileNorm(se, ain = "log2", aout = "Quantile", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the quantile normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- quantileNorm(tuberculosis_TMT_se, ain = "log2",
                              aout = "Quantile", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- quantileNorm(tuberculosis_TMT_se, ain = "log2",
                              aout = "Quantile", on_raw = FALSE)

Helper function to read example data

Description

Helper function to read example data

Usage

readPRONE_example(path = NULL)
readPRONE_example(path = NULL)

Arguments

path

NULL to get all example data set files, otherwise specify the file name

Value

If path=NULL a character vector with the file names, otherwise the path to the specific file

Examples

readPRONE_example()
readPRONE_example()

Remove normalization assays from a SummarizedExperiment object

Description

Remove normalization assays from a SummarizedExperiment object

Usage

remove_assays_from_SE(se, assays_to_remove)
remove_assays_from_SE(se, assays_to_remove)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`assays_to_remove`	Character vector of assay names to remove from the SummarizedExperiment object

Value

SummarizedExperiment object with the normalization assays removed

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_assays_from_SE(tuberculosis_TMT_se,
                                 assays_to_remove = c("IRS_on_RobNorm"))

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_assays_from_SE(tuberculosis_TMT_se,
                                 assays_to_remove = c("IRS_on_RobNorm"))

Remove outliers samples detected by the detect_outliers_POMA function

Description

Remove outliers samples detected by the detect_outliers_POMA function

Usage

remove_POMA_outliers(se, poma_res_outliers)
remove_POMA_outliers(se, poma_res_outliers)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`poma_res_outliers`	Outliers data.table returned by the detect_outliers_POMA function

Value

filtered SummarizedExperiment object

Examples

data(tuberculosis_TMT_se)
poma_res <- detect_outliers_POMA(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_POMA_outliers(tuberculosis_TMT_se, poma_res$outliers)

data(tuberculosis_TMT_se)
poma_res <- detect_outliers_POMA(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_POMA_outliers(tuberculosis_TMT_se, poma_res$outliers)

Remove reference samples of SummarizedExperiment object (reference samples specified during loading)

Description

Remove reference samples of SummarizedExperiment object (reference samples specified during loading)

Usage

remove_reference_samples(se)
remove_reference_samples(se)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set

Value

filtered SummarizedExperiment object

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_reference_samples(tuberculosis_TMT_se)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_reference_samples(tuberculosis_TMT_se)

Remove samples with specific value in column manually

Description

Remove samples with specific value in column manually

Usage

remove_samples_manually(se, column, values)
remove_samples_manually(se, column, values)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`column`	String specifying the column of the meta data (samples with the specified value in this column will be removed)
`values`	Vector of Strings specifying the value for the removal of samples (samples with this value in the specified column will be removed)

Value

filtered SummarizedExperiment object

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se,
                                column = "Label", values = c("1.HC_Pool1"))

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se,
                                column = "Label", values = c("1.HC_Pool1"))

Cyclic Linear Regression Normalization on MA Transformed Data

Description

No reference, but MA transformation and normalization of samples is done pairwise between two samples with A = average of two samples and M = difference. The process is iterated through all samples pairs. Log2 data should be taken as input (on_raw = FALSE).

Usage

rlrMACycNorm(
  se,
  ain = "log2",
  aout = "RlrMACyc",
  on_raw = FALSE,
  iterations = 3
)
rlrMACycNorm(
  se,
  ain = "log2",
  aout = "RlrMACyc",
  on_raw = FALSE,
  iterations = 3
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data
`iterations`	Number of cyclic iterations to be performed

Value

SummarizedExperiment containing the RlrMACyc normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- rlrMACycNorm(tuberculosis_TMT_se, ain = "log2",
                            aout = "RlrMACyc", on_raw = FALSE, iterations=3)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- rlrMACycNorm(tuberculosis_TMT_se, ain = "log2",
                            aout = "RlrMACyc", on_raw = FALSE, iterations=3)

Linear Regression Normalization on MA Transformed Data

Description

Similar to Rlr, but data are MA transformed before normalization, (A = median sample, M = difference of that sample to A). Log2 data should be taken as input (on_raw = FALSE).

Usage

rlrMANorm(se, ain = "log2", aout = "RlrMA", on_raw = FALSE)
rlrMANorm(se, ain = "log2", aout = "RlrMA", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the RlrMA normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- rlrMANorm(tuberculosis_TMT_se, ain = "log2",
                                  aout = "RlrMA", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- rlrMANorm(tuberculosis_TMT_se, ain = "log2",
                                  aout = "RlrMA", on_raw = FALSE)

Robust Linear Regression Normalization of NormalyzerDE.

Description

Uses median values over all samples as reference sample to which all the other samples in the data are normalized to. Log2 data should be taken as input (on_raw = FALSE).

Usage

rlrNorm(se, ain = "log2", aout = "Rlr", on_raw = FALSE)
rlrNorm(se, ain = "log2", aout = "Rlr", on_raw = FALSE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the rlr normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- rlrNorm(tuberculosis_TMT_se, ain = "log2",
                                            aout = "Rlr", on_raw = FALSE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- rlrNorm(tuberculosis_TMT_se, ain = "log2",
                                            aout = "Rlr", on_raw = FALSE)

RobNorm Normalization

Description

Log2-scaled data should be used as input (on_raw = FALSE).

Usage

robnormNorm(se, ain = "log2", aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)
robnormNorm(se, ain = "log2", aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data
`gamma.0`	Numeric representing the exponent of the weighted density. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1.

Value

SummarizedExperiment containing the RobNorm normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- robnormNorm(tuberculosis_TMT_se, ain = "log2",
                            aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- robnormNorm(tuberculosis_TMT_se, ain = "log2",
                            aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)

Run DE analysis of a selection of normalized data sets

Description

Run DE analysis of a selection of normalized data sets

Usage

run_DE(
  se,
  comparisons,
  ain = NULL,
  condition = NULL,
  DE_method = "limma",
  covariate = NULL,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05,
  B = 100,
  K = 500,
  trend = TRUE,
  robust = TRUE,
  DEqMS_PSMs_column = NULL
)
run_DE(
  se,
  comparisons,
  ain = NULL,
  condition = NULL,
  DE_method = "limma",
  covariate = NULL,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05,
  B = 100,
  K = 500,
  trend = TRUE,
  robust = TRUE,
  DEqMS_PSMs_column = NULL
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`comparisons`	Vector of comparisons that are performed in the DE analysis (from specify_comparisons method)
`ain`	Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other.
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`DE_method`	String specifying which DE method should be applied (limma, ROTS, DEqMS)
`covariate`	String specifying which column to include as covariate into limma
`logFC`	Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE)
`logFC_up`	Upper log2 fold change threshold (dividing into up regulated)
`logFC_down`	Lower log2 fold change threshold (dividing into down regulated)
`p_adj`	Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE)
`alpha`	Threshold for adjusted p-values or p-values
`B`	Number of bootstrapping for ROTS
`K`	Number of top-ranked features for reproducibility optimization
`trend`	logical, should an intensity-dependent trend be allowed for the prior variance? If FALSE then the prior variance is constant. Alternatively, trend can be a row-wise numeric vector, which will be used as the covariate for the prior variance.
`robust`	logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?
`DEqMS_PSMs_column`	String specifying which column name to use for DEqMS (default NULL). Any column of the rowData(se) is accepted.

Value

Data table of DE results of selected normalized data sets

Examples

data(tuberculosis_TMT_se)
comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL,
                            sep = NULL, control = NULL)
de_res <- run_DE(tuberculosis_TMT_se, comparisons,
              ain = NULL, condition = NULL, DE_method = "limma",
              logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE,
              alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE)

data(tuberculosis_TMT_se)
comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL,
                            sep = NULL, control = NULL)
de_res <- run_DE(tuberculosis_TMT_se, comparisons,
              ain = NULL, condition = NULL, DE_method = "limma",
              logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE,
              alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE)

Run DE analysis on a single normalized data set

Description

Run DE analysis on a single normalized data set

Usage

run_DE_single(
  se,
  method,
  comparisons,
  condition = NULL,
  DE_method = "limma",
  covariate = NULL,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05,
  B = 100,
  K = 500,
  trend = TRUE,
  robust = TRUE,
  DEqMS_PSMs_column = NULL
)
run_DE_single(
  se,
  method,
  comparisons,
  condition = NULL,
  DE_method = "limma",
  covariate = NULL,
  logFC = TRUE,
  logFC_up = 1,
  logFC_down = -1,
  p_adj = TRUE,
  alpha = 0.05,
  B = 100,
  K = 500,
  trend = TRUE,
  robust = TRUE,
  DEqMS_PSMs_column = NULL
)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`method`	String specifying which assay should be used as input
`comparisons`	Vector of comparisons that are performed in the DE analysis (from specify_comparisons method)
`condition`	column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`DE_method`	String specifying which DE method should be applied (limma, ROTS, DEqMS)
`covariate`	String specifying which column to include as covariate into limma
`logFC`	Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE)
`logFC_up`	Upper log2 fold change threshold (dividing into up regulated)
`logFC_down`	Lower log2 fold change threshold (dividing into down regulated)
`p_adj`	Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE)
`alpha`	Threshold for adjusted p-values or p-values
`B`	Number of bootstrapping for ROTS
`K`	Number of top-ranked features for reproducibility optimization
`trend`	logical, should an intensity-dependent trend be allowed for the prior variance? If FALSE then the prior variance is constant. Alternatively, trend can be a row-wise numeric vector, which will be used as the covariate for the prior variance.
`robust`	logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?
`DEqMS_PSMs_column`	String specifying which column name to use for DEqMS (default NULL). Any column of the rowData(se) is accepted.

Value

Data table of DE results

Create vector of comparisons for DE analysis (either by single condition (sep = NULL) or by combined condition)

Description

Create vector of comparisons for DE analysis (either by single condition (sep = NULL) or by combined condition)

Usage

specify_comparisons(se, condition = NULL, sep = NULL, control = NULL)
specify_comparisons(se, condition = NULL, sep = NULL, control = NULL)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`condition`	Column name of condition (if NULL, condition saved in SummarizedExperiment will be taken)
`sep`	Separator that separates both groups in the condition vector (NULL if condition composed only of single group)
`control`	String of control samples (how the control condition is named) (NULL if no control sample)

Value

Vector of comparisons for DE analysis

Examples

data(tuberculosis_TMT_se)
comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL,
                            sep = NULL, control = NULL)

data(tuberculosis_TMT_se)
comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL,
                            sep = NULL, control = NULL)

Additional function of the DEqMS package

Description

Additional function of the DEqMS package

Usage

spectraCounteBayes_DEqMS(fit, coef_col)
spectraCounteBayes_DEqMS(fit, coef_col)

Arguments

`fit`	linear model from function perform_limma
`coef_col`	an integer vector indicating the column(s) of fit$coefficients for which the function is to be performed. if not specified, all coefficients are used.

Value

list object

Example data.table of DE results of a spike-in proteomics data set

Description

A data.table containing the DE results of the spike_in_se data set (limma, logFC > 1, logFC < -1, p.adj < 0.05)

Usage

data(spike_in_de_res)
data(spike_in_de_res)

Format

An object of class data.table (inherits from data.frame) with 7500 rows and 10 columns.

Source

Jürgen Cox, Marco Y. Hein, Christian A. Luber, Igor Paron, Nagarjuna Nagaraj, and Matthias Mann.Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ. Molecular & Cellular Proteomics 13.9 (Sept. 2014), pp. 2513–2526. <https://doi.org/10.1074/mcp.M113.031591>.

Example SummarizedExperiment of a spike-in proteomics data set

Description

A SummarizedExperiment containing the raw and log2-scaled data of 301 proteins measured in 20 samples. Due to size restriction, we only included the relevant columns of the original proteinGroups.txt of MaxQuant.

Usage

data(spike_in_se)
data(spike_in_se)

Format

An object of class SummarizedExperiment with 1500 rows and 6 columns.

Source

Subset SummarizedExperiment object by normalization assays

Description

Subset SummarizedExperiment object by normalization assays

Usage

subset_SE_by_norm(se, ain)
subset_SE_by_norm(se, ain)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomics data set
`ain`	Character vector of assay names to keep in the SummarizedExperiment object

Value

SummarizedExperiment object with only the selected normalization assays

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- subset_SE_by_norm(tuberculosis_TMT_se,
                          ain = c("raw", "log2", "IRS_on_RobNorm"))

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- subset_SE_by_norm(tuberculosis_TMT_se,
                          ain = c("raw", "log2", "IRS_on_RobNorm"))

Weighted Trimmed Mean of M Values (TMM) Normalization of edgeR package.

Description

Raw data should be taken as input (on_raw = TRUE).

Usage

tmmNorm(se, ain = "raw", aout = "TMM", on_raw = TRUE)
tmmNorm(se, ain = "raw", aout = "TMM", on_raw = TRUE)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data

Value

SummarizedExperiment containing the TMM normalized data as assay (on log2 scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- tmmNorm(tuberculosis_TMT_se, ain = "raw",
                                  aout = "TMM", on_raw = TRUE)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- tmmNorm(tuberculosis_TMT_se, ain = "raw",
                                  aout = "TMM", on_raw = TRUE)

Example data.table of DE results of a real-world proteomics data set

Description

A data.table containing the DE results of the tuberculosis_TMT_se data set (limma, logFC > 1, logFC < -1, p.adj < 0.05)

Usage

data(tuberculosis_TMT_de_res)
data(tuberculosis_TMT_de_res)

Format

An object of class data.table (inherits from data.frame) with 9030 rows and 9 columns.

Source

Biadglegne et al. Mycobacterium tuberculosis Affects Protein and Lipid Content of Circulating Exosomes in Infected Patients Depending on Tuberculosis Disease State. Biomedicines 10.4 (Mar. 2022), p. 783. doi: 10.3390/ biomedicines10040783.

Example SummarizedExperiment of a real-world proteomics data set

Description

A SummarizedExperiment containing the raw and log2-scaled data of 301 proteins measured in 20 samples

Usage

data(tuberculosis_TMT_se)
data(tuberculosis_TMT_se)

Format

An object of class SummarizedExperiment with 301 rows and 20 columns.

Source

Variance Stabilization Normalization of limma package.

Description

Raw data should be taken as input (on_raw = TRUE).

Usage

vsnNorm(se, ain = "raw", aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)
vsnNorm(se, ain = "raw", aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)

Arguments

`se`	SummarizedExperiment containing all necessary information of the proteomic dataset
`ain`	String which assay should be used as input
`aout`	String which assay should be used to save normalized data
`on_raw`	Boolean specifying whether normalization should be performed on raw or log2-scaled data
`VSN_quantile`	Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression (see vsn2 lts.quantile)

Value

SummarizedExperiment containing the vsn normalized data as assay (on log2-scale)

Examples

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- vsnNorm(tuberculosis_TMT_se, ain = "raw",
                            aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)

data(tuberculosis_TMT_se)
tuberculosis_TMT_se <- vsnNorm(tuberculosis_TMT_se, ain = "raw",
                            aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)

Package 'PRONE'

Help Index

Apply other thresholds to DE results

Description

Usage

Arguments

Value

Examples

Outlier detection via POMA R Package

Description

Usage

Arguments

Value

Examples

EigenMS Normalization

Description

Usage

Arguments

Value

Examples

Export the SummarizedExperiment object, the meta data, and the normalized data.

Description

Usage

Arguments

Value

Examples

Extract consensus DE candidates

Description

Usage

Arguments

Value

Examples

Extract the DE results from eBayes fit of perform_limma function.

Description

Usage

Arguments

Value

Remove proteins with NAs in all samples

Description

Usage

Arguments

Value

Examples

Filter proteins based on their NA pattern using a specific threshold

Description

Usage

Arguments

Value

Examples

Remove proteins by their ID

Description

Usage

Arguments

Value

Examples

Remove proteins by value in specific column

Description

Usage

Arguments

Value

Examples

Function to get a long data table of all intensities of all kind of normalization

Description

Usage

Arguments

Value

Function to get a long data table of all PCA1 and PCA2 values of all kind of normalization

Description

Usage

Arguments

Value

Function returning some values on the numbers of NA in the data

Description

Usage

Arguments

Value

Examples

Function to return available normalization methods' identifier names

Description

Usage