Title: | The PROteomics Normalization Evaluator |
---|---|
Description: | High-throughput omics data are often affected by systematic biases introduced throughout all the steps of a clinical study, from sample collection to quantification. Normalization methods aim to adjust for these biases to make the actual biological signal more prominent. However, selecting an appropriate normalization method is challenging due to the wide range of available approaches. Therefore, a comparative evaluation of unnormalized and normalized data is essential in identifying an appropriate normalization strategy for a specific data set. This R package provides different functions for preprocessing, normalizing, and evaluating different normalization approaches. Furthermore, normalization methods can be evaluated on downstream steps, such as differential expression analysis and statistical enrichment analysis. Spike-in data sets with known ground truth and real-world data sets of biological experiments acquired by either tandem mass tag (TMT) or label-free quantification (LFQ) can be analyzed. |
Authors: | Lis Arend [aut, cre] |
Maintainer: | Lis Arend <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.1.0 |
Built: | 2024-10-31 03:45:33 UTC |
Source: | https://github.com/bioc/PRONE |
Apply other thresholds to DE results
apply_thresholds( de_res, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05 )
apply_thresholds( de_res, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05 )
de_res |
data table resulting of run_DE |
logFC |
Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE) |
logFC_up |
Upper log2 fold change threshold (dividing into up regulated) |
logFC_down |
Lower log2 fold change threshold (dividing into down regulated) |
p_adj |
Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE) |
alpha |
Threshold for adjusted p-values or p-values |
data table updating the Change column with the newly applied thresholds
data(tuberculosis_TMT_de_res) de_res <- apply_thresholds(tuberculosis_TMT_de_res, logFC = FALSE, p_adj = TRUE, alpha = 0.01)
data(tuberculosis_TMT_de_res) de_res <- apply_thresholds(tuberculosis_TMT_de_res, logFC = FALSE, p_adj = TRUE, alpha = 0.01)
Outlier detection via POMA R Package
detect_outliers_POMA( se, ain = "log2", condition = NULL, method = "euclidean", type = "median", group = TRUE, coeff = 1.5 )
detect_outliers_POMA( se, ain = "log2", condition = NULL, method = "euclidean", type = "median", group = TRUE, coeff = 1.5 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which data type should be used (default raw) |
condition |
Column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
method |
String specifying the method that should be used to calculate the distance matrix |
type |
String specifying the type of distance calculation to centroid or spatial median |
group |
String specifying if the outlier detection should be performed multi-variate (with conditions) or on the complete data set |
coeff |
This value corresponds to the classical 1.5 in Q3 + 1.5 * IQR formula to detect outliers. By changing this value, the permissiveness in outlier detection will change. |
list of two ggplot objects and a data.table with outlier samples
data(tuberculosis_TMT_se) poma_res <- detect_outliers_POMA(tuberculosis_TMT_se, ain="raw", condition = NULL, method="euclidean", type="median", group=TRUE, coeff = 1.5)
data(tuberculosis_TMT_se) poma_res <- detect_outliers_POMA(tuberculosis_TMT_se, ain="raw", condition = NULL, method="euclidean", type="median", group=TRUE, coeff = 1.5)
EigenMS fits an analysis of variance model to estimate the effects of the experimental factors on the data using the knowledge about the experimental design, and then applies singular value decomposition to identify systematic trends contributing to significant variation not explained by the experimental factors Log2-scaled data should be used as input (on_raw = FALSE).
eigenMSNorm(se, ain = "log2", aout = "EigenMS", on_raw = FALSE)
eigenMSNorm(se, ain = "log2", aout = "EigenMS", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the EigenMS normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- eigenMSNorm(tuberculosis_TMT_se, ain = "log2", aout = "EigenMS", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- eigenMSNorm(tuberculosis_TMT_se, ain = "log2", aout = "EigenMS", on_raw = FALSE)
Export the SummarizedExperiment object, the meta data, and the normalized data.
export_data(se, out_dir, ain = NULL)
export_data(se, out_dir, ain = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
out_dir |
Path of output directory |
ain |
Vector of strings which assay should be downloaded (default NULL). If NULL then all assays of the se object are saved. |
Nothing
data(tuberculosis_TMT_se) ## Not run: export_data(tuberculosis_TMT_se, out_dir = "data/", ain = c("IRS_on_RobNorm", "IRS_on_Median")) ## End(Not run)
data(tuberculosis_TMT_se) ## Not run: export_data(tuberculosis_TMT_se, out_dir = "data/", ain = c("IRS_on_RobNorm", "IRS_on_Median")) ## End(Not run)
Extract consensus DE candidates
extract_consensus_DE_candidates( de_res, ain = NULL, comparisons = NULL, norm_thr = 0.8, per_comparison = FALSE )
extract_consensus_DE_candidates( de_res, ain = NULL, comparisons = NULL, norm_thr = 0.8, per_comparison = FALSE )
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparisons |
Vector of comparisons (must be valid comparisons saved in de_res) |
norm_thr |
Threshold for the number of normalization methods that must agree on a DE candidate |
per_comparison |
Logical indicating if the consensus should be calculated per comparison |
data table with consensus DE candidates
data(tuberculosis_TMT_de_res) extract_consensus_DE_candidates(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, norm_thr = 0.8, per_comparison = TRUE)
data(tuberculosis_TMT_de_res) extract_consensus_DE_candidates(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, norm_thr = 0.8, per_comparison = TRUE)
Extract the DE results from eBayes fit of perform_limma function.
extract_limma_DE( fit, comparisons, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05 )
extract_limma_DE( fit, comparisons, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05 )
fit |
eBayes object resulting from perform_limma method |
comparisons |
Vector of comparisons that are performed in the DE analysis (from specify_comparisons method) |
logFC |
Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE) |
logFC_up |
Upper log2 fold change threshold (dividing into up regulated) |
logFC_down |
Lower log2 fold change threshold (dividing into down regulated) |
p_adj |
Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE) |
alpha |
Threshold for adjusted p-values or p-values |
Data table with limma DE results
Remove proteins with NAs in all samples
filter_out_complete_NA_proteins(se)
filter_out_complete_NA_proteins(se)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
filtered SummarizedExperiment object
data(tuberculosis_TMT_se) tuberulosis_TMT_se <- filter_out_complete_NA_proteins(tuberculosis_TMT_se)
data(tuberculosis_TMT_se) tuberulosis_TMT_se <- filter_out_complete_NA_proteins(tuberculosis_TMT_se)
Filter proteins based on their NA pattern using a specific threshold
filter_out_NA_proteins_by_threshold(se, thr = 0.8)
filter_out_NA_proteins_by_threshold(se, thr = 0.8)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
thr |
Threshold for the minimum fraction of valid values allowed for any protein |
filtered SummarizedExperiment object
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- filter_out_NA_proteins_by_threshold(tuberculosis_TMT_se, thr = 0.8)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- filter_out_NA_proteins_by_threshold(tuberculosis_TMT_se, thr = 0.8)
Remove proteins by their ID
filter_out_proteins_by_ID(se, protein_ids)
filter_out_proteins_by_ID(se, protein_ids)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
protein_ids |
Vector of protein IDs that should be kept |
filtered SummarizedExperiment object
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- filter_out_proteins_by_ID(tuberculosis_TMT_se, protein_ids = c("P0A8V2", "P0A8V2"))
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- filter_out_proteins_by_ID(tuberculosis_TMT_se, protein_ids = c("P0A8V2", "P0A8V2"))
Remove proteins by value in specific column
filter_out_proteins_by_value(se, column_name = "Reverse", values = c("+"))
filter_out_proteins_by_value(se, column_name = "Reverse", values = c("+"))
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
column_name |
name of column of which proteins with a specific value should be removed |
values |
value of the column defining the proteins that should be removed |
filtered SummarizedExperiment object
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- filter_out_proteins_by_value(tuberculosis_TMT_se, column_name = "Reverse", values = c("+"))
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- filter_out_proteins_by_value(tuberculosis_TMT_se, column_name = "Reverse", values = c("+"))
Function to get a long data table of all intensities of all kind of normalization
get_complete_dt(se, ain = NULL)
get_complete_dt(se, ain = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which assay should be used as input (default NULL) If NULL then all normalization of the SummarizedExperiment object are plotted next to each other (except raw). |
data table
Function to get a long data table of all PCA1 and PCA2 values of all kind of normalization
get_complete_pca_dt(se, ain = NULL)
get_complete_pca_dt(se, ain = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which assay should be used as input (default NULL) If NULL then all normalization of the SummarizedExperiment object are plotted next to each other (except raw). |
data table
Function returning some values on the numbers of NA in the data
get_NA_overview(se, ain = "log2")
get_NA_overview(se, ain = "log2")
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which data type should be used (default raw) |
list with total amount of values in the data, amount of NA values, and the percentage of NAs
data(tuberculosis_TMT_se) get_NA_overview(tuberculosis_TMT_se, ain="log2")
data(tuberculosis_TMT_se) get_NA_overview(tuberculosis_TMT_se, ain="log2")
Function to return available normalization methods' identifier names
get_normalization_methods()
get_normalization_methods()
Vector of normalization methods
get_normalization_methods()
get_normalization_methods()
Get overview table of DE results
get_overview_DE(de_res)
get_overview_DE(de_res)
de_res |
data table resulting of run_DE |
data table of numbers of DE proteins per comparison and per normalization method
data(tuberculosis_TMT_de_res) get_overview_DE(tuberculosis_TMT_de_res)
data(tuberculosis_TMT_de_res) get_overview_DE(tuberculosis_TMT_de_res)
Get proteins by value in specific column
get_proteins_by_value(se, column_name = "Reverse", values = c("+"))
get_proteins_by_value(se, column_name = "Reverse", values = c("+"))
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
column_name |
name of column of which proteins with a specific value should be identified |
values |
value of the column defining the proteins that should be identified |
vector of protein IDs
data(tuberculosis_TMT_se) proteins <- get_proteins_by_value(tuberculosis_TMT_se, column_name = "Potential.contaminant", values = c("+"))
data(tuberculosis_TMT_se) proteins <- get_proteins_by_value(tuberculosis_TMT_se, column_name = "Potential.contaminant", values = c("+"))
Get performance metrics of DE results of spike-in data set.
get_spiked_stats_DE(se, de_res)
get_spiked_stats_DE(se, de_res)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
de_res |
data table resulting of run_DE |
data table with multiple performance metrics of the DE results
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res)
Intensities of each variable in a sample are divided with the sum of intensities of all variables in the sample and multiplied with the median or mean of sum of intensities of all variables in all samples. Raw data should be taken as input (on_raw = TRUE).
globalIntNorm( se, ain = "raw", aout = "GlobalMedian", type = "median", on_raw = TRUE )
globalIntNorm( se, ain = "raw", aout = "GlobalMedian", type = "median", on_raw = TRUE )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
type |
String whether to use median or mean to calculate the scaling factor |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the total intensity normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- globalIntNorm(tuberculosis_TMT_se, ain = "raw", aout = "GlobalMedian", type = "median", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- globalIntNorm(tuberculosis_TMT_se, ain = "raw", aout = "GlobalMedian", type = "median", on_raw = TRUE)
Intensities of each variable in a sample are divided with the sum of intensities of all variables in the sample and multiplied with the mean of sum of intensities of all variables in all samples. Raw data should be taken as input (on_raw = TRUE).
globalMeanNorm(se, ain = "raw", aout = "GlobalMean", on_raw = TRUE)
globalMeanNorm(se, ain = "raw", aout = "GlobalMean", on_raw = TRUE)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the total intensity normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- globalMeanNorm(tuberculosis_TMT_se, ain = "raw", aout = "GlobalMean", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- globalMeanNorm(tuberculosis_TMT_se, ain = "raw", aout = "GlobalMean", on_raw = TRUE)
Intensities of each variable in a sample are divided with the sum of intensities of all variables in the sample and multiplied with the median of sum of intensities of all variables in all samples. Raw data should be taken as input (on_raw = TRUE).
globalMedianNorm(se, ain = "raw", aout = "GlobalMedian", on_raw = TRUE)
globalMedianNorm(se, ain = "raw", aout = "GlobalMedian", on_raw = TRUE)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the total intensity normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- globalMedianNorm(tuberculosis_TMT_se, ain = "raw", aout = "GlobalMedian", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- globalMedianNorm(tuberculosis_TMT_se, ain = "raw", aout = "GlobalMedian", on_raw = TRUE)
Method to impute SummarizedExperiment. This method performs a mixed imputation on the proteins. It uses a k-nearest neighbor imputation for proteins with missing values at random (MAR) and imputes missing values by random draws from a left-shifted Gaussian distribution for proteins with missing values not at random (MNAR).
impute_se(se, ain = NULL, condition = NULL)
impute_se(se, ain = NULL, condition = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics dataset |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
condition |
name of column of colData(se) representing the conditions of the data |
SummarizedExperiment with imputed intensities
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se, column = "Label", values = c("1.HC_Pool1")) tuberculosis_TMT_se <- impute_se(tuberculosis_TMT_se, ain = NULL, condition = NULL)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se, column = "Label", values = c("1.HC_Pool1")) tuberculosis_TMT_se <- impute_se(tuberculosis_TMT_se, ain = NULL, condition = NULL)
IRS makes different measurements of the same thing all exactly the same and puts all of the intensities on the same scale. Raw data should be taken as input (on_raw = TRUE)
irsNorm(se, ain = "raw", aout = "IRS", on_raw = TRUE)
irsNorm(se, ain = "raw", aout = "IRS", on_raw = TRUE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the IRS normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- irsNorm(tuberculosis_TMT_se, ain = "raw", aout = "IRS", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- irsNorm(tuberculosis_TMT_se, ain = "raw", aout = "IRS", on_raw = TRUE)
Log2-scaled data should be used as input (on_raw = FALSE).
limmaNorm(se, ain = "log2", aout = "limBE", on_raw = FALSE)
limmaNorm(se, ain = "log2", aout = "limBE", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the limBE normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- limmaNorm(tuberculosis_TMT_se, ain = "log2", aout = "limBE", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- limmaNorm(tuberculosis_TMT_se, ain = "log2", aout = "limBE", on_raw = FALSE)
Load real-world proteomics data into a SummarizedExperiment
load_data( data, md, protein_column = "Protein.IDs", gene_column = "Gene.Names", ref_samples = NULL, batch_column = NULL, condition_column = NULL, label_column = NULL )
load_data( data, md, protein_column = "Protein.IDs", gene_column = "Gene.Names", ref_samples = NULL, batch_column = NULL, condition_column = NULL, label_column = NULL )
data |
tabular data table with rows = proteins and columns = samples (such as proteinGroups.txt of MaxQuant) |
md |
experimental design table (requires a column named "Column" for the column names of the sample intensities in data) |
protein_column |
name of the column in data containing the protein IDs |
gene_column |
name of the column in data containing the gene names |
ref_samples |
reference samples if TMT experiment provided (names of samples) |
batch_column |
name of the column in md defining the batches |
condition_column |
name of the column in md defining the condition (can still be changed afterwards) |
label_column |
name of the column in md containing simple sample names (for visualization) |
SummarizedExperiment object
data_path <- readPRONE_example("tuberculosis_protein_intensities.csv") md_path <- readPRONE_example("tuberculosis_metadata.csv") data <- read.csv(data_path) md <- read.csv(md_path) md$Column <- stringr::str_replace_all(md$Column, " ", ".") ref_samples <- md[md$Group == "ref",]$Column se <- load_data(data, md, protein_column = "Protein.IDs", gene_column = "Gene.names", ref_samples = ref_samples, batch_column = "Pool", condition_column = "Group", label_column = "Label")
data_path <- readPRONE_example("tuberculosis_protein_intensities.csv") md_path <- readPRONE_example("tuberculosis_metadata.csv") data <- read.csv(data_path) md <- read.csv(md_path) md$Column <- stringr::str_replace_all(md$Column, " ", ".") ref_samples <- md[md$Group == "ref",]$Column se <- load_data(data, md, protein_column = "Protein.IDs", gene_column = "Gene.names", ref_samples = ref_samples, batch_column = "Pool", condition_column = "Group", label_column = "Label")
Load spike-in proteomics data into a SummarizedExperiment
load_spike_data( data, md, spike_column, spike_value, spike_concentration, protein_column = "Protein.IDs", gene_column = "Gene.Names", ref_samples = NULL, batch_column = NULL, condition_column = NULL, label_column = NULL )
load_spike_data( data, md, spike_column, spike_value, spike_concentration, protein_column = "Protein.IDs", gene_column = "Gene.Names", ref_samples = NULL, batch_column = NULL, condition_column = NULL, label_column = NULL )
data |
tabular data table with rows = proteins and columns = samples (such as proteinGroups.txt of MaxQuant) |
md |
experimental design table (requires a column named "Column" for the column names of the sample intensities in data) |
spike_column |
name of the column specifying which proteins are the spike-ins |
spike_value |
String value specifying the spike-in proteins in the spike-in column |
spike_concentration |
name of the column in md defining the spike-in concentration per sample |
protein_column |
name of the column in data containing the protein IDs |
gene_column |
name of the column in data containing the gene names |
ref_samples |
reference samples if TMT experiment provided (names of samples) |
batch_column |
name of the column in md defining the batches |
condition_column |
name of the column in md defining the condition (can still be changed afterwards) |
label_column |
name of the column in md containing simple sample names (for visualization) |
SummarizedExperiment object
data_path <- readPRONE_example("Ecoli_human_MaxLFQ_protein_intensities.csv") md_path <- readPRONE_example("Ecoli_human_MaxLFQ_metadata.csv") data <- read.csv(data_path) md <- read.csv(md_path) mixed <- grepl("Homo sapiens.*Escherichia|Escherichia.*Homo sapiens", data$Fasta.headers) data <- data[!mixed,] data$Spiked <- rep("HUMAN", nrow(data)) data$Spiked[grepl("ECOLI", data$Fasta.headers)] <- "ECOLI" se <- load_spike_data(data, md, spike_column = "Spiked", spike_value = "ECOLI", spike_concentration = "Concentration", protein_column = "Protein.IDs", gene_column = "Gene.names", ref_samples = NULL, batch_column = NULL, condition_column = "Condition", label_column = "Label")
data_path <- readPRONE_example("Ecoli_human_MaxLFQ_protein_intensities.csv") md_path <- readPRONE_example("Ecoli_human_MaxLFQ_metadata.csv") data <- read.csv(data_path) md <- read.csv(md_path) mixed <- grepl("Homo sapiens.*Escherichia|Escherichia.*Homo sapiens", data$Fasta.headers) data <- data[!mixed,] data$Spiked <- rep("HUMAN", nrow(data)) data$Spiked[grepl("ECOLI", data$Fasta.headers)] <- "ECOLI" se <- load_spike_data(data, md, spike_column = "Spiked", spike_value = "ECOLI", spike_concentration = "Concentration", protein_column = "Protein.IDs", gene_column = "Gene.names", ref_samples = NULL, batch_column = NULL, condition_column = "Condition", label_column = "Label")
Two samples of the data are MA transformed and normalized at a time, and all pairs of samples are iterated through. Log2-scaled data should be taken as input (on_raw = FALSE).
loessCycNorm(se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)
loessCycNorm(se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the loessCyc normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- loessCycNorm(tuberculosis_TMT_se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- loessCycNorm(tuberculosis_TMT_se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)
Using mean intensities over all the samples as its reference A sample. Log2-scaled data should be used as input (on_raw = FALSE).
loessFNorm(se, ain = "log2", aout = "LoessF", on_raw = FALSE)
loessFNorm(se, ain = "log2", aout = "LoessF", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the LoessF normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- loessFNorm(tuberculosis_TMT_se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- loessFNorm(tuberculosis_TMT_se, ain = "log2", aout = "LoessCyc", on_raw = FALSE)
The intensity of each protein group in a given sample is divided by the mean of the intensities of all protein groups in that sample and then multiplied by the mean of mean of sum of intensities of all protein groups in all samples.
meanNorm(se, ain = "raw", aout = "Mean", on_raw = TRUE)
meanNorm(se, ain = "raw", aout = "Mean", on_raw = TRUE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean whether normalized should be performed on raw or log2-scaled data |
SummarizedExperiment containing the mean normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- meanNorm(tuberculosis_TMT_se, ain = "raw", aout = "Mean", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- meanNorm(tuberculosis_TMT_se, ain = "raw", aout = "Mean", on_raw = TRUE)
Substracts the median and divides the data by the median absolute deviation (MAD). Log2-scaled data should be used as input (on_raw = FALSE).
medianAbsDevNorm(se, ain = "log2", aout = "MAD", on_raw = FALSE)
medianAbsDevNorm(se, ain = "log2", aout = "MAD", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scale data |
SummarizedExperiment containing the MAD normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- medianAbsDevNorm(tuberculosis_TMT_se, ain = "log2", aout = "MAD", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- medianAbsDevNorm(tuberculosis_TMT_se, ain = "log2", aout = "MAD", on_raw = FALSE)
The intensity of each protein group in a given sample is divided by the median of the intensities of all protein groups in that sample and then multiplied by the mean of median of sum of intensities of all protein groups in all samples.
medianNorm(se, ain = "raw", aout = "Median", on_raw = TRUE)
medianNorm(se, ain = "raw", aout = "Median", on_raw = TRUE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the median normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- medianNorm(tuberculosis_TMT_se, ain = "raw", aout = "Median", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- medianNorm(tuberculosis_TMT_se, ain = "raw", aout = "Median", on_raw = TRUE)
Normalize SummarizedExperiment object using single normalization methods or specified combinations of normalization methods
normalize_se( se, methods, combination_pattern = "_on_", on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9 )
normalize_se( se, methods, combination_pattern = "_on_", on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
methods |
Vector of normalization methods to apply for normalizing the proteomics data of the SummarizedExperiment object (identifier of normalization methods can be retrieved using get_normalization_methods()) |
combination_pattern |
String specifying how normalization methods are combined. For instance, methods = c("IRS", "Median_on_IRS"), combination_pattern = "_on_". |
on_raw |
Logical indicating if the normalization should be performed on the raw data or on log2-transformed data. If on_raw = NULL(default), the normalization is performed on the default method specific on_raw setting (suggestion based on publications). |
gamma.0 |
Numeric representing the exponent of the weighted density of RobNorm normalization. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1. |
reduce_correlation_by |
If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc. |
NormicsVSN_quantile |
The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers. |
top_x |
Number of reference proteins extracted for the calculation of parameters |
VSN_quantile |
Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression (see vsn2 lts.quantile) |
SummarizedExperiment object with normalized data saved as assays
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normalize_se(tuberculosis_TMT_se, methods = c("IRS_on_GlobalMedian", "IRS_on_Median", "limBE_on_NormicsVSN"), on_raw = NULL, combination_pattern = "_on_", gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normalize_se(tuberculosis_TMT_se, methods = c("IRS_on_GlobalMedian", "IRS_on_Median", "limBE_on_NormicsVSN"), on_raw = NULL, combination_pattern = "_on_", gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)
Normalize SummarizedExperiment object using combinations of normalization methods
normalize_se_combination( se, methods, ains, on_raw = NULL, combination_pattern = "_on_", gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9 )
normalize_se_combination( se, methods, ains, on_raw = NULL, combination_pattern = "_on_", gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
methods |
Vector of normalization methods to apply for normalizing the proteomics data of the SummarizedExperiment object (identifier of normalization methods can be retrieved using get_normalization_methods()) |
ains |
Vector of assays of SummarizedExperiment object to apply the normalization methods (e.g. if you want to perform Median normalization on IRS-normalized data) |
on_raw |
Logical indicating if the normalization should be performed on the raw data or on log2-transformed data. If on_raw = NULL(default), the normalization is performed on the default method specific on_raw setting (suggestion based on publications). |
combination_pattern |
String to give name to combination of methods (e.g. IRS_on_Median –> "_on_") |
gamma.0 |
Numeric representing the exponent of the weighted density of RobNorm normalization. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1. |
reduce_correlation_by |
If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc. |
NormicsVSN_quantile |
The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers. |
top_x |
Number of reference proteins extracted for the calculation of parameters |
VSN_quantile |
Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression. (see vsn2 lts.quantile) |
SummarizedExperiment object with normalized data saved as assays
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normalize_se_combination(tuberculosis_TMT_se, methods = c("Median","NormicsVSN"), ains = c("IRS"), on_raw = NULL, combination_pattern = "_on_", gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normalize_se_combination(tuberculosis_TMT_se, methods = c("Median","NormicsVSN"), ains = c("IRS"), on_raw = NULL, combination_pattern = "_on_", gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)
Normalize SummarizedExperiment object using different normalization methods
normalize_se_single( se, methods = NULL, on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9 )
normalize_se_single( se, methods = NULL, on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
methods |
Vector of normalization methods to apply for normalizing the proteomics data of the SummarizedExperiment object (identifier of normalization methods can be retrieved using get_normalization_methods()) |
on_raw |
Logical indicating if the normalization should be performed on the raw data or on log2-transformed data. If on_raw = NULL(default), the normalization is performed on the default method specific on_raw setting (suggestion based on publications). |
gamma.0 |
Numeric representing the exponent of the weighted density of RobNorm normalization. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1. |
reduce_correlation_by |
If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc. |
NormicsVSN_quantile |
The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers. |
top_x |
Number of reference proteins extracted for the calculation of parameters |
VSN_quantile |
Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression. (see vsn2 lts.quantile) |
SummarizedExperiment object with normalized data saved as assays
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normalize_se_single(tuberculosis_TMT_se, methods = c("RobNorm", "Median", "NormicsVSN","VSN"), on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normalize_se_single(tuberculosis_TMT_se, methods = c("RobNorm", "Median", "NormicsVSN","VSN"), on_raw = NULL, gamma.0 = 0.1, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, top_x = 50, VSN_quantile = 0.9)
Log2-scaled data should be used as input (on_raw = FALSE).
normicsNorm( se, ain = "raw", aout = "NormicsVSN", method = "NormicsVSN", on_raw = TRUE, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, TMT_ratio = FALSE, top_x = 50 )
normicsNorm( se, ain = "raw", aout = "NormicsVSN", method = "NormicsVSN", on_raw = TRUE, reduce_correlation_by = 1, NormicsVSN_quantile = 0.8, TMT_ratio = FALSE, top_x = 50 )
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
method |
String specifying the method to use (NORMICS or NORMICSmedian) |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
reduce_correlation_by |
If the data is too big for the computation of the params, increase this parameter by 2,3,4.... The whole data will still be normalized, but the params are calculated on every second row etc. |
NormicsVSN_quantile |
The quantile that is used for the resistant least trimmed sum of squares regression. A value of 0.8 means focusing on the central 80% of the data, reducing the influence of outliers. |
TMT_ratio |
Indicates if the data involves Tandem Mass Tag (TMT) ratio-based measurements (common in proteomics). If TRUE, the method may handle the data differently. |
top_x |
Number of reference proteins extracted for the calculation of parameters |
SummarizedExperiment containing the NormicsVSN/NormicsMedian normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normicsNorm(tuberculosis_TMT_se, ain = "raw", aout = "NormicsVSN", method = "NormicsVSN", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- normicsNorm(tuberculosis_TMT_se, ain = "raw", aout = "NormicsVSN", method = "NormicsVSN", on_raw = TRUE)
Perform DEqMS
perform_DEqMS( fit, se, DEqMS_PSMs_column = NULL, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05 )
perform_DEqMS( fit, se, DEqMS_PSMs_column = NULL, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05 )
fit |
eBayes object resulting from perform_limma method |
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
DEqMS_PSMs_column |
String specifying which column name to use for DEqMS (default NULL). Any column of the rowData(se) is accepted. |
logFC |
Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE) |
logFC_up |
Upper log2 fold change threshold (dividing into up regulated) |
logFC_down |
Lower log2 fold change threshold (dividing into down regulated) |
p_adj |
Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE) |
alpha |
Threshold for adjusted p-values or p-values |
data.table of DE results
Fitting a linear model using limma
perform_limma( data, condition_vector, comparisons, covariate = NULL, trend = TRUE, robust = TRUE )
perform_limma( data, condition_vector, comparisons, covariate = NULL, trend = TRUE, robust = TRUE )
data |
Data table of intensities (rows = proteins, cols = samples) |
condition_vector |
Vector of experimental design specifying the condition(s) to compare |
comparisons |
Vector of comparisons that are performed in the DE analysis (from specify_comparisons method) |
covariate |
String specifying which column to include as covariate into limma |
trend |
logical, should an intensity-dependent trend be allowed for the prior variance? If FALSE then the prior variance is constant. Alternatively, trend can be a row-wise numeric vector, which will be used as the covariate for the prior variance. |
robust |
logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances? |
eBayes object
Performing ROTS
perform_ROTS( data, condition, comparisons, condition_name, coldata, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500 )
perform_ROTS( data, condition, comparisons, condition_name, coldata, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500 )
data |
Data table of intensities (rows = proteins, cols = samples) |
condition |
Vector of experimental design specifying the condition(s) to compare |
comparisons |
Vector of comparisons that are performed in the DE analysis (from specify_comparisons method) |
condition_name |
String of name of condition in colData |
coldata |
colData of the SummarizedExperiment |
logFC |
Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE) |
logFC_up |
Upper log2 fold change threshold (dividing into up regulated) |
logFC_down |
Lower log2 fold change threshold (dividing into down regulated) |
p_adj |
Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE) |
alpha |
Threshold for adjusted p-values or p-values |
B |
Number of bootstrapping for ROTS |
K |
Number of top-ranked features for reproducibility optimization |
Data table with DE results
Plot the distributions of the normalized data as boxplots
plot_boxplots( se, ain = NULL, color_by = NULL, label_by = NULL, facet_norm = TRUE, ncol = 3 )
plot_boxplots( se, ain = NULL, color_by = NULL, label_by = NULL, facet_norm = TRUE, ncol = 3 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
label_by |
String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.) |
facet_norm |
Boolean specifying whether to facet by normalization methods (default TRUE). If FALSE, list of plots of the different normalized data is returned. |
ncol |
Number of columns in plot (for faceting) |
if facet_norm = TRUE, ggplot object, else list of ggplot objects
data(tuberculosis_TMT_se) plot_boxplots(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL, facet_norm = TRUE, ncol = 3) plot_boxplots(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"), color_by = "Pool", label_by = "Label", facet_norm = FALSE)
data(tuberculosis_TMT_se) plot_boxplots(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL, facet_norm = TRUE, ncol = 3) plot_boxplots(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"), color_by = "Pool", label_by = "Label", facet_norm = FALSE)
Barplot showing the number of samples per condition
plot_condition_overview(se, condition = NULL)
plot_condition_overview(se, condition = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
ggplot object
data(tuberculosis_TMT_se) plot_condition_overview(tuberculosis_TMT_se, condition = NULL)
data(tuberculosis_TMT_se) plot_condition_overview(tuberculosis_TMT_se, condition = NULL)
Plot the densities of the normalized data
plot_densities(se, ain = NULL, color_by = NULL, facet_norm = TRUE, ncol = 3)
plot_densities(se, ain = NULL, color_by = NULL, facet_norm = TRUE, ncol = 3)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
facet_norm |
Boolean specifying whether to facet by normalization methods (default TRUE). If FALSE, list of plots of the different normalized data is returned. |
ncol |
Number of columns in plot (for faceting) |
if facet_norm = TRUE, ggplot object, else list of ggplot objects
data(tuberculosis_TMT_se) plot_densities(tuberculosis_TMT_se, ain = NULL, color_by = NULL, facet_norm = TRUE, ncol = 3) plot_densities(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"), color_by = "Label", facet_norm = FALSE)
data(tuberculosis_TMT_se) plot_densities(tuberculosis_TMT_se, ain = NULL, color_by = NULL, facet_norm = TRUE, ncol = 3) plot_densities(tuberculosis_TMT_se, ain = c("log2", "IRS_on_RobNorm"), color_by = "Label", facet_norm = FALSE)
Boxplot of log fold changes of spike-in and background proteins for specific normalization methods and comparisons. The ground truth (calculated based on the concentrations of the spike-ins) is shown as a horizontal line.
plot_fold_changes_spiked(se, de_res, condition, ain = NULL, comparisons = NULL)
plot_fold_changes_spiked(se, de_res, condition, ain = NULL, comparisons = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
de_res |
data table resulting of run_DE |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
ggplot object
data(spike_in_se) data(spike_in_de_res) plot_fold_changes_spiked(spike_in_se, spike_in_de_res, condition = "Condition", ain = NULL, comparisons = NULL)
data(spike_in_se) data(spike_in_de_res) plot_fold_changes_spiked(spike_in_se, spike_in_de_res, condition = "Condition", ain = NULL, comparisons = NULL)
Plot a heatmap of the sample intensities with optional column annotations for a selection of normalization methods
plot_heatmap( se, ain = NULL, color_by = c("Group", "Pool"), label_by = NULL, only_refs = FALSE )
plot_heatmap( se, ain = NULL, color_by = c("Group", "Pool"), label_by = NULL, only_refs = FALSE )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
color_by |
Vector of strings specifying the columns to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bars added.) |
label_by |
String specifying the column in the metadata used to label the samples for the UpSet plot |
only_refs |
Logical, if TRUE, only reference samples (ComRef) are included in the plot |
list of ggplot objects
data(tuberculosis_TMT_se) plot_heatmap(tuberculosis_TMT_se, ain = c("log2"), color_by = NULL, label_by = NULL, only_refs = FALSE)
data(tuberculosis_TMT_se) plot_heatmap(tuberculosis_TMT_se, ain = c("log2"), color_by = NULL, label_by = NULL, only_refs = FALSE)
Heatmap of DE results
plot_heatmap_DE( se, de_res, ain, comparison, condition = NULL, label_by = NULL, pvalue_column = "adj.P.Val", col_vector = NULL )
plot_heatmap_DE( se, de_res, ain, comparison, condition = NULL, label_by = NULL, pvalue_column = "adj.P.Val", col_vector = NULL )
se |
SummarizedExperiment containing all necessary information of the proteomics data set (including the normalized intensities) |
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparison |
String of comparison (must be a valid comparison saved in de_res) |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
label_by |
String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.) |
pvalue_column |
column name of p-values in de_res |
col_vector |
Vector of colors to use for the heatmap. If NULL, default colors are used. |
list of ComplexHeatmaps for each method
data(tuberculosis_TMT_se) data(tuberculosis_TMT_de_res) plot_heatmap_DE(tuberculosis_TMT_se, tuberculosis_TMT_de_res, ain = NULL, comparison = "PTB-HC", condition = NULL, label_by = NULL, pvalue_column = "adj.P.Val", col_vector = NULL)
data(tuberculosis_TMT_se) data(tuberculosis_TMT_de_res) plot_heatmap_DE(tuberculosis_TMT_se, tuberculosis_TMT_de_res, ain = NULL, comparison = "PTB-HC", condition = NULL, label_by = NULL, pvalue_column = "adj.P.Val", col_vector = NULL)
Plot histogram of the spike-in and background protein intensities per condition.
plot_histogram_spiked(se, condition = NULL)
plot_histogram_spiked(se, condition = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
ggplot object
data(spike_in_se) plot_histogram_spiked(spike_in_se, condition = NULL)
data(spike_in_se) plot_histogram_spiked(spike_in_se, condition = NULL)
Plot number of identified spike-in proteins per sample.
plot_identified_spiked_proteins(se, color_by = NULL, label_by = NULL)
plot_identified_spiked_proteins(se, color_by = NULL, label_by = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
label_by |
String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.)#' |
ggplot object
data(spike_in_se) plot_identified_spiked_proteins(spike_in_se, color_by = NULL, label_by = NULL)
data(spike_in_se) plot_identified_spiked_proteins(spike_in_se, color_by = NULL, label_by = NULL)
Intersect top N enrichment terms per normalization method
plot_intersection_enrichment( se, de_res, ain = NULL, comparisons = NULL, id_column = "Gene.Names", organism = "hsapiens", per_comparison = TRUE, sources = c("GO:BP", "GO:MF", "GO:CC"), top = 10 )
plot_intersection_enrichment( se, de_res, ain = NULL, comparisons = NULL, id_column = "Gene.Names", organism = "hsapiens", per_comparison = TRUE, sources = c("GO:BP", "GO:MF", "GO:CC"), top = 10 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparisons |
Vector of comparisons (must be valid comparisons saved in de_res) |
id_column |
String specifying the column of the rowData of the SummarizedExperiment object which includes the gene names |
organism |
Organism name (gprofiler parameter) |
per_comparison |
Boolean specifying whether the enrichment analysis should be performed per comparison (TRUE) or on all given comparisons together (FALSE) |
sources |
Vector of data sources to use (gprofiler parameter) |
top |
Number of enrichment terms to extract for each normalization method |
list of ggplot objects or single ggplot object
data(tuberculosis_TMT_se) data(tuberculosis_TMT_de_res) plot_intersection_enrichment(tuberculosis_TMT_se, tuberculosis_TMT_de_res, ain = c("IRS_on_RobNorm", "IRS_on_Median"), comparisons = NULL, id_column = "Gene.Names", organism = "hsapiens", per_comparison = TRUE, sources = c("GO:BP", "GO:MF", "GO:CC"), top = 10)
data(tuberculosis_TMT_se) data(tuberculosis_TMT_de_res) plot_intersection_enrichment(tuberculosis_TMT_se, tuberculosis_TMT_de_res, ain = c("IRS_on_RobNorm", "IRS_on_Median"), comparisons = NULL, id_column = "Gene.Names", organism = "hsapiens", per_comparison = TRUE, sources = c("GO:BP", "GO:MF", "GO:CC"), top = 10)
Plot intragroup correlation of the normalized data
plot_intragroup_correlation( se, ain = NULL, condition = NULL, method = "pearson" )
plot_intragroup_correlation( se, ain = NULL, condition = NULL, method = "pearson" )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
method |
String specifying the method for correlation calculation (pearson, spearman or kendall) |
ggplot object (boxplot)
data(tuberculosis_TMT_se) plot_intragroup_correlation(tuberculosis_TMT_se, ain = NULL, condition = NULL, method = "pearson")
data(tuberculosis_TMT_se) plot_intragroup_correlation(tuberculosis_TMT_se, ain = NULL, condition = NULL, method = "pearson")
Plot intragroup pooled coefficient of variation (PCV) of the normalized data
plot_intragroup_PCV(se, ain = NULL, condition = NULL, diff = FALSE)
plot_intragroup_PCV(se, ain = NULL, condition = NULL, diff = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
condition |
Column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
diff |
Boolean indicating whether to visualize the reduction of intragroup variation (PCV) compared to "log" (TRUE) or a normal boxplot of intragroup variation (PCV) for each normalization method (FALSE). |
ggplot object (boxplot)
data(tuberculosis_TMT_se) plot_intragroup_PCV(tuberculosis_TMT_se, ain = NULL, condition = NULL, diff = FALSE)
data(tuberculosis_TMT_se) plot_intragroup_PCV(tuberculosis_TMT_se, ain = NULL, condition = NULL, diff = FALSE)
Plot intragroup pooled estimate of variance (PEV) of the normalized data
plot_intragroup_PEV(se, ain = NULL, condition = NULL, diff = FALSE)
plot_intragroup_PEV(se, ain = NULL, condition = NULL, diff = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
diff |
Boolean indicating whether to visualize the reduction of intragroup variation (PEV) compared to "log" (TRUE) or a normal boxplot of intragroup variation (PEV) for each normalization method (FALSE). |
ggplot object (boxplot)
data(tuberculosis_TMT_se) plot_intragroup_PEV(tuberculosis_TMT_se, ain = NULL, condition = NULL, diff = FALSE)
data(tuberculosis_TMT_se) plot_intragroup_PEV(tuberculosis_TMT_se, ain = NULL, condition = NULL, diff = FALSE)
Plot intragroup pooled median absolute deviation (PMAD) of the normalized data
plot_intragroup_PMAD(se, ain = NULL, condition = NULL, diff = FALSE)
plot_intragroup_PMAD(se, ain = NULL, condition = NULL, diff = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
diff |
Boolean indicating whether to visualize the reduction of intragroup variation (PMAD) compared to "log" (TRUE) or a normal boxplot of intragroup variation (PMAD) for each normalization method (FALSE). |
ggplot object (boxplot)
data(tuberculosis_TMT_se) plot_intragroup_PMAD(tuberculosis_TMT_se, ain = NULL, condition = NULL, diff = FALSE)
data(tuberculosis_TMT_se) plot_intragroup_PMAD(tuberculosis_TMT_se, ain = NULL, condition = NULL, diff = FALSE)
Jaccard similarity heatmap of DE proteins of the different normalization methods
plot_jaccard_heatmap( de_res, ain = NULL, comparisons = NULL, plot_type = "single" )
plot_jaccard_heatmap( de_res, ain = NULL, comparisons = NULL, plot_type = "single" )
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparisons |
Vector of comparisons (must be valid comparisons saved in de_res) |
plot_type |
String indicating whether to plot a single plot per comparison ("single"), facet by comparison ("facet_comp"), or include all comparisons in a single plot ("all") |
ggplot object (list of objects if plot_type == "single)
data(tuberculosis_TMT_de_res) plot_jaccard_heatmap(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, plot_type = "all")
data(tuberculosis_TMT_de_res) plot_jaccard_heatmap(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, plot_type = "all")
Line plot of number of true and false positives when applying different logFC thresholds
plot_logFC_thresholds_spiked( se, de_res, condition, ain = NULL, comparisons = NULL, nrow = 2, alpha = 0.05 )
plot_logFC_thresholds_spiked( se, de_res, condition, ain = NULL, comparisons = NULL, nrow = 2, alpha = 0.05 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
de_res |
data table resulting of run_DE |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
nrow |
number of rows for facet wrap |
alpha |
threshold for adjusted p-values |
list of ggplot objects
data(spike_in_se) data(spike_in_de_res) plot_logFC_thresholds_spiked(spike_in_se, spike_in_de_res, condition = "Condition", ain = NULL, comparisons = NULL, nrow = 2, alpha = 0.05)
data(spike_in_se) data(spike_in_de_res) plot_logFC_thresholds_spiked(spike_in_se, spike_in_de_res, condition = "Condition", ain = NULL, comparisons = NULL, nrow = 2, alpha = 0.05)
Boxplots of intensities of specific markers
plot_markers_boxplots( se, markers, ain = NULL, id_column = "Protein.IDs", color_by = NULL, shape_by = NULL, facet_norm = TRUE, facet_marker = FALSE )
plot_markers_boxplots( se, markers, ain = NULL, id_column = "Protein.IDs", color_by = NULL, shape_by = NULL, facet_norm = TRUE, facet_marker = FALSE )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
markers |
Vector of the IDs of the markers to plot |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
id_column |
String specifying the column of the rowData of the SummarizedExperiment object which includes the IDs of the markers |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
shape_by |
String specifying the column to shape the samples (If NULL or "No", no shaping of samples is done.) |
facet_norm |
Boolean indicating whether to facet by normalization method (TRUE) or not (FALSE) |
facet_marker |
Boolean indicating whether to facet by comparison (TRUE) or not (FALSE). Only valid if facet_norm = FALSE. |
ggplot object
data(tuberculosis_TMT_se) plot_markers_boxplots(tuberculosis_TMT_se, markers = c("Q7Z7F0", "Q13790"), ain = c("log2"), id_column = "Protein.IDs", color_by = NULL, shape_by = "Pool", facet_norm = FALSE, facet_marker = TRUE)
data(tuberculosis_TMT_se) plot_markers_boxplots(tuberculosis_TMT_se, markers = c("Q7Z7F0", "Q13790"), ain = c("log2"), id_column = "Protein.IDs", color_by = NULL, shape_by = "Pool", facet_norm = FALSE, facet_marker = TRUE)
Plot the intensity distribution of proteins with and without NAs
plot_NA_density(se)
plot_NA_density(se)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ggplot object
data(tuberculosis_TMT_se) plot_NA_density(tuberculosis_TMT_se)
data(tuberculosis_TMT_se) plot_NA_density(tuberculosis_TMT_se)
Plot protein identification overlap (x = identified in number of Samples, y=number of proteins)
plot_NA_frequency(se)
plot_NA_frequency(se)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ggplot object
data(tuberculosis_TMT_se) plot_NA_frequency(tuberculosis_TMT_se)
data(tuberculosis_TMT_se) plot_NA_frequency(tuberculosis_TMT_se)
Plot heatmap of the NA pattern
plot_NA_heatmap( se, color_by = NULL, label_by = NULL, cluster_samples = TRUE, cluster_proteins = TRUE, show_row_dend = TRUE, show_column_dend = FALSE, col_vector = NULL )
plot_NA_heatmap( se, color_by = NULL, label_by = NULL, cluster_samples = TRUE, cluster_proteins = TRUE, show_row_dend = TRUE, show_column_dend = FALSE, col_vector = NULL )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
label_by |
String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.) |
cluster_samples |
Boolean. TRUE if samples should be clustered, else FALSE. |
cluster_proteins |
Boolean. TRUE if proteins should be clustered, else FALSE. |
show_row_dend |
Boolean. TRUE if row dendrogram should be shown. |
show_column_dend |
Boolean. TRUE if column dendrogram should be shown. |
col_vector |
Vector of colors for the color bar. If NULL, default colors are used. |
ComplexHeatmap plot (only showing proteins with at least one missing value)
data(tuberculosis_TMT_se) plot_NA_heatmap(tuberculosis_TMT_se, color_by = NULL, label_by = NULL, cluster_samples = TRUE, cluster_proteins = TRUE, show_row_dend = TRUE, show_column_dend = FALSE, col_vector = NULL)
data(tuberculosis_TMT_se) plot_NA_heatmap(tuberculosis_TMT_se, color_by = NULL, label_by = NULL, cluster_samples = TRUE, cluster_proteins = TRUE, show_row_dend = TRUE, show_column_dend = FALSE, col_vector = NULL)
Plot number of non-zero proteins per sample
plot_nr_prot_samples(se, ain = "raw", color_by = NULL, label_by = NULL)
plot_nr_prot_samples(se, ain = "raw", color_by = NULL, label_by = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which data type should be used (default raw) |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
label_by |
String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.) |
ggplot object
data(tuberculosis_TMT_se) plot_nr_prot_samples(tuberculosis_TMT_se, ain="raw", color_by = "Group", label_by = "Label")
data(tuberculosis_TMT_se) plot_nr_prot_samples(tuberculosis_TMT_se, ain="raw", color_by = "Group", label_by = "Label")
Overview plots of DE results
plot_overview_DE_bar( de_res, ain = NULL, comparisons = NULL, plot_type = "single" )
plot_overview_DE_bar( de_res, ain = NULL, comparisons = NULL, plot_type = "single" )
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparisons |
Vector of comparisons (must be valid comparisons saved in de_res) |
plot_type |
String indicating whether to plot a single plot per comparison ("single"), facet by comparison ("facet_comp"), stack the number of DE per comparison ("stacked"), or stack the number of DE per comparison but facet by up- and down-regulated ("facet_regulation") |
list of ggplot objects or single object if plot_type = facet or stacked
data(tuberculosis_TMT_de_res) plot_overview_DE_bar(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, plot_type = "facet_regulation")
data(tuberculosis_TMT_de_res) plot_overview_DE_bar(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, plot_type = "facet_regulation")
Overview heatmap plot of DE results
plot_overview_DE_tile(de_res, ain = NULL, comparisons = NULL)
plot_overview_DE_tile(de_res, ain = NULL, comparisons = NULL)
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparisons |
Vector of comparisons (must be valid comparisons saved in de_res) |
ggplot object
data(tuberculosis_TMT_de_res) plot_overview_DE_tile(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL)
data(tuberculosis_TMT_de_res) plot_overview_DE_tile(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL)
PCA plot of the normalized data
plot_PCA( se, ain = NULL, color_by = NULL, label_by = NULL, shape_by = NULL, facet_norm = TRUE, facet_by = NULL, ellipse = FALSE, ncol = 3 )
plot_PCA( se, ain = NULL, color_by = NULL, label_by = NULL, shape_by = NULL, facet_norm = TRUE, facet_by = NULL, ellipse = FALSE, ncol = 3 )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
label_by |
String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.) |
shape_by |
String specifying the column to shape the samples (If NULL or "No", no shaping of samples is done.) |
facet_norm |
Boolean specifying whether to facet by normalization methods (default TRUE). If FALSE, list of plots of the different normalized data is returned. However, then you can also facet by any column of the metadata. |
facet_by |
String specifying the column to facet the samples (If facet = FALSE, the plot will not be faceted by the normalization methods, but instead a list of plots of each normalization method is returned. Then, the PCA plot can be faceted by any column of the metadata, for instance by "Batch". If facet_by is NULL or "No", no faceting is performed.) |
ellipse |
Boolean to indicate if ellipses should be drawn |
ncol |
Number of columns in plot (for faceting) |
if facet_norm = TRUE, ggplot object, else list of ggplot objects
data(tuberculosis_TMT_se) plot_PCA(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL, shape_by = "Pool", facet_norm = TRUE, ncol = 3) plot_PCA(tuberculosis_TMT_se, ain = c("IRS_on_RobNorm"), color_by = "Group", label_by = "Label", facet_norm = FALSE, facet_by = "Pool")
data(tuberculosis_TMT_se) plot_PCA(tuberculosis_TMT_se, ain = NULL, color_by = NULL, label_by = NULL, shape_by = "Pool", facet_norm = TRUE, ncol = 3) plot_PCA(tuberculosis_TMT_se, ain = c("IRS_on_RobNorm"), color_by = "Group", label_by = "Label", facet_norm = FALSE, facet_by = "Pool")
Plot profiles of the spike-in and background proteins using the log2 average protein intensities as a function of the different concentrations.
plot_profiles_spiked(se, xlab = "Concentration")
plot_profiles_spiked(se, xlab = "Concentration")
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
xlab |
String for the x-label of the plot |
ggplot object
data(spike_in_se) plot_profiles_spiked(spike_in_se, xlab = "Concentration")
data(spike_in_se) plot_profiles_spiked(spike_in_se, xlab = "Concentration")
Boxplot of p-values of spike-in and background proteins for specific normalization methods and comparisons. The ground truth (calculated based on the concentrations of the spike-ins) is shown as a horizontal line.
plot_pvalues_spiked(se, de_res, ain = NULL, comparisons = NULL)
plot_pvalues_spiked(se, de_res, ain = NULL, comparisons = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
ggplot object
data(spike_in_se) data(spike_in_de_res) plot_pvalues_spiked(spike_in_se, spike_in_de_res, ain = NULL, comparisons = NULL)
data(spike_in_se) data(spike_in_de_res) plot_pvalues_spiked(spike_in_se, spike_in_de_res, ain = NULL, comparisons = NULL)
Plot ROC curve and barplot of AUC values for each method for a specific comparion or for all comparisons
plot_ROC_AUC_spiked(se, de_res, ain = NULL, comparisons = NULL)
plot_ROC_AUC_spiked(se, de_res, ain = NULL, comparisons = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
list of ggplot objects
data(spike_in_se) data(spike_in_de_res) plot_ROC_AUC_spiked(spike_in_se, spike_in_de_res)
data(spike_in_se) data(spike_in_de_res) plot_ROC_AUC_spiked(spike_in_se, spike_in_de_res)
Heatmap of performance metrics for spike-in data sets
plot_stats_spiked_heatmap( stats, ain = NULL, comparisons = NULL, metrics = c("Accuracy", "Precision", "F1Score") )
plot_stats_spiked_heatmap( stats, ain = NULL, comparisons = NULL, metrics = c("Accuracy", "Precision", "F1Score") )
stats |
data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE) |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
metrics |
vector of Strings specifying the metrics (must be colnames of stats) |
ggplot object
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_stats_spiked_heatmap(stats, ain = NULL, comparisons = NULL, metrics = c("F1Score", "Accuracy"))
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_stats_spiked_heatmap(stats, ain = NULL, comparisons = NULL, metrics = c("F1Score", "Accuracy"))
Plot total protein intensity per sample
plot_tot_int_samples(se, ain = "raw", color_by = NULL, label_by = NULL)
plot_tot_int_samples(se, ain = "raw", color_by = NULL, label_by = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
String which data type should be used (default raw) |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used. If "No", no color bar added.) |
label_by |
String specifying the column to label the samples (If NULL, the labels column of the SummarizedExperiment object is used. If "No", no labeling of samples done.) |
list of a ggplot object and the dataframe of outliers
data(tuberculosis_TMT_se) plot_tot_int_samples(tuberculosis_TMT_se, ain="raw", color_by = NULL, label_by = NULL)
data(tuberculosis_TMT_se) plot_tot_int_samples(tuberculosis_TMT_se, ain="raw", color_by = NULL, label_by = NULL)
Barplot of true and false positives for specific comparisons and normalization methods
plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)
plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)
stats |
data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE) |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
ggplot object (barplot)
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_TP_FP_spiked_bar(stats, ain = NULL, comparisons = NULL)
Boxplot of true and false positives for specific comparisons and normalization methods
plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)
plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)
stats |
data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE) |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
ggplot object (barplot)
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_TP_FP_spiked_box(stats, ain = NULL, comparisons = NULL)
Scatterplot of true positives and false positives (median with errorbars as Q1, and Q3) for all comparisons
plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)
plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)
stats |
data table with multiple metrics of the DE results (resulting of get_spiked_stats_DE) |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in stats) |
comparisons |
Vector of comparisons (must be valid comparisons saved in stats) |
ggplot object
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)
data(spike_in_se) data(spike_in_de_res) stats <- get_spiked_stats_DE(spike_in_se, spike_in_de_res) plot_TP_FP_spiked_scatter(stats, ain = NULL, comparisons = NULL)
This function generates an UpSet plot from a given SummarizedExperiment object. It allows for the visualization of overlaps between sets defined by a specific column in the metadata. The function supports subsetting to reference samples and customizable color mapping.
plot_upset( se, color_by = NULL, label_by = NULL, mb.ratio = c(0.7, 0.3), only_refs = FALSE )
plot_upset( se, color_by = NULL, label_by = NULL, mb.ratio = c(0.7, 0.3), only_refs = FALSE )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
color_by |
String specifying the column to color the samples (If NULL, the condition column of the SummarizedExperiment object is used.) |
label_by |
String specifying the column in the metadata used to label the samples for the UpSet plot |
mb.ratio |
A numeric vector of length 2, specifying the barplot and matrix area ratios |
only_refs |
Logical, if TRUE, only reference samples (ComRef) are included in the plot |
ggplot object
data(tuberculosis_TMT_se) plot_upset(tuberculosis_TMT_se, color_by = NULL, label_by = NULL, mb.ratio = c(0.7, 0.3), only_refs = FALSE)
data(tuberculosis_TMT_se) plot_upset(tuberculosis_TMT_se, color_by = NULL, label_by = NULL, mb.ratio = c(0.7, 0.3), only_refs = FALSE)
Upset plots of DE results of the different normalization methods
plot_upset_DE( de_res, ain = NULL, comparisons = NULL, min_degree = 2, plot_type = "single" )
plot_upset_DE( de_res, ain = NULL, comparisons = NULL, min_degree = 2, plot_type = "single" )
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparisons |
Vector of comparisons (must be valid comparisons saved in de_res) |
min_degree |
Minimal degree of an intersection for it to be included |
plot_type |
String indicating whether to plot a single plot per comparison ("single") or stack the number of DE per comparison ("stacked) |
list of plots and intersection tables (split by comparison if plot_type == "single)
data(tuberculosis_TMT_de_res) plot_upset_DE(tuberculosis_TMT_de_res, ain = c("IRS_on_RobNorm", "IRS_on_Median"), comparisons = NULL, min_degree = 2, plot_type = "stacked")
data(tuberculosis_TMT_de_res) plot_upset_DE(tuberculosis_TMT_de_res, ain = c("IRS_on_RobNorm", "IRS_on_Median"), comparisons = NULL, min_degree = 2, plot_type = "stacked")
Volcano plots of DE results
plot_volcano_DE( de_res, ain = NULL, comparisons = NULL, facet_norm = TRUE, facet_comparison = FALSE )
plot_volcano_DE( de_res, ain = NULL, comparisons = NULL, facet_norm = TRUE, facet_comparison = FALSE )
de_res |
data table resulting of run_DE |
ain |
Vector of strings of normalization methods to visualize (must be valid normalization methods saved in de_res) |
comparisons |
Vector of comparisons (must be valid comparisons saved in de_res) |
facet_norm |
Boolean indicating whether to facet by normalization method (TRUE) or not (FALSE) |
facet_comparison |
Boolean indicating whether to facet by comparison (TRUE) or not (FALSE). Only valid if facet_norm = FALSE. |
list of ggplot objects
data(tuberculosis_TMT_de_res) plot_volcano_DE(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, facet_norm = TRUE, facet_comparison = FALSE)
data(tuberculosis_TMT_de_res) plot_volcano_DE(tuberculosis_TMT_de_res, ain = NULL, comparisons = NULL, facet_norm = TRUE, facet_comparison = FALSE)
Forces distributions of the samples to be the same on the basis of the quantiles of the samples by replacing each protein of a sample with the mean of the corresponding quantile. Log2-scaled data should be taken as input (on_raw = FALSE)
quantileNorm(se, ain = "log2", aout = "Quantile", on_raw = FALSE)
quantileNorm(se, ain = "log2", aout = "Quantile", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the quantile normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- quantileNorm(tuberculosis_TMT_se, ain = "log2", aout = "Quantile", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- quantileNorm(tuberculosis_TMT_se, ain = "log2", aout = "Quantile", on_raw = FALSE)
Helper function to read example data
readPRONE_example(path = NULL)
readPRONE_example(path = NULL)
path |
NULL to get all example data set files, otherwise specify the file name |
If path=NULL a character vector with the file names, otherwise the path to the specific file
readPRONE_example()
readPRONE_example()
Remove normalization assays from a SummarizedExperiment object
remove_assays_from_SE(se, assays_to_remove)
remove_assays_from_SE(se, assays_to_remove)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
assays_to_remove |
Character vector of assay names to remove from the SummarizedExperiment object |
SummarizedExperiment object with the normalization assays removed
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_assays_from_SE(tuberculosis_TMT_se, assays_to_remove = c("IRS_on_RobNorm"))
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_assays_from_SE(tuberculosis_TMT_se, assays_to_remove = c("IRS_on_RobNorm"))
Remove outliers samples detected by the detect_outliers_POMA function
remove_POMA_outliers(se, poma_res_outliers)
remove_POMA_outliers(se, poma_res_outliers)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
poma_res_outliers |
Outliers data.table returned by the detect_outliers_POMA function |
filtered SummarizedExperiment object
data(tuberculosis_TMT_se) poma_res <- detect_outliers_POMA(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_POMA_outliers(tuberculosis_TMT_se, poma_res$outliers)
data(tuberculosis_TMT_se) poma_res <- detect_outliers_POMA(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_POMA_outliers(tuberculosis_TMT_se, poma_res$outliers)
Remove reference samples of SummarizedExperiment object (reference samples specified during loading)
remove_reference_samples(se)
remove_reference_samples(se)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
filtered SummarizedExperiment object
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_reference_samples(tuberculosis_TMT_se)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_reference_samples(tuberculosis_TMT_se)
Remove samples with specific value in column manually
remove_samples_manually(se, column, values)
remove_samples_manually(se, column, values)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
column |
String specifying the column of the meta data (samples with the specified value in this column will be removed) |
values |
Vector of Strings specifying the value for the removal of samples (samples with this value in the specified column will be removed) |
filtered SummarizedExperiment object
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se, column = "Label", values = c("1.HC_Pool1"))
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- remove_samples_manually(tuberculosis_TMT_se, column = "Label", values = c("1.HC_Pool1"))
No reference, but MA transformation and normalization of samples is done pairwise between two samples with A = average of two samples and M = difference. The process is iterated through all samples pairs. Log2 data should be taken as input (on_raw = FALSE).
rlrMACycNorm( se, ain = "log2", aout = "RlrMACyc", on_raw = FALSE, iterations = 3 )
rlrMACycNorm( se, ain = "log2", aout = "RlrMACyc", on_raw = FALSE, iterations = 3 )
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
iterations |
Number of cyclic iterations to be performed |
SummarizedExperiment containing the RlrMACyc normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- rlrMACycNorm(tuberculosis_TMT_se, ain = "log2", aout = "RlrMACyc", on_raw = FALSE, iterations=3)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- rlrMACycNorm(tuberculosis_TMT_se, ain = "log2", aout = "RlrMACyc", on_raw = FALSE, iterations=3)
Similar to Rlr, but data are MA transformed before normalization, (A = median sample, M = difference of that sample to A). Log2 data should be taken as input (on_raw = FALSE).
rlrMANorm(se, ain = "log2", aout = "RlrMA", on_raw = FALSE)
rlrMANorm(se, ain = "log2", aout = "RlrMA", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the RlrMA normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- rlrMANorm(tuberculosis_TMT_se, ain = "log2", aout = "RlrMA", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- rlrMANorm(tuberculosis_TMT_se, ain = "log2", aout = "RlrMA", on_raw = FALSE)
Uses median values over all samples as reference sample to which all the other samples in the data are normalized to. Log2 data should be taken as input (on_raw = FALSE).
rlrNorm(se, ain = "log2", aout = "Rlr", on_raw = FALSE)
rlrNorm(se, ain = "log2", aout = "Rlr", on_raw = FALSE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the rlr normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- rlrNorm(tuberculosis_TMT_se, ain = "log2", aout = "Rlr", on_raw = FALSE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- rlrNorm(tuberculosis_TMT_se, ain = "log2", aout = "Rlr", on_raw = FALSE)
Log2-scaled data should be used as input (on_raw = FALSE).
robnormNorm(se, ain = "log2", aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)
robnormNorm(se, ain = "log2", aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
gamma.0 |
Numeric representing the exponent of the weighted density. When the sample size is small, the fitted population of some proteins could be locally trapped such that the variance of those proteins was very small under a large gamma. To avoid this, a small gamma is recommended. When sample size smaller than 40, then set gamma to 0.5 or 0.1. |
SummarizedExperiment containing the RobNorm normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- robnormNorm(tuberculosis_TMT_se, ain = "log2", aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- robnormNorm(tuberculosis_TMT_se, ain = "log2", aout = "RobNorm", on_raw = FALSE, gamma.0 = 0.1)
Run DE analysis of a selection of normalized data sets
run_DE( se, comparisons, ain = NULL, condition = NULL, DE_method = "limma", covariate = NULL, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE, DEqMS_PSMs_column = NULL )
run_DE( se, comparisons, ain = NULL, condition = NULL, DE_method = "limma", covariate = NULL, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE, DEqMS_PSMs_column = NULL )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
comparisons |
Vector of comparisons that are performed in the DE analysis (from specify_comparisons method) |
ain |
Vector of strings which assay should be used as input (default NULL). If NULL then all normalization of the se object are plotted next to each other. |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
DE_method |
String specifying which DE method should be applied (limma, ROTS, DEqMS) |
covariate |
String specifying which column to include as covariate into limma |
logFC |
Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE) |
logFC_up |
Upper log2 fold change threshold (dividing into up regulated) |
logFC_down |
Lower log2 fold change threshold (dividing into down regulated) |
p_adj |
Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE) |
alpha |
Threshold for adjusted p-values or p-values |
B |
Number of bootstrapping for ROTS |
K |
Number of top-ranked features for reproducibility optimization |
trend |
logical, should an intensity-dependent trend be allowed for the prior variance? If FALSE then the prior variance is constant. Alternatively, trend can be a row-wise numeric vector, which will be used as the covariate for the prior variance. |
robust |
logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances? |
DEqMS_PSMs_column |
String specifying which column name to use for DEqMS (default NULL). Any column of the rowData(se) is accepted. |
Data table of DE results of selected normalized data sets
data(tuberculosis_TMT_se) comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL, sep = NULL, control = NULL) de_res <- run_DE(tuberculosis_TMT_se, comparisons, ain = NULL, condition = NULL, DE_method = "limma", logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE)
data(tuberculosis_TMT_se) comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL, sep = NULL, control = NULL) de_res <- run_DE(tuberculosis_TMT_se, comparisons, ain = NULL, condition = NULL, DE_method = "limma", logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE)
Run DE analysis on a single normalized data set
run_DE_single( se, method, comparisons, condition = NULL, DE_method = "limma", covariate = NULL, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE, DEqMS_PSMs_column = NULL )
run_DE_single( se, method, comparisons, condition = NULL, DE_method = "limma", covariate = NULL, logFC = TRUE, logFC_up = 1, logFC_down = -1, p_adj = TRUE, alpha = 0.05, B = 100, K = 500, trend = TRUE, robust = TRUE, DEqMS_PSMs_column = NULL )
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
method |
String specifying which assay should be used as input |
comparisons |
Vector of comparisons that are performed in the DE analysis (from specify_comparisons method) |
condition |
column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
DE_method |
String specifying which DE method should be applied (limma, ROTS, DEqMS) |
covariate |
String specifying which column to include as covariate into limma |
logFC |
Boolean specifying whether to apply a logFC threshold (TRUE) or not (FALSE) |
logFC_up |
Upper log2 fold change threshold (dividing into up regulated) |
logFC_down |
Lower log2 fold change threshold (dividing into down regulated) |
p_adj |
Boolean specifying whether to apply a threshold on adjusted p-values (TRUE) or on raw p-values (FALSE) |
alpha |
Threshold for adjusted p-values or p-values |
B |
Number of bootstrapping for ROTS |
K |
Number of top-ranked features for reproducibility optimization |
trend |
logical, should an intensity-dependent trend be allowed for the prior variance? If FALSE then the prior variance is constant. Alternatively, trend can be a row-wise numeric vector, which will be used as the covariate for the prior variance. |
robust |
logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances? |
DEqMS_PSMs_column |
String specifying which column name to use for DEqMS (default NULL). Any column of the rowData(se) is accepted. |
Data table of DE results
Create vector of comparisons for DE analysis (either by single condition (sep = NULL) or by combined condition)
specify_comparisons(se, condition = NULL, sep = NULL, control = NULL)
specify_comparisons(se, condition = NULL, sep = NULL, control = NULL)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
condition |
Column name of condition (if NULL, condition saved in SummarizedExperiment will be taken) |
sep |
Separator that separates both groups in the condition vector (NULL if condition composed only of single group) |
control |
String of control samples (how the control condition is named) (NULL if no control sample) |
Vector of comparisons for DE analysis
data(tuberculosis_TMT_se) comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL, sep = NULL, control = NULL)
data(tuberculosis_TMT_se) comparisons <- specify_comparisons(tuberculosis_TMT_se, condition = NULL, sep = NULL, control = NULL)
Additional function of the DEqMS package
spectraCounteBayes_DEqMS(fit, coef_col)
spectraCounteBayes_DEqMS(fit, coef_col)
fit |
linear model from function perform_limma |
coef_col |
an integer vector indicating the column(s) of fit$coefficients for which the function is to be performed. if not specified, all coefficients are used. |
list object
A data.table
containing the DE results of the spike_in_se data set (limma, logFC > 1, logFC < -1, p.adj < 0.05)
data(spike_in_de_res)
data(spike_in_de_res)
An object of class data.table
(inherits from data.frame
) with 7500 rows and 10 columns.
Jürgen Cox, Marco Y. Hein, Christian A. Luber, Igor Paron, Nagarjuna Nagaraj, and Matthias Mann.Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ. Molecular & Cellular Proteomics 13.9 (Sept. 2014), pp. 2513–2526. <https://doi.org/10.1074/mcp.M113.031591>.
A SummarizedExperiment
containing the raw and log2-scaled data of 301 proteins measured in 20 samples. Due to size restriction, we only included the relevant columns of the original proteinGroups.txt of MaxQuant.
data(spike_in_se)
data(spike_in_se)
An object of class SummarizedExperiment
with 1500 rows and 6 columns.
Jürgen Cox, Marco Y. Hein, Christian A. Luber, Igor Paron, Nagarjuna Nagaraj, and Matthias Mann.Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ. Molecular & Cellular Proteomics 13.9 (Sept. 2014), pp. 2513–2526. <https://doi.org/10.1074/mcp.M113.031591>.
Subset SummarizedExperiment object by normalization assays
subset_SE_by_norm(se, ain)
subset_SE_by_norm(se, ain)
se |
SummarizedExperiment containing all necessary information of the proteomics data set |
ain |
Character vector of assay names to keep in the SummarizedExperiment object |
SummarizedExperiment object with only the selected normalization assays
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- subset_SE_by_norm(tuberculosis_TMT_se, ain = c("raw", "log2", "IRS_on_RobNorm"))
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- subset_SE_by_norm(tuberculosis_TMT_se, ain = c("raw", "log2", "IRS_on_RobNorm"))
Raw data should be taken as input (on_raw = TRUE).
tmmNorm(se, ain = "raw", aout = "TMM", on_raw = TRUE)
tmmNorm(se, ain = "raw", aout = "TMM", on_raw = TRUE)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
SummarizedExperiment containing the TMM normalized data as assay (on log2 scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- tmmNorm(tuberculosis_TMT_se, ain = "raw", aout = "TMM", on_raw = TRUE)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- tmmNorm(tuberculosis_TMT_se, ain = "raw", aout = "TMM", on_raw = TRUE)
A data.table
containing the DE results of the tuberculosis_TMT_se data set (limma, logFC > 1, logFC < -1, p.adj < 0.05)
data(tuberculosis_TMT_de_res)
data(tuberculosis_TMT_de_res)
An object of class data.table
(inherits from data.frame
) with 9030 rows and 9 columns.
Biadglegne et al. Mycobacterium tuberculosis Affects Protein and Lipid Content of Circulating Exosomes in Infected Patients Depending on Tuberculosis Disease State. Biomedicines 10.4 (Mar. 2022), p. 783. doi: 10.3390/ biomedicines10040783.
A SummarizedExperiment
containing the raw and log2-scaled data of 301 proteins measured in 20 samples
data(tuberculosis_TMT_se)
data(tuberculosis_TMT_se)
An object of class SummarizedExperiment
with 301 rows and 20 columns.
Biadglegne et al. Mycobacterium tuberculosis Affects Protein and Lipid Content of Circulating Exosomes in Infected Patients Depending on Tuberculosis Disease State. Biomedicines 10.4 (Mar. 2022), p. 783. doi: 10.3390/ biomedicines10040783.
Raw data should be taken as input (on_raw = TRUE).
vsnNorm(se, ain = "raw", aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)
vsnNorm(se, ain = "raw", aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)
se |
SummarizedExperiment containing all necessary information of the proteomic dataset |
ain |
String which assay should be used as input |
aout |
String which assay should be used to save normalized data |
on_raw |
Boolean specifying whether normalization should be performed on raw or log2-scaled data |
VSN_quantile |
Numeric of length 1. The quantile that is used for the resistant least trimmed sum of squares regression (see vsn2 lts.quantile) |
SummarizedExperiment containing the vsn normalized data as assay (on log2-scale)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- vsnNorm(tuberculosis_TMT_se, ain = "raw", aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)
data(tuberculosis_TMT_se) tuberculosis_TMT_se <- vsnNorm(tuberculosis_TMT_se, ain = "raw", aout = "VSN", on_raw = TRUE, VSN_quantile = 0.9)