---
title: "LimROTS: A Hybrid Method Integrating Empirical Bayes and 
        Reproducibility-Optimized Statistics for Robust Analysis of Proteomics 
        and Metabolomics Data"
bibliography: references.bib
output:
    BiocStyle::html_document:
        fig_height: 7
        fig_width: 7
        toc: true
        toc_float: true
        toc_depth: 3
        number_sections: true
vignette: >
    %\VignetteIndexEntry{LimROTS}
    %\VignetteEncoding{UTF-8}
    %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 72
---

```{r, include = FALSE}
knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)
```

# Introduction

Differential expression analysis is a prevalent method utilised in the
examination of diverse biological data. The reproducibility-optimized
test statistic (ROTS) @rots has been developed with a modified
t-statistic based on the data's intrinsic characteristics and ranks
features according to their statistical significance for differential
expression between two or more groups, as shown by the f-statistic.
Focusing on proteomics and metabolomics, the current ROTS implementation
cannot account for technical or biological covariates such as MS batches
or gender differences among the samples. Consequently, we developed
LimROTS, which employs a reproducibility-optimized test statistic
utilizing the limma empirical bayes @limma methodology to simulate more
complex experimental designs.

# Algorithm overview

The **LimROTS** approach initially uses the limma package @limma to
simulate the intensity data of proteins and metabolites. A linear model
is subsequently fitted using the design matrix. Empirical Bayes variance
shrinking is then implemented. To obtain the moderated t-statistics (or
f-statistics), the adjusted standard error
$SE_{post} = \sqrt{s^2_{\text{post}}} \times \text{unscaled SD}$ for
each feature is computed, along with the regression coefficient for each
feature (indicating the impact of variations in the experimental
settings). Then, by adapting a reproducibility-optimized technique known
as ROTS @rots to establish an optimality based on the largest overlap of
top-ranked features within group-preserving bootstrap datasets (refer to
@elo2008reproducibility for further information on the
reproducibility-optimization). Finally based on the optimized parameters
${\alpha1}$ and ${\alpha2}$ this equation used to calculates the final
statistics:
$$t{\alpha(p)} = \frac{\beta_{(p)}}{\alpha1 + \alpha2 \times 
\text{SEpost}_{(p)}}$$

where $t_{\alpha(p)}$ is the final statistics for each feature,
${\beta_{(p)}}$ is the coefficient, and $SEpost_{(p)}$ is the adjusted
standard error. LimROTS generates p-values from permutation samples
using the implementation available in `qvalue` package @qvalue, along
with internal implementation of FDR adapted from ROTS package @rots.
Additionally, the `qvalue` package is used to calculate q-values, were
the proportion of true null p-values is set to the bootstrap method. We
recommend using permutation-derived p-values and qvalues.

# Computational Power

The number of samples, features, bootstrap iterations, and `k`, which
denotes the top list size for ranking, are the four primary elements
that determine the amount of computing resources required for the
optimisation process in LimROTS. It is therefore advised to use at least
4 workers to execute LimROTS since it uses a parallel processing (using
`BiocParallel` @BiocParallel) implementation for the bootstrapping step.
The optimisation process is sequential and maybe time-consuming, based
on the `k` value; it is planned to be modified in order to make the
upcoming LimROTS version faster.

# Parameter Description for LimROTS Main Function

The `LimROTS()` function serves as the primary interface function for
end users, providing core functionality for the analysis implemented
within the `LimROTS` package.

`LimROTS()`takes several parameters, and it should be called correctly
to obtain the desired output. For a detailed description of all
parameters and their usage, refer to the function's help page by typing
`?LimROTS` in the R console.

# UPS1 Case Study

To demonstrate LimROTS' ability to detect true negatives complex
scenarios, we are using a DIA proteomics data from a UPS1-spiked E. coli
protein mixture @GOTTI2022107829 includes 48 samples: 24 samples
analyzed with Spectronaut and another 24 analyzed with ScaffoldDIA
software, with total of 1656 proteins. Eight different concentrations of
UPS1 were used (0.1 to 50 fmol), grouped into two categories: low
concentrations (0.1–2.5 fmol, labeled as \text{Conc.} 2, 12 Samples from
each software) and high concentrations (5–50 fmol, labeled as
\text{Conc.} 1, 12 Samples from each software).

A synthetic, unbalanced fake batches assigned to the samples. The
assignment follows the ratio of:

```{r echo=FALSE, message=FALSE, results='hide'}
library(LimROTS, quietly = TRUE)
data("UPS1.Case4")
```

```{r echo=FALSE}
table(UPS1.Case4$fake.batch, UPS1.Case4$Conc.)
```

Additionally, 100 E. coli proteins were randomly selected, and an effect
size of 10 was added to each in only one of the fake batches. The
expected outcome is that only the UPS1 human proteins will be identified
as truly significant, while none of the remaining proteins should show
significant differences between the concentration groups.

This scenario resembles a real-world case where the experiment involves
unbalanced batch assignments or, for instance, an uneven gender ratio
among the samples.

LimROTS can take a SummarizedExperiment object with all the metadata
needed to run the model. In this example we importing UPS1.Case4 data
available in LimROTS.

The original source of the dataset can be found here @GOTTI2022107829

```{r, message=FALSE}
# Load necessary packages
library(LimROTS, quietly = TRUE)
library(BiocParallel, quietly = TRUE)
library(ggplot2, quietly = TRUE)
library(SummarizedExperiment, quietly = TRUE)
```

```{r}
# Load the dataset
data("UPS1.Case4")
print(UPS1.Case4)
```

### Run LimROTS

**Before running LimROTS, the seed should be set for reproducibility.**

```{r}
set.seed(1234, kind = "default" , sample.kind = "default")
```


```{r}
# Set metadata and formula for LimROTS analysis
meta.info <- c("Conc.", "tool", "fake.batch")
niter <- 100 # Number of bootstrap samples
K <- 100 # Set the value for K based on the data size
K <- floor(K)
group.name <- "Conc."
formula.str <- "~ 0 + Conc. + tool + fake.batch" # Formula for group comparison

# Run LimROTS analysis with trend and robust settings enabled
UPS1.Case4 <- LimROTS(
    x = UPS1.Case4,
    niter = niter, K = K, meta.info = meta.info,
    BPPARAM  = NULL, group.name = group.name,
    formula.str = formula.str, trend = TRUE,
    robust = TRUE, permutating.group = FALSE
)
```

**NOTE:** "In this instance, we configure the number of bootstrap
iterations (niter) and the count of top-ranked features for
reproducibility optimization (K) to 100 both, in order to minimize the
example's run-time. For actual analyses, it is advisable to utilize a
greater number of bootstraps (e.g., 1000). Also, for the number of cores
to use we recommend to use at least 4 workers

On Windows OS, the user can define a `BPPARAM ` to work with more than 2
workers as follows:

```{r}
# BPPARAM  <- SnowParam(workers = 4)
# Commented out in the vignette to ensure the limit set by the
# R_CHECK_LIMIT_CORES environment variable is not exceeded.
```

On Linux and Mac OS

```{r}
# BPPARAM  <- MulticoreParam(workers = 4)
# Commented out in the vignette to ensure the limit set by the
# R_CHECK_LIMIT_CORES environment variable is not exceeded.
```

And then pass the `BPPARAM ` to `LimROTS()`.

The results from the `LimROTS` function will be appended to the
`SummarizedExperiment` object used, in this case, `UPS1.Case4`.

### Volcano Plot with ggplot2

Utilising a Volcano plot and mapping the human UPS1 proteins at q-values
5%, it is evident that LimROTS accurately identified the majority of
actual positive proteins while detecting a limited number of simulated
E.coli proteins.

```{r}
# Create a data frame from the LimROTS results

limrots.result.df <- data.frame(rowData(UPS1.Case4), 
                                row.names = rownames(UPS1.Case4))

# Mark proteins as true positives (HUMAN UPS1 proteins)
limrots.result.df$TP <- ifelse(grepl("HUMAN", limrots.result.df$GeneID),
    "HUMAN_TP", "ECOLI_FP"
)

# Create a volcano plot
ggplot(limrots.result.df, aes(
    x = corrected.logfc, y = -log10(qvalue),
    color = factor(TP)
)) +
    geom_point(alpha = 0.8) +
    theme_bw() +
    labs(
        title = "Volcano Plot", x = "Log Fold Change", y = "-Log10 q.value",
        color = "True Positive"
    ) +
    scale_color_manual(values = c("grey", "red")) +
    geom_hline(yintercept = -log10(0.05), linetype = "dashed", color = "blue")
```

### Quality Control Plots

LimROTS generates p-values from permutation samples, along with FDR.
Additionally, the `qvalue` package is used to calculate q-values and
Benjamini-Hochberg adjusted p-values based on the permutation-derived
p-values. These can be used as Quality Control for the LimROTS results.
We recommend using permutation-derived p-values and qvalues, though they
should generally be very similar to the FDR and Benjamini-Hochberg
adjusted p-values.

```{r results="hide" , message=FALSE, warning=FALSE}
## Quality Control Plots

# Plot of q-values
plot(metadata(UPS1.Case4)[["q_values"]])

# Histogram of q-values
hist(metadata(UPS1.Case4)[["q_values"]])

# Summary of q-values
print(summary(metadata(UPS1.Case4)[["q_values"]]))
```

# Comparison of LimROTS, limma, and ROTS

Before comparing the LimROTS, limma, and ROTS packages, we will need to
install the necessary packages: limma, ROTS and caret (for the use in
the comparison).

```{r, eval=FALSE}
if (!require("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}
if (!require(limma)) {
    BiocManager::install("limma")
}
if (!require(ROTS)) {
    BiocManager::install("ROTS")
}
if (!require(caret)) {
    install.packages(caret)
}
```

## Run ROTS

```{r}
library(ROTS)
groups <- as.numeric(UPS1.Case4$Conc.)
rots.result <- ROTS(
    data = assay(UPS1.Case4), groups = groups, B = niter,
    K = K, seed = 1234
)
rots.result <- data.frame(
    proteins = row.names(rots.result$data),
    logFC = rots.result$logfc, FDR = rots.result$FDR
)
```

## Run limma

```{r}
library(limma)

design.matrix <- model.matrix(formula(formula.str), data = colData(UPS1.Case4))
fit <- lmFit(assay(UPS1.Case4), design.matrix)
cont_matrix <- makeContrasts("Conc.1-Conc.2", levels = design.matrix)
fit2 <- contrasts.fit(fit, cont_matrix)
fit.ebayes <- eBayes(fit2, trend = TRUE, robust = TRUE)
limma.result <- topTable(fit.ebayes, coef = "Conc.1-Conc.2", number = "Inf")
```

## Generating a Confusion Matrix for Comparison

```{r}
library(caret, quietly = TRUE, warn.conflicts = TRUE)


TP <- elementMetadata(UPS1.Case4)[["GeneID"]]
TP <- TP[grepl("HUMAN", TP)]


predictions_limrots <- elementMetadata(UPS1.Case4)[["qvalue"]] < 0.05
predictions_limrots <- factor(predictions_limrots, levels = c(TRUE, FALSE))
true_labels_limrots <- ifelse(rownames(UPS1.Case4) %in% TP,
    TRUE, FALSE
)
true_labels_limrots <- factor(true_labels_limrots, levels = c(TRUE, FALSE))
conf_matrix_limrots <- confusionMatrix(
    predictions_limrots,
    true_labels_limrots
)

predictions_rots <- rots.result$FDR < 0.05
predictions_rots <- factor(predictions_rots, levels = c(TRUE, FALSE))
true_labels_rots <- ifelse(rots.result$protein %in% TP, TRUE, FALSE)
true_labels_rots <- factor(true_labels_rots, levels = c(TRUE, FALSE))
conf_matrix_rots <- confusionMatrix(predictions_rots, true_labels_rots)


predictions_limma <- limma.result$adj.P.Val < 0.05
predictions_limma <- factor(predictions_limma, levels = c(TRUE, FALSE))
true_labels_limma <- ifelse(row.names(limma.result) %in% TP, TRUE, FALSE)
true_labels_limma <- factor(true_labels_limma, levels = c(TRUE, FALSE))
conf_matrix_limma <- confusionMatrix(predictions_limma, true_labels_limma)
```

## Summarizing the Comparison Results

```{r}
library(ggplot2)


extract_metrics <- function(conf_matrix, method_name) {
    metrics <- c(
        conf_matrix$byClass["Sensitivity"],
        conf_matrix$byClass["Specificity"],
        conf_matrix$byClass["Pos Pred Value"],
        conf_matrix$byClass["Neg Pred Value"],
        conf_matrix$byClass["F1"],
        conf_matrix$byClass["Balanced Accuracy"]
    )

    data.frame(
        Method = method_name, Metric = names(metrics),
        Value = as.numeric(metrics)
    )
}

metrics_limrots <- extract_metrics(conf_matrix_limrots, "limROTS")
metrics_rots <- extract_metrics(conf_matrix_rots, "ROTS")
metrics_limma <- extract_metrics(conf_matrix_limma, "limma")


all_metrics <- do.call(rbind, list(
    metrics_limrots, metrics_rots,
    metrics_limma
))


ggplot(all_metrics, aes(x = Metric, y = Value, fill = Method)) +
    geom_bar(stat = "identity", position = "dodge") +
    theme_bw() +
    labs(
        title = "Comparison of Performance Case Study",
        y = "Value", x = "Metric"
    ) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    theme(
        plot.title = element_text(face = "bold", color = "black"),
        axis.title = element_text(face = "bold", color = "black"),
        axis.text = element_text(face = "bold", color = "black"),
        axis.ticks = element_line(color = "black")
    )
```

Based on the evaluation, limROTS emerges as the most suitable method
overall when considering the balance across all performance metrics. It
achieves a strong combination of sensitivity, specificity, predictive
values, F1 score, and balanced accuracy, suggesting its ability to
reliably identify true positives and true negatives while maintaining
consistent predictive performance. In contrast, while ROTS and limma
excel in certain metrics like sensitivity, their lower specificity and
predictive values indicate limitations in broader applicability.
Therefore, limROTS appears to provide the most consistent and balanced
performance for general use cases.

```{r}
sessionInfo()
```

# References