Package 'RUCova' reference manual

Title:	Removes unwanted covariance from mass cytometry data
Description:	Mass cytometry enables the simultaneous measurement of dozens of protein markers at the single-cell level, producing high dimensional datasets that provide deep insights into cellular heterogeneity and function. However, these datasets often contain unwanted covariance introduced by technical variations, such as differences in cell size, staining efficiency, and instrument-specific artifacts, which can obscure biological signals and complicate downstream analysis. This package addresses this challenge by implementing a robust framework of linear models designed to identify and remove these sources of unwanted covariance. By systematically modeling and correcting for technical noise, the package enhances the quality and interpretability of mass cytometry data, enabling researchers to focus on biologically relevant signals.
Authors:	Rosario Astaburuaga-García [aut, cre] (ORCID: <https://orcid.org/0000-0003-1179-4080>)
Maintainer:	Rosario Astaburuaga-García <[email protected]>
License:	GPL-3
Version:	1.5.0
Built:	2026-07-03 23:46:56 UTC
Source:	https://github.com/bioc/RUCova

Calculated mean of normalised highest BC per cell

Description

Calculated mean of normalised highest BC per cell

Usage

calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc, q = 0.95)
calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc, q = 0.95)

Arguments

sce

A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function.

name_assay

A string specifying the name of the assay including the BC channels in linear scale. Default is "counts".

bc_channels

Vector specifying the names of the BC channels

n_bc

number of barcoding isotopes per cell. n_bc = 3 for the Fluidigm kit.

q

Quantile for normalisation. Default is 0.95.

Value

The SingleCellExperiment object with an extra column "mean_BC" in the corresponding assay.

Examples

sce <- RUCova::sce
bc_channels <- c(c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di"),
  c("Dead_cells_194Pt", "Dead_cells_198Pt")
)
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
sce <- RUCova::sce
bc_channels <- c(c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di"),
  c("Dead_cells_194Pt", "Dead_cells_198Pt")
)
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)

Calculated mean of normalised Iridium isotopes

Description

Calculated mean of normalised Iridium isotopes

Usage

calc_mean_DNA(sce, name_assay = "counts", dna_channels, q)
calc_mean_DNA(sce, name_assay = "counts", dna_channels, q)

Arguments

sce

A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function.

name_assay

A string specifying the name of the assay including the DNA channels in linear scale.

dna_channels

Vector specifying the names of the DNA channels

q

Quantile for normalisation.

Value

The SingleCellExperiment object with an extra column "mean_BC" in the corresponding assay.

Examples

sce <- RUCova::sce
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
sce <- RUCova::sce
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)

Get pearson correlation coefficients between markers on a double triangular matrix for comparison (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric matrix

Description

Get pearson correlation coefficients between markers on a double triangular matrix for comparison (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric matrix

Usage

compare_corr(
  sce,
  name_assay_before = "counts",
  name_assay_after = NULL,
  name_reduced_dim = NULL
)
compare_corr(
  sce,
  name_assay_before = "counts",
  name_assay_after = NULL,
  name_reduced_dim = NULL
)

Arguments

sce

A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function.

name_assay_before

A string specifying the name of the assay before RUCova (with original counts in linear scale).

name_assay_after

A string specifying the name of the assay before RUCova (with original counts in linear scale).

name_reduced_dim

A string specifying the name of the dimensionality reduction data stored under reducedDim(). If "PCA", then PCs will be included in the heatmao.

Value

#A matrix with pearson correlation coefficients.

Examples

sce <- RUCova::sce
bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di",
"Dead_cells_194Pt", "Dead_cells_198Pt")
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
# Markers:
m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38",
"pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1",
"pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD")
# SUCs::
x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC")
sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, 
apply_asinh_SUCs = TRUE,  model = "interaction", center_SUCs = "across_samples", 
col_name_sample = "line", name_assay_after = "counts_interaction")
compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")
heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")
sce <- RUCova::sce
bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di",
"Dead_cells_194Pt", "Dead_cells_198Pt")
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
# Markers:
m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38",
"pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1",
"pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD")
# SUCs::
x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC")
sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, 
apply_asinh_SUCs = TRUE,  model = "interaction", center_SUCs = "across_samples", 
col_name_sample = "line", name_assay_after = "counts_interaction")
compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")
heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")

Plot pearson correlation coefficients between markers on a double triangular heatmap (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric heatmap.

Description

Plot pearson correlation coefficients between markers on a double triangular heatmap (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric heatmap.

Usage

heatmap_compare_corr(
  sce,
  name_assay_before = "counts",
  name_assay_after = NULL,
  name_reduced_dim = NULL
)
heatmap_compare_corr(
  sce,
  name_assay_before = "counts",
  name_assay_after = NULL,
  name_reduced_dim = NULL
)

Arguments

sce

A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function.

name_assay_before

A string specifying the name of the assay before RUCova (with original counts in linear scale).

name_assay_after

A string specifying the name of the assay before RUCova (with original counts in linear scale).

name_reduced_dim

A string specifying the name of the dimensionality reduction data stored under reducedDim(). If "PCA", then PCs will be included in the heatmao.

Value

#A heatmap with pearson correlation coefficients.

Examples

sce <- RUCova::sce
bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di",
"Dead_cells_194Pt", "Dead_cells_198Pt")
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
# Markers:
m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38",
"pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1",
"pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD")
# SUCs::
x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC")
sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, 
apply_asinh_SUCs = TRUE,  model = "interaction", center_SUCs = "across_samples", 
col_name_sample = "line", name_assay_after = "counts_interaction")
heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")
sce <- RUCova::sce
bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di",
"Dead_cells_194Pt", "Dead_cells_198Pt")
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
# Markers:
m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38",
"pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1",
"pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD")
# SUCs::
x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC")
sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, 
apply_asinh_SUCs = TRUE,  model = "interaction", center_SUCs = "across_samples", 
col_name_sample = "line", name_assay_after = "counts_interaction")
heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")

HNSCC data set

Description

Is a tibble containing mass cytometry data of single-cell marker signals (rows = cells, columns = markers and metadata) in linear scale. This data set should be clean, meaning you excluded beads, debris, doublets, dead cells, and single-cells are demultiplexed (important if you want to adapt the linear fits to the samples). In this example we offer a mass cytometry data set consisting of 8 Head-and-Neck Squamous Cell Carcinoma (HNSCC) lines in irradiated (10 Gy) and control (0 Gy) conditions (Figure 2 and Figure 3 in the manuscript).

Usage

HNSCC_data
HNSCC_data

Format

A data frame with 108649 rows and 59 variables: #'

CXCL1: Signal for the CXCL1 marker, a cytokine associated with inflammatory responses.
Ce140Di: Signal for the Cerium 140 marker, typically used as a control or calibration marker.
Cyclin_B1: Marker for the G2/M cell cycle phase regulator Cyclin B1.
Cyclin_D1: Marker for the G1 cell cycle phase regulator Cyclin D1.
DNA_191Ir, DNA_193Ir: Signals for DNA-intercalating markers labeled with Iridium isotopes, used to assess nuclear content.
Dead_cells_194Pt, Dead_cells_195Pt, Dead_cells_196Pt, Dead_cells_198Pt: Signals for dead cell markers labeled with Platinum isotopes, used to identify and exclude dead cells.
Event_length: Measure of event duration in the mass cytometer, used for quality control.
GDF15: Signal for GDF15, a marker associated with stress and inflammation.
IdU: Signal for IdU (Iododeoxyuridine), used to assess DNA synthesis.
IkBa: Signal for IkBa, an inhibitor of the NF-kB signaling pathway.
Ki.67: Signal for Ki-67, a marker of cell proliferation.
Lamin_B1: Signal for Lamin B1, a nuclear lamina protein.
NICD: Signal for the Notch Intracellular Domain, a marker of Notch signaling activity.
Pd102Di, Pd104Di, Pd105Di, Pd106Di, Pd108Di, Pd110Di: Signals for Palladium isotopes, used as barcodes for cell multiplexing.
Pt194Di_norm: Normalized signal for Platinum 194 isotope.
Time: Timestamp for the acquisition of each event.
YAP: Signal for Yes-associated protein (YAP), a transcriptional co-activator in the Hippo pathway.
bc: Barcode channel used for multiplexing.
beadsOut: Indicator for exclusion due to bead contamination.
cCasp3: Signal for cleaved Caspase-3, a marker of apoptosis.
cPARP: Signal for cleaved PARP, another marker of apoptosis.
dose: Treatment condition: either 0Gy (control) or 10Gy (irradiated).
irradiated: Indicator for whether the sample was irradiated.
line: HNSCC cell lines included in the data: "Cal27", "Cal33", "UPCISCC099", "UPCISCC131", "UTSCC16A", "UDSCC2", "UPCISCC154", and "VUSCC147".
lowPt: Indicator of low Platinum signal.
p.p38, p.p53, p4e.BP1: Signals for phosphorylated p38, p53, and 4E-BP1, markers of stress response and translation regulation.
pAkt, pAkt_T308: Signals for phosphorylated Akt, a marker of PI3K/Akt pathway activity.
pCDC25c, pChk2: Signals for phosphorylated CDC25c and Chk2, markers of DNA damage response.
pERK1.2: Signal for phosphorylated ERK1/2, a marker of MAPK pathway activity.
pH2A.X, pH3: Signals for phosphorylated H2A.X and H3, markers of DNA damage and mitosis, respectively.
pMEK1.2, pNFkB: Signals for phosphorylated MEK1/2 and NF-kB, markers of MAPK and inflammatory signaling.
pRb: Signal for phosphorylated Rb, a marker of cell cycle regulation.
pS6: Signal for phosphorylated S6, a marker of protein synthesis.
pSmad1.8, pSmad2.3: Signals for phosphorylated Smad1/8 and Smad2/3, markers of TGF-beta signaling.
pStat1, pStat3: Signals for phosphorylated Stat1 and Stat3, markers of JAK/STAT pathway activity.
pan_Akt: Total Akt protein signal.
replicate: Identifier for technical replicates.
singlets: Indicator for singlet events, excluding doublets.
total_ERK: Total ERK protein signal.

Remove unwanted covariance

Usage

rucova(
  sce,
  name_assay_before = "counts",
  markers,
  SUCs = c("mean_DNA", "mean_BC", "total_ERK", "pan_Akt"),
  name_reduced_dim = "PCA",
  apply_asinh_SUCs = TRUE,
  model = "interaction",
  col_name_sample = "line",
  center_SUCs = "across_samples",
  keep_offset = TRUE,
  name_assay_after = "counts_rucova"
)
rucova(
  sce,
  name_assay_before = "counts",
  markers,
  SUCs = c("mean_DNA", "mean_BC", "total_ERK", "pan_Akt"),
  name_reduced_dim = "PCA",
  apply_asinh_SUCs = TRUE,
  model = "interaction",
  col_name_sample = "line",
  center_SUCs = "across_samples",
  keep_offset = TRUE,
  name_assay_after = "counts_rucova"
)

Arguments

sce

A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay_before". Asinh transformation is applied within the function.

name_assay_before

A string specifying the name of the assay before RUCova (with original counts in linear scale).

markers

Vector of marker names to normalise, y (in linear scale).

SUCs

Vector of surrogates of unwanted covariance to use for normalisation, x (in linear scale).

name_reduced_dim

string specifying the name of the dimensionality reduction result in the SingleCellExperiment sce.

apply_asinh_SUCs

Apply (TRUE) or not (FALSE) asinh transformation to the SUCs. TRUE if SUCs are the measured surrogates, FALSE if SUCs are PCs.

model

A character: "simple", "offset" or "interaction" defining the model.

col_name_sample

A character indicating the column name in "data" defining each sample.

center_SUCs

A character "across_samples" or "per_sample" defining how to center the SUCs in zero.

keep_offset

Keep (TRUE) or not (FALSE) the offset intercept between samples.+

name_assay_after

A string specifying the name of the assay after RUCova (with regressed counts in linear scale).

Value

The input SingleCellExperiment object with an additional assay (name_assay_after) and a list in the metadata containing all the model details.

Examples

sce <- RUCova::sce
bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", 
"Dead_cells_194Pt", "Dead_cells_198Pt")
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
# Markers:
m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38","pChk2",
"pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1","pSmad1.8",
"pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD")
# SUCs::
x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC")
sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, 
apply_asinh_SUCs = TRUE,  model = "interaction", center_SUCs = "across_samples", 
col_name_sample = "line", name_assay_after = "counts_interaction")
sce <- RUCova::sce
bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", 
"Dead_cells_194Pt", "Dead_cells_198Pt")
sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
dna_channels <- c("DNA_191Ir", "DNA_193Ir")
sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
# Markers:
m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38","pChk2",
"pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1","pSmad1.8",
"pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD")
# SUCs::
x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC")
sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, 
apply_asinh_SUCs = TRUE,  model = "interaction", center_SUCs = "across_samples", 
col_name_sample = "line", name_assay_after = "counts_interaction")

SingleCellExperiment Object with HNSCC Data Set

Description

A SingleCellExperiment object containing mass cytometry data of single-cell marker signals. The data includes one or more assays (e.g., "counts") with signals in linear scale. Rows represent markers, and columns represent cells. The data is clean, excluding calibration beads, debris, doublets, and dead cells. Single cells are demultiplexed, which is important for adapting linear fits to samples. Metadata such as samples and treatment conditions are stored in the colData.

Usage

sce
sce

Format

A SingleCellExperiment object with the following components:

assays: One or more assays, such as "counts", containing the marker signals.
colData: Column metadata, including cell annotations such as cell_id, line, and dose.
rowData: Row metadata, including marker annotations.

Package 'RUCova'

Help Index

Calculated mean of normalised highest BC per cell

Description

Usage

Arguments

Value

Examples

Calculated mean of normalised Iridium isotopes

Description

Usage

Arguments

Value

Examples

Get pearson correlation coefficients between markers on a double triangular matrix for comparison (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric matrix

Description

Usage

Arguments

Value

Examples

Plot pearson correlation coefficients between markers on a double triangular heatmap (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric heatmap.

Description

Usage

Arguments

Value

Examples

HNSCC data set

Description

Usage

Format

Remove unwanted covariance

Usage

Arguments

Value

Examples

SingleCellExperiment Object with HNSCC Data Set

Description

Usage

Format