| Title: | Removes unwanted covariance from mass cytometry data |
|---|---|
| Description: | Mass cytometry enables the simultaneous measurement of dozens of protein markers at the single-cell level, producing high dimensional datasets that provide deep insights into cellular heterogeneity and function. However, these datasets often contain unwanted covariance introduced by technical variations, such as differences in cell size, staining efficiency, and instrument-specific artifacts, which can obscure biological signals and complicate downstream analysis. This package addresses this challenge by implementing a robust framework of linear models designed to identify and remove these sources of unwanted covariance. By systematically modeling and correcting for technical noise, the package enhances the quality and interpretability of mass cytometry data, enabling researchers to focus on biologically relevant signals. |
| Authors: | Rosario Astaburuaga-García [aut, cre] (ORCID: <https://orcid.org/0000-0003-1179-4080>) |
| Maintainer: | Rosario Astaburuaga-García <[email protected]> |
| License: | GPL-3 |
| Version: | 1.5.0 |
| Built: | 2026-05-30 09:53:09 UTC |
| Source: | https://github.com/bioc/RUCova |
Calculated mean of normalised highest BC per cell
calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc, q = 0.95)calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc, q = 0.95)
sce |
A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function. |
name_assay |
A string specifying the name of the assay including the BC channels in linear scale. Default is "counts". |
bc_channels |
Vector specifying the names of the BC channels |
n_bc |
number of barcoding isotopes per cell. n_bc = 3 for the Fluidigm kit. |
q |
Quantile for normalisation. Default is 0.95. |
The SingleCellExperiment object with an extra column "mean_BC" in the corresponding assay.
sce <- RUCova::sce bc_channels <- c(c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di"), c("Dead_cells_194Pt", "Dead_cells_198Pt") ) sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)sce <- RUCova::sce bc_channels <- c(c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di"), c("Dead_cells_194Pt", "Dead_cells_198Pt") ) sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95)
Calculated mean of normalised Iridium isotopes
calc_mean_DNA(sce, name_assay = "counts", dna_channels, q)calc_mean_DNA(sce, name_assay = "counts", dna_channels, q)
sce |
A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function. |
name_assay |
A string specifying the name of the assay including the DNA channels in linear scale. |
dna_channels |
Vector specifying the names of the DNA channels |
q |
Quantile for normalisation. |
The SingleCellExperiment object with an extra column "mean_BC" in the corresponding assay.
sce <- RUCova::sce dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)sce <- RUCova::sce dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95)
Get pearson correlation coefficients between markers on a double triangular matrix for comparison (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric matrix
compare_corr( sce, name_assay_before = "counts", name_assay_after = NULL, name_reduced_dim = NULL )compare_corr( sce, name_assay_before = "counts", name_assay_after = NULL, name_reduced_dim = NULL )
sce |
A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function. |
name_assay_before |
A string specifying the name of the assay before RUCova (with original counts in linear scale). |
name_assay_after |
A string specifying the name of the assay before RUCova (with original counts in linear scale). |
name_reduced_dim |
A string specifying the name of the dimensionality reduction data stored under |
#A matrix with pearson correlation coefficients.
sce <- RUCova::sce bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", "Dead_cells_194Pt", "Dead_cells_198Pt") sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95) dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95) # Markers: m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38", "pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1", "pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD") # SUCs:: x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC") sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, apply_asinh_SUCs = TRUE, model = "interaction", center_SUCs = "across_samples", col_name_sample = "line", name_assay_after = "counts_interaction") compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction") heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")sce <- RUCova::sce bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", "Dead_cells_194Pt", "Dead_cells_198Pt") sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95) dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95) # Markers: m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38", "pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1", "pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD") # SUCs:: x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC") sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, apply_asinh_SUCs = TRUE, model = "interaction", center_SUCs = "across_samples", col_name_sample = "line", name_assay_after = "counts_interaction") compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction") heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")
Plot pearson correlation coefficients between markers on a double triangular heatmap (lower triangle: before RUCova, upper triangle: after RUCova). If RUCova has not been applied, the output is a symmetric heatmap.
heatmap_compare_corr( sce, name_assay_before = "counts", name_assay_after = NULL, name_reduced_dim = NULL )heatmap_compare_corr( sce, name_assay_before = "counts", name_assay_after = NULL, name_reduced_dim = NULL )
sce |
A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay". Asinh transformation is applied within the function. |
name_assay_before |
A string specifying the name of the assay before RUCova (with original counts in linear scale). |
name_assay_after |
A string specifying the name of the assay before RUCova (with original counts in linear scale). |
name_reduced_dim |
A string specifying the name of the dimensionality reduction data stored under |
#A heatmap with pearson correlation coefficients.
sce <- RUCova::sce bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", "Dead_cells_194Pt", "Dead_cells_198Pt") sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95) dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95) # Markers: m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38", "pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1", "pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD") # SUCs:: x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC") sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, apply_asinh_SUCs = TRUE, model = "interaction", center_SUCs = "across_samples", col_name_sample = "line", name_assay_after = "counts_interaction") heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")sce <- RUCova::sce bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", "Dead_cells_194Pt", "Dead_cells_198Pt") sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95) dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95) # Markers: m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38", "pChk2","pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1", "pSmad1.8","pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD") # SUCs:: x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC") sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, apply_asinh_SUCs = TRUE, model = "interaction", center_SUCs = "across_samples", col_name_sample = "line", name_assay_after = "counts_interaction") heatmap_compare_corr(sce[,sce$line == "Cal33"], name_assay_before = "counts", name_assay_after = "counts_interaction")
Is a tibble containing mass cytometry data of single-cell marker signals (rows = cells, columns = markers and metadata) in linear scale. This data set should be clean, meaning you excluded beads, debris, doublets, dead cells, and single-cells are demultiplexed (important if you want to adapt the linear fits to the samples). In this example we offer a mass cytometry data set consisting of 8 Head-and-Neck Squamous Cell Carcinoma (HNSCC) lines in irradiated (10 Gy) and control (0 Gy) conditions (Figure 2 and Figure 3 in the manuscript).
HNSCC_dataHNSCC_data
A data frame with 108649 rows and 59 variables: #'
Signal for the CXCL1 marker, a cytokine associated with inflammatory responses.
Signal for the Cerium 140 marker, typically used as a control or calibration marker.
Marker for the G2/M cell cycle phase regulator Cyclin B1.
Marker for the G1 cell cycle phase regulator Cyclin D1.
Signals for DNA-intercalating markers labeled with Iridium isotopes, used to assess nuclear content.
Signals for dead cell markers labeled with Platinum isotopes, used to identify and exclude dead cells.
Measure of event duration in the mass cytometer, used for quality control.
Signal for GDF15, a marker associated with stress and inflammation.
Signal for IdU (Iododeoxyuridine), used to assess DNA synthesis.
Signal for IkBa, an inhibitor of the NF-kB signaling pathway.
Signal for Ki-67, a marker of cell proliferation.
Signal for Lamin B1, a nuclear lamina protein.
Signal for the Notch Intracellular Domain, a marker of Notch signaling activity.
Signals for Palladium isotopes, used as barcodes for cell multiplexing.
Normalized signal for Platinum 194 isotope.
Timestamp for the acquisition of each event.
Signal for Yes-associated protein (YAP), a transcriptional co-activator in the Hippo pathway.
Barcode channel used for multiplexing.
Indicator for exclusion due to bead contamination.
Signal for cleaved Caspase-3, a marker of apoptosis.
Signal for cleaved PARP, another marker of apoptosis.
Treatment condition: either 0Gy (control) or 10Gy (irradiated).
Indicator for whether the sample was irradiated.
HNSCC cell lines included in the data: "Cal27", "Cal33", "UPCISCC099", "UPCISCC131", "UTSCC16A", "UDSCC2", "UPCISCC154", and "VUSCC147".
Indicator of low Platinum signal.
Signals for phosphorylated p38, p53, and 4E-BP1, markers of stress response and translation regulation.
Signals for phosphorylated Akt, a marker of PI3K/Akt pathway activity.
Signals for phosphorylated CDC25c and Chk2, markers of DNA damage response.
Signal for phosphorylated ERK1/2, a marker of MAPK pathway activity.
Signals for phosphorylated H2A.X and H3, markers of DNA damage and mitosis, respectively.
Signals for phosphorylated MEK1/2 and NF-kB, markers of MAPK and inflammatory signaling.
Signal for phosphorylated Rb, a marker of cell cycle regulation.
Signal for phosphorylated S6, a marker of protein synthesis.
Signals for phosphorylated Smad1/8 and Smad2/3, markers of TGF-beta signaling.
Signals for phosphorylated Stat1 and Stat3, markers of JAK/STAT pathway activity.
Total Akt protein signal.
Identifier for technical replicates.
Indicator for singlet events, excluding doublets.
Total ERK protein signal.
rucova( sce, name_assay_before = "counts", markers, SUCs = c("mean_DNA", "mean_BC", "total_ERK", "pan_Akt"), name_reduced_dim = "PCA", apply_asinh_SUCs = TRUE, model = "interaction", col_name_sample = "line", center_SUCs = "across_samples", keep_offset = TRUE, name_assay_after = "counts_rucova" )rucova( sce, name_assay_before = "counts", markers, SUCs = c("mean_DNA", "mean_BC", "total_ERK", "pan_Akt"), name_reduced_dim = "PCA", apply_asinh_SUCs = TRUE, model = "interaction", col_name_sample = "line", center_SUCs = "across_samples", keep_offset = TRUE, name_assay_after = "counts_rucova" )
sce |
A SingleCellExperiment object with markers and SUCs in linear scale stored in the assay "name_assay_before". Asinh transformation is applied within the function. |
name_assay_before |
A string specifying the name of the assay before RUCova (with original counts in linear scale). |
markers |
Vector of marker names to normalise, y (in linear scale). |
SUCs |
Vector of surrogates of unwanted covariance to use for normalisation, x (in linear scale). |
name_reduced_dim |
string specifying the name of the dimensionality reduction result in the SingleCellExperiment sce. |
apply_asinh_SUCs |
Apply (TRUE) or not (FALSE) asinh transformation to the SUCs. TRUE if SUCs are the measured surrogates, FALSE if SUCs are PCs. |
model |
A character: "simple", "offset" or "interaction" defining the model. |
col_name_sample |
A character indicating the column name in "data" defining each sample. |
center_SUCs |
A character "across_samples" or "per_sample" defining how to center the SUCs in zero. |
keep_offset |
Keep (TRUE) or not (FALSE) the offset intercept between samples.+ |
name_assay_after |
A string specifying the name of the assay after RUCova (with regressed counts in linear scale). |
The input SingleCellExperiment object with an additional assay (name_assay_after) and a list in the metadata containing all the model details.
sce <- RUCova::sce bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", "Dead_cells_194Pt", "Dead_cells_198Pt") sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95) dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95) # Markers: m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38","pChk2", "pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1","pSmad1.8", "pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD") # SUCs:: x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC") sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, apply_asinh_SUCs = TRUE, model = "interaction", center_SUCs = "across_samples", col_name_sample = "line", name_assay_after = "counts_interaction")sce <- RUCova::sce bc_channels <- c("Pd102Di", "Pd104Di", "Pd105Di", "Pd106Di", "Pd108Di", "Pd110Di", "Dead_cells_194Pt", "Dead_cells_198Pt") sce <- RUCova::calc_mean_BC(sce, name_assay = "counts", bc_channels, n_bc = 4, q = 0.95) dna_channels <- c("DNA_191Ir", "DNA_193Ir") sce <- RUCova::calc_mean_DNA(sce, name_assay = "counts", dna_channels, q = 0.95) # Markers: m <- c("pH3","IdU","Cyclin_D1","Cyclin_B1", "Ki.67","pRb","pH2A.X","p.p53","p.p38","pChk2", "pCDC25c","cCasp3","cPARP","pAkt","pAkt_T308","pMEK1.2","pERK1.2","pS6","p4e.BP1","pSmad1.8", "pSmad2.3","pNFkB","IkBa", "CXCL1","Lamin_B1", "pStat1","pStat3", "YAP","NICD") # SUCs:: x <- c("total_ERK", "pan_Akt", "mean_DNA", "mean_BC") sce <- RUCova::rucova(sce = sce, name_assay_before = "counts", markers = m, SUCs = x, apply_asinh_SUCs = TRUE, model = "interaction", center_SUCs = "across_samples", col_name_sample = "line", name_assay_after = "counts_interaction")
A SingleCellExperiment object containing mass cytometry data of single-cell marker signals.
The data includes one or more assays (e.g., "counts") with signals in linear scale.
Rows represent markers, and columns represent cells. The data is clean,
excluding calibration beads, debris, doublets, and dead cells. Single cells are demultiplexed,
which is important for adapting linear fits to samples. Metadata such as samples and treatment conditions
are stored in the colData.
scesce
A SingleCellExperiment object with the following components:
One or more assays, such as "counts", containing the marker signals.
Column metadata, including cell annotations such as cell_id, line, and dose.
Row metadata, including marker annotations.