Title: | Combinatorial Polyfunctionality Analysis of Single Cells |
---|---|
Description: | COMPASS is a statistical framework that enables unbiased analysis of antigen-specific T-cell subsets. COMPASS uses a Bayesian hierarchical framework to model all observed cell-subsets and select the most likely to be antigen-specific while regularizing the small cell counts that often arise in multi-parameter space. The model provides a posterior probability of specificity for each cell subset and each sample, which can be used to profile a subject's immune response to external stimuli such as infection or vaccination. |
Authors: | Lynn Lin, Kevin Ushey, Greg Finak, Ravio Kolde (pheatmap) |
Maintainer: | Greg Finak <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.45.0 |
Built: | 2024-11-29 05:13:39 UTC |
Source: | https://github.com/bioc/COMPASS |
This package implements a model for the analysis of polyfunctionality in single-cell cytometry experiments. The model effectively identifies combinations of markers that are differentially expressed between samples of cells subjected to different stimulations.
COMPASSContainer
, for information on getting your
cytometry data into a suitable format for use with COMPASS
,
COMPASS
, for the main model fitting routine.
Returns the categories matrix in a COMPASSResult
object.
categories(x, counts)
categories(x, counts)
x |
A |
counts |
Boolean; if |
Compute the number of cells expressing a particular combination of markers for each sample.
CellCounts(data, combinations)
CellCounts(data, combinations)
data |
Either a |
combinations |
A list of 'combinations', used to denote the subsets of interest. See the examples for usage. |
set.seed(123) ## generate 10 simulated matrices of flow data K <- 6 ## number of markers data <- replicate(10, simplify=FALSE, { m <- matrix( rnorm(1E4 * K, 2000, 1000 ), ncol=K ) m[m < 2500] <- 0 colnames(m) <- c("IL2", "IL4", "IL6", "Mip1B", "IFNg", "TNFa") return(m) }) names(data) <- sample(letters, 10) head( data[[1]] ) ## generate counts over all available combinations of markers in data str(CellCounts(data)) ## 64 columns, as all 2^6 combinations expressed ## generate marginal counts combos <- list(1, 2, 3, 4, 5, 6) ## marginal cell counts cc <- CellCounts(data, combos) ## a base R way of doing the same thing f <- function(data) { do.call(rbind, lapply(data, function(x) apply(x, 2, function(x) sum(x > 0)))) } cc2 <- f(data) ## check that they're identical stopifnot(identical( unname(cc), unname(cc2) )) ## We can also generate cell counts by expressing various combinations ## of markers (names) in the data. ## count cells expressing IL2 or IL4 CellCounts(data, "IL2|IL4") ## count cells expressing IL2, IL4 or IL6 CellCounts(data, "IL2|IL4|IL6") ## counts for each of IL2, IL4, IL6 (marginally) CellCounts(data, c("IL2", "IL4", "IL6")) ## counts for cells that are IL2 positive and IL4 negative CellCounts(data, "IL2 & !IL4") ## expressing the same intent with indices CellCounts(data, list(c(1, -2))) ## all possible combinations str(CellCounts(data, Combinations(6))) ## can also call on COMPASSContainers data(COMPASS) CellCounts(CC, "M1&M2")
set.seed(123) ## generate 10 simulated matrices of flow data K <- 6 ## number of markers data <- replicate(10, simplify=FALSE, { m <- matrix( rnorm(1E4 * K, 2000, 1000 ), ncol=K ) m[m < 2500] <- 0 colnames(m) <- c("IL2", "IL4", "IL6", "Mip1B", "IFNg", "TNFa") return(m) }) names(data) <- sample(letters, 10) head( data[[1]] ) ## generate counts over all available combinations of markers in data str(CellCounts(data)) ## 64 columns, as all 2^6 combinations expressed ## generate marginal counts combos <- list(1, 2, 3, 4, 5, 6) ## marginal cell counts cc <- CellCounts(data, combos) ## a base R way of doing the same thing f <- function(data) { do.call(rbind, lapply(data, function(x) apply(x, 2, function(x) sum(x > 0)))) } cc2 <- f(data) ## check that they're identical stopifnot(identical( unname(cc), unname(cc2) )) ## We can also generate cell counts by expressing various combinations ## of markers (names) in the data. ## count cells expressing IL2 or IL4 CellCounts(data, "IL2|IL4") ## count cells expressing IL2, IL4 or IL6 CellCounts(data, "IL2|IL4|IL6") ## counts for each of IL2, IL4, IL6 (marginally) CellCounts(data, c("IL2", "IL4", "IL6")) ## counts for cells that are IL2 positive and IL4 negative CellCounts(data, "IL2 & !IL4") ## expressing the same intent with indices CellCounts(data, list(c(1, -2))) ## all possible combinations str(CellCounts(data, Combinations(6))) ## can also call on COMPASSContainers data(COMPASS) CellCounts(CC, "M1&M2")
Given an intenger n
, generate all binary combinations of
n
elements, transformed to indices. This is primarily for use
with the CellCounts
function, but may be useful for
users in some scenarios.
Combinations(n)
Combinations(n)
n |
An integer. |
Combinations(3)
Combinations(3)
This function fits the COMPASS
model.
COMPASS( data, treatment, control, subset = NULL, category_filter = function(x) colSums(x > 5) > 2, filter_lowest_frequency = 0, filter_specific_markers = NULL, model = "discrete", iterations = 40000, replications = 8, keep_original_data = FALSE, verbose = TRUE, dropDegreeOne = FALSE, init_with_fisher = FALSE, run_model_or_return_data = "run_model", ... )
COMPASS( data, treatment, control, subset = NULL, category_filter = function(x) colSums(x > 5) > 2, filter_lowest_frequency = 0, filter_specific_markers = NULL, model = "discrete", iterations = 40000, replications = 8, keep_original_data = FALSE, verbose = TRUE, dropDegreeOne = FALSE, init_with_fisher = FALSE, run_model_or_return_data = "run_model", ... )
data |
An object of class |
treatment |
An R expression, evaluated within the metadata, that
returns |
control |
An R expression, evaluated within the metadata, that
returns |
subset |
An expression used to subset the data. We keep only the samples
for which the expression evaluates to |
category_filter |
A filter for the categories that are generated. This is a
function that will be applied to the treatment counts matrix generated from
the intensities. Only categories meeting the |
filter_lowest_frequency |
A number specifying how many of the least expressed markers should be removed. |
filter_specific_markers |
Similar to |
model |
A string denoting which model to fit; currently, only
the discrete model ( |
iterations |
The number of iterations (per 'replication') to perform. |
replications |
The number of 'replications' to perform. In order to conserve memory, we only keep the model estimates from the last replication. |
keep_original_data |
Keep the original |
verbose |
Boolean; if |
dropDegreeOne |
Boolean; if |
init_with_fisher |
Boolean;initialize from fisher's exact test. Any subset and subject with lower 95 Otherwise initialize very subject and subset as a responder except those where ps <= pu. |
run_model_or_return_data |
|
... |
Other arguments; currently unused. |
A COMPASSResult
is a list with the following components:
fit |
A list of various fitted parameters resulting from the
|
data |
The data used as input to the |
orig |
If |
The fit
component is a list with the following components:
alpha_s |
The hyperparameter shared across all subjects under the
stimulated condition. It is updated through the |
A_alphas |
The acceptance rate of |
alpha_u |
The hyperparameter shared across all subjects under the
unstimulated condition. It is updated through the |
A_alphau |
The acceptance rate of |
gamma |
An array of dimensions |
mean_gamma |
A matrix of mean response rates. Each cell denotes
the mean response of individual |
A_gamma |
The acceptance rate for the gamma. Each element
corresponds to the number of times an individual's |
categories |
The category matrix, showing which categories entered the model. |
model |
The type of model called. |
posterior |
Posterior measures from the sample fit. |
call |
The matched call used to generate the model fit. |
The data
component is a list with the following components:
n_s |
The counts matrix for stimulated samples. |
n_u |
The counts matrix for unstimulated samples. |
counts_s |
Total cell counts for stimulated samples. |
counts_u |
Total cell counts for unstimulated samples. |
categories |
The categories matrix used to define which categories will enter the model. |
meta |
The metadata. Note that only individual-level metadata will be kept; sample-specific metadata is dropped. |
sample_id |
The name of the vector in the metadata used to identify the samples. |
individual_id |
The name of the vector in the metadata used to identify the individuals. |
The orig
component (included if keep_original_data
is
TRUE
) is the COMPASSContainer
object used in the model
fit.
The category filter is used to exclude categories (combinations of
markers expressed for a particular cell) that are expressed very rarely.
It is applied to the treatment
counts matrix, which is a
N
samples by K
categories matrix. Those categories which
are mostly unexpressed can be excluded here. For example, the default
criteria,
category_filter=function(x) colSums(x > 5) > 2
indicates that we should only retain categories for which at least three samples had at least six cells expressing that particular combination of markers.
COMPASSContainer
, for constructing the data object
required by COMPASS
data(COMPASS) ## loads the COMPASSContainer 'CC' fit <- COMPASS(CC, category_filter=NULL, treatment=trt == "Treatment", control=trt == "Control", verbose=FALSE, iterations=100 ## set higher for a real analysis )
data(COMPASS) ## loads the COMPASSContainer 'CC' fit <- COMPASS(CC, category_filter=NULL, treatment=trt == "Treatment", control=trt == "Control", verbose=FALSE, iterations=100 ## set higher for a real analysis )
This function generates the data container suitable for use with
COMPASS
.
COMPASSContainer( data, counts, meta, individual_id, sample_id, countFilterThreshold = 0 )
COMPASSContainer( data, counts, meta, individual_id, sample_id, countFilterThreshold = 0 )
data |
A list of matrices. Each matrix |
counts |
A named integer vector of the cell counts(of the parent population) for each
sample in |
meta |
A |
individual_id |
The name of the vector in |
sample_id |
The name of the vector in |
countFilterThreshold |
Numeric; if the number of parent cells is less than this threshold, we remove that file. Default is 0, which means filter is disabled. |
The names
attributes for the data
and counts
objects passed should match.
A COMPASSContainer
returns a list made up of the same
components as input the model, but checks and sanitizes the supplied data
to ensure that it conforms to the expectations outlied above.
set.seed(123) n <- 10 ## number of samples k <- 3 ## number of markers ## generate some sample data sid_vec <- paste0("sid_", 1:n) ## sample ids; unique names used to denote samples iid_vec <- rep_len( paste0("iid_", 1:(n/2) ), n ) ## individual ids ## generate n matrices of 'cell intensities' data <- replicate(n, { nrow <- round(runif(1) * 1E2 + 1000) ncol <- k vals <- rexp( nrow * ncol, runif(1, 1E-5, 1E-3) ) vals[ vals < 2000 ] <- 0 output <- matrix(vals, nrow, ncol) output <- output[ apply(output, 1, sum) > 0, ] colnames(output) <- paste0("M", 1:k) return(output) }) names(data) <- sid_vec ## make a sample metadata data.frame meta <- data.frame( sid=sid_vec, iid=iid_vec, trt=rep( c("Control", "Treatment"), each=5 ) ) ## generate an example total counts ## recall that cells not expressing any marker are not included ## in the 'data' matrices counts <- sapply(data, nrow) + round( rnorm(n, 1E3, 1E2) ) counts <- setNames( as.integer(counts), names(counts) ) ## insert everything into a COMPASSContainer CC <- COMPASSContainer(data, counts, meta, "iid", "sid")
set.seed(123) n <- 10 ## number of samples k <- 3 ## number of markers ## generate some sample data sid_vec <- paste0("sid_", 1:n) ## sample ids; unique names used to denote samples iid_vec <- rep_len( paste0("iid_", 1:(n/2) ), n ) ## individual ids ## generate n matrices of 'cell intensities' data <- replicate(n, { nrow <- round(runif(1) * 1E2 + 1000) ncol <- k vals <- rexp( nrow * ncol, runif(1, 1E-5, 1E-3) ) vals[ vals < 2000 ] <- 0 output <- matrix(vals, nrow, ncol) output <- output[ apply(output, 1, sum) > 0, ] colnames(output) <- paste0("M", 1:k) return(output) }) names(data) <- sid_vec ## make a sample metadata data.frame meta <- data.frame( sid=sid_vec, iid=iid_vec, trt=rep( c("Control", "Treatment"), each=5 ) ) ## generate an example total counts ## recall that cells not expressing any marker are not included ## in the 'data' matrices counts <- sapply(data, nrow) + round( rnorm(n, 1E3, 1E2) ) counts <- setNames( as.integer(counts), names(counts) ) ## insert everything into a COMPASSContainer CC <- COMPASSContainer(data, counts, meta, "iid", "sid")
This dataset contains simulated data for an intracellular cytokine staining experiment. In this data set, we have paired samples from five individuals, with each pair of samples being subjected to either a 'Control' condition of a 'Treatment' condition.
Please see COMPASSContainer
for more information on the
components of this object.
The dataset is exported as CC
, which is a short alias
for COMPASS
Container
.
This code expects a GatingSet
or GatingSetList
.
It expects a regular expression for the node name
(i.e. '/4\+$' would match '/4+' in a node name with the plus
sign at the end of the string. Alternatively, you can supply a
partial path.
The user must supply the 'individual_id', which has the default value suitable for the data we commonly see.
'sample_id' is the 'rownames' of 'pData' of 'GatingSet'.
Sometimes the child node names don't match the marker names exactly.
This function will try to make some guesses about how to match these up.
The filter.fun
parameter is a function that does some regular expression string
substitution to try and clean up the node names by removing
various symobls that are often added to gates, {+/-}. The user can provide their
own function to do string cleanup.
Counts are extracted as well as metadata and single cell data, and these are fed into the
COMPASSContainer
constructor.
COMPASSContainerFromGatingSet( gs = NULL, node = NULL, filter.fun = NULL, individual_id = "PTID", mp = NULL, matchmethod = c("Levenshtein", "regex"), markers = NA, swap = FALSE, countFilterThreshold = 5000 )
COMPASSContainerFromGatingSet( gs = NULL, node = NULL, filter.fun = NULL, individual_id = "PTID", mp = NULL, matchmethod = c("Levenshtein", "regex"), markers = NA, swap = FALSE, countFilterThreshold = 5000 )
gs |
a |
node |
a |
filter.fun |
a |
individual_id |
a |
mp |
a |
matchmethod |
a |
markers |
a |
swap |
a |
countFilterThreshold |
|
There is likely not sufficient error checking.
## Not run: ## gs is a GatingSet from flowWorkspace COMPASSContainerFromGatingSet(gs, "4+") ## End(Not run)
## Not run: ## gs is a GatingSet from flowWorkspace COMPASSContainerFromGatingSet(gs, "4+") ## End(Not run)
This is used for setting an informative description used in the Shiny application.
COMPASSDescription(x) COMPASSDescription(x) <- value
COMPASSDescription(x) COMPASSDescription(x) <- value
x |
A |
value |
A set of paragraphs describing the experiment, as a character vector. |
Information about the COMPASS
results will be auto-generated.
Returns a table of counts and parent counts for each cell subset in a COMPASS fit.
COMPASSfitToCountsTable( x, idcol = NULL, parent = NULL, drop = NULL, stimName = NULL )
COMPASSfitToCountsTable( x, idcol = NULL, parent = NULL, drop = NULL, stimName = NULL )
x |
|
idcol |
unquote variable name in the metadata for the subject id. |
parent |
character name of the parent population for this model fit. e.g. "CD4" |
drop |
numeric vector indicating the columns in the metadata to drop from the output. Usually sample-specific columns rather than subject specific columns. |
stimName |
the name of the stimulation |
Diagnostic of a set of COMPASS Models.
COMPASSMCMCDiagnosis(x)
COMPASSMCMCDiagnosis(x)
x |
a list of compass model fits of the same data with the same number of iterations, different seeds. Run some mcmc diagnostics on a series of COMPASS model fits. Assuming the input is a list of model fits for the same data with the same number of iterations and different seeds. Run Gelman's Rhat diagnostics on the alpha_s and alpha_u hyperparameter chains, treating each model as an independent chain. Rhat should be near 1 but rarely are in practice. Very large values may be a concern. The method returns an average model, by averaging the mean_gamma matrices (equally weighted since each input has the same number of iterations). This mean model should be better then any of the individual models. It can be plotted via "plot(result$mean_model)". |
These functions can be used for accessing data within a COMPASSResult
.
Gamma(x) MeanGamma(x)
Gamma(x) MeanGamma(x)
x |
A |
This dataset represents the result of fitting the COMPASS
model on the
accompanying dataset CC, as exported by data(COMPASS)
.
Please see the vignette (vignette("COMPASS")
) for more details on
how to interact with a COMPASS
fit.
The model is fit as follows, using the COMPASSContainer
CC
.
CR <- COMPASS(CC, treatment=trt == "Treatment", control=trt == "Control", iterations=1000 )
The dataset is exported as CR, which is a short alias
for COMPASS
Result
.
Please see COMPASS
for more information on the output from
a COMPASS
model fit.
Computes the functionality score for each observation from the gamma matrix
of a COMPASS model fit. The scores are normalized according to the total
number of possible subsets that could be observed (2^M - 1
).
FunctionalityScore(x, n, markers = NULL) ## S3 method for class 'COMPASSResult' FunctionalityScore(x, n, markers = NULL) ## Default S3 method: FunctionalityScore(x, n, markers = NULL)
FunctionalityScore(x, n, markers = NULL) ## S3 method for class 'COMPASSResult' FunctionalityScore(x, n, markers = NULL) ## Default S3 method: FunctionalityScore(x, n, markers = NULL)
x |
An object of class |
n |
The number of markers included in an experiment. It is inferred
from the data when |
markers |
The set of markers for which to compute a Functionality score. By default uses all markers. Must match names returned by |
A numeric vector of functionality scores.
The null category is implicitly dropped when computing the functionality
score for a COMPASS
result; this is not true for the regular matrix
method.
FunctionalityScore(CR)
FunctionalityScore(CR)
Get a data.table of counts of polyfunctional subsets
getCounts(object)
getCounts(object)
object |
An object of class |
getCounts(CR)
getCounts(CR)
This function extracts thresholded intensities for children of a
node node
, as specified through the map
argument.
GetThresholdedIntensities(gs, node, map)
GetThresholdedIntensities(gs, node, map)
gs |
A |
node |
The name, or path, of a single node in a
|
map |
A |
map
should be an R list
, mapping node names (as
specified in the gating hierarchy of the gating set) to channel
names (as specified in either the desc
or name
columns of the parameters of the associated flowFrame
s
in the GatingSet
).
A list
with two components:
data |
A |
counts |
A named vector of total cell counts at the node |
if (require("flowWorkspace")&require("flowCore")&require("tidyr")) { ## Generate an example GatingSet that could be used with COMPASS ## We then pull out the 'data' and 'counts' components that could ## be used within a COMPASSContainer n <- 10 ## number of samples k <- 4 ## number of markers sid_vec <- paste0("sid_", 1:n) ## sample ids; unique names used to denote samples iid_vec <- rep_len( paste0("iid_", 1:(n/10) ), n ) ## individual ids marker_names <- c("TNFa", "IL2", "IL4", "IL6") ## Generate n sets of 'flow' data -- a list of matrices, each row ## is a cell, each column is fluorescence intensities on a particular ## channel / marker data <- replicate(n, { nrow <- round(runif(1) * 1E4 + 1000) ncol <- k vals <- rexp( nrow * ncol, runif(1, 1E-5, 1E-3) ) output <- matrix(vals, nrow, ncol) colnames(output) <- marker_names return(output) }) names(data) <- sid_vec ## Put it into a GatingSet fs <- flowSet( lapply(data, flowFrame) ) gs <- GatingSet(fs) ## Add some dummy metadata meta <- pData(gs) meta$PTID <- 1:10 pData(gs) <- meta gate <- rectangleGate( list(TNFa=c(-Inf,Inf))) gs_pop_add(gs, gate, parent="root", name="dummy") ## Add dummy gate ## Make some gates, and apply them invisible(lapply(marker_names, function(marker) { .gate <- setNames( list( c( rexp(1, runif(1, 1E-5, 1E-3)), Inf) ), marker ) gate <- rectangleGate(.gate=.gate) gs_pop_add(gs, gate, parent="dummy", name=paste0(marker, "+")) })) recompute(gs) ## Map node names to channel names map=list( "TNFa+"="TNFa", "IL2+"="IL2", "IL4+"="IL4", "IL6+"="IL6" ) ## Pull out the data as a COMPASS-friendly dataset node <- "dummy" map <- map system.time( output <- GetThresholdedIntensities(gs, "dummy", map) ) system.time( output <- COMPASSContainerFromGatingSet(gs, "dummy", individual_id="PTID") ) str(output) }
if (require("flowWorkspace")&require("flowCore")&require("tidyr")) { ## Generate an example GatingSet that could be used with COMPASS ## We then pull out the 'data' and 'counts' components that could ## be used within a COMPASSContainer n <- 10 ## number of samples k <- 4 ## number of markers sid_vec <- paste0("sid_", 1:n) ## sample ids; unique names used to denote samples iid_vec <- rep_len( paste0("iid_", 1:(n/10) ), n ) ## individual ids marker_names <- c("TNFa", "IL2", "IL4", "IL6") ## Generate n sets of 'flow' data -- a list of matrices, each row ## is a cell, each column is fluorescence intensities on a particular ## channel / marker data <- replicate(n, { nrow <- round(runif(1) * 1E4 + 1000) ncol <- k vals <- rexp( nrow * ncol, runif(1, 1E-5, 1E-3) ) output <- matrix(vals, nrow, ncol) colnames(output) <- marker_names return(output) }) names(data) <- sid_vec ## Put it into a GatingSet fs <- flowSet( lapply(data, flowFrame) ) gs <- GatingSet(fs) ## Add some dummy metadata meta <- pData(gs) meta$PTID <- 1:10 pData(gs) <- meta gate <- rectangleGate( list(TNFa=c(-Inf,Inf))) gs_pop_add(gs, gate, parent="root", name="dummy") ## Add dummy gate ## Make some gates, and apply them invisible(lapply(marker_names, function(marker) { .gate <- setNames( list( c( rexp(1, runif(1, 1E-5, 1E-3)), Inf) ), marker ) gate <- rectangleGate(.gate=.gate) gs_pop_add(gs, gate, parent="dummy", name=paste0(marker, "+")) })) recompute(gs) ## Map node names to channel names map=list( "TNFa+"="TNFa", "IL2+"="IL2", "IL4+"="IL4", "IL6+"="IL6" ) ## Pull out the data as a COMPASS-friendly dataset node <- "dummy" map <- map system.time( output <- GetThresholdedIntensities(gs, "dummy", map) ) system.time( output <- COMPASSContainerFromGatingSet(gs, "dummy", individual_id="PTID") ) str(output) }
Returns the markers associated with an experiment.
markers(object)
markers(object)
object |
An R object. |
Inspired by reshape2:::melt
, we melt data.frame
s and
matrix
s. This function is built for speed.
melt_(data, ...) ## S3 method for class 'data.frame' melt_( data, id.vars, measure.vars, variable.name = "variable", ..., value.name = "value" ) ## S3 method for class 'matrix' melt_(data, ...)
melt_(data, ...) ## S3 method for class 'data.frame' melt_( data, id.vars, measure.vars, variable.name = "variable", ..., value.name = "value" ) ## S3 method for class 'matrix' melt_(data, ...)
data |
The |
... |
Arguments passed to other methods. |
id.vars |
Vector of id variables. Can be integer (variable position)
or string (variable name). If blank, we use all variables not in |
measure.vars |
Vector of measured variables. Can be integer (variable position)
or string (variable name). If blank, we use all variables not in |
variable.name |
Name of variable used to store measured variable names. |
value.name |
Name of variable used to store values. |
If items to be stacked are not of the same internal type, they will be
promoted in the order logical
> integer
> numeric
>
character
.
This function merges two COMPASSContainers
.
## S3 method for class 'COMPASSContainer' merge(x, y, ...)
## S3 method for class 'COMPASSContainer' merge(x, y, ...)
x |
A |
y |
A |
... |
other arguments passed to 'COMPASSContainer' call. |
## Chop the example COMPASSContainer into two, then merge it back together CC1 <- subset(CC, trt == "Control") CC2 <- subset(CC, trt == "Treatment") merged <- merge(CC1, CC2) res <- identical(CC, merge(CC1, CC2)) ## should return TRUE in this case stopifnot( isTRUE(res) )
## Chop the example COMPASSContainer into two, then merge it back together CC1 <- subset(CC, trt == "Control") CC2 <- subset(CC, trt == "Treatment") merged <- merge(CC1, CC2) res <- identical(CC, merge(CC1, CC2)) ## should return TRUE in this case stopifnot( isTRUE(res) )
Functions for getting and setting the metadata associated with an object.
metadata(x) ## S3 method for class 'COMPASSContainer' metadata(x) ## S3 method for class 'COMPASSResult' metadata(x) metadata(x) <- value ## S3 replacement method for class 'COMPASSContainer' metadata(x) <- value
metadata(x) ## S3 method for class 'COMPASSContainer' metadata(x) ## S3 method for class 'COMPASSResult' metadata(x) metadata(x) <- value ## S3 replacement method for class 'COMPASSContainer' metadata(x) <- value
x |
An R object. |
value |
An R object appropriate for storing metadata in object |
A function to draw clustered heatmaps where one has better control over some graphical parameters such as cell size, etc.
pheatmap( mat, color = colorRampPalette(rev(brewer.pal(n = 7, name = "RdYlBu")))(100), kmeans_k = NA, breaks = NA, border_color = "grey60", cellwidth = NA, cellheight = NA, scale = "none", cluster_rows = TRUE, cluster_cols = TRUE, clustering_distance_rows = "euclidean", clustering_distance_cols = "euclidean", clustering_method = "complete", treeheight_row = ifelse(cluster_rows, 50, 0), treeheight_col = ifelse(cluster_cols, 50, 0), legend = TRUE, legend_breaks = NA, legend_labels = NA, annotation = NA, annotation_colors = NA, annotation_legend = TRUE, drop_levels = TRUE, show_rownames = TRUE, show_colnames = TRUE, main = NA, fontsize = 10, fontsize_row = fontsize, fontsize_col = fontsize, display_numbers = FALSE, number_format = "%.2f", fontsize_number = 0.8 * fontsize, filename = NA, width = NA, height = NA, row_annotation = NA, row_annotation_legend = TRUE, row_annotation_colors = NA, cytokine_annotation = NA, headerplot = NA, polar = FALSE, order_by_max_functionality = TRUE, ... )
pheatmap( mat, color = colorRampPalette(rev(brewer.pal(n = 7, name = "RdYlBu")))(100), kmeans_k = NA, breaks = NA, border_color = "grey60", cellwidth = NA, cellheight = NA, scale = "none", cluster_rows = TRUE, cluster_cols = TRUE, clustering_distance_rows = "euclidean", clustering_distance_cols = "euclidean", clustering_method = "complete", treeheight_row = ifelse(cluster_rows, 50, 0), treeheight_col = ifelse(cluster_cols, 50, 0), legend = TRUE, legend_breaks = NA, legend_labels = NA, annotation = NA, annotation_colors = NA, annotation_legend = TRUE, drop_levels = TRUE, show_rownames = TRUE, show_colnames = TRUE, main = NA, fontsize = 10, fontsize_row = fontsize, fontsize_col = fontsize, display_numbers = FALSE, number_format = "%.2f", fontsize_number = 0.8 * fontsize, filename = NA, width = NA, height = NA, row_annotation = NA, row_annotation_legend = TRUE, row_annotation_colors = NA, cytokine_annotation = NA, headerplot = NA, polar = FALSE, order_by_max_functionality = TRUE, ... )
mat |
numeric matrix of the values to be plotted. |
color |
vector of colors used in heatmap. |
kmeans_k |
the number of kmeans clusters to make, if we want to agggregate the rows before drawing heatmap. If NA then the rows are not aggregated. |
breaks |
a sequence of numbers that covers the range of values in mat and is one element longer than color vector. Used for mapping values to colors. Useful, if needed to map certain values to certain colors, to certain values. If value is NA then the breaks are calculated automatically. |
border_color |
color of cell borders on heatmap, use NA if no border should be drawn. |
cellwidth |
individual cell width in points. If left as NA, then the values depend on the size of plotting window. |
cellheight |
individual cell height in points. If left as NA, then the values depend on the size of plotting window. |
scale |
character indicating if the values should be centered and scaled in
either the row direction or the column direction, or none. Corresponding values are
|
cluster_rows |
boolean values determining if rows should be clustered, |
cluster_cols |
boolean values determining if columns should be clustered. |
clustering_distance_rows |
distance measure used in clustering rows. Possible
values are |
clustering_distance_cols |
distance measure used in clustering columns. Possible values the same as for clustering_distance_rows. |
clustering_method |
clustering method used. Accepts the same values as
|
treeheight_row |
the height of a tree for rows, if these are clustered. Default value 50 points. |
treeheight_col |
the height of a tree for columns, if these are clustered. Default value 50 points. |
legend |
logical to determine if legend should be drawn or not. |
legend_breaks |
vector of breakpoints for the legend. |
legend_labels |
vector of labels for the |
annotation |
data frame that specifies the annotations shown on top of the columns. Each row defines the features for a specific column. The columns in the data and rows in the annotation are matched using corresponding row and column names. Note that color schemes takes into account if variable is continuous or discrete. |
annotation_colors |
list for specifying annotation track colors manually. It is possible to define the colors for only some of the features. Check examples for details. |
annotation_legend |
boolean value showing if the legend for annotation tracks should be drawn. |
drop_levels |
logical to determine if unused levels are also shown in the legend |
show_rownames |
boolean specifying if column names are be shown. |
show_colnames |
boolean specifying if column names are be shown. |
main |
the title of the plot |
fontsize |
base fontsize for the plot |
fontsize_row |
fontsize for rownames (Default: fontsize) |
fontsize_col |
fontsize for colnames (Default: fontsize) |
display_numbers |
logical determining if the numeric values are also printed to the cells. |
number_format |
format strings (C printf style) of the numbers shown in cells.
For example " |
fontsize_number |
fontsize of the numbers displayed in cells |
filename |
file path where to save the picture. Filetype is decided by the extension in the path. Currently following formats are supported: png, pdf, tiff, bmp, jpeg. Even if the plot does not fit into the plotting window, the file size is calculated so that the plot would fit there, unless specified otherwise. |
width |
manual option for determining the output file width in inches. |
height |
manual option for determining the output file height in inches. |
row_annotation |
data frame that specifies the annotations shown on the rows. Each row defines the features for a specific row. The rows in the data and rows in the annotation are matched using corresponding row names.The category labels are given by the data frame column names. |
row_annotation_legend |
same interpretation as the column parameters. |
row_annotation_colors |
same interpretation as the column parameters |
cytokine_annotation |
A |
headerplot |
is a list with two components, order and data. Order tells how to reorder the columns of the matrix. |
polar |
Boolean; if |
order_by_max_functionality |
Boolean; re-order the cytokine labels by maximum functionality? |
... |
graphical parameters for the text used in plot. Parameters passed to
|
The function also allows to aggregate the rows using kmeans clustering. This is advisable if number of rows is so big that R cannot handle their hierarchical clustering anymore, roughly more than 1000. Instead of showing all the rows separately one can cluster the rows in advance and show only the cluster centers. The number of clusters can be tuned with parameter kmeans_k.
Invisibly a list of components
tree_row
the clustering of rows as hclust
object
tree_col
the clustering of columns as hclust
object
kmeans
the kmeans clustering of rows if parameter kmeans_k
was
specified
Original version by Raivo Kolde <[email protected]>, with modifications by Greg Finak <[email protected]> and Kevin Ushey <[email protected]>.
# Generate some data test = matrix(rnorm(200), 20, 10) test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3 test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2 test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4 colnames(test) = paste("Test", 1:10, sep = "") rownames(test) = paste("Gene", 1:20, sep = "") # Draw heatmaps pheatmap(test) pheatmap(test, kmeans_k = 2) pheatmap(test, scale = "row", clustering_distance_rows = "correlation") pheatmap(test, color = colorRampPalette(c("navy", "white", "firebrick3"))(50)) pheatmap(test, cluster_row = FALSE) pheatmap(test, legend = FALSE) pheatmap(test, display_numbers = TRUE) pheatmap(test, display_numbers = TRUE, number_format = "%.1e") pheatmap(test, cluster_row = FALSE, legend_breaks = -1:4, legend_labels = c("0", "1e-4", "1e-3", "1e-2", "1e-1", "1")) pheatmap(test, cellwidth = 15, cellheight = 12, main = "Example heatmap") #pheatmap(test, cellwidth = 15, cellheight = 12, fontsize = 8, filename = "test.pdf") # Generate column annotations annotation = data.frame(Var1 = factor(1:10 %% 2 == 0, labels = c("Class1", "Class2")), Var2 = 1:10) annotation$Var1 = factor(annotation$Var1, levels = c("Class1", "Class2", "Class3")) rownames(annotation) = paste("Test", 1:10, sep = "") pheatmap(test, annotation = annotation) pheatmap(test, annotation = annotation, annotation_legend = FALSE) pheatmap(test, annotation = annotation, annotation_legend = FALSE, drop_levels = FALSE) # Specify colors Var1 = c("navy", "darkgreen") names(Var1) = c("Class1", "Class2") Var2 = c("lightgreen", "navy") ann_colors = list(Var1 = Var1, Var2 = Var2) #Specify row annotations row_ann <- data.frame(foo=gl(2,nrow(test)/2),`Bar`=relevel(gl(2,nrow(test)/2),"2")) rownames(row_ann)<-rownames(test) pheatmap(test, annotation = annotation, annotation_legend = FALSE, drop_levels = FALSE,row_annotation = row_ann) #Using cytokine annotations M<-matrix(rnorm(8*20),ncol=8) row_annotation<-data.frame(A=gl(4,nrow(M)/4),B=gl(4,nrow(M)/4)) eg<-expand.grid(factor(c(0,1)),factor(c(0,1)),factor(c(0,1))) colnames(eg)<-c("IFNg","TNFa","IL2") rownames(eg)<-apply(eg,1,function(x)paste0(x,collapse="")) rownames(M)<-1:nrow(M) colnames(M)<-rownames(eg) cytokine_annotation=eg pheatmap(M,annotation=annotation,row_annotation=row_annotation, annotation_legend=TRUE,row_annotation_legend=TRUE, cluster_rows=FALSE,cytokine_annotation=cytokine_annotation,cluster_cols=FALSE) # Specifying clustering from distance matrix drows = dist(test, method = "minkowski") dcols = dist(t(test), method = "minkowski") pheatmap(test, clustering_distance_rows = drows, clustering_distance_cols = dcols)
# Generate some data test = matrix(rnorm(200), 20, 10) test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3 test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2 test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4 colnames(test) = paste("Test", 1:10, sep = "") rownames(test) = paste("Gene", 1:20, sep = "") # Draw heatmaps pheatmap(test) pheatmap(test, kmeans_k = 2) pheatmap(test, scale = "row", clustering_distance_rows = "correlation") pheatmap(test, color = colorRampPalette(c("navy", "white", "firebrick3"))(50)) pheatmap(test, cluster_row = FALSE) pheatmap(test, legend = FALSE) pheatmap(test, display_numbers = TRUE) pheatmap(test, display_numbers = TRUE, number_format = "%.1e") pheatmap(test, cluster_row = FALSE, legend_breaks = -1:4, legend_labels = c("0", "1e-4", "1e-3", "1e-2", "1e-1", "1")) pheatmap(test, cellwidth = 15, cellheight = 12, main = "Example heatmap") #pheatmap(test, cellwidth = 15, cellheight = 12, fontsize = 8, filename = "test.pdf") # Generate column annotations annotation = data.frame(Var1 = factor(1:10 %% 2 == 0, labels = c("Class1", "Class2")), Var2 = 1:10) annotation$Var1 = factor(annotation$Var1, levels = c("Class1", "Class2", "Class3")) rownames(annotation) = paste("Test", 1:10, sep = "") pheatmap(test, annotation = annotation) pheatmap(test, annotation = annotation, annotation_legend = FALSE) pheatmap(test, annotation = annotation, annotation_legend = FALSE, drop_levels = FALSE) # Specify colors Var1 = c("navy", "darkgreen") names(Var1) = c("Class1", "Class2") Var2 = c("lightgreen", "navy") ann_colors = list(Var1 = Var1, Var2 = Var2) #Specify row annotations row_ann <- data.frame(foo=gl(2,nrow(test)/2),`Bar`=relevel(gl(2,nrow(test)/2),"2")) rownames(row_ann)<-rownames(test) pheatmap(test, annotation = annotation, annotation_legend = FALSE, drop_levels = FALSE,row_annotation = row_ann) #Using cytokine annotations M<-matrix(rnorm(8*20),ncol=8) row_annotation<-data.frame(A=gl(4,nrow(M)/4),B=gl(4,nrow(M)/4)) eg<-expand.grid(factor(c(0,1)),factor(c(0,1)),factor(c(0,1))) colnames(eg)<-c("IFNg","TNFa","IL2") rownames(eg)<-apply(eg,1,function(x)paste0(x,collapse="")) rownames(M)<-1:nrow(M) colnames(M)<-rownames(eg) cytokine_annotation=eg pheatmap(M,annotation=annotation,row_annotation=row_annotation, annotation_legend=TRUE,row_annotation_legend=TRUE, cluster_rows=FALSE,cytokine_annotation=cytokine_annotation,cluster_cols=FALSE) # Specifying clustering from distance matrix drows = dist(test, method = "minkowski") dcols = dist(t(test), method = "minkowski") pheatmap(test, clustering_distance_rows = drows, clustering_distance_cols = dcols)
This function can be used to visualize the mean probability of response; that is, the probability that there is a difference in response between samples subjected to the 'treatment' condition, and samples subjected to the 'control' condition.
## S3 method for class 'COMPASSResult' plot( x, y, subset = NULL, threshold = 0.01, minimum_dof = 1, maximum_dof = Inf, must_express = NULL, row_annotation, palette = colorRampPalette(brewer.pal(10, "Purples"))(20), show_rownames = FALSE, show_colnames = FALSE, measure = NULL, order_by = FunctionalityScore, order_by_max_functionality = TRUE, markers = NULL, ... )
## S3 method for class 'COMPASSResult' plot( x, y, subset = NULL, threshold = 0.01, minimum_dof = 1, maximum_dof = Inf, must_express = NULL, row_annotation, palette = colorRampPalette(brewer.pal(10, "Purples"))(20), show_rownames = FALSE, show_colnames = FALSE, measure = NULL, order_by = FunctionalityScore, order_by_max_functionality = TRUE, markers = NULL, ... )
x |
An object of class |
y |
This argument gets passed to |
subset |
An R expression, evaluated within the metadata, used to determine which individuals should be kept. |
threshold |
A numeric threshold for filtering under-expressed
categories. Any categories with mean score < |
minimum_dof |
The minimum degree of functionality for the categories to be plotted. |
maximum_dof |
The maximum degree of functionality for the categories to be plotted. |
must_express |
A character vector of markers that should be included
in each subset plotted. For example, |
row_annotation |
A vector of names, pulled from the metadata, to be used for row annotation. |
palette |
The colour palette to be used. |
show_rownames |
Boolean; if |
show_colnames |
Boolean; if |
measure |
Optional. By default, we produce a heatmap of the mean
gammas produced in a model fit. We can override this by supplying a
matrix of suitable dimension as well; these can be generated with
the |
order_by |
Order rows within a group. This should be a function;
e.g. |
order_by_max_functionality |
Order columns by functionality within each degree subset.
to |
markers |
specifies a subset of markers to plot. default is NULL, which means all markers. |
... |
Optional arguments passed to |
The plot as a grid
object (grob
). It can be redrawn
with e.g. grid::grid.draw()
.
## visualize the mean probability of reponse plot(CR) ## visualize the proportion of cells belonging to a category plot(CR, measure=PosteriorPs(CR))
## visualize the mean probability of reponse plot(CR) ## visualize the proportion of cells belonging to a category plot(CR, measure=PosteriorPs(CR))
This function can be used to visualize the mean probability of response – that is, the probability that there is a difference in response between samples subjected to the 'treatment' condition, and samples subjected to the 'control' condition.
plot2( x, y, subset, threshold = 0.01, minimum_dof = 1, maximum_dof = Inf, must_express = NULL, row_annotation = NULL, palette = NA, show_rownames = FALSE, show_colnames = FALSE, ... )
plot2( x, y, subset, threshold = 0.01, minimum_dof = 1, maximum_dof = Inf, must_express = NULL, row_annotation = NULL, palette = NA, show_rownames = FALSE, show_colnames = FALSE, ... )
x |
An object of class |
y |
An object of class |
subset |
An R expression, evaluated within the metadata, used to determine which individuals should be kept. |
threshold |
A numeric threshold for filtering under-expressed
categories. Any categories with mean score < |
minimum_dof |
The minimum degree of functionality for the categories to be plotted. |
maximum_dof |
The maximum degree of functionality for the categories to be plotted. |
must_express |
A character vector of markers that should be included
in each subset plotted. For example, |
row_annotation |
A vector of names, pulled from the metadata, to be used for row annotation. |
palette |
The colour palette to be used. |
show_rownames |
Boolean; if |
show_colnames |
Boolean; if |
... |
Optional arguments passed to |
The plot as a grid
object (grob
). It can be redrawn
with e.g. grid::grid.draw()
.
This function can be used to visualize the mean probability of response; that is, the probability that there is a difference in response between samples subjected to the 'treatment' condition, and samples subjected to the 'control' condition. This version is used for plotting multiple COMPASSResult objects. The COMPASS runs must all use the same markers. The code is heavily based on the plot.COMPASSResult and plot2 functions. Not all options from plot.COMPASSResult are supported yet.
plotCOMPASSResultStack( x, threshold = 0.01, minimum_dof = 1, maximum_dof = Inf, row_annotation, variable, palette = colorRampPalette(brewer.pal(9, "Purples"))(20), show_rownames = FALSE, ... )
plotCOMPASSResultStack( x, threshold = 0.01, minimum_dof = 1, maximum_dof = Inf, row_annotation, variable, palette = colorRampPalette(brewer.pal(9, "Purples"))(20), show_rownames = FALSE, ... )
x |
A named list of objects of class |
threshold |
A numeric threshold for filtering under-expressed
categories. Any categories with mean score < |
minimum_dof |
The minimum degree of functionality for the categories to be plotted. |
maximum_dof |
The maximum degree of functionality for the categories to be plotted. |
row_annotation |
A vector of names, pulled from the metadata, to be used for row annotation. |
variable |
What to call the variable that is different across the objects in x. |
palette |
The colour palette to be used. |
show_rownames |
Boolean; if |
... |
Optional arguments passed to |
The plot as a grid
object (grob
). It can be redrawn
with e.g. grid::grid.draw()
.
## Not run: # allCompassResults is a list of 4 COMPASSResults names(allCompassResults) <- c("Antigen 85A", "CFP-10", "CMV", "ESAT-6") plotCOMPASSResultStack(allCompassResults, row_annotation = c("Antigen", "PATIENT ID", "Time"), variable = "Antigen", show_rownames = FALSE, main = "Heatmap of Mean Probability of Response to Antigens, CD8+", fontsize = 14, fontsize_row = 13, fontsize_col = 11) ## End(Not run)
## Not run: # allCompassResults is a list of 4 COMPASSResults names(allCompassResults) <- c("Antigen 85A", "CFP-10", "CMV", "ESAT-6") plotCOMPASSResultStack(allCompassResults, row_annotation = c("Antigen", "PATIENT ID", "Time"), variable = "Antigen", show_rownames = FALSE, main = "Heatmap of Mean Probability of Response to Antigens, CD8+", fontsize = 14, fontsize_row = 13, fontsize_col = 11) ## End(Not run)
Computes the Polyfunctionality score for each observation from the
gamma matrix of a COMPASS
model fit. The scores are normalized to
one.
PolyfunctionalityScore(x, degree, n, markers = NULL) ## S3 method for class 'COMPASSResult' PolyfunctionalityScore(x, degree, n, markers = NULL) ## Default S3 method: PolyfunctionalityScore(x, degree, n, markers = NULL)
PolyfunctionalityScore(x, degree, n, markers = NULL) ## S3 method for class 'COMPASSResult' PolyfunctionalityScore(x, degree, n, markers = NULL) ## Default S3 method: PolyfunctionalityScore(x, degree, n, markers = NULL)
x |
An object of class |
degree |
A vector of weights. If missing, this is the 'degree of functionality', that is, the number of markers expressed in a particular category. |
n |
The total number of markers. This is inferred when |
markers |
A |
A numeric vector of polyfunctionality scores.
PolyfunctionalityScore(CR)
PolyfunctionalityScore(CR)
These functions can be used to retrieve different posterior measures
from a COMPASS
fit object.
Posterior(x) PosteriorDiff(x) PosteriorLogDiff(x) PosteriorPs(x) PosteriorPu(x)
Posterior(x) PosteriorDiff(x) PosteriorLogDiff(x) PosteriorPs(x) PosteriorPu(x)
x |
An object of class |
The posterior items retrieved are described as follows::
PosteriorPs
:The posterior estimate of the proportion of cells in the stimulated sample.
PosteriorPu:
The posterior estimate of the proportio of cells in the unstimulated sample.
PosteriorDiff
:The difference in posterior proportions, as described above.
PosteriorLogDiff
:The difference in the log posterior proportions, as described above.
Posterior(CR) PosteriorPs(CR) PosteriorPu(CR) PosteriorDiff(CR) PosteriorLogDiff(CR)
Posterior(CR) PosteriorPs(CR) PosteriorPu(CR) PosteriorDiff(CR) PosteriorLogDiff(CR)
This function prints a COMPASSContainer
object, giving basic
information about the object and the data it encapsulates.
## S3 method for class 'COMPASSContainer' print(x, ...)
## S3 method for class 'COMPASSContainer' print(x, ...)
x |
An object of class |
... |
Optional arguments passed to |
print(CC)
print(CC)
This function prints basic information about the model fit by a
COMPASS
call.
## S3 method for class 'COMPASSResult' print(x, ...)
## S3 method for class 'COMPASSResult' print(x, ...)
x |
An object of class |
... |
Optional arguments; currently unused. |
print(CR)
print(CR)
Compute a response probability based on the selected markers, evaluating the probability
that a subject exhibits a response of size degree
or greater.
i.e., the probability of at least degree
markers exhibiting an antigen specific response.
Response(x, markers, degree, max.prob, at_least_n) ## S3 method for class 'COMPASSResult' Response(x, markers = NULL, degree = 1, max.prob = FALSE, at_least_n = NULL)
Response(x, markers, degree, max.prob, at_least_n) ## S3 method for class 'COMPASSResult' Response(x, markers = NULL, degree = 1, max.prob = FALSE, at_least_n = NULL)
x |
a |
markers |
a |
degree |
the |
max.prob |
|
at_least_n |
|
The response is computed from the sampled Gamma matrix, subsetting on the selected markers, and
A vector
of response probabilities for each subject.
Response(CR, markers = c("M1","M2","M3"), degree = 2)
Response(CR, markers = c("M1","M2","M3"), degree = 2)
This function extracts the functionality and polyfunctionality scores from a COMPASS result merged with the sample metadata table, accounting for any dropped samples due to filtering.
scores(x, markers = NULL)
scores(x, markers = NULL)
x |
A |
markers |
A |
scores(CR)
scores(CR)
Returns a boolean vector indexing cell populations in cellpops
that match
the pattern for boolean combinations of markers
.
select_compass_pops(cellpops, markers)
select_compass_pops(cellpops, markers)
cellpops |
|
markers |
|
If markers A, B, C, D make up the population names in cellpops
and they
the names match the pattern e.g. "A+B-C+D+,Count" (typical of exports from some gating tools),
then markers
should be a vector of markers in the same order they appear in cellpops
.
A boolean vector indexing cellpops
with TRUE
for populations matchin
the pattern.
translate_marker_names
#Generate some population names markers = LETTERS[1:4] pos = c("+","-") popnames = apply(expand.grid(pos,pos,pos,pos),1, function(x)paste(paste(paste(markers,x,sep=""), collapse=""),",Count",sep="")) popnames = sample(c(popnames,paste(paste(markers,sample(c("+","-"), length(markers),replace=TRUE),sep=""),",Count",sep=""))) popnames[select_compass_pops(popnames,LETTERS[1:4])]
#Generate some population names markers = LETTERS[1:4] pos = c("+","-") popnames = apply(expand.grid(pos,pos,pos,pos),1, function(x)paste(paste(paste(markers,x,sep=""), collapse=""),",Count",sep="")) popnames = sample(c(popnames,paste(paste(markers,sample(c("+","-"), length(markers),replace=TRUE),sep=""),",Count",sep=""))) popnames[select_compass_pops(popnames,LETTERS[1:4])]
This function takes a COMPASSResult
object, and generates
a local Shiny application for visualizing the results.
shinyCOMPASS( x, dir = NULL, meta.vars, facet1 = "None", facet2 = "None", facet3 = "None", main = "Heatmap of Ag-Specificity Posterior Probabilities", stimulation = NULL, launch = TRUE, ... )
shinyCOMPASS( x, dir = NULL, meta.vars, facet1 = "None", facet2 = "None", facet3 = "None", main = "Heatmap of Ag-Specificity Posterior Probabilities", stimulation = NULL, launch = TRUE, ... )
x |
An object of class |
dir |
A location to write out the |
meta.vars |
A character vector of column names that should be used for potential facetting in the Shiny app. By default, we take all metadata variables; you may want to limit this if you know certain variables are not of interest. |
facet1 , facet2 , facet3
|
Default values for facets in the Shiny app. Each should be the name of a single vector in the metadata. |
main |
A title to give to the heatmap and subset histogram plots. |
stimulation |
The name of the stimulation applied. If this is |
launch |
Boolean; if |
... |
Optional arguments passed to |
shinyCOMPASSDeps
, for identifying packages that you
need in order to run the Shiny application.
if (interactive()) { oldOpt <- getOption("example.ask") options(example.ask=FALSE) on.exit( options(example.ask=oldOpt) ) shinyCOMPASS(CR) options(example.ask=TRUE) }
if (interactive()) { oldOpt <- getOption("example.ask") options(example.ask=FALSE) on.exit( options(example.ask=oldOpt) ) shinyCOMPASS(CR) options(example.ask=TRUE) }
This function can be used to identify the packages still needed in order to launch the Shiny app.
shinyCOMPASSDeps(verbose = TRUE)
shinyCOMPASSDeps(verbose = TRUE)
verbose |
Boolean; if |
shinyCOMPASSDeps()
shinyCOMPASSDeps()
This function fits the COMPASS
model from a user-provided set of
stimulated / unstimulated matrices. See the NOTE for important details.
SimpleCOMPASS( n_s, n_u, meta, individual_id, iterations = 10000, replications = 8, verbose = TRUE, seed = 100 )
SimpleCOMPASS( n_s, n_u, meta, individual_id, iterations = 10000, replications = 8, verbose = TRUE, seed = 100 )
n_s |
The cell counts for stimulated cells. |
n_u |
The cell counts for unstimulated cells. |
meta |
A |
individual_id |
The name of the vector in |
iterations |
The number of iterations (per 'replication') to perform. |
replications |
The number of 'replications' to perform. In order to conserve memory, we only keep the model estimates from the last replication. |
verbose |
Boolean; if |
seed |
A seed for the random number generator. Defaults to 100. |
A list
with class COMPASSResult
with two components,
the fit
containing parameter estimates and parameter acceptance
rates, and data
containing the generated data used as input for
the model.
n_s and n_u counts matrices should contain ALL 2^M possible combinations of markers, even if they are 0 for some combinations. The code expects the marker combinations to be named in the following way:
"M1&M2&!M3"
means the combination represents cells expressing marker "M1" and "M2" and not "M3". For 3 markers, there should be 8 such combinations, such that n_s and n_u have 8 columns.
set.seed(123) n <- 10 ## number of subjects k <- 3 ## number of markers ## generate some sample data iid_vec <- paste0("iid_", 1:n) # Subject id data <- replicate(2*n, { nrow <- round(runif(1) * 1E4 + 1000) ncol <- k vals <- rexp( nrow * ncol, runif(1, 1E-5, 1E-3) ) vals[ vals < 2000 ] <- 0 output <- matrix(vals, nrow, ncol) output <- output[ apply(output, 1, sum) > 0, ] colnames(output) <- paste0("M", 1:k) return(output) }) meta <- cbind(iid=iid_vec, data.frame(trt=rep( c("Control", "Treatment"), each=n/2 ))) ## generate counts for n_s, n_u n_s <- CellCounts( data[1:n], Combinations(k) ) n_u <- CellCounts( data[(n+1):(2*n)], Combinations(k) ) rownames(n_s) = unique(meta$iid) rownames(n_u) = rownames(n_s) ## A smaller number of iterations is used here for running speed; ## prefer using more iterations for a real fit scr = SimpleCOMPASS(n_s, n_u, meta, "iid", iterations=1000)
set.seed(123) n <- 10 ## number of subjects k <- 3 ## number of markers ## generate some sample data iid_vec <- paste0("iid_", 1:n) # Subject id data <- replicate(2*n, { nrow <- round(runif(1) * 1E4 + 1000) ncol <- k vals <- rexp( nrow * ncol, runif(1, 1E-5, 1E-3) ) vals[ vals < 2000 ] <- 0 output <- matrix(vals, nrow, ncol) output <- output[ apply(output, 1, sum) > 0, ] colnames(output) <- paste0("M", 1:k) return(output) }) meta <- cbind(iid=iid_vec, data.frame(trt=rep( c("Control", "Treatment"), each=n/2 ))) ## generate counts for n_s, n_u n_s <- CellCounts( data[1:n], Combinations(k) ) n_u <- CellCounts( data[(n+1):(2*n)], Combinations(k) ) rownames(n_s) = unique(meta$iid) rownames(n_u) = rownames(n_s) ## A smaller number of iterations is used here for running speed; ## prefer using more iterations for a real fit scr = SimpleCOMPASS(n_s, n_u, meta, "iid", iterations=1000)
Use this function to subset a COMPASSContainer
.
## S3 method for class 'COMPASSContainer' subset(x, subset, ...)
## S3 method for class 'COMPASSContainer' subset(x, subset, ...)
x |
A |
subset |
A logical expression, evaluated within the metadata, indicating which entries to keep. |
... |
other arguments passed to 'COMPASSContainer' call. |
subset(CC, iid == "iid_1")
subset(CC, iid == "iid_1")
This function prints summary information about a COMPASSContainer
object – the number of samples, basic information about the metadata,
and so on.
## S3 method for class 'COMPASSContainer' summary(object, ...)
## S3 method for class 'COMPASSContainer' summary(object, ...)
object |
An object of class |
... |
Optional arguments; currently ignored. |
summary(CC)
summary(CC)
This function prints basic information about the model fit by a
COMPASS
call.
## S3 method for class 'COMPASSResult' summary(object, ...)
## S3 method for class 'COMPASSResult' summary(object, ...)
object |
An object of class |
... |
Optional arguments; currently unused. |
print(CR)
print(CR)
This function is used to compute total cell counts, per individual,
from a COMPASSContainer
.
TotalCellCounts(data, subset, aggregate = TRUE)
TotalCellCounts(data, subset, aggregate = TRUE)
data |
A |
subset |
An expression, evaluated within the metadata, defining
the subset of |
aggregate |
Boolean; if |
TotalCellCounts(CC, trt == "Treatment") TotalCellCounts(CC, trt == "Control") TotalCellCounts(CC)
TotalCellCounts(CC, trt == "Treatment") TotalCellCounts(CC, trt == "Control") TotalCellCounts(CC)
Translate boolean population names from format exported by common software tools to a format used by COMPASS.
translate_marker_names(cellpops)
translate_marker_names(cellpops)
cellpops |
|
character
vector of cell population names used by COMPASS
select_compass_pops
#Generate marker names markers = LETTERS[1:4] pos = c("+","-") popnames = apply(expand.grid(pos,pos,pos,pos),1, function(x) paste(paste(paste(markers,x,sep=""), collapse=""),",Count",sep="")) popnames = sample(c(popnames, paste(paste(markers,sample(c("+","-"), length(markers),replace=TRUE),sep=""), ",Count",sep=""))) popnames = popnames[select_compass_pops(popnames,LETTERS[1:4])] #Translate translate_marker_names(popnames)
#Generate marker names markers = LETTERS[1:4] pos = c("+","-") popnames = apply(expand.grid(pos,pos,pos,pos),1, function(x) paste(paste(paste(markers,x,sep=""), collapse=""),",Count",sep="")) popnames = sample(c(popnames, paste(paste(markers,sample(c("+","-"), length(markers),replace=TRUE),sep=""), ",Count",sep=""))) popnames = popnames[select_compass_pops(popnames,LETTERS[1:4])] #Translate translate_marker_names(popnames)
Transpose a matrix-like list.
transpose_list(x)
transpose_list(x)
x |
An R list. |
l <- list( 1:3, 4:6, 7:9 ) stopifnot( identical( transpose_list( transpose_list(l) ), l ) )
l <- list( 1:3, 4:6, 7:9 ) stopifnot( identical( transpose_list( transpose_list(l) ), l ) )
Generate all possible unique combinations of x
.
Primarily used as a helper function for CellCounts
, but may
be occasionally useful to the end user.
UniqueCombinations(x, as.matrix) ## S3 method for class 'COMPASSContainer' UniqueCombinations(x, as.matrix = FALSE) ## Default S3 method: UniqueCombinations(x, as.matrix = FALSE)
UniqueCombinations(x, as.matrix) ## S3 method for class 'COMPASSContainer' UniqueCombinations(x, as.matrix = FALSE) ## Default S3 method: UniqueCombinations(x, as.matrix = FALSE)
x |
Either a |
as.matrix |
Boolean; if |
UniqueCombinations(CC)
UniqueCombinations(CC)