Title: | Clustering for Flow Cytometry |
---|---|
Description: | Robust model-based clustering using a t-mixture model with Box-Cox transformation. Note: users should have GSL installed. Windows users: 'consult the README file available in the inst directory of the source distribution for necessary configuration instructions'. |
Authors: | Raphael Gottardo, Kenneth Lo <[email protected]>, Greg Finak <[email protected]> |
Maintainer: | Greg Finak <[email protected]>, Mike Jiang <[email protected]> |
License: | MIT |
Version: | 3.45.0 |
Built: | 2024-10-30 07:52:44 UTC |
Source: | https://github.com/bioc/flowClust |
Robust model-based clustering using a mixture model with Box-Cox
transformation.
Package: | flowClust |
Type: | Package |
Version: | 2.0.0 |
Depends: | R(>= 2.5.0), methods, mnormt, mclust, ellipse, Biobase, flowCore |
Collate: | SetClasses.R SetMethods.R plot.R flowClust.R SimulateMixture.R |
biocViews: | Clustering, Statistics, Visualization |
License: | Artistic-2.0 |
Built: | R 2.6.1; universal-apple-darwin8.10.1; 2008-03-26 20:54:42; unix |
Box-Cox Transformation
Grid of Density Values for the
Fitted Mixture Model with Box-Cox Transformation
Density of the Multivariate Distribution
with Box-Cox Tranformation
Density of the
Multivariate Mixture Distribution with Box-Cox Tranformation
Robust Model-based Clustering for Flow Cytometry
1-D Density Plot (Histogram) of Clustering Results
Cluster Assignment Based on Clustering Results
Various Functions for Retrieving Information from Clustering Results
Scatterplot of Clustering Results
Contour or Image Plot of Clustering Results
Scatterplot / 1-D Density Plot of Filtering (Clustering) Results
Reverse Box-Cox Transformation
Showing or Modifying the Rule used to Identify Outliers
Show Method for flowClust
/ tmixFilterResult
Object
Show Method for
tmixFilter
Object
Random
Generation from a Mixture Model with Box-Cox Transformation
Splitting Data Based on Clustering Results
Subsetting Data Based on Clustering Results
Summary Method for
flowClust
Object
Creating Filters and Filtering Flow Cytometry Data
Further information is available in the vignette.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Maintainer: Raphael Gottardo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
Extract the bayesian information criterion for a flowClust fit.
## S3 method for class 'flowClustList' BIC(object, ...) ## S3 method for class 'flowClust' BIC(object, ...)
## S3 method for class 'flowClustList' BIC(object, ...) ## S3 method for class 'flowClust' BIC(object, ...)
object |
|
... |
other arguments. Currently not used. |
vector
of BIC or ICL values
This function performs Box-Cox transformation on the inputted data matrix.
box(data, lambda)
box(data, lambda)
data |
A numeric vector, matrix or data frame of observations. Negative data values are permitted. |
lambda |
The transformation to be applied to the data. If negative
data values are present, |
To allow for negative data values, a slightly modified version of the original Box-Cox (1964) is used here. This modified version originated from Bickel and Doksum (1981), taking the following form:
When negative data values are involved, the
transformation parameter, , has to be positive
in order to avoid discontinuity across zero.
A numeric vector, matrix or data frame of the same dimension as
data
is returned.
Bickel, P. J. and Doksum, K. A. (1981) An Analysis of Transformations Revisited. J. Amer. Statist. Assoc. 76(374), 296-311.
Box, G. E. P. and Cox, D. R. (1964) An Analysis of Transformations. J. R. Statist. Soc. B 26, 211-252.
data(rituximab) library(flowCore) data <- exprs(rituximab) summary(data) # Transform data using Box-Cox with lambda=0.3 dataTrans <- box(data, 0.3) # Reverse transform data; this should return back to the original rituximab data summary(rbox(dataTrans, 0.3))
data(rituximab) library(flowCore) data <- exprs(rituximab) summary(data) # Transform data using Box-Cox with lambda=0.3 dataTrans <- box(data, 0.3) # Reverse transform data; this should return back to the original rituximab data summary(rbox(dataTrans, 0.3))
Various functions are available to retrieve the information criteria
(criterion
), the posterior probabilities of clustering memberships
(
posterior
), the “weights”
(
importance
), the uncertainty (uncertainty
), and the estimates
of the cluster proportions, means and variances (getEstimates
)
resulted from the clustering (filtering) operation.
criterion(object, ...) ## S4 method for signature 'flowClust' criterion(object, type = "BIC") ## S4 method for signature 'flowClustList' criterion(object, type = "BIC", max = FALSE, show.K = FALSE) criterion(object) <- value ## S4 replacement method for signature 'flowClustList,character' criterion(object) <- value posterior(object, assign = FALSE) importance(object, assign = FALSE) uncertainty(object) getEstimates(object, data)
criterion(object, ...) ## S4 method for signature 'flowClust' criterion(object, type = "BIC") ## S4 method for signature 'flowClustList' criterion(object, type = "BIC", max = FALSE, show.K = FALSE) criterion(object) <- value ## S4 replacement method for signature 'flowClustList,character' criterion(object) <- value posterior(object, assign = FALSE) importance(object, assign = FALSE) uncertainty(object) getEstimates(object, data)
object |
Object returned from |
... |
Further arguments. Currently this is |
type , value
|
A character string stating the criterion used to choose the
best model. May take either |
max |
whether |
show.K |
whether |
assign |
A logical value. If |
data |
A numeric vector, matrix, data frame of observations, or object
of class |
These functions are written to retrieve various slots contained in the
object returned from the clustering operation. criterion
is to
retrieve object@BIC
, object@ICL
or object@logLike
. It
replacement method modifies object@index
and object@criterion
to select the best model according to the desired criterion.
posterior
and importance
provide a means to conveniently
retrieve information stored in object@z
and object@u
respectively. uncertainty
is to retrieve object@uncertainty
.
getEstimates
is to retrieve information stored in object@mu
(transformed back to the original scale) and object@w
; when the data
object is provided, an approximate variance estimate (on the original scale,
obtained by performing one M-step of the EM algorithm without taking the
Box-Cox transformation) will also be computed.
Denote by the number of clusters,
the number of
observations, and
the number of variables. For
posterior
and
importance
, a matrix of size is returned if
assign=FALSE
(default). Otherwise, a vector of size is
outputted.
uncertainty
always outputs a vector of size .
getEstimates
returns a list with named elements, proportions
,
locations
and, if the data object is provided, dispersion
.
proportions
is a vector of size and contains the estimates of
the
cluster proportions.
locations
is a matrix of size
and contains the estimates of the
mean
vectors transformed back to the original scale (i.e.,
rbox(object@mu,
object@lambda)
). dispersion
is an array of dimensions , containing the approximate estimates of the
covariance matrices on the original scale.
When object@nu=Inf
, the Mahalanobis distances instead of the
“weights” are stored in object@u
. Hence, importance
will retrieve information corresponding to the Mahalanobis distances.
the assign
argument is set to TRUE
, only the quantities
corresponding to assigned observations will be returned. Quantities
corresponding to unassigned observations (outliers and filtered
observations) will be reported as NA
. Hence, A change in the rule to
call outliers will incur a change in the number of NA
values
returned.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This function computes the densities at the inputted points of the
multivariate distribution with Box-Cox transformation.
dmvt(x, mu, sigma, nu, lambda, log = FALSE)
dmvt(x, mu, sigma, nu, lambda, log = FALSE)
x |
A matrix or data frame of size |
mu |
A numeric vector of length |
sigma |
A matrix of size |
nu |
The degrees of freedom used for the |
lambda |
The Box-Cox transformation parameter. If missing, the
conventional |
log |
A logical value. If |
A list with the following components:
value |
A vector of
length |
md |
A vector of length
|
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
This function computes the densities at the inputted points of the
multivariate mixture distribution with Box-Cox transformation.
dmvtmix(x, w, mu, sigma, nu, lambda, object, subset, include, log = FALSE)
dmvtmix(x, w, mu, sigma, nu, lambda, object, subset, include, log = FALSE)
x |
A matrix or data frame of size |
w |
A numeric vector of length |
mu |
A matrix of size |
sigma |
An array of size |
nu |
A numeric vector of length |
lambda |
The Box-Cox transformation parameter. If missing, the
conventional |
object |
An optional argument. If provided, it's an object returned
from |
subset |
An optional argument. If provided, it's a numeric vector
indicating which variables are selected for computing the densities. If
|
include |
An optional argument. If provided, it's a numeric vector specifying which clusters are included for computing the densities. |
log |
A logical value. If |
A vector of length containing the density values.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
This function performs automated clustering for identifying cell populations in flow cytometry data. The approach is based on the tmixture model with the Box-Cox transformation, which provides a unified framework to handle outlier identification and data transformation simultaneously.
flowClust( x, expName = "Flow Experiment", varNames = NULL, K, nu = 4, lambda = 1, trans = 1, min.count = 10, max.count = 10, min = NULL, max = NULL, randomStart = 0, prior = NULL, usePrior = "no", criterion = "BIC", ... )
flowClust( x, expName = "Flow Experiment", varNames = NULL, K, nu = 4, lambda = 1, trans = 1, min.count = 10, max.count = 10, min = NULL, max = NULL, randomStart = 0, prior = NULL, usePrior = "no", criterion = "BIC", ... )
x |
A numeric vector, matrix, data frame of observations, or object of
class |
expName |
A character string giving the name of the experiment. |
varNames |
A character vector specifying the variables (columns) to be included in clustering. When it is left unspecified, all the variables will be used. |
K |
An integer vector indicating the numbers of clusters. |
nu |
The degrees of freedom used for the |
lambda |
The initial transformation to be applied to the data. |
trans |
A numeric indicating whether the Box-Cox transformation parameter is estimated from the data. May take 0 (no estimation), 1 (estimation, default) or 2 (cluster-specific estimation). |
min.count |
An integer specifying the threshold count for filtering
data points from below. The default is 10, meaning that if 10 or more data
points are smaller than or equal to |
max.count |
An integer specifying the threshold count for filtering
data points from above. Interpretation is similar to that of
|
min |
The lower boundary set for data filtering. Note that it is a vector of length equal to the number of variables (columns), implying that a different value can be set as per each variable. |
max |
The upper boundary set for data filtering. Interpretation is
similar to that of |
randomStart |
A numeric value indicating how many times a random parition of the data is generated for initialization. The default is 0, meaning that a deterministic partition based on kmeans clustering is used. A value of 10 means random partitions of the data will be generated, each of which is followed by a short EM run. The partition leading to the highest likelihood value will be adopted to be the initial partition for the eventual long EM run. |
prior |
The specification of the prior. Used if usePrior="yes" |
usePrior |
Argument specifying whether or not the prior will be used. Can be "yes","no","vague". A vague prior will be automatically specified if usePrior="vague" |
criterion |
A character string stating the criterion used to choose the
best model. May take either |
... |
other arguments: B: The maximum number of EM iterations.Default is 500. tol: The tolerance used to assess the convergence of the EM. default is 1e-5. nu.est: A numeric indicating whether level: A numeric value between 0 and 1 specifying the threshold quantile level used to call a point an outlier. The default is 0.9, meaning that any point outside the 90% quantile region will be called an outlier. u.cutoff: Another criterion used to identify outliers. If this is
z.cutoff: A numeric value between 0 and 1 underlying a criterion which may
be used together with B.init: The maximum number of EM iterations following each random partition in random initialization. Default is the same as B. tol.init: The tolerance used as the stopping criterion for the short EM runs in random initialization. Default is 1e-2. seed: An integer giving the seed number used when
control: An argument reserved for internal use. |
Estimation of the unknown parameters (including the Box-Cox parameter) is done via an Expectation-Maximization (EM) algorithm. At each EM iteration, Brent's algorithm is used to find the optimal value of the Box-Cox transformation parameter. Conditional on the transformation parameter, all other estimates can be obtained in closed form. Please refer to Lo et al. (2008) for more details.
The flowClust package makes extensive use of the GSL as well as BLAS. If an optimized BLAS library is provided when compiling the package, the flowClust package will be able to run multi-threaded processes.
Various operations have been defined for the object returned from
flowClust
.
In addition, to facilitate the integration with the flowCore package
for processing flow cytometry data, the flowClust
operation can be
done through a method pair (tmixFilter
and
filter
) such that various methods defined in
flowCore can be applied on the object created from the filtering
operation.
If K
is of length 1, the function returns an object of class
flowClust
containing the following slots, where is the number
of clusters,
is the number of observations and
is the number
of variables:
expName |
Content of the |
varNames |
Content of the |
K |
An integer showing the number of clusters. |
w |
A vector of length |
mu |
A matrix of size |
sigma |
An array of dimension |
lambda |
The Box-Cox transformation parameter estimate. |
nu |
The
degrees of freedom for the |
z |
A matrix of size
|
u |
A
matrix of size |
label |
A
vector of size |
uncertainty |
A vector of size |
ruleOutliers |
A numeric vector of size 3, storing the
rule used to call outliers. The first element is 0 if the criterion is set
by the |
flagOutliers |
A
logical vector of size |
rm.min |
Number of points filtered from below. |
rm.max |
Number of points filtered from above. |
logLike |
The log-likelihood of the fitted mixture model. |
BIC |
The Bayesian Information Criterion for the fitted mixture model. |
ICL |
The Integrated Completed Likelihood for the fitted mixture model. |
If K
has a length >1, the function returns an object of class
flowClustList
. Its data part is a list with the same length as
K
, each element of which is a flowClust
object corresponding
to a specific number of clusters. In addition, the resultant
flowClustList
object contains the following slots:
index
An integer giving the index of the list element corresponding
to the best model as selected by criterion
.criterion
The
criterion used to choose the best model – either "BIC"
or
"ICL"
.
Note that when a flowClustList
object is used in place of a
flowClust
object, in most cases the list element corresponding to the
best model will be extracted and passed to the method/function call.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
summary
,
plot
,
density
,
hist
, Subset
,
split
, ruleOutliers
, Map
,
SimulateMixture
library(flowCore) data(rituximab) ### cluster the data using FSC.H and SSC.H res1 <- flowClust(rituximab, varNames=c("FSC.H", "SSC.H"), K=1) ### remove outliers before proceeding to the second stage # %in% operator returns a logical vector indicating whether each # of the observations lies within the cluster boundary or not rituximab2 <- rituximab[rituximab %in% res1,] # a shorthand for the above line rituximab2 <- rituximab[res1,] # this can also be done using the Subset method rituximab2 <- Subset(rituximab, res1) ### cluster the data using FL1.H and FL3.H (with 3 clusters) res2 <- flowClust(rituximab2, varNames=c("FL1.H", "FL3.H"), K=3) show(res2) summary(res2) # to demonstrate the use of the split method split(rituximab2, res2) split(rituximab2, res2, population=list(sc1=c(1,2), sc2=3)) # to show the cluster assignment of observations table(Map(res2)) # to show the cluster centres (i.e., the mean parameter estimates # transformed back to the original scale) getEstimates(res2)$locations ### demonstrate the use of various plotting methods # a scatterplot plot(res2, data=rituximab2, level=0.8) plot(res2, data=rituximab2, level=0.8, include=c(1,2), grayscale=TRUE, pch.outliers=2) # a contour / image plot res2.den <- density(res2, data=rituximab2) plot(res2.den) plot(res2.den, scale="sqrt", drawlabels=FALSE) plot(res2.den, type="image", nlevels=100) plot(density(res2, include=c(1,2), from=c(0,0), to=c(400,600))) # a histogram (1-D density) plot hist(res2, data=rituximab2, subset="FL1.H") ### to demonstrate the use of the ruleOutliers method summary(res2) # change the rule to call outliers ruleOutliers(res2) <- list(level=0.95) # augmented cluster boundaries lead to fewer outliers summary(res2) # the following line illustrates how to select a subset of data # to perform cluster analysis through the min and max arguments; # also note the use of level to specify a rule to call outliers # other than the default flowClust(rituximab2, varNames=c("FL1.H", "FL3.H"), K=3, B=100, min=c(0,0), max=c(400,800), level=0.95, z.cutoff=0.5)
library(flowCore) data(rituximab) ### cluster the data using FSC.H and SSC.H res1 <- flowClust(rituximab, varNames=c("FSC.H", "SSC.H"), K=1) ### remove outliers before proceeding to the second stage # %in% operator returns a logical vector indicating whether each # of the observations lies within the cluster boundary or not rituximab2 <- rituximab[rituximab %in% res1,] # a shorthand for the above line rituximab2 <- rituximab[res1,] # this can also be done using the Subset method rituximab2 <- Subset(rituximab, res1) ### cluster the data using FL1.H and FL3.H (with 3 clusters) res2 <- flowClust(rituximab2, varNames=c("FL1.H", "FL3.H"), K=3) show(res2) summary(res2) # to demonstrate the use of the split method split(rituximab2, res2) split(rituximab2, res2, population=list(sc1=c(1,2), sc2=3)) # to show the cluster assignment of observations table(Map(res2)) # to show the cluster centres (i.e., the mean parameter estimates # transformed back to the original scale) getEstimates(res2)$locations ### demonstrate the use of various plotting methods # a scatterplot plot(res2, data=rituximab2, level=0.8) plot(res2, data=rituximab2, level=0.8, include=c(1,2), grayscale=TRUE, pch.outliers=2) # a contour / image plot res2.den <- density(res2, data=rituximab2) plot(res2.den) plot(res2.den, scale="sqrt", drawlabels=FALSE) plot(res2.den, type="image", nlevels=100) plot(density(res2, include=c(1,2), from=c(0,0), to=c(400,600))) # a histogram (1-D density) plot hist(res2, data=rituximab2, subset="FL1.H") ### to demonstrate the use of the ruleOutliers method summary(res2) # change the rule to call outliers ruleOutliers(res2) <- list(level=0.95) # augmented cluster boundaries lead to fewer outliers summary(res2) # the following line illustrates how to select a subset of data # to perform cluster analysis through the min and max arguments; # also note the use of level to specify a rule to call outliers # other than the default flowClust(rituximab2, varNames=c("FL1.H", "FL3.H"), K=3, B=100, min=c(0,0), max=c(400,800), level=0.95, z.cutoff=0.5)
generate the curve that reflects the tmixture fitting outcome
flowClust.den(x, obj, subset, include)
flowClust.den(x, obj, subset, include)
x |
the numeric vector represents the x coordinates in plot |
obj |
the flowClust object |
subset |
An integer indicating which variable is selected for the plot.
Alternatively, a character string containing the name of the variable is
allowed if |
include |
A numeric vector specifying which clusters are shown on the plot. By default, all clusters are included. |
This method constructs the flowDens
object which is used to generate
a contour or image plot.
## S4 method for signature 'flowClust' density( x, data = NULL, subset = c(1, 2), include = 1:(x@K), npoints = c(100, 100), from = NULL, to = NULL ) ## S4 method for signature 'flowClustList' density( x, data = NULL, subset = c(1, 2), include = 1:(x@K), npoints = c(100, 100), from = NULL, to = NULL )
## S4 method for signature 'flowClust' density( x, data = NULL, subset = c(1, 2), include = 1:(x@K), npoints = c(100, 100), from = NULL, to = NULL ) ## S4 method for signature 'flowClustList' density( x, data = NULL, subset = c(1, 2), include = 1:(x@K), npoints = c(100, 100), from = NULL, to = NULL )
x |
Object returned from |
data |
A matrix, data frame of observations, or object of class
|
subset |
A numeric vector of length two indicating which two variables
are selected for the scatterplot. Alternatively, a character vector
containing the names of the two variables is allowed if |
include |
A numeric vector specifying which clusters are included to compute the density values. By default, all clusters are included. |
npoints |
A numeric vector of size two specifying the number of grid
points in |
from |
A numeric vector of size two specifying the coordinates of the
lower left point of the grid square. Note that, if this (and |
to |
A numeric vector of size two specifying the co-ordinates of the upper right point of the grid square. |
The flowDens
object returned is to be passed to the plot
method for generating a contour or image plot.
An object of class flowDens
containing the following slots is
constructed:
dx |
A numeric vector of length |
dy |
A numeric vector of
length |
value |
A matrix of size |
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
This method generates a one-dimensional density plot for the specified dimension (variable) based on the robust model-based clustering results. A histogram of the actual data or cluster assignment is optional for display.
## S3 method for class 'flowClust' hist( x, data = NULL, subset = 1, include = 1:(x@K), histogram = TRUE, labels = TRUE, ylab = "Density", main = NULL, col = NULL, pch = 20, cex = 0.6, ... ) ## S3 method for class 'flowClustList' hist(x, ...)
## S3 method for class 'flowClust' hist( x, data = NULL, subset = 1, include = 1:(x@K), histogram = TRUE, labels = TRUE, ylab = "Density", main = NULL, col = NULL, pch = 20, cex = 0.6, ... ) ## S3 method for class 'flowClustList' hist(x, ...)
x |
Object returned from |
data |
A numeric vector, matrix, data frame of observations, or object
of class |
subset |
An integer indicating which variable is selected for the plot.
Alternatively, a character string containing the name of the variable is
allowed if |
include |
A numeric vector specifying which clusters are shown on the plot. By default, all clusters are included. |
histogram |
A logical value indicating whether a histogram of the actual data is made in addition to the density plot or not. |
labels |
A logical value indicating whether information about cluster assignment is shown or not. |
ylab |
Labels for the |
main |
Title of the plot. |
col |
Colors of the plotting characters displaying the cluster
assignment (if |
pch |
Plotting character used to show the cluster assignment. |
cex |
Size of the plotting character showing the cluster assignment. |
... |
other arguments xlim The range of ylim The range of breaks Content to be passed to the ... Further arguments passed to if |
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This method performs cluster assignment according to the posterior probabilities of clustering memberships resulted from the clustering (filtering) operations. Outliers identified will be left unassigned by default.
Map(f, ...) ## S4 method for signature 'flowClust' Map(f, rm.outliers = TRUE, ...) ## S4 method for signature 'flowClustList' Map(f, rm.outliers = TRUE, ...)
Map(f, ...) ## S4 method for signature 'flowClust' Map(f, rm.outliers = TRUE, ...) ## S4 method for signature 'flowClustList' Map(f, rm.outliers = TRUE, ...)
f |
|
... |
Further arguments to be passed to or from other methods. |
rm.outliers |
A logical value indicating whether outliers will be left unassigned or not. |
A numeric vector of size (the number of observations)
indicating to which cluster each observation is assigned. Unassigned
observations will be labelled as
NA
.
Even if rm.outliers
is set to FALSE
, NA
may still
appear in the resultant vector due to the filtered observations; see the
descriptions about the min.count
, max.count
, min
and
max
arguments of flowClust
.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This method generates scatterplot revealing the cluster assignment, cluster boundaries according to the specified percentile as well as supplemental information like outliers or filtered observations.
plot(x, y, ...) ## S4 method for signature 'flowClust,missing' plot( x, data, subset = c(1, 2), ellipse = T, show.outliers = T, show.rm = F, include = 1:(x@K), main = NULL, grayscale = F, col = (if (grayscale) gray(1/4) else 2:(length(include) + 1)), pch = ".", cex = 0.6, col.outliers = gray(3/4), pch.outliers = ".", cex.outliers = cex, col.rm = 1, pch.rm = 1, cex.rm = 0.6, ecol = 1, elty = 1, level = NULL, u.cutoff = NULL, z.cutoff = NULL, npoints = 100, add = F, ... ) ## S4 method for signature 'flowClustList,missing' plot( x, data, subset = c(1, 2), ellipse = T, show.outliers = T, show.rm = F, include = 1:(x@K), main = NULL, grayscale = F, col = (if (grayscale) gray(1/4) else 2:(length(include) + 1)), pch = ".", cex = 0.6, col.outliers = gray(3/4), pch.outliers = ".", cex.outliers = cex, col.rm = 1, pch.rm = 1, cex.rm = 0.6, ecol = 1, elty = 1, level = NULL, u.cutoff = NULL, z.cutoff = NULL, npoints = 501, add = F, ... )
plot(x, y, ...) ## S4 method for signature 'flowClust,missing' plot( x, data, subset = c(1, 2), ellipse = T, show.outliers = T, show.rm = F, include = 1:(x@K), main = NULL, grayscale = F, col = (if (grayscale) gray(1/4) else 2:(length(include) + 1)), pch = ".", cex = 0.6, col.outliers = gray(3/4), pch.outliers = ".", cex.outliers = cex, col.rm = 1, pch.rm = 1, cex.rm = 0.6, ecol = 1, elty = 1, level = NULL, u.cutoff = NULL, z.cutoff = NULL, npoints = 100, add = F, ... ) ## S4 method for signature 'flowClustList,missing' plot( x, data, subset = c(1, 2), ellipse = T, show.outliers = T, show.rm = F, include = 1:(x@K), main = NULL, grayscale = F, col = (if (grayscale) gray(1/4) else 2:(length(include) + 1)), pch = ".", cex = 0.6, col.outliers = gray(3/4), pch.outliers = ".", cex.outliers = cex, col.rm = 1, pch.rm = 1, cex.rm = 0.6, ecol = 1, elty = 1, level = NULL, u.cutoff = NULL, z.cutoff = NULL, npoints = 501, add = F, ... )
x |
Object returned from |
y |
missing |
... |
Further graphical parameters passed to the generic function
|
data |
A matrix, data frame of observations, or object of class
|
subset |
A numeric vector of length two indicating which two variables
are selected for the scatterplot. Alternatively, a character vector
containing the names of the two variables is allowed if |
ellipse |
A logical value indicating whether the cluster boundary is to
be drawn or not. If |
show.outliers |
A logical value indicating whether outliers will be explicitly shown or not. |
show.rm |
A logical value indicating whether filtered observations will be shown or not. |
include |
A numeric vector specifying which clusters will be shown on the plot. By default, all clusters are included. |
main |
Title of the plot. |
grayscale |
A logical value specifying if a grayscale plot is desired. This argument takes effect only if the default values of relevant graphical arguments are taken. |
col |
Color(s) of the plotting characters. May specify a different color for each cluster. |
pch |
Plotting character(s) of the plotting characters. May specify a different character for each cluster. |
cex |
Size of the plotting characters. May specify a different size for each cluster. |
col.outliers |
Color of the plotting characters denoting outliers. |
pch.outliers |
Plotting character(s) used to denote outliers. May specify a different character for each cluster. |
cex.outliers |
Size of the plotting characters used to denote outliers. May specify a different size for each cluster. |
col.rm |
Color of the plotting characters denoting filtered observations. |
pch.rm |
Plotting character used to denote filtered observations. |
cex.rm |
Size of the plotting character used to denote filtered observations. |
ecol |
Color(s) of the lines representing the cluster boundaries. May specify a different color for each cluster. |
elty |
Line type(s) drawing the cluster boundaries. May specify a different line type for each cluster. |
level , u.cutoff , z.cutoff
|
These three optional arguments specify the
rule used to identify outliers. By default, all of them are left
unspecified, meaning that the rule stated in |
npoints |
The number of points used to draw each cluster boundary. |
add |
A logical value. If |
The cluster boundaries need not be elliptical since Box-Cox transformation has been performed.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This method makes use of the flowDens
object returned by
density
to generate a contour or image
plot.
## S4 method for signature 'flowDens,missing' plot( x, type = c("contour", "image"), nlevels = 30, scale = c("raw", "log", "sqrt"), color = c("rainbow", "heat.colors", "terrain.colors", "topo.colors", "cm.colors", "gray"), xlab = colnames(x@dx), ylab = colnames(x@dy), ... )
## S4 method for signature 'flowDens,missing' plot( x, type = c("contour", "image"), nlevels = 30, scale = c("raw", "log", "sqrt"), color = c("rainbow", "heat.colors", "terrain.colors", "topo.colors", "cm.colors", "gray"), xlab = colnames(x@dx), ylab = colnames(x@dy), ... )
x |
The |
type |
Either |
nlevels |
An integer to specify the number of contour levels or colors shown in the plot. |
scale |
If |
color |
A string containing the name of the function used to generate the desired list of colors. |
xlab , ylab
|
Labels for the |
... |
Other arguments to be passed to |
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
Depending on the dimensions specified, this method generates either a scatterplot or a one-dimensional density plot (histogram) based on the robust model-based clustering results.
## S4 method for signature 'flowFrame,tmixFilterResult' plot(x, y, z = NULL, ...) ## S4 method for signature 'flowFrame,tmixFilterResultList' plot(x, y, z = NULL, ...)
## S4 method for signature 'flowFrame,tmixFilterResult' plot(x, y, z = NULL, ...) ## S4 method for signature 'flowFrame,tmixFilterResultList' plot(x, y, z = NULL, ...)
x |
Object of class |
y |
Object of class |
z |
A character vector of length one or two containing the name(s) of
the variable(s) selected for the plot. If it is of length two, a
scatterplot will be generated. If it is of length one, a 1-D density plot
will be made. If it is unspecified, the first one/two variable(s) listed in
|
... |
All optional arguments passed to the
|
This plot
method is designed such that it resembles the
argument list of the plot
method defined in the flowCore
package. The actual implementation is done through the
plot
or hist
method defined for a flowClust
object.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This function performs back transformation on Box-Cox transformed data.
rbox(data, lambda)
rbox(data, lambda)
data |
A numeric vector, matrix or data frame of observations. |
lambda |
The Box-Cox transformation applied which results in the inputted data matrix. |
A numeric vector, matrix or data frame of the same dimension as
data
is returned.
Please refer to the documentation for box
for details about the
Box-Cox transformation in use.
A flow cytometry dataset produced in a drug-screening project to identify agents that would enhance the anti-lymphoma activity of Rituximab, a therapeutic monoclonal antibody. Cells were stained with anti-BrdU FITC and the DNA binding dye 7-AAD.
An object of class flowFrame
with 1545 cells (rows) and the
following eight variables (columns):
FSC-Height
Side Scatter
Anti-BrdU FITC
Channel not used
7 AAD
Channel not used
Channel not used
Time
Gasparetto, M., Gentry, T., Sebti, S., O'Bryan, E., Nimmanapalli, R., Blaskovich, M. A., Bhalla, K., Rizzieri, D., Haaland, P., Dunne, J. and Smith, C. (2004) Identification of compounds that enhance the anti-lymphoma activity of rituximab using flow cytometric high-content screening. J. Immunol. Methods 292, 59-71.
This method shows or modifies the rule used to identify outliers.
ruleOutliers(object) ## S4 method for signature 'flowClust' ruleOutliers(object) ## S4 method for signature 'flowClustList' ruleOutliers(object) ruleOutliers(object) <- value ## S4 replacement method for signature 'flowClust,list' ruleOutliers(object) <- value
ruleOutliers(object) ## S4 method for signature 'flowClust' ruleOutliers(object) ## S4 method for signature 'flowClustList' ruleOutliers(object) ruleOutliers(object) <- value ## S4 replacement method for signature 'flowClust,list' ruleOutliers(object) <- value
object |
|
value |
A list object with one or more of the following named elements:
|
The replacement method modifies object@ruleOutliers
(or
object[[k]]@ruleOutliers
if object
is of class
flowClustList
or tmixFilterResultList
) AND updates the logical
vector object@flagOutliers
(or object[[k]]@ruleOutliers
)
according to the new rule.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This method lists out the slots contained in a flowClust
object.
## S4 method for signature 'flowClust' show(object) ## S4 method for signature 'flowClustList' show(object) ## S4 method for signature 'tmixFilter' show(object) ## S4 method for signature 'tmixFilterResult' show(object) ## S4 method for signature 'tmixFilterResultList' show(object)
## S4 method for signature 'flowClust' show(object) ## S4 method for signature 'flowClustList' show(object) ## S4 method for signature 'tmixFilter' show(object) ## S4 method for signature 'tmixFilterResult' show(object) ## S4 method for signature 'tmixFilterResultList' show(object)
object |
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
This method splits data according to results of the clustering (filtering) operation. Outliers identified will be removed by default.
split(x, f, drop = FALSE, ...) ## S4 method for signature 'data.frame,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'matrix,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'vector,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,tmixFilterResult' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'data.frame,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'matrix,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'vector,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,tmixFilterResultList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... )
split(x, f, drop = FALSE, ...) ## S4 method for signature 'data.frame,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'matrix,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'vector,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,flowClust' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,tmixFilterResult' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'data.frame,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'matrix,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'vector,flowClustList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... ) ## S4 method for signature 'flowFrame,tmixFilterResultList' split( x, f, drop = FALSE, population = NULL, split = NULL, rm.outliers = TRUE, ... )
x |
A numeric vector, matrix, data frame of observations, or object of
class |
f |
Object returned from |
drop |
A logical value indicating whether to coerce a column matrix
into a vector, if applicable. Default is |
... |
Further arguments to be passed to or from other methods. |
population |
An optional argument which specifies how to split the
data. If specified, it takes a list object with named or unnamed elements
each of which is a numeric vector specifying which clusters are included.
If this argument is left unspecified, the data object will be split into
|
split |
This argument is deprecated. Should use |
rm.outliers |
A logical value indicating whether outliers are removed or not. |
A list object with elements each of which is a subset of x
and also retains the same class as x
. If the split
argument
is specified with a list of named elements, those names will be used to name
the corresponding elements in the resultant list object.
split(x, f, drop=FALSE, population=NULL, split=NULL, rm.outliers=TRUE, ...)
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This method returns a subset of data upon the removal of outliers identified from the clustering (filtering) operations.
## S4 method for signature 'flowFrame,flowClust' Subset(x, subset, ...) ## S4 method for signature 'flowFrame,tmixFilterResult' Subset(x, subset, ...) ## S4 method for signature 'data.frame,flowClust' Subset(x, subset, ...) ## S4 method for signature 'matrix,flowClust' Subset(x, subset, ...) ## S4 method for signature 'vector,flowClust' Subset(x, subset, ...) ## S4 method for signature 'ANY,flowClustList' Subset(x, subset, ...) ## S4 method for signature 'flowFrame,tmixFilterResultList' Subset(x, subset, ...)
## S4 method for signature 'flowFrame,flowClust' Subset(x, subset, ...) ## S4 method for signature 'flowFrame,tmixFilterResult' Subset(x, subset, ...) ## S4 method for signature 'data.frame,flowClust' Subset(x, subset, ...) ## S4 method for signature 'matrix,flowClust' Subset(x, subset, ...) ## S4 method for signature 'vector,flowClust' Subset(x, subset, ...) ## S4 method for signature 'ANY,flowClustList' Subset(x, subset, ...) ## S4 method for signature 'flowFrame,tmixFilterResultList' Subset(x, subset, ...)
x |
A numeric vector, matrix, data frame of observations, or object of
class |
subset |
Object returned from |
... |
Further arguments to be passed to or from other methods. |
An object which is a subset of x
. It also retains the same
class as x
.
Subset(x, subset, ...)
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
This method prints out various characteristics of the model fitted via robust model-based clustering.
summary(object, ...) ## S4 method for signature 'flowClust' summary(object) ## S4 method for signature 'flowClustList' summary(object)
summary(object, ...) ## S4 method for signature 'flowClust' summary(object) ## S4 method for signature 'flowClustList' summary(object)
object |
|
... |
not used |
Various characteristics of the fitted model will be given under the following five categories: Experiment Information, Clustering Summary, Transformation Parameter, Information Criteria, and Data Quality. Under Data Quality, information about data filtering, outliers, and uncertainty is given.
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
The tmixFilter
function creates a filter object which is then passed
to the filter
method that performs filtering on a flowFrame
object. This method pair is provided to let flowClust integrate with
the flowCore package.
tmixFilter(filterId = "tmixFilter", parameters = "", ...) ## S4 method for signature 'ANY,flowClust' x %in% table ## S4 method for signature 'flowFrame,tmixFilterResult' x %in% table ## S4 method for signature 'flowFrame,tmixFilter' x %in% table ## S4 method for signature 'ANY,tmixFilterResult' x %in% table ## S4 method for signature 'ANY,flowClustList' x %in% table ## S4 method for signature 'ANY,tmixFilterResultList' x %in% table ## S4 method for signature 'flowFrame,flowClust' x[i, j, ..., drop = FALSE] ## S4 method for signature 'flowFrame,tmixFilterResult' x[i, j, ..., drop = FALSE] ## S4 method for signature 'flowFrame,flowClustList' x[i, j, ..., drop = FALSE] ## S4 method for signature 'flowFrame,tmixFilterResultList' x[i, j, ..., drop = FALSE] ## S4 method for signature 'tmixFilterResultList,ANY' x[[i, j, ..., exact = TRUE]] ## S4 method for signature 'tmixFilterResultList' length(x) ## S4 method for signature 'tmixFilterResult,tmixFilter' summarizeFilter(result, filter)
tmixFilter(filterId = "tmixFilter", parameters = "", ...) ## S4 method for signature 'ANY,flowClust' x %in% table ## S4 method for signature 'flowFrame,tmixFilterResult' x %in% table ## S4 method for signature 'flowFrame,tmixFilter' x %in% table ## S4 method for signature 'ANY,tmixFilterResult' x %in% table ## S4 method for signature 'ANY,flowClustList' x %in% table ## S4 method for signature 'ANY,tmixFilterResultList' x %in% table ## S4 method for signature 'flowFrame,flowClust' x[i, j, ..., drop = FALSE] ## S4 method for signature 'flowFrame,tmixFilterResult' x[i, j, ..., drop = FALSE] ## S4 method for signature 'flowFrame,flowClustList' x[i, j, ..., drop = FALSE] ## S4 method for signature 'flowFrame,tmixFilterResultList' x[i, j, ..., drop = FALSE] ## S4 method for signature 'tmixFilterResultList,ANY' x[[i, j, ..., exact = TRUE]] ## S4 method for signature 'tmixFilterResultList' length(x) ## S4 method for signature 'tmixFilterResult,tmixFilter' summarizeFilter(result, filter)
filterId |
A character string that identifies the filter created. |
parameters |
A character vector specifying the variables to be used in
filtering. When it is left unspecified, all the variables of the
|
... |
Other arguments passed to the The The The If If |
x |
flowFrame |
table |
tmixFilterResult |
i |
tmixFilterResult or tmixFilterResultList |
j , drop , exact
|
not used |
result |
tmixFilterResult |
filter |
tmixFilter |
Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.
flowClust
,
summary
,
plot
,
density
,
hist
, Subset
,
split
, ruleOutliers
, Map
### The example below largely resembles the one in the flowClust ### man page. The main purpose here is to demonstrate how the ### entire cluster analysis can be done in a fashion highly ### integrated into flowCore. data(rituximab) library(flowCore) ### create a filter object s1filter <- tmixFilter("s1", c("FSC.H", "SSC.H"), K=1) ### cluster the data using FSC.H and SSC.H res1 <- filter(rituximab, s1filter) ### remove outliers before proceeding to the second stage # %in% operator returns a logical vector indicating whether each # of the observations lies inside the gate or not rituximab2 <- rituximab[rituximab %in% res1,] # a shorthand for the above line rituximab2 <- rituximab[res1,] # this can also be done using the Subset method rituximab2 <- Subset(rituximab, res1) ### cluster the data using FL1.H and FL3.H (with 3 clusters) s2filter <- tmixFilter("s2", c("FL1.H", "FL3.H"), K=3) res2 <- filter(rituximab2, s2filter) show(s2filter) show(res2) summary(res2) # to demonstrate the use of the split method split(rituximab2, res2) split(rituximab2, res2, population=list(sc1=c(1,2), sc2=3)) # to show the cluster assignment of observations table(Map(res2)) # to show the cluster centres (i.e., the mean parameter estimates # transformed back to the original scale) and proportions getEstimates(res2) ### demonstrate the use of various plotting methods # a scatterplot plot(rituximab2, res2, level=0.8) plot(rituximab2, res2, level=0.8, include=c(1,2), grayscale=TRUE, pch.outliers=2) # a contour / image plot res2.den <- density(res2, data=rituximab2) plot(res2.den) plot(res2.den, scale="sqrt", drawlabels=FALSE) plot(res2.den, type="image", nlevels=100) plot(density(res2, include=c(1,2), from=c(0,0), to=c(400,600))) # a histogram (1-D density) plot plot(rituximab2, res2, "FL1.H") ### to demonstrate the use of the ruleOutliers method summary(res2) # change the rule to call outliers ruleOutliers(res2) <- list(level=0.95) # augmented cluster boundaries lead to fewer outliers summary(res2) # the following line illustrates how to select a subset of data # to perform cluster analysis through the min and max arguments; # also note the use of level to specify a rule to call outliers # other than the default s2t <- tmixFilter("s2t", c("FL1.H", "FL3.H"), K=3, B=100, min=c(0,0), max=c(400,800), level=0.95, z.cutoff=0.5) filter(rituximab2, s2t)
### The example below largely resembles the one in the flowClust ### man page. The main purpose here is to demonstrate how the ### entire cluster analysis can be done in a fashion highly ### integrated into flowCore. data(rituximab) library(flowCore) ### create a filter object s1filter <- tmixFilter("s1", c("FSC.H", "SSC.H"), K=1) ### cluster the data using FSC.H and SSC.H res1 <- filter(rituximab, s1filter) ### remove outliers before proceeding to the second stage # %in% operator returns a logical vector indicating whether each # of the observations lies inside the gate or not rituximab2 <- rituximab[rituximab %in% res1,] # a shorthand for the above line rituximab2 <- rituximab[res1,] # this can also be done using the Subset method rituximab2 <- Subset(rituximab, res1) ### cluster the data using FL1.H and FL3.H (with 3 clusters) s2filter <- tmixFilter("s2", c("FL1.H", "FL3.H"), K=3) res2 <- filter(rituximab2, s2filter) show(s2filter) show(res2) summary(res2) # to demonstrate the use of the split method split(rituximab2, res2) split(rituximab2, res2, population=list(sc1=c(1,2), sc2=3)) # to show the cluster assignment of observations table(Map(res2)) # to show the cluster centres (i.e., the mean parameter estimates # transformed back to the original scale) and proportions getEstimates(res2) ### demonstrate the use of various plotting methods # a scatterplot plot(rituximab2, res2, level=0.8) plot(rituximab2, res2, level=0.8, include=c(1,2), grayscale=TRUE, pch.outliers=2) # a contour / image plot res2.den <- density(res2, data=rituximab2) plot(res2.den) plot(res2.den, scale="sqrt", drawlabels=FALSE) plot(res2.den, type="image", nlevels=100) plot(density(res2, include=c(1,2), from=c(0,0), to=c(400,600))) # a histogram (1-D density) plot plot(rituximab2, res2, "FL1.H") ### to demonstrate the use of the ruleOutliers method summary(res2) # change the rule to call outliers ruleOutliers(res2) <- list(level=0.95) # augmented cluster boundaries lead to fewer outliers summary(res2) # the following line illustrates how to select a subset of data # to perform cluster analysis through the min and max arguments; # also note the use of level to specify a rule to call outliers # other than the default s2t <- tmixFilter("s2t", c("FL1.H", "FL3.H"), K=3, B=100, min=c(0,0), max=c(400,800), level=0.95, z.cutoff=0.5) filter(rituximab2, s2t)