Title: | Statistical methods for the analysis of flow cytometry data |
---|---|
Description: | Methods and functionality to analyse flow data that is beyond the basic infrastructure provided by the flowCore package. |
Authors: | Florian Hahne, Nishant Gopalakrishnan, Alireza Hadj Khodabakhshi, Chao-Jen Wong, Kyongryun Lee |
Maintainer: | Greg Finak <[email protected]>, Mike Jiang <[email protected]> |
License: | Artistic-2.0 |
Version: | 4.19.0 |
Built: | 2024-12-29 07:10:41 UTC |
Source: | https://github.com/bioc/flowStats |
Functions, methods and classes implementing algorithmns for statistical analysis of flow cytometry data. This involves mostly data normalization and automated gating.
Package: | flowStats |
Type: | Package |
Version: | 1.0 |
License: | Artistic-2.0 |
Lazyload: | yes |
Florian Hahne
Maintainers: Mike Jiang <[email protected]>, Jake Wagner <[email protected]>
These methods are copied from flowViz to eliminate its dependency on curv1Filter and curv2Filter
## S4 method for signature 'curv1Filter,character' addName(x, name, data, ...) ## S4 method for signature 'curv1Filter,logical' addName(x, name, data, ...) ## S4 method for signature 'curv2Filter,character' addName(x, name, data, ...) ## S4 method for signature 'curv2Filter,logical' addName(x, name, data, ...)
## S4 method for signature 'curv1Filter,character' addName(x, name, data, ...) ## S4 method for signature 'curv1Filter,logical' addName(x, name, data, ...) ## S4 method for signature 'curv2Filter,character' addName(x, name, data, ...) ## S4 method for signature 'curv2Filter,logical' addName(x, name, data, ...)
x |
curv1Filter, curv2Filter |
name |
character or logical. Names can be generated by the filter or by the user. |
data |
flowFrame |
... |
other arguments |
The methods are called for their side effects. No value is returned.
This function tries to fit a single norm2Filter based on a rough
preselection of the data. This function is considered internal. Please
use the API provided by lymphGate
.
autoGate(x, ..., scale = 2.5)
autoGate(x, ..., scale = 2.5)
x |
An object of class |
... |
Named arguments or a list of the ranges used for the
initial rough preselection. This gets passed on to
|
scale |
The |
The flowSet
is first filtered using a rectangleGate
and
the norm2Filter
is subsequently fitted to the remaining subset.
A list with items:
x |
The filtered |
n2gate |
The |
n2gateResults |
The
|
Florian Hahne
library(flowCore) data(GvHD) flowStats:::autoGate(GvHD[10:15], "FSC-H"=c(100,500), "SSC-H"=c(0, 400))
library(flowCore) data(GvHD) flowStats:::autoGate(GvHD[10:15], "FSC-H"=c(100,500), "SSC-H"=c(0, 400))
A data frame containing the sub-populations of ITN dataset corresponding
to the high-density areas on "FSC" and "SSC" channels. This dataset is
yielded by backGating
on channel CD3, CD8, and CD4 of the
ITN
sample data.
data(BackGating)
data(BackGating)
Results from executing the following code:
library(flowCore) data(ITN)
flowStats:::backGating(ITN, xy=c("FSC", "SSC"), channels=c("CD3", "CD8", "CD4"))
The bins generated by probability binning a control data set can be applied to a test data set to perfom statistical comparisions by methods such as the Chi-squared test or the probability binning statistic.
binByRef(binRes, data)
binByRef(binRes, data)
binRes |
The result generated by calling teh |
data |
An object of class
|
An enviroment containing the matrices for each bin of the test data set
Nishant Gopalakrishnan
library(flowCore) data(GvHD) resCtrl<-proBin(GvHD[[1]],200) resSample<-binByRef(resCtrl,GvHD[[2]]) ls(resSample)
library(flowCore) data(GvHD) resCtrl<-proBin(GvHD[[1]],200) resSample<-binByRef(resCtrl,GvHD[[2]]) ls(resSample)
This function calculates the Probability binning metric proposed by
Baggerly et al. The function utilizes the data binned using the
proBin
and binByRef
functions.
calcPBChiSquare(ctrlRes,sampRes,ctrlCount,sampCount)
calcPBChiSquare(ctrlRes,sampRes,ctrlCount,sampCount)
ctrlRes |
The result generated by calling the |
sampRes |
The result generated by calling the |
ctrlCount |
The number of events in the control sample |
sampCount |
The number of events in the test sample being compared |
A list containing the statistic, p.value, observed, expected counts and the residuals
Nishant Gopalakrishnan
library(flowCore) data(GvHD) # flow frame 1 is treated as control dataset and used to generate bins resCtrl<-proBin(GvHD[[1]][,c("FSC-H","SSC-H","Time")],200) plotBins(resCtrl,GvHD[[1]],channels=c("FSC-H","SSC-H","Time"),title="Binned control data") # Same bins are applied to flowFrame 16 resSample<-binByRef(resCtrl,GvHD[[16]][,c("FSC-H","SSC-H","Time")]) ctrlCount<-nrow(GvHD[[1]]) sampCount<-nrow(GvHD[[16]]) stat<-calcPBChiSquare(resCtrl,resSample,ctrlCount,sampCount)
library(flowCore) data(GvHD) # flow frame 1 is treated as control dataset and used to generate bins resCtrl<-proBin(GvHD[[1]][,c("FSC-H","SSC-H","Time")],200) plotBins(resCtrl,GvHD[[1]],channels=c("FSC-H","SSC-H","Time"),title="Binned control data") # Same bins are applied to flowFrame 16 resSample<-binByRef(resCtrl,GvHD[[16]][,c("FSC-H","SSC-H","Time")]) ctrlCount<-nrow(GvHD[[1]]) sampCount<-nrow(GvHD[[16]]) stat<-calcPBChiSquare(resCtrl,resSample,ctrlCount,sampCount)
This function calculates the Pearsons chi-squared statistic for comparing
data binned using the proBin
and binByRef
functions.Internally, the
function utilizes the chisq.test function.
calcPearsonChi(ctrlRes,sampRes)
calcPearsonChi(ctrlRes,sampRes)
ctrlRes |
The result generated by calling the |
sampRes |
The result generated by calling the |
A list containing the statistic, p.value, observed, expected counts and the residuals
Nishant Gopalakrishnan
library(flowCore) data(GvHD) # flow frame 1 is treated as control dataset and used to generate bins resCtrl<-proBin(GvHD[[1]][,c("FSC-H","SSC-H","Time")],200) plotBins(resCtrl,GvHD[[1]],channels=c("FSC-H","SSC-H","Time"),title="Binned control data") # Same bins are applied to flowFrame 16 resSample<-binByRef(resCtrl,GvHD[[16]][,c("FSC-H","SSC-H","Time")]) stat<-calcPearsonChi(resCtrl,resSample)
library(flowCore) data(GvHD) # flow frame 1 is treated as control dataset and used to generate bins resCtrl<-proBin(GvHD[[1]][,c("FSC-H","SSC-H","Time")],200) plotBins(resCtrl,GvHD[[1]],channels=c("FSC-H","SSC-H","Time"),title="Binned control data") # Same bins are applied to flowFrame 16 resSample<-binByRef(resCtrl,GvHD[[16]][,c("FSC-H","SSC-H","Time")]) stat<-calcPearsonChi(resCtrl,resSample)
Class and constructor for data-driven
filter
objects
that selects high-density regions in one dimension.
curv1Filter(x, bwFac=1.2, gridsize=rep(401, 2), filterId="defaultCurv1Filter")
curv1Filter(x, bwFac=1.2, gridsize=rep(401, 2), filterId="defaultCurv1Filter")
x |
Character giving the name of the measurement parameter on which the filter is supposed to work on. This can also be a list containing a single character scalar for programmatic access. |
filterId |
An optional parameter that sets the |
bwFac , gridsize
|
Numerics of length 1 and 2, respectively, used
to set the |
Areas of high local density in one dimensions are identified by
detecting significant curvature regions. See Duong, T. and
Cowling, A. and Koch, I. and Wand, M.P., Computational Statistics and
Data Analysis 52/9, 2008 for details. The constructor
curv1Filter
is a convenience function for object
instantiation. Evaluating a curv1Filter
results in potentially
multiple sub-populations, an hence in an object of class
multipleFilterResult
. Accordingly, curv1Filters
can be used to split flow cytometry data sets.
Returns a curv1Filter
object for use in filtering
flowFrame
s or other flow cytometry objects.
Class "parameterFilter"
, directly.
Class "concreteFilter"
, by class
parameterFilter
, distance 2.
Class "filter"
, by class parameterFilter
,
distance 3.
bwFac
:Object of class "numeric"
. The bandwidth
factor used for smoothing of the density estimate.
gridsize
:Object of class "numeric"
. The size
of the bins used for density estimation.
parameters
:Object of class "character"
,
describing the parameter used to filter the flowFrame
.
filterId
:Object of class "character"
,
referencing the filter.
Objects can be created by calls of the form new("curvFilter",
...)
or using the constructor curv1Filter
. Using the
constructor is the recommended way of object instantiation:
signature(x = "flowFrame", table =
"curv1Filter")
: The workhorse used to evaluate the filter on
data. This is usually not called directly by the user, but
internally by calls to the filter
methods.
signature(object = "curv1Filter")
: Print
information about the filter.
See the documentation in the
flowViz
package for plotting of
curv1Filters
.
Florian Hahne
curv2Filter
,
flowFrame
, flowSet
,
filter
for evaluation
of curv1Filters
and split
for splitting of flow
cytometry data sets based on that.
library(flowStats) library(flowCore) ## Loading example data dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore")) ## Create directly. Most likely from a command line curv1Filter("FSC-H", filterId="myCurv1Filter", bwFac=2) ## To facilitate programmatic construction we also have the following c1f <- curv1Filter(filterId="myCurv1Filter", x=list("FSC-H"), bwFac=2) ## Filtering using curv1Filter fres <- filter(dat, c1f) fres summary(fres) names(fres) ## The result of curv1 filtering are multiple sub-populations ## and we can split our data set accordingly split(dat, fres) ## We can limit the splitting to one or several sub-populations split(dat, fres, population="rest") split(dat, fres, population=list(keep=c("peak 2", "peak 3")))
library(flowStats) library(flowCore) ## Loading example data dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore")) ## Create directly. Most likely from a command line curv1Filter("FSC-H", filterId="myCurv1Filter", bwFac=2) ## To facilitate programmatic construction we also have the following c1f <- curv1Filter(filterId="myCurv1Filter", x=list("FSC-H"), bwFac=2) ## Filtering using curv1Filter fres <- filter(dat, c1f) fres summary(fres) names(fres) ## The result of curv1 filtering are multiple sub-populations ## and we can split our data set accordingly split(dat, fres) ## We can limit the splitting to one or several sub-populations split(dat, fres, population="rest") split(dat, fres, population=list(keep=c("peak 2", "peak 3")))
Class and constructor for data-driven filter
objects
that selects high-density regions in two dimensions.
curv2Filter(x, y, filterId="defaultCurv2Filter", bwFac=1.2, gridsize=rep(151, 2))
curv2Filter(x, y, filterId="defaultCurv2Filter", bwFac=1.2, gridsize=rep(151, 2))
x , y
|
Characters giving the names of the measurement parameter
on which the filter is supposed to work on. |
filterId |
An optional parameter that sets the |
bwFac , gridsize
|
Numerics of length 1 and 2, respectively, used
to set the |
Areas of high local density in two dimensions are identified by
detecting significant curvature regions. See Duong, T. and
Cowling, A. and Koch, I. and Wand, M.P., Computational Statistics and
Data Analysis 52/9, 2008 for details. The constructor
curv2Filter
is a convenience function for object
instantiation. Evaluating a curv2Filter
results in potentially
multiple sub-populations, an hence in an object of class
multipleFilterResult
. Accordingly, curv2Filters
can be used to split flow cytometry data sets.
Returns a curv2Filter
object for use in filtering
flowFrame
s or other flow cytometry objects.
Class "parameterFilter"
, directly.
Class "concreteFilter"
, by class
parameterFilter
, distance 2.
Class "filter"
, by class parameterFilter
,
distance 3.
bwFac
:Object of class "numeric"
. The bandwidth
factor used for smoothing of the density estimate.
gridsize
:Object of class "numeric"
. The size
of the bins used for density estimation.
parameters
:Object of class "character"
,
describing the parameters used to filter the flowFrame
.
filterId
:Object of class "character"
,
referencing the filter.
Objects can be created by calls of the form new("curv2Filter",
...)
or using the constructor curv2Filter
. The constructor
is the recommended way of object instantiation:
signature(x = "flowFrame", table =
"curv2Filter")
: The workhorse used to evaluate the filter on
data. This is usually not called directly by the user, but
internally by calls to the filter
methods.
signature(object = "curv2Filter")
: Print
information about the filter.
See the documentation in the
flowViz
package for plotting of
curv2Filters
.
Florian Hahne
curv1Filter
,
flowFrame
,
flowSet
, filter
for evaluation
of curv2Filters
and split
for splitting of flow
cytometry data sets based on that.
library(flowCore) ## Loading example data dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore")) ## Create directly. Most likely from a command line curv2Filter("FSC-H", "SSC-H", filterId="myCurv2Filter") ## To facilitate programmatic construction we also have the following c2f <- curv2Filter(filterId="myCurv2Filter", x=list("FSC-H", "SSC-H"), bwFac=2) c2f <- curv2Filter(filterId="myCurv2Filter", x=c("FSC-H", "SSC-H"), bwFac=2) ## Filtering using curv2Filter fres <- filter(dat, c2f) fres summary(fres) names(fres) ## The result of curv2 filtering are multiple sub-populations ## and we can split our data set accordingly split(dat, fres) ## We can limit the splitting to one or several sub-populations split(dat, fres, population="rest") split(dat, fres, population=list(keep=c("area 2", "area 3"))) curv2Filter("FSC-H", "SSC-H", filterId="test filter")
library(flowCore) ## Loading example data dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore")) ## Create directly. Most likely from a command line curv2Filter("FSC-H", "SSC-H", filterId="myCurv2Filter") ## To facilitate programmatic construction we also have the following c2f <- curv2Filter(filterId="myCurv2Filter", x=list("FSC-H", "SSC-H"), bwFac=2) c2f <- curv2Filter(filterId="myCurv2Filter", x=c("FSC-H", "SSC-H"), bwFac=2) ## Filtering using curv2Filter fres <- filter(dat, c2f) fres summary(fres) names(fres) ## The result of curv2 filtering are multiple sub-populations ## and we can split our data set accordingly split(dat, fres) ## We can limit the splitting to one or several sub-populations split(dat, fres, population="rest") split(dat, fres, population=list(keep=c("area 2", "area 3"))) curv2Filter("FSC-H", "SSC-H", filterId="test filter")
Parse the output of
curv1Filter
and find modes
and midpoints of the high-density regions. This function is considered
to be internal.
curvPeaks(x, dat, borderQuant = 0.01, n = 201, from, to, densities=NULL)
curvPeaks(x, dat, borderQuant = 0.01, n = 201, from, to, densities=NULL)
x |
A
|
dat |
The corresponding
|
borderQuant |
A numeric in |
n , from , to
|
Arguments are passed on to
|
densities |
The optional y values of the density estimate computed for the respective data. |
A list with items
peaks |
x and y locations of the modes of the regions in the density estimates. |
regions |
the left and right margins of the regions. |
midpoints |
the mean of |
regPoints |
x and y locations of the outline of the significant density regions. |
densFuns |
an approximation function of the density estimate |
Florian Hahne
library(flowCore) data(GvHD) tmp <- filter(GvHD[[10]], curv1Filter("FSC-H")) res <- flowStats:::curvPeaks(tmp, exprs(GvHD[[10]])[, "FSC-H"])
library(flowCore) data(GvHD) tmp <- filter(GvHD[[10]], curv1Filter("FSC-H")) res <- flowStats:::curvPeaks(tmp, exprs(GvHD[[10]])[, "FSC-H"])
The function tries to find a reasonable split point between the two
hypothetical cell populations "positive" and "negative". This function
is considered internal, please use the API provided by
rangeGate
.
density1d(x, stain, alpha = "min", sd = 2, plot = FALSE, borderQuant = 0.1, absolute = TRUE, inBetween = FALSE, refLine=NULL,rare=FALSE,bwFac=1.2 ,sig=NULL,peakNr=NULL, ...)
density1d(x, stain, alpha = "min", sd = 2, plot = FALSE, borderQuant = 0.1, absolute = TRUE, inBetween = FALSE, refLine=NULL,rare=FALSE,bwFac=1.2 ,sig=NULL,peakNr=NULL, ...)
x |
|
stain |
A character scalar giving the flow parameter for which to compute the separation. |
alpha |
A tuning parameter that controls the location of the
split point between the two populations. This has to be a numeric in
the range |
sd |
For the case where there is only a single population, the
algorithm falls back to esitmating the mode of this population and a
robust measure of the variance of it distribution. The |
plot |
Create a plot of the results of the computation. |
borderQuant |
Usualy the instrument is set up in a way that the
positive population is somewhere on the high end of the measurement
range and the negative population is on the low end. This parameter
allows to disregard populations with mean values in the extreme
quantiles of the data range. It's value should be in the range
|
absolute |
Logical controling whether to classify a population
(positive or negative) relative to the theoretical measurment range
of the instrument or the actual range of the data. This can be set
to |
inBetween |
Force the algorithm to put the separator in between two peaks. If there are more than two peaks, this argument is ignored. |
refLine |
Either |
rare |
Either |
bwFac |
The bandwidth for smoothing the density estimate. User-tunable |
sig |
a value of c(NULL,"L","R"),when sig is not NULL,use the half (left or right) of signal to estimate the std and mean. |
peakNr |
when peakNr is not NULL,drop the less significant peaks by their heights |
... |
Further arguments. |
The algorithm first tries to identify high density regions in the
data. If the input is a flowSet
, density regions will be
computed on the collapsed data, hence it should have been normalized
before (see warpSet
for one possible normalization
technique). The high density regions are then clasified as positive
and negative populations, based on their mean value in the theoretical
(or absolute if argument absolute=TRUE
) measurement range. In
case there are only two high-density regions the lower one is usually
clasified as the negative populations, however the heuristics in the
algorithm will force the classification towards a positive population
if the mean value is already very high. The absolute
and
borderQuant
arguments can be used to control this
behaviour. The split point between populations will be drawn at the
value of mimimum local density between the two populations, or, if the
alpha
argument is used, somewhere between the two populations
where the value of alpha forces the point to be closer to the negative
(0 - 0.5
) or closer to the positive population (0.5 -
1
).
If there is only a single high-density region, the algorithm will fall
back to estimating the mode of the distribution
(hubers
) and a robust measure of it's variance
and, in combination with the sd
argument, set the split point
somewhere in the right or left tail, depending on the classification
of the region.
For more than two populations, the algorithm will still classify each population into positive and negative and compute the split point between those clusteres, similar to the two population case.
A numeric indicating the split point between positive and negative populations.
Florian Hahne
library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL3-H"=asinh(`FL3-H`)) d <- flowStats:::density1d(dat, "FL4-H", plot=TRUE) if(require(flowViz)) densityplot(~`FL4-H`, dat, refline=d) ## tweaking the location flowStats:::density1d(dat, "FL4-H", plot=TRUE, alpha=0.8) ## only a single population flowStats:::density1d(dat, "FL3-H", plot=TRUE) flowStats:::density1d(dat, "FL3-H", plot=TRUE, sd=2)
library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL3-H"=asinh(`FL3-H`)) d <- flowStats:::density1d(dat, "FL4-H", plot=TRUE) if(require(flowViz)) densityplot(~`FL4-H`, dat, refline=d) ## tweaking the location flowStats:::density1d(dat, "FL4-H", plot=TRUE, alpha=0.8) ## only a single population flowStats:::density1d(dat, "FL3-H", plot=TRUE) flowStats:::density1d(dat, "FL3-H", plot=TRUE, sd=2)
convert ellipse from cov/mu to points used to plot priors
ellipse(cov, centre, level = 0.95)
ellipse(cov, centre, level = 0.95)
The version of fdPar from fda 2.4.0 because the new API changes the output. (specifically res$fd$coefs) and thus breaks the landmarkreg call.
fdPar(fdobj = NULL, Lfdobj = NULL, lambda = 0, estimate = TRUE, penmat = NULL)
fdPar(fdobj = NULL, Lfdobj = NULL, lambda = 0, estimate = TRUE, penmat = NULL)
fdobj |
functional data object, functional basis object, a functional parameter object or a matrix. If it a matrix, it is replaced by fd(fdobj). If class(fdobj) == 'basisfd', it is converted to an object of class fd with a coefficient matrix consisting of a single column of zeros. |
Lfdobj |
either a nonnegative integer or a linear differential operator object.
If NULL, Lfdobj depends on
|
lambda |
a nonnegative real number specifying the amount of smoothing to be applied to the estimated functional parameter. |
estimate |
not currently used. |
penmat |
a roughness penalty matrix. Including this can eliminate the need to compute this matrix over and over again in some types of calculations. |
Generate a prior specification based on a flowClust model This function generates a prior specification based on a flowClust fit object It can be passed to a second round of flowClust() with usePrior="yes" The prior could be estimated from a single sample, for example, and then used to speed up the convergence for other samples.
flowClust2Prior(x, kappa, Nt = NULL, addCluster = NULL)
flowClust2Prior(x, kappa, Nt = NULL, addCluster = NULL)
x |
a flowClust fit object |
kappa |
is the fraction of equivalent observations by which to weight this prior relative to the flowClust model. |
Nt |
the number of total equivalent observation |
addCluster |
not currently supported |
We construct a singlet gate by applying a robust linear model with
rlm
. By default, we model the forward-scatter height
(FSC-H)as a function of forward-scatter area (FSC-A). If sidescatter
is given, forward-scatter height is as a function of area
+
sidescatter
+ sidescatter / area
.
gate_singlet( x, area = "FSC-A", height = "FSC-H", sidescatter = NULL, prediction_level = 0.99, subsample_pct = NULL, wider_gate = FALSE, filterId = "singlet", maxit = 5, ... ) singletGate( x, area = "FSC-A", height = "FSC-H", sidescatter = NULL, prediction_level = 0.99, subsample_pct = NULL, wider_gate = FALSE, filterId = "singlet", maxit = 5, ... )
gate_singlet( x, area = "FSC-A", height = "FSC-H", sidescatter = NULL, prediction_level = 0.99, subsample_pct = NULL, wider_gate = FALSE, filterId = "singlet", maxit = 5, ... ) singletGate( x, area = "FSC-A", height = "FSC-H", sidescatter = NULL, prediction_level = 0.99, subsample_pct = NULL, wider_gate = FALSE, filterId = "singlet", maxit = 5, ... )
x |
a |
area |
character giving the channel name that records the signal intensity as peak area |
height |
character giving the channel name that records the signal intensity as peak heightchannel name of height |
sidescatter |
character giving an optional channel name for the sidescatter signal. By default, ignored. |
prediction_level |
a numeric value between 0 and 1 specifying the level to use for the prediction bands |
subsample_pct |
a numeric value between 0 and 1 indicating the percentage
of observations that should be randomly selected from |
wider_gate |
logical value. If |
filterId |
the name for the filter that is returned |
maxit |
the limit on the number of IWLS iterations passed to |
... |
additional arguments passed to |
Because rlm
relies on iteratively reweighted least
squares (IRLS), the runtime to construct a singlet gate is dependent in part
on the number of observations in x
. To improve the runtime, we provide
an option to subsample randomly a subset of x
. A percentage of
observations to subsample can be given in subsample_pct
. By default, no
subsampling is applied.
a polygonGate
object with the singlet gate
## Not run: # fr is a flowFrame sg <- gate_singlet(fr, area = "FSC-A", height = "FSC-H") sg # plot the gate xyplot(`FSC-H` ~ `FSC-A`, fr, filter = sg) ## End(Not run)
## Not run: # fr is a flowFrame sg <- gate_singlet(fr, area = "FSC-A", height = "FSC-H") sg # plot the gate xyplot(`FSC-H` ~ `FSC-A`, fr, filter = sg) ## End(Not run)
This funciton normalizes a set of flow cytometry data samples by identifying and aligning the high density regions (landmarks or peaks) for each channel. The data of each channel is shifted in such a way that the identified high density regions are moved to fixed locations called base landmarks.
gaussNorm (flowset, channel.names, max.lms=2, base.lms=NULL, peak.density.thr=0.05, peak.distance.thr=0.05, debug=FALSE, fname='')
gaussNorm (flowset, channel.names, max.lms=2, base.lms=NULL, peak.density.thr=0.05, peak.distance.thr=0.05, debug=FALSE, fname='')
flowset |
A |
channel.names |
A character vector of flow parameters in |
max.lms |
A numeric vector of the maximum number of base landmarks to be used for normalizing each channel. If it has only one value that will be used as the maximum number of base landmarks for all the channels. |
base.lms |
A list of vector for each channel that contains the base landmarks for normalizing that channel. If not specified the base landmarks are computed from the set of extracted landmarks. |
peak.density.thr |
The peaks with density value less than "peak.density.thr times maximum peak density" are discarded. |
peak.distance.thr |
The sequences of peaks that are located closer than "peak.distance.thr times range of data" are identified. Then for each sequence only one peak (the one with the highest intensity value) is used as a landmark. In other words no two landmarks are located closer than "peak.distance.thr times range of data" to each other. |
debug |
Logical. Forces the function to draw before and
after normalization plots for each sample. The plot of the i-th sample is stored in |
fname |
The pre- and post- normalization plots of the i-th sample is
stored in |
Normalization is archived in three phases: (i) identifying high-density regions
(landmarks) for each flowFrame
in the flowSet
for a single channel; (ii) computing the best
matching between the landmarks and a set of fixed reference
landmarks for each channel called base landmarks; (iii) manipulating
the data of each channel in such a way that each landmark is moved to
its matching base landmark.
Please note that this normalization is on a channel-by-channel
basis. Multiple channels are normalized in a loop.
A list with items flowset
: normalized flowSet
.
confidence
: a confidence measure of the normalization procedure.
Alireza Hadj Khodabakhshi
library(flowCore) data(ITN) dat <- transform(ITN, "CD4"=asinh(CD4), "CD3"=asinh(CD3), "CD8"=asinh(CD8)) lg <- lymphGate(dat, channels=c("CD3", "SSC"), preselection="CD4",scale=1.5) dat <- Subset(dat, lg) datr <- gaussNorm(dat, "CD8")$flowset if(require(flowViz)){ d1 <- densityplot(~CD8, dat, main="original", filter=curv1Filter("CD8")) d2 <- densityplot(~CD8, datr, main="normalized", filter=curv1Filter("CD8")) plot(d1, split=c(1,1,2,1)) plot(d2, split=c(2,1,2,1), newpage=FALSE) }
library(flowCore) data(ITN) dat <- transform(ITN, "CD4"=asinh(CD4), "CD3"=asinh(CD3), "CD8"=asinh(CD8)) lg <- lymphGate(dat, channels=c("CD3", "SSC"), preselection="CD4",scale=1.5) dat <- Subset(dat, lg) datr <- gaussNorm(dat, "CD8")$flowset if(require(flowViz)){ d1 <- densityplot(~CD8, dat, main="original", filter=curv1Filter("CD8")) d2 <- densityplot(~CD8, datr, main="normalized", filter=curv1Filter("CD8")) plot(d1, split=c(1,1,2,1)) plot(d2, split=c(2,1,2,1), newpage=FALSE) }
This function performs a multi-dimensional normalization of flow cytometry
data (flowSets
) using a generalized Procrustes analysis (GPA) method.
gpaSet(x, params, register="backgating", bgChannels=NULL, bg=NULL, rotation.only=TRUE, downweight.missingFeatures=FALSE, thres.sigma=2.5, show.workflow=FALSE, ask=names(dev.cur())!="pdf")
gpaSet(x, params, register="backgating", bgChannels=NULL, bg=NULL, rotation.only=TRUE, downweight.missingFeatures=FALSE, thres.sigma=2.5, show.workflow=FALSE, ask=names(dev.cur())!="pdf")
x |
A |
params |
A character vector of length 2 describing the channels of interest. |
register |
A character indicating the method to be used for identifying features. Available method only includes “backgating” at the point. |
bgChannels |
A character vector indicating the channels used for
backgating. If NULL, |
bg |
A data frame as the returning value of the
|
rotation.only |
Logical for coarsing a reflection matrix to a rotation matrix. |
downweight.missingFeatures |
Logical. If TRUE, the missing features, labeled as bogus features, are downweighted to zero. See details. |
thres.sigma |
A numerical value indicating the threshold of where
to cut the tree, e.g., as resulting from |
show.workflow |
Logical. If TRUE, the workflow of
|
ask |
Logical. If TRUE, the display operates in interactive mode. |
Normalization is achieved by first identifying features for each
flowFrame
in the flowSet
for
designated channels using backgating, subsequently labeling features, and
finally aligning the features to a reference feature in the sense of
minimizing the Frobenus norm of
where is a scalar,
a rotational matrix,
the matrix of
features, and
the reference feature. Both
and
are solved by using singular value decomposition (SVD).
Note that if feature is missing, it is given a bogus
value as
.
If downweight.missingFeatures
is TRUE, the cost function
becomes
where the weighting function is zero if the corresponding
feature is bogus.
The normalized flowSet
with "GPA" attribute.
C. J. Wong [email protected]
in progress
library(flowCore) ## Example 1: calling up gpaSet directly data(ITN) data(BackGating) tl <- transformList(colnames(ITN)[3:7], asinh, transformationId="asinh") dat <- transform(ITN, tl) xy = c("FSC", "SSC") bgChannels = c("CD8", "CD4", "CD3") ## bg <- flowStats:::backGating(dat, xy=xy, channels=bgChannels) ## using pre-generated backgating results: BackGating s <- gpaSet(dat, params=xy, bgChannels=bgChannels, bg=BackGating) if(require(flowViz)) { d1 <- densityplot(~., s, channels=c("FSC", "SSC"), layout=c(2,1), main="After GPA using bg") d2 <- xyplot(FSC ~ SSC, as(s, "flowFrame"), channels=c("FSC", "SSC"), main="All flowFrames") plot(d1) plot(d2) } ## view "GPA" attribute attr(s, "GPA") ## Not run: library(flowCore) ## Example 2: using work flow and normalization objects data(ITN) ITN <- ITN[1:8, ] wf <- workFlow(ITN) tl <- transformList(colnames(ITN)[3:7], asinh, transformationId="asinh") add(wf, tl) x <- Data(wf[["asinh"]]) ## normalize 'FSC' and 'SSC' channels norm <- normalization(normFun=function(x, parameters, ...) gpaSet(x, parameters, ...), parameters = c("FSC", "SSC"), arguments=list(bgChannels=c("CD8", "CD3"), register="backgating"), normalizationId="Procrustes") add(wf, norm2, parent="asinh") s <- Data(wf[["Procrustes"]]) if(require(flowViz)) { d1 <- densityplot(~., s, channels=c("FSC", "SSC"), layout=c(2,1), main="After GPA using bg") d2 <- xyplot(FSC ~ SSC, as(s, "flowFrame"), channels=c("FSC", "SSC"), main="All flowFrames") plot(d1) plot(d2) } ## End(Not run) ## end of dontrun
library(flowCore) ## Example 1: calling up gpaSet directly data(ITN) data(BackGating) tl <- transformList(colnames(ITN)[3:7], asinh, transformationId="asinh") dat <- transform(ITN, tl) xy = c("FSC", "SSC") bgChannels = c("CD8", "CD4", "CD3") ## bg <- flowStats:::backGating(dat, xy=xy, channels=bgChannels) ## using pre-generated backgating results: BackGating s <- gpaSet(dat, params=xy, bgChannels=bgChannels, bg=BackGating) if(require(flowViz)) { d1 <- densityplot(~., s, channels=c("FSC", "SSC"), layout=c(2,1), main="After GPA using bg") d2 <- xyplot(FSC ~ SSC, as(s, "flowFrame"), channels=c("FSC", "SSC"), main="All flowFrames") plot(d1) plot(d2) } ## view "GPA" attribute attr(s, "GPA") ## Not run: library(flowCore) ## Example 2: using work flow and normalization objects data(ITN) ITN <- ITN[1:8, ] wf <- workFlow(ITN) tl <- transformList(colnames(ITN)[3:7], asinh, transformationId="asinh") add(wf, tl) x <- Data(wf[["asinh"]]) ## normalize 'FSC' and 'SSC' channels norm <- normalization(normFun=function(x, parameters, ...) gpaSet(x, parameters, ...), parameters = c("FSC", "SSC"), arguments=list(bgChannels=c("CD8", "CD3"), register="backgating"), normalizationId="Procrustes") add(wf, norm2, parent="asinh") s <- Data(wf[["Procrustes"]]) if(require(flowViz)) { d1 <- densityplot(~., s, channels=c("FSC", "SSC"), layout=c(2,1), main="After GPA using bg") d2 <- xyplot(FSC ~ SSC, as(s, "flowFrame"), channels=c("FSC", "SSC"), main="All flowFrames") plot(d1) plot(d2) } ## End(Not run) ## end of dontrun
Identify and labeling significant features using divisive
clustering method such as diana
.
idFeaturesByBackgating(bg, nDim, thres.sigma=2.5, lambda=0.1, reference.method="median", plot.workflow=FALSE, ask=names(dev.cur())!="pdf")
idFeaturesByBackgating(bg, nDim, thres.sigma=2.5, lambda=0.1, reference.method="median", plot.workflow=FALSE, ask=names(dev.cur())!="pdf")
bg |
A data frame containing subpopulations on channels of
interests. Must be a returning result from |
nDim |
An integer indicating the length of channels of interest. |
thres.sigma |
An numerical value indicating the threshold at which to cut tree, e.g., as resulting from 'diana', into several clusters. |
lambda |
A numerical value indicating the percentage of the potential features that is used as a threshold for deciding outlier clusters. The default value is 0.1. |
reference.method |
A character vector indicating the method for
computing the reference features. If |
plot.workflow |
Logical. If TURE, display the workflow of feature identification. |
ask |
Logical. If TRUE, the display operates in interactive mode. |
Using the resulting data frame from backGating
as potential
features, the algorithm follows four major steps: (i) centering
the potential features, which yields the returning value
TransMatrix
, (ii) using diana
to compute a clustering of
the potential features, (iii) cutting the tree into several clusters,
and (iv) accessing outliers and rendering the final registered
features with labels.
In step three, the threshold for cutting the tree is computed by
where is the standard deviation of the distribution of the
height between entities computed by
diana
.
A cluster is determined as an outlier if the number of its members is less than the median of the numbers of all clusters' members times 'lambda'.
register |
A list containing registered features for each sample. |
Chao-Jen Wong
## Not run: library(flowCore) data(ITN) wf <- workFlow(ITN) tl <- transformList(colnames(ITN)[3:7], asinh, transformationId="asinh") dat <- trnasformList(ITN, tl) bg <- backGating(dat, xy=c("FSC", "SSC"), channels="CD3") ## End(Not run) data(BackGating) results <- flowStats:::idFeaturesByBackgating(bg=BackGating, nDim=2, plot.workflow=TRUE, ask=TRUE)
## Not run: library(flowCore) data(ITN) wf <- workFlow(ITN) tl <- transformList(colnames(ITN)[3:7], asinh, transformationId="asinh") dat <- trnasformList(ITN, tl) bg <- backGating(dat, xy=c("FSC", "SSC"), channels="CD3") ## End(Not run) data(BackGating) results <- flowStats:::idFeaturesByBackgating(bg=BackGating, nDim=2, plot.workflow=TRUE, ask=TRUE)
Based on generalized Procrustes analysis, this function determines a
linear transformation (rotation/reflection and scalling) of the points in
matrix x
to align them to their reference points
in matrix xbar
. The alignemnt is carried out by
minimizing the distance between the points in x
and xbar
.
iProcrustes(x, xbar, rotation.only=TRUE, scalling=TRUE, translate=FALSE)
iProcrustes(x, xbar, rotation.only=TRUE, scalling=TRUE, translate=FALSE)
x |
A numerical matrix to be align to points in |
xbar |
A numerical, reference matrix to which points in matrix |
rotation.only |
Logical. When |
scalling |
Logical. When |
translate |
Logical. Set |
Suppose the points in matrix and
are centralized
(meaning their centroids are at the origin). The
linear transformation of
for aligning
to its reference
matrix
., i.e., min
, is given by:
and
where V and U are the sigular value vectors of (that is,
), and
is the scalling factor.
A list of the linear tranformation with items
Q |
An orthogonal, rotation/reflection matrix. |
scal |
A scalling factor |
.
T |
(optional) A translation vector used to shift the centroid of the
points in matrix |
T.xbar |
(optional) Centered |
Note that the return values of this function do not include the transformed
matrix or
, where
is the
translation vector and
is an
vector with elements
.
C. J. Wong [email protected]
## Example 1 x <- matrix(runif(20), nrow=10, ncol=2)+ 1.4 s <- matrix(c(cos(60), -sin(60), sin(60), cos(60)), nrow=2, ncol=2, byrow=TRUE) xbar <- 2.2 *(x %*% s) - 0.1 lt <- iProcrustes(x, xbar, translate=TRUE) ## return linear transformation lt ## showing result I <- matrix(1, nrow=nrow(x), ncol=1) tx <- x - I %*% lt$T ## get the transformed matrix xnew xnew <- lt$scal * (tx %*% lt$Q) if (require(lattice)) { xyplot(V1 ~ V2, do.call(make.groups, lapply(list(x=x, xbar=xbar, T.xbar=lt$T.xbar, xnew=xnew),as.data.frame)), group=which, aspect=c(0.7), pch=c(1,3,2,4), col.symbol="black", main=("Align the points in x to xbar"), key=list(points=list(pch=c(1,3,2,4), col="black"), space="right", text=list(c("x", "xbar", "T.xbar", "xnew")))) } ## Example 2. centralized x and xbar prior to using iProcrustes x <- matrix(runif(10), nrow=5, ncol=2) s <- matrix(c(cos(60), -sin(60), sin(60), cos(60)), nrow=2, ncol=2, byrow=TRUE) xbar <- 1.2 *(x %*% s) - 2 I <- matrix(1, nrow=nrow(x), ncol=1) x <- x-(I %*% colMeans(x)) ## shift the centroid of points in x to the origin xbar <- xbar - (I %*% colMeans(xbar)) ## shift centroid to the origin lt <- iProcrustes(x, xbar, translate=FALSE) ## return linear transformation ## only return the rotation/reflection matrix and scalling factor lt xnew=lt$scal *(x %*% lt$Q) ## transformed matrix aligned to centralized xbar if (require(lattice)) { xyplot(V1 ~ V2, do.call(make.groups, lapply(list(x=x,xbar=xbar, xnew=xnew), as.data.frame)), group=which, auto.key=list(space="right")) }
## Example 1 x <- matrix(runif(20), nrow=10, ncol=2)+ 1.4 s <- matrix(c(cos(60), -sin(60), sin(60), cos(60)), nrow=2, ncol=2, byrow=TRUE) xbar <- 2.2 *(x %*% s) - 0.1 lt <- iProcrustes(x, xbar, translate=TRUE) ## return linear transformation lt ## showing result I <- matrix(1, nrow=nrow(x), ncol=1) tx <- x - I %*% lt$T ## get the transformed matrix xnew xnew <- lt$scal * (tx %*% lt$Q) if (require(lattice)) { xyplot(V1 ~ V2, do.call(make.groups, lapply(list(x=x, xbar=xbar, T.xbar=lt$T.xbar, xnew=xnew),as.data.frame)), group=which, aspect=c(0.7), pch=c(1,3,2,4), col.symbol="black", main=("Align the points in x to xbar"), key=list(points=list(pch=c(1,3,2,4), col="black"), space="right", text=list(c("x", "xbar", "T.xbar", "xnew")))) } ## Example 2. centralized x and xbar prior to using iProcrustes x <- matrix(runif(10), nrow=5, ncol=2) s <- matrix(c(cos(60), -sin(60), sin(60), cos(60)), nrow=2, ncol=2, byrow=TRUE) xbar <- 1.2 *(x %*% s) - 2 I <- matrix(1, nrow=nrow(x), ncol=1) x <- x-(I %*% colMeans(x)) ## shift the centroid of points in x to the origin xbar <- xbar - (I %*% colMeans(xbar)) ## shift centroid to the origin lt <- iProcrustes(x, xbar, translate=FALSE) ## return linear transformation ## only return the rotation/reflection matrix and scalling factor lt xnew=lt$scal *(x %*% lt$Q) ## transformed matrix aligned to centralized xbar if (require(lattice)) { xyplot(V1 ~ V2, do.call(make.groups, lapply(list(x=x,xbar=xbar, xnew=xnew), as.data.frame)), group=which, auto.key=list(space="right")) }
A flowSet
cotaining data from 15
patients.
data(ITN)
data(ITN)
A flowSet
containing 15
flowFrames
. There are 3
patient groups with 5 samples each.
Immune Tolerance Network
This functions first identifies high-density regions for each
flowFrame
in a
flowSet
and subsequently tries
to cluster these regions, yielding the landmarks matrix that needs to
be supplied to landmarkreg
. The function is
considered to be internal.
landmarkMatrix(data, fres, parm, border=0.05, peakNr=NULL, densities = NULL, n = 201, indices=FALSE)
landmarkMatrix(data, fres, parm, border=0.05, peakNr=NULL, densities = NULL, n = 201, indices=FALSE)
data |
A |
fres |
A list of
|
parm |
Character scalar of flow paramater to compute landmarks for. |
border |
A numeric in |
peakNr |
Force a fixed number of peaks. |
densities |
An optional matrix of y values of the density
estimates for the |
n |
Number of bins used for the density estimation. |
indices |
Return matrix of population indices instead of landmark locations. These indices can be used to point into the populations identified by the curv1Filter. |
In order to normalize the data using the landmarkreg
function
in the fda
, a set of landmarks has to
be computed for each flowFrame
in a flowSet
. The number
of lansmarks has to be the same for each frame. This function
identifies high-density regions in each frame, computes a simple
clustering and returns a matrix of landmark locations. Missing
landmarks of individual frames are substituted by the mean landmark
location of the respective cluster.
A matrix of landmark locations. Columns are landmarks and rows are
flowFrames
.
Florian Hahne
library(flowCore) data(GvHD) tmp <- list("FSC-H"=filter(GvHD[1:3], curv1Filter("FSC-H"))) res <- flowStats:::landmarkMatrix(GvHD[1:3], tmp, "FSC-H")
library(flowCore) data(GvHD) tmp <- list("FSC-H"=filter(GvHD[1:3], curv1Filter("FSC-H"))) res <- flowStats:::landmarkMatrix(GvHD[1:3], tmp, "FSC-H")
Cell populations of roughly elliptical shape in two-dimensional projections are of huge interest in many flow cytometry applications. This function identifies a single such population, potentially from a mixture of multiple populations.
lymphGate(x, channels, preselection=NULL, scale=2.5, bwFac=1.3, filterId="defaultLymphGate", plot=FALSE, ...)
lymphGate(x, channels, preselection=NULL, scale=2.5, bwFac=1.3, filterId="defaultLymphGate", plot=FALSE, ...)
x |
An object of class |
channels |
A character vector of length 2 of valid flow
parameters in |
preselection |
Either |
scale |
The |
bwFac |
The bandwidth factor that gets passed on to
|
filterId |
A character used as filterId. |
plot |
Logical. Produce plots of filter results |
... |
Additional arguments. |
This algorithm does not apply real mixture modelling, however it is able to identify a single elliptical cell population from a mixture of multiple such populations. The idea is to first define a rough rectangular preselection and, in a second step, fit a bivariate normal distribution to this subset only.
Depending on the value of preselection
, the initial rough
selection is either
No preselection at all
Preselection based on cells that are
positive for a single marker only. This allows for back-gating,
for instances by selecting CD4+
T-cells and using this
information to back-gate lymphocytes in FSC
and
SSC
. Positive cells are identified using a
curv1Filter
.
Preselection by a rectangular gate. The items of the list have to be numerics of length one giving the gate boundaries in the respective dimensions.
An ellipsoidGate
or list
of ellipsoidGate
objects
Class parameterFilter
,
directly.
Class concreteFilter
,
by class "parameterFilter", distance 2.
Class filter
, by class
"parameterFilter", distance 3.
See Arguments
section for details.
preselection
:Object of class character
, the
name of the flow parameter used for preselection.
rectDef
:Object of class list
, the initial
rectangular selection.
scale
:Object of class numeric
.
bwFac
:Object of class numeric
.
parameters
:Object of class parameters
, the flow
parameters to operate on.
filterId
:Object of class "character"
, the
filter identifier.
Objects can be created by calls of the form new("lymphFilter",
parameters, ...)
or using the constructor lymphFilter
. The
constructor is the recommended way of object instantiation.
Florian Hahne
library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`)) lg <- lymphGate(dat, channels=c("FSC-H", "SSC-H"), preselection="FL4-H",scale=1.5) if(require(flowViz)) xyplot(`SSC-H`~`FSC-H`, dat, filter=lg)
library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`)) lg <- lymphGate(dat, channels=c("FSC-H", "SSC-H"), preselection="FL4-H",scale=1.5) if(require(flowViz)) xyplot(`SSC-H`~`FSC-H`, dat, filter=lg)
Generate a flowClust prior specification from gates and data
mkPrior(gate, data, nu0, Omega0, ...) ## S4 method for signature 'polygonGate,flowFrame,numeric,matrix' mkPrior(gate, data, nu0, Omega0) ## S4 method for signature 'rectangleGate,flowFrame,numeric,matrix' mkPrior(gate, data, nu0, Omega0) ## S4 method for signature 'rectangleGate,flowFrame,missing,missing' mkPrior(gate, data, nu0 = NA, Omega0 = NA) ## S4 method for signature 'polygonGate,flowFrame,missing,missing' mkPrior(gate, data, nu0 = NA, Omega0 = NA) ## S4 method for signature 'list,flowSet,missing,missing' mkPrior(gate, data, nu0 = NA, Omega0, model.cov = "full", model.means = "full") ## S4 method for signature 'missing,flowSet,ANY,missing' mkPrior(gate, data, nu0 = NA, Omega0, model.cov = "full", model.means = "full") ## S4 method for signature 'missing,flowFrame,missing,missing' mkPrior(gate, data, nu0, Omega0) ## S4 method for signature 'list,flowSet,ANY,missing' mkPrior(gate, data, nu0 = NA, Omega0, model.cov = "full", model.means = "full")
mkPrior(gate, data, nu0, Omega0, ...) ## S4 method for signature 'polygonGate,flowFrame,numeric,matrix' mkPrior(gate, data, nu0, Omega0) ## S4 method for signature 'rectangleGate,flowFrame,numeric,matrix' mkPrior(gate, data, nu0, Omega0) ## S4 method for signature 'rectangleGate,flowFrame,missing,missing' mkPrior(gate, data, nu0 = NA, Omega0 = NA) ## S4 method for signature 'polygonGate,flowFrame,missing,missing' mkPrior(gate, data, nu0 = NA, Omega0 = NA) ## S4 method for signature 'list,flowSet,missing,missing' mkPrior(gate, data, nu0 = NA, Omega0, model.cov = "full", model.means = "full") ## S4 method for signature 'missing,flowSet,ANY,missing' mkPrior(gate, data, nu0 = NA, Omega0, model.cov = "full", model.means = "full") ## S4 method for signature 'missing,flowFrame,missing,missing' mkPrior(gate, data, nu0, Omega0) ## S4 method for signature 'list,flowSet,ANY,missing' mkPrior(gate, data, nu0 = NA, Omega0, model.cov = "full", model.means = "full")
gate |
A list of flowCore gates. The gates should represent the SAME population gated across multiple samples. |
data |
A flowSet of the same size as the number of gates above. Each flowFrame in the flowSet should contain the events representing the population in its corresponding gate. i.e. it should be the gated data. |
nu0 |
The nu0 hyperparameter. For estimation from data, it should be nu0=NA. |
Omega0 |
The Omega0 hyperparameter. For estimation from data it can be missing. |
... |
Not currently used. |
model.cov , model.means
|
model names used for cov and means. one of c("full","DE","DU"). "full" is the default. |
Construct a prior specification. Generally not called by the user.
Return values depend on the specific method called. Not meant for user consumption.
Greg Finak [email protected]
## The function is currently defined as
## The function is currently defined as
Class and constructors for a filter
that fits a bivariate
normal distribution to a data set of paired values and selects data points
according to their standard deviation from the fitted distribution.
norm2Filter(x, y, method="covMcd", scale.factor=1, n=50000, filterId="defaultNorm2Filter")
norm2Filter(x, y, method="covMcd", scale.factor=1, n=50000, filterId="defaultNorm2Filter")
x , y
|
Characters giving the names of the measurement parameter on which
the filter is supposed to work on. |
filterId |
An optional parameter that sets the |
scale.factor , n
|
Numerics of length 1, used to set the
|
method |
Character in |
The filter fits a bivariate normal distribution to the data and selects all
events within the Mahalanobis distance multiplied by the scale.factor
argument. The constructor norm2Filter
is a convenience function for
object instantiation. Evaluating a curv2Filter
results in an object
of class logicalFilterResult
. Accordingly, norm2Filters
can be used to subset and to split flow cytometry data sets.
Returns a norm2Filter
object for use in filtering
flowFrame
s or other flow cytometry objects.
method
One of covMcd
or cov.rob
defining method used for computation of covariance matrix.
scale.factor
Numeric vector giving factor of standard
deviations used for data selection (all points within
scalefac
standard deviations are selected).
n
Object of class "numeric"
, the number of
events used to compute the covariance matrix of the bivariate
distribution.
filterId
Object of class "character"
referencing the filter.
parameters
Object of class "ANY"
describing
the parameters used to filter the flowFrame
or
flowSet
.
Class "parameterFilter"
, directly.
Class "concreteFilter"
, by class
parameterFilter
, distance 2.
Class "filter"
, by class parameterFilter
,
distance 3.
Objects can be created by calls of the form new("norm2Filter",
...)
or using the constructor norm2Filter
. The constructor
is the recommended way.
signature(x = "flowFrame", table =
"norm2Filter")
: The workhorse used to evaluate the filter on
data. This is usually not called directly by the user, but
internally by calls to the filter
methods.
signature(object = "norm2Filter")
: Print
information about the filter.
See the documentation in the flowViz
package for plotting of norm2Filters
.
F. Hahne
cov.rob
, CovMcd
,
filter
for evaluation of
norm2Filters
and split
and Subset
for
splitting and subsetting of flow cytometry data sets based on that.
library(flowCore) ## Loading example data dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore")) ## Create directly. Most likely from a command line norm2Filter("FSC-H", "SSC-H", filterId="myCurv2Filter") ## To facilitate programmatic construction we also have the following n2f <- norm2Filter(filterId="myNorm2Filter", x=list("FSC-H", "SSC-H"), scale.factor=2) n2f <- norm2Filter(filterId="myNorm2Filter", x=c("FSC-H", "SSC-H"), scale.factor=2) ## Filtering using norm2Filter fres <- filter(dat, n2f) fres summary(fres) ## The result of norm2 filtering is a logical subset Subset(dat, fres) ## We can also split, in which case we get those events in and those ## not in the gate as separate populations split(dat, fres)
library(flowCore) ## Loading example data dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore")) ## Create directly. Most likely from a command line norm2Filter("FSC-H", "SSC-H", filterId="myCurv2Filter") ## To facilitate programmatic construction we also have the following n2f <- norm2Filter(filterId="myNorm2Filter", x=list("FSC-H", "SSC-H"), scale.factor=2) n2f <- norm2Filter(filterId="myNorm2Filter", x=c("FSC-H", "SSC-H"), scale.factor=2) ## Filtering using norm2Filter fres <- filter(dat, n2f) fres summary(fres) ## The result of norm2 filtering is a logical subset Subset(dat, fres) ## We can also split, in which case we get those events in and those ## not in the gate as separate populations split(dat, fres)
The method will step through the gating hierarchy in a breadth first search manner and normalize each dimension and gate not explicitly excluded in skipdims,or skipgates. The normalization approach is based on warpSet, but uses sequential normalization to alternately normalize then perform gating of the cell populations. This often helps with feature registration of popualtions lower in the gating hierarchy. FSC and SSC, as well as time are generally excluded by default. The rule of thumb, is to only normalize a channel in a gate if it is absolutely warranted.
normalize(data,x,...)
normalize(data,x,...)
data |
The |
x |
|
... |
Arguments passed to downstream functions. target:
The target sample to normalize the other samples in the gating set to. A populations:
A dims:
A chunksize:
For a memory-efficient implementation of normalization, set the nPeaks:
A bwFac:
The bandwidth for density estimation, a |
This function implements sequential normalization using a GatingSet and a set of manual gates. For each gate in the gating hierarchy, the algorithm checks if the gate should be normalized, and which dimensions in the gate should be normalized. If normalization is warranted, this is performed prior to gating. After gating, the counts for the gate in the GatingSet are updated, and the next gate is processed. This is useful in the application of template gates to data that has staining variability in one or more channels.
Returns a GatingSet
of normalized data.
Greg Finak <[email protected]>
See also GatingSet-class
, GatingHierarchy-class
, ncdfFlowSet
## Not run: #gs is a GatingSet gs_norm <- normalize(gs , target = "M+T panel_903997-25.fcs" , populations = "cd27gate" , dims = "<Violet A 610/20-A>" , minCountThreshold = 100 , nPeaks = list('cd27gate' = 2) , chunksize = 10 , bwFac = 2 ) #show the population statistics for before and after normalization getPopStats(gs) getPopStats(gs_norm) #plot the gate to see the effects of normalization grid.arrange( plotGate(gs, "cd27gate", type = "densityplot", stack = T) ,plotGate(gs_norm, "cd27gate", type = "densityplot", stack = T) ) ## End(Not run)
## Not run: #gs is a GatingSet gs_norm <- normalize(gs , target = "M+T panel_903997-25.fcs" , populations = "cd27gate" , dims = "<Violet A 610/20-A>" , minCountThreshold = 100 , nPeaks = list('cd27gate' = 2) , chunksize = 10 , bwFac = 2 ) #show the population statistics for before and after normalization getPopStats(gs) getPopStats(gs_norm) #plot the gate to see the effects of normalization grid.arrange( plotGate(gs, "cd27gate", type = "densityplot", stack = T) ,plotGate(gs_norm, "cd27gate", type = "densityplot", stack = T) ) ## End(Not run)
Create QA plots for a flow cytometry normalization process.
normQA(data, morph = c("^fsc", "^ssc"), channels, odat = NULL, ask = names(dev.cur()) != "pdf", grouping = NULL, tag.outliers = FALSE, peaksOnly = TRUE)
normQA(data, morph = c("^fsc", "^ssc"), channels, odat = NULL, ask = names(dev.cur()) != "pdf", grouping = NULL, tag.outliers = FALSE, peaksOnly = TRUE)
data |
a normalized |
morph |
A character vector of channel names to use for the backgating into the morphological channels. |
channels |
The channels for which to create plots. Defaults to all normalized channels. |
odat |
The original data set, always needed if there are no warping functions available. |
ask |
Ask before creating a new plot. |
grouping |
A grouping variable in |
tag.outliers |
Logical. Add sample name to outliers in the plots. |
peaksOnly |
Logical. Only use data when a peak was detected in a
particular sample. If set to |
This function assumes that the necessary information has been added as
attributes to data
during the normalization procedure. Depending
on the available information, a set of QA plots is generated.
Available plots are:
Amount of peak adjustment
Warping functions
Landmark classification confidence
Backgating of peak events in morphological channels
This function is called for its side effect of generating plots.
Florian Hahne
This function computes an Overton-like subtraction of two densities. It calculates the proportion of the reference density that is above a reference
overton_like(ref, test, twosided = FALSE)
overton_like(ref, test, twosided = FALSE)
ref |
The reference channel specified as a |
test |
The test (potentially positive) channel specified as a |
twosided |
|
The test can be one-sided or two-sided. If one sided, it tests the region of the test density that is above the mode of the reference density. If two-sided it will look at the regions on either side of the mode of the reference density. Densities are computed on a grid of 1024, and appropriately normalized.
numeric
value representing the proportion of the area of the test density above the reference density.
Greg Finak
A = rnorm(10000,mean=1,sd=0.5) B = rnorm(10000,mean=2,sd=0.5) overton_like(A,B)
A = rnorm(10000,mean=1,sd=0.5) B = rnorm(10000,mean=2,sd=0.5) overton_like(A,B)
This function is useful in visualizing the differences between the binned control and sample datasets. The bins generated from the control dataset are overlaid with the sample dataset. An optional argument residuals can be used to shade each bin based on a calculated statistical measure of difference between the number of events in each bin.
plotBins(binRes,data,channels,title,residuals,shadeFactor)
plotBins(binRes,data,channels,title,residuals,shadeFactor)
binRes |
The result generated by calling the |
data |
An object of class
|
channels |
The flow parameters to be plotted.In cases where more
than two parameters are binned from the control set, the
|
title |
Optional title for the plot generated |
residuals |
A vector of length equal to the number of bins
generated that can be used to shade each bin. The residuals from the
|
shadeFactor |
Optional argument between 0 and 1 that controls the intensity of the shading of bins |
Nishant Gopalakrishnan
proBin
, calcPearsonChi
,
calcPBChiSquare
library(flowCore) data(GvHD) # flow frame 1 is treated as control dataset and used to generate bins resCtrl<-proBin(GvHD[[1]],200,channels=c("FSC-H","SSC-H")) plotBins(resCtrl,GvHD[[1]],channels=c("FSC-H","SSC-H"),title="Binned control data") # Same bins are applied to flowFrame 16 resSample<-binByRef(resCtrl,GvHD[[16]]) stat<-calcPearsonChi(resCtrl,resSample) dev.new() plotBins(resCtrl,data=GvHD[[16]],channels=c("FSC-H","SSC-H","Time"),title="Comparision 1 & 16", residuals=stat$residuals[2,],shadeFactor=0.7)
library(flowCore) data(GvHD) # flow frame 1 is treated as control dataset and used to generate bins resCtrl<-proBin(GvHD[[1]],200,channels=c("FSC-H","SSC-H")) plotBins(resCtrl,GvHD[[1]],channels=c("FSC-H","SSC-H"),title="Binned control data") # Same bins are applied to flowFrame 16 resSample<-binByRef(resCtrl,GvHD[[16]]) stat<-calcPearsonChi(resCtrl,resSample) dev.new() plotBins(resCtrl,data=GvHD[[16]],channels=c("FSC-H","SSC-H","Time"),title="Comparision 1 & 16", residuals=stat$residuals[2,],shadeFactor=0.7)
Plots a flowClust prior overlaid on data.
plotPrior(data, prior, dims = NULL, ...)
plotPrior(data, prior, dims = NULL, ...)
data |
On object of class "flowFrame". The data to be plotted. |
prior |
An object of class "flowClustPrior", or "flowClustPriorList",
returned by a call to |
dims |
A character vector of the dimensions to be included in the plot. The dimension names should match column names in the prior and in the flowFrame. |
... |
Additional arguments to plotting functions, such as
|
Generates a plot of a "flowClustPrior" or "flowClustPriorList" object overlaid on some data. Plots the prior means (Mu0), prior covariance of the means (Omega0), and prior sample covariance (Lambda0).
Silently returns zero.
Greg Finak <[email protected]>
This function divides the flowframe events into bins such that each bin contains the same number of events. The number of events falling into each bin can then be compared across the control and test samples using statistical methods such as the Chi-squared test.
proBin(m, minEvents=500,channels=NULL)
proBin(m, minEvents=500,channels=NULL)
m |
An object of class
|
minEvents |
The |
channels |
A character vector for the Flourescence channels on which probability binning is to be performed. Defaults is NULL, in which case, all flourescence channels are used for probability binning.( Time information, if provided in the flowFrame is discarded) |
The flowSet
is first filtered using a rectangleGate
and
the norm2Filter
is subsequently fitted to the remaining subset.
A list with items:
table |
A The dataIndx column provides indexes for retrieving the matrices
during each stage of the binning process from the enviroment
The parent field indicates the row number in the table that holds the parent information for the corresponding node. The left and right columns indicates the row numbers in the table that stores information regarding the children of that particular node. The leaf nodes that hold the binned data can be identified by the nodes with the left of right values of zero( ie. no children nodes) The visited column is used internally by the algorithm to check if a particular node has been visited during the computation process. |
data |
An enviroment in which the matrices generated during each stage of the probability binning process is stored. The matrices stored at the leaf nodes represent the binned events obtained after the stop criterion of minEvents has been achieved. These can be identified by the corresponding dataIndx fields provided by the rows in the table with the left or right column values of zero. |
limits |
A list containing the the boundaries of each hyperplane generated during probability binning |
splitPars |
A splitMed - The median value which was used as the threshold for
splitting the The splitCol and splitMed parameters are utilized by the plotBins and shadeBins functions in visualizing the differences between control and test sample cases. |
Nishant Gopalakrishnan
library(flowCore) data(GvHD) res<-proBin(GvHD[[1]],200,channels=c("FSC-H","SSC-H","FL1-H","FL4-H"))
library(flowCore) data(GvHD) res<-proBin(GvHD[[1]],200,channels=c("FSC-H","SSC-H","FL1-H","FL4-H"))
This function tries to find the most likely separation of two-dimensional flow cytometry in four quadrants.
quadrantGate(x, stains, alpha=c("min", "min"), sd=c(2, 2), plot=FALSE, filterId="defaultQuadGate", refLine.1=NULL, refLine.2=NULL ,rare=c(FALSE,FALSE) ,sig=c(NULL,NULL) ,...)
quadrantGate(x, stains, alpha=c("min", "min"), sd=c(2, 2), plot=FALSE, filterId="defaultQuadGate", refLine.1=NULL, refLine.2=NULL ,rare=c(FALSE,FALSE) ,sig=c(NULL,NULL) ,...)
x |
|
stains |
A character vector of length two giving the two flow parameters for which the quad gate is to be computed. |
alpha , sd
|
Tuning factors to control the computation of the
gate boundaries. See |
plot |
Logical. Produce plots of intermediate results. |
filterId |
Character, the name assigned to the resulting filter. |
refLine.1 |
Either |
refLine.2 |
Either |
rare |
logical flags for two channels, Refer to |
sig |
parameters for two channels. Refer to |
... |
Additional arguments |
The most likely separation between postitive and negative stains for
two-dimensional data is computed based on density
estimates. Essentially, the gate parameters are first fitted
separately for the two parameters and later combined. See the
documentation for rangeGate
for details. There is a
certain amount of heuristics involved in this process. The algorithm
can be slightly tweaked using the alpha
and sd
arguments. Their values will be recycled for the two dimensions unless
explicitely given as vectors of length 2.
An object of class quadGate
.
Florian Hahne
## Not run: library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL2-H"=asinh(`FL2-H`)) qg <- quadrantGate(dat, c("FL2-H", "FL4-H")) qg if(require(flowViz)) xyplot(`FL2-H`~`FL4-H`, dat, filter=qg) qg <- quadrantGate(dat, c("FL2-H", "FL4-H"), alpha=c(0.1, 0.9), plot=TRUE) qg split(dat, qg) ## End(Not run)
## Not run: library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL2-H"=asinh(`FL2-H`)) qg <- quadrantGate(dat, c("FL2-H", "FL4-H")) qg if(require(flowViz)) xyplot(`FL2-H`~`FL4-H`, dat, filter=qg) qg <- quadrantGate(dat, c("FL2-H", "FL4-H"), alpha=c(0.1, 0.9), plot=TRUE) qg split(dat, qg) ## End(Not run)
The function tries to find a reasonable split point between the two hypothetical cell populations "positive" and "negative".
rangeGate(x, stain, alpha="min", sd=2, plot=FALSE, borderQuant=0.1, absolute=TRUE, filterId="defaultRectangleGate", positive=TRUE, refLine=NULL, simple = FALSE,...) rangeFilter(stain, alpha="min", sd=2, borderQuant=0.1, filterId="defaultRangeFilter")
rangeGate(x, stain, alpha="min", sd=2, plot=FALSE, borderQuant=0.1, absolute=TRUE, filterId="defaultRectangleGate", positive=TRUE, refLine=NULL, simple = FALSE,...) rangeFilter(stain, alpha="min", sd=2, borderQuant=0.1, filterId="defaultRangeFilter")
x |
|
stain |
A character scalar giving the flow parameter for which to compute the separation. |
alpha |
A tuning parameter that controls the location of the
split point between the two populations. This has to be a numeric in
the range |
sd |
For the case where there is only a single population, the
algorithm falls back to esitmating the mode of this population and a
robust measure of the variance of it distribution. The |
plot |
Create a plot of the results of the computation. |
borderQuant |
Usualy the instrument is set up in a way that the
positive population is somewhere on the high end of the measurement
range and the negative population is on the low end. This parameter
allows to disregard populations with mean values in the extreme
quantiles of the data range. It's value should be in the range
|
absolute |
Logical controling whether to classify a population
(positive or negative) relative to the theoretical measurment range
of the instrument or the actual range of the data. This can be set
to |
filterId |
Character, the name assigned to the resulting filter. |
positive |
Create a range gate that includes the positive
( |
refLine |
Either |
simple |
|
... |
Further arguments. |
The algorithm first tries to identify high density regions in the
data. If the input is a flowSet
, density regions will be
computed on the collapsed data, hence it should have been normalized
before (see warpSet
for one possible normalization
technique). The high density regions are then clasified as positive
and negative populations, based on their mean value in the theoretical
(or absolute if argument absolute=TRUE
) measurement range. In
case there are only two high-density regions the lower one is usually
clasified as the negative populations, however the heuristics in the
algorithm will force the classification towards a positive population
if the mean value is already very high. The absolute
and
borderQuant
arguments can be used to control this
behaviour. The split point between populations will be drawn at the
value of mimimum local density between the two populations, or, if the
alpha
argument is used, somewhere between the two populations
where the value of alpha forces the point to be closer to the negative
(0 - 0.5
) or closer to the positive population (0.5 -
1
).
If there is only a single high-density region, the algorithm will fall
back to estimating the mode of the distribution
(hubers
) and a robust measure of it's variance
and, in combination with the sd
argument, set the split point
somewhere in the right or left tail, depending on the classification
of the region.
For more than two populations, the algorithm will still classify each population into positive and negative and compute the split point between those clusteres, similar to the two population case.
The rangeFilter
class and constructor provide the means to
treat rangeGate
as regular flowCore
filters.
A range gate, more explicitely an object of class
rectangleGate
.
signature(x =
"flowFrame", table = "rangeFilter")
: the work horse for doing the
actual filtering. Internally, this simply calls the rangeGate
function.
Florian Hahne, Kyongryun Lee
warpSet
, rangeGate
,
rectangleGate
library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL3-H"=asinh(`FL3-H`)) rg <- rangeGate(dat, "FL4-H", plot=TRUE) rg split(dat, rg) ## Test rangeGate when settting refLine=0; it does not do anything since ## there is no sub-population below zero. rangeGate(dat, "FL4-H", plot=FALSE, refLine=0) rf <- rangeFilter("FL4-H") filter(dat, rf)
library(flowCore) data(GvHD) dat <- GvHD[pData(GvHD)$Patient==10] dat <- transform(dat, "FL4-H"=asinh(`FL4-H`), "FL3-H"=asinh(`FL3-H`)) rg <- rangeGate(dat, "FL4-H", plot=TRUE) rg split(dat, rg) ## Test rangeGate when settting refLine=0; it does not do anything since ## there is no sub-population below zero. rangeGate(dat, "FL4-H", plot=FALSE, refLine=0) rf <- rangeFilter("FL4-H") filter(dat, rf)
This function can be used to generate a sample from a multivariate
mixture model with Box-Cox transformation.
SimulateMixture(N, w, mu, sigma, nu = 4, lambda)
SimulateMixture(N, w, mu, sigma, nu = 4, lambda)
N |
The number of observations. |
w |
A vector of length |
mu |
A matrix of size |
sigma |
An array of dimension |
nu |
The degrees of freedom used for the |
lambda |
The Box-Cox transformation parameter. If missing, the
conventional |
A matrix of size .
Raphael Gottardo <[email protected]>, Kenneth Lo <[email protected]>
### Number of components K <- 5 ### Dimension p <- 2 ### Number of observations n <- 200 Mu <- matrix(runif(K*p, 0, 20), K, p) Sigma <- array(0, c(K, p, p)) for (k in 1:K) { Sigma[k,,][outer(1:p, 1:p, ">")] <- runif(p*(p-1)/2,-.1,.1) diag(Sigma[k,,]) <- runif(p,0,1) ### Make sigma positive definite Sigma[k,,] <- Sigma[k,,] %*% t(Sigma[k,,]) } ### Generate the weights w <- rgamma(K,10,1) w <- w/sum(w) y <- SimulateMixture(n, w, Mu, Sigma, nu=4)
### Number of components K <- 5 ### Dimension p <- 2 ### Number of observations n <- 200 Mu <- matrix(runif(K*p, 0, 20), K, p) Sigma <- array(0, c(K, p, p)) for (k in 1:K) { Sigma[k,,][outer(1:p, 1:p, ">")] <- runif(p*(p-1)/2,-.1,.1) diag(Sigma[k,,]) <- runif(p,0,1) ### Make sigma positive definite Sigma[k,,] <- Sigma[k,,] %*% t(Sigma[k,,]) } ### Generate the weights w <- rgamma(K,10,1) w <- w/sum(w) y <- SimulateMixture(n, w, Mu, Sigma, nu=4)
flowSet
for use with spillover
by matching channel
names to compensation control filenamesSpillover information for a particular experiment is often obtained by
running several tubes of beads or cells stained with a single color that can
then be used to determine a spillover matrix for use with
compensate
.
This method facilitates construction of a flowSet
of compensation
control flowFrame
s using a simple file linking filenames to channels.
This resulting flowSet
can then be used with spillover
using the option prematched = TRUE
.
Matching stain channels to compensation controls is done via a csv
file (matchfile
) with columns 'filename' and 'channel'. The 'channel' entries should
exactly match the channel names in the FCS files. The 'filename' should be
the FCS file name of each compensation control which should also be the
corresponding sample name in the flowSet
. There should also be one unstained
control with the 'channel' entry of 'unstained'.
The method also allows for x
to be missing if path
is provided,
pointing to a directory containing the control FCS files.
## S4 method for signature 'flowSet' spillover_match(x, fsc = "FSC-A", ssc = "SSC-A", matchfile = NULL, path) ## S4 method for signature 'missing' spillover_match(x, fsc = "FSC-A", ssc = "SSC-A", matchfile, path)
## S4 method for signature 'flowSet' spillover_match(x, fsc = "FSC-A", ssc = "SSC-A", matchfile = NULL, path) ## S4 method for signature 'missing' spillover_match(x, fsc = "FSC-A", ssc = "SSC-A", matchfile, path)
x |
A flowSet of compensation beads or cells |
fsc |
The name or index of the forward scatter parameter |
ssc |
The name or index of the side scatter parameter |
matchfile |
The name or path of the csv file holding the compensation control file to channel matching information. |
path |
The name or path of the directory containing the control FCS files to be matched to channels by matchfile. |
A flowSet
with the sample names of its flowFrames
corresponding to the channels specified by the matchfile.
B. Ellis, J. Wagner
flowSet
, simplified APISpillover information for a particular experiment is often obtained by
running several tubes of beads or cells stained with a single color that can
then be used to determine a spillover matrix for use with
compensate
.
Matching stain channels to compensation controls is done via a matching csv
file (at the path given by matchfile
) with columns 'filename' and 'channel'.
The 'channel' entries should exactly match the channel names in the FCS files.
The 'filename' should be the FCS file name of each compensation control which
should also be the corresponding sample name in the flowSet
.
There should also be one unstained control with the 'channel' entry of 'unstained'.
The method also allows for x
to be missing if path
is provided,
pointing to a directory containing the control FCS files.
By default, pregating is always done on the channels using this API, and the
mode of the channel is used to compute the spillover matrix. FSC and SSC channels
can be provided to allow a pregating on (approximately) a population in the FSC and
SSC dimensions. Also by default, a norm2Filter
is applied before
computing the spillover. These defaults can be overridden using the pregate
,
method
, and useNormFilt
arguments.
## S4 method for signature 'flowSet' spillover_ng(x, fsc = "FSC-A", ssc = "SSC-A", plot = FALSE, matchfile, path, useNormFilt = TRUE, patt = NULL, pregate = TRUE, method = "mode", ...) ## S4 method for signature 'missing' spillover_ng(x, fsc = "FSC-A", ssc = "SSC-A", plot = FALSE, matchfile, path, useNormFilt = TRUE, patt = NULL, pregate = TRUE, method = "mode", ...) ## S4 method for signature 'missing' spillover_ng( x, fsc = "FSC-A", ssc = "SSC-A", plot = FALSE, matchfile, path, useNormFilt = TRUE, patt = NULL, pregate = TRUE, method = "mode", ... )
## S4 method for signature 'flowSet' spillover_ng(x, fsc = "FSC-A", ssc = "SSC-A", plot = FALSE, matchfile, path, useNormFilt = TRUE, patt = NULL, pregate = TRUE, method = "mode", ...) ## S4 method for signature 'missing' spillover_ng(x, fsc = "FSC-A", ssc = "SSC-A", plot = FALSE, matchfile, path, useNormFilt = TRUE, patt = NULL, pregate = TRUE, method = "mode", ...) ## S4 method for signature 'missing' spillover_ng( x, fsc = "FSC-A", ssc = "SSC-A", plot = FALSE, matchfile, path, useNormFilt = TRUE, patt = NULL, pregate = TRUE, method = "mode", ... )
x |
A flowSet of compensation beads or cells |
fsc |
The name or index of the forward scatter parameter |
ssc |
The name or index of the side scatter parameter |
plot |
logical. Plots the kernel density for each channel when
pregating. Displays the gate used. If |
matchfile |
Name of the csv file holding the compensation control file to channel matching information. |
path |
A path to a directory containing the control files, to be used
if |
useNormFilt |
logical Indicating whether to apply a
|
patt |
An optional regular expression defining which parameters should be considered |
pregate |
logical Indicating whether to pregate using |
method |
The statistic to use for calculation. Traditionally, this has been the median so it is the default. The mean is sometimes more stable. |
... |
Additional arguments passed to
|
The algorithm used is fairly simple. First, using the scatter parameters, we restrict ourselves to the most closely clustered population to reduce the amount of debris. The selected statistic is then calculated on all appropriate parameters and the unstained values swept out of the matrix. Every sample is then normalized to [0,1] with respect to the maximum value of the sample, giving the spillover in terms of a proportion of the primary channel intensity.
A matrix for each of the parameters
B. Ellis
C. B. Bagwell & E. G. Adams (1993). Fluorescence spectral overlap compensation for any number of flow cytometry parameters. in: Annals of the New York Academy of Sciences, 677:167-184.
Spillover information for a particular experiment is often obtained by
running several tubes of beads or cells stained with a single color that can
then be used to determine a spillover matrix for use with
compensate
.
When matching stain channels in x
with the compensation controls, we
provide a few options. If ordered
, we assume the ordering of the
channels in the flowSet object is the same as the ordering of the
compensation-control samples. If regexpr
, we use a regular expression
to match the channel names with the names of each of the compensation control
flowFrame
s (that is, sampleNames(x)
, which will typically be the
filenames passed to read.FCS
).
By default, we must "guess" based on the largest statistic for the
compensation control (i.e., the row).
Additionally, matching of channels to compensation control files can
be accomplished using the spillover_match
method, which allows
the matches to be specified using a csv file. The flowSet returned
by the spillover_match
method should then be used as the x
argument
to spillover
with prematched = TRUE
.
## S4 method for signature 'flowSet' spillover( x, unstained = NULL, fsc = "FSC-A", ssc = "SSC-A", patt = NULL, method = "median", stain_match = c("intensity", "ordered", "regexpr"), useNormFilt = FALSE, prematched = FALSE, exact_match = FALSE )
## S4 method for signature 'flowSet' spillover( x, unstained = NULL, fsc = "FSC-A", ssc = "SSC-A", patt = NULL, method = "median", stain_match = c("intensity", "ordered", "regexpr"), useNormFilt = FALSE, prematched = FALSE, exact_match = FALSE )
x |
A flowSet of compensation beads or cells |
unstained |
The name or index of the unstained negative control |
fsc |
The name or index of the forward scatter parameter |
ssc |
The name or index of the side scatter parameter |
patt |
An optional regular expression defining which parameters should be considered |
method |
The statistic to use for calculation. Traditionally, this has been the median so it is the default. The mean is sometimes more stable. |
stain_match |
Determines how the stain channels are matched with the compensation controls. See details. |
useNormFilt |
logical Indicating whether to apply a
|
prematched |
a convenience argument specifying if the channels have already been matched by spillover_match. This will override the values of unstained and stain_match with unstained = "unstained" and stain_match = "regexpr". |
exact_match |
a |
The algorithm used is fairly simple. First, using the scatter parameters, we restrict ourselves to the most closely clustered population to reduce the amount of debris. The selected statistic is then calculated on all appropriate parameters and the unstained values swept out of the matrix. Every sample is then normalized to [0,1] with respect to the maximum value of the sample, giving the spillover in terms of a proportion of the primary channel intensity.
A matrix for each of the parameters
B. Ellis, J. Wagner
C. B. Bagwell & E. G. Adams (1993). Fluorescence spectral overlap compensation for any number of flow cytometry parameters. in: Annals of the New York Academy of Sciences, 677:167-184.
This function will perform a normalization of flow cytometry data based on warping functions computed on high-density region landmarks for individual flow channels.
warpSet(x, ...) ## S3 method for class 'cytoset' warpSet( x, stains, grouping = NULL, subsample = NULL, peakNr = NULL, clipRange = 0.01, nbreaks = 11, fres, bwFac = 2, warpFuns = FALSE, target = NULL, chunksize = 10, ... )
warpSet(x, ...) ## S3 method for class 'cytoset' warpSet( x, stains, grouping = NULL, subsample = NULL, peakNr = NULL, clipRange = 0.01, nbreaks = 11, fres, bwFac = 2, warpFuns = FALSE, target = NULL, chunksize = 10, ... )
x |
A |
... |
Further arguments that are passed on to
|
stains |
A character vector of flow parameters in |
grouping |
A character indicating one of the phenotypic
variables in the |
subsample |
Numeric. Reduce the number of events in each |
peakNr |
Numeric scalar. Force a fixed number of peaks to use for the normalization. |
clipRange |
Only use peaks within a clipped data
range. Essentially, the number indicates the percent of clipping on
both sides of the data range, e.g. |
nbreaks |
The number of spline sections used to approximate the data. Higher values produce more accurate results, however this comes with the cost of increaseqd computing times. For most data, the default setting is good enough. |
fres |
A named list of |
bwFac |
Numeric of lenght 1 used to set the bandwidth factor by
|
warpFuns |
Logical indcating whether to return the normalized
|
target |
Character vector specifying the target sample to which other samples in the |
chunksize |
an |
Normalization is achived by first identifying high-density regions
(landmarks) for each flowFrame
in the flowSet
for a single channel and subsequently by
computing warping functions for each flowFrame
that best align
these landmarks. This is based on the algorithm implemented in the
landmarkreg
function in the fda
package. An intermediate step classifies the high-density regions, see
landmarkMatrix
for details.
Please note that this normalization is on a channel-by-channel basis. Multiple channels are normalized in a loop.
The normalized flowSet
if warpFuns
is FALSE
,
otherwise a list of warping functions. Additional inforamtion is
attached as the warping
attribute to the flowSet
in form
of a list.
We currently use a patched fda version.
Florian Hahne
J.O. Ramsay and B.W. Silverman: Applied Functional Data Analysis, Springer 2002
## Not run: library(flowCore) data(ITN) dat <- transform(ITN, "CD4"=asinh(CD4), "CD3"=asinh(CD3), "CD8"=asinh(CD8)) lg <- lymphGate(dat, channels=c("CD3", "SSC"), preselection="CD4",scale=1.5) dat <- Subset(dat, lg) datr <- warpSet(dat, "CD8", grouping="GroupID") if(require(flowViz)){ d1 <- densityplot(~CD8, dat, main="original", filter=curv1Filter("CD8")) d2 <- densityplot(~CD8, datr, main="normalized", filter=curv1Filter("CD8")) plot(d1, split=c(1,1,2,1)) plot(d2, split=c(2,1,2,1), newpage=FALSE) } ## End(Not run)
## Not run: library(flowCore) data(ITN) dat <- transform(ITN, "CD4"=asinh(CD4), "CD3"=asinh(CD3), "CD8"=asinh(CD8)) lg <- lymphGate(dat, channels=c("CD3", "SSC"), preselection="CD4",scale=1.5) dat <- Subset(dat, lg) datr <- warpSet(dat, "CD8", grouping="GroupID") if(require(flowViz)){ d1 <- densityplot(~CD8, dat, main="original", filter=curv1Filter("CD8")) d2 <- densityplot(~CD8, datr, main="normalized", filter=curv1Filter("CD8")) plot(d1, split=c(1,1,2,1)) plot(d2, split=c(2,1,2,1), newpage=FALSE) } ## End(Not run)