Title: | Analysis Of Differential Abundance Taking Sample and Scale Variation Into Account |
---|---|
Description: | A differential abundance analysis for the comparison of two or more conditions. Useful for analyzing data from standard RNA-seq or meta-RNA-seq assays as well as selected and unselected values from in-vitro sequence selections. Uses a Dirichlet-multinomial model to infer abundance from counts, optimized for three or more experimental replicates. The method infers biological and sampling variation to calculate the expected false discovery rate, given the variation, based on a Wilcoxon Rank Sum test and Welch's t-test (via aldex.ttest), a Kruskal-Wallis test (via aldex.kw), a generalized linear model (via aldex.glm), or a correlation test (via aldex.corr). All tests report predicted p-values and posterior Benjamini-Hochberg corrected p-values. ALDEx2 also calculates expected standardized effect sizes for paired or unpaired study designs. ALDEx2 can now be used to estimate the effect of scale on the results and report on the scale-dependent robustness of results. |
Authors: | Greg Gloor, Andrew Fernandes, Jean Macklaim, Arianne Albert, Matt Links, Thomas Quinn, Jia Rong Wu, Ruth Grace Wong, Brandon Lieng, Michelle Nixon |
Maintainer: | Greg Gloor <[email protected]> |
License: | GPL (>=3) |
Version: | 1.39.0 |
Built: | 2024-10-30 03:29:34 UTC |
Source: | https://github.com/bioc/ALDEx2 |
A differential abundance analysis for the comparison of two or more conditions. For example, single-organism and meta-RNA-seq high-throughput sequencing assays, or of selected and unselected values from in-vitro sequence selections. Uses a Dirichlet-multinomial model to infer abundance from counts, that has been optimized for three or more experimental replicates. Infers sampling variation and calculates the expected false discovery rate given the biological and sampling variation using the Wilcox rank test or Welches t-test (aldex.ttest) or the glm and Kruskal Wallis tests (aldex.glm). Reports both P and fdr values calculated by the Benjamini Hochberg correction.
Please use the citation given by citation(package="ALDEx")
.
aldex.clr
,
aldex.ttest
,
aldex.glm
,
aldex.effect
,
selex
# see examples for the aldex.clr, aldex.ttest, aldex.effect, aldex.glm functions
# see examples for the aldex.clr, aldex.ttest, aldex.effect, aldex.glm functions
aldex
ObjectWelcome to the ALDEx2
package!
The aldex
function is a wrapper that performs log-ratio transformation
and statistical testing in a single line of code. Specifically, this function:
(a) generates Monte Carlo samples of the Dirichlet distribution for each sample,
(b) converts each instance using a log-ratio transform, then (c) returns test
results for two sample (Welch's t, Wilcoxon) or multi-sample (glm, Kruskal-Wallace)
tests. This function also estimates effect size for two sample analyses.
aldex( reads, conditions, mc.samples = 128, test = "t", effect = TRUE, CI = FALSE, include.sample.summary = FALSE, verbose = FALSE, paired.test = FALSE, denom = "all", iterate = FALSE, gamma = NULL, ... )
aldex( reads, conditions, mc.samples = 128, test = "t", effect = TRUE, CI = FALSE, include.sample.summary = FALSE, verbose = FALSE, paired.test = FALSE, denom = "all", iterate = FALSE, gamma = NULL, ... )
reads |
A non-negative, integer-only |
conditions |
A character vector. A description of the data structure used
for testing. Typically, a vector of group labels. For |
mc.samples |
An integer. The number of Monte Carlo samples to use when estimating the underlying distributions. Since we are estimating central tendencies, 128 is usually sufficient. |
test |
A character string. Indicates which tests to perform. "t" runs
Welch's t and Wilcoxon tests. "kw" runs Kruskal-Wallace and glm tests.
"glm" runs a generalized linear model using a |
effect |
A boolean. Toggles whether to calculate abundances and effect sizes. |
CI |
A boolean. Toggles whether to calculate effect size confidence intervals
Applies to |
include.sample.summary |
A boolean. Toggles whether to include median clr
values for each sample. Applies to |
verbose |
A boolean. Toggles whether to print diagnostic information while
running. Useful for debugging errors on large datasets. Applies to
|
paired.test |
A boolean. Toggles whether to do paired-sample tests.
Applies to |
denom |
A character string. Indicates which features to retain as the denominator for the Geometric Mean calculation. Using "iqlr" accounts for data with systematic variation and centers the features on the set features that have variance that is between the lower and upper quartile of variance. Using "zero" is a more extreme case where there are many non-zero features in one condition but many zeros in another. In this case the geometric mean of each group is calculated using the set of per-group non-zero features. |
iterate |
A boolean. Toggles whether to iteratively perform a test. For example, this will use the results from an initial "t" routine to seed the reference (i.e., denominator of Geometric Mean calculation) for a second "t" routine. |
gamma |
A numeric. The standard deviation on the within sample variation. |
... |
Arguments to embedded method (e.g., |
See "Examples" below for a description of the sample input.
Returns a number of values that depends on the set of options. See the return values of aldex.ttest, aldex.kw, aldex.glm, and aldex.effect for explanations and examples.
Greg Gloor, Andrew Fernandes, and Matt Links contributed to the original package. Thom Quinn added the "glm" test method, the "corr" test method, and the "iterate" procedure. Michelle Pistner Nixon and Justin Silverman contributed the scale and PPP routines
Please use the citation given by
citation(package="ALDEx2")
.
aldex
,
aldex.clr
,
aldex.ttest
,
aldex.kw
,
aldex.glm
,
aldex.effect
,
aldex.corr
,
selex
# The 'reads' data.frame should have row # and column names that are unique, and # looks like the following: # # T1a T1b T2 T3 N1 N2 Nx # Gene_00001 0 0 2 0 0 1 0 # Gene_00002 20 8 12 5 19 26 14 # Gene_00003 3 0 2 0 0 0 1 # Gene_00004 75 84 241 149 271 257 188 # Gene_00005 10 16 4 0 4 10 10 # Gene_00006 129 126 451 223 243 149 209 # ... many more rows ... data(selex) selex <- selex[1201:1600,] # subset for efficiency conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex(selex, conds, mc.samples=2, denom="all", test="t", effect=TRUE, paired.test=FALSE)
# The 'reads' data.frame should have row # and column names that are unique, and # looks like the following: # # T1a T1b T2 T3 N1 N2 Nx # Gene_00001 0 0 2 0 0 1 0 # Gene_00002 20 8 12 5 19 26 14 # Gene_00003 3 0 2 0 0 0 1 # Gene_00004 75 84 241 149 271 257 188 # Gene_00005 10 16 4 0 4 10 10 # Gene_00006 129 126 451 223 243 149 209 # ... many more rows ... data(selex) selex <- selex[1201:1600,] # subset for efficiency conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex(selex, conds, mc.samples=2, denom="all", test="t", effect=TRUE, paired.test=FALSE)
The aldex.clr S4 class is a class which stores the data generated by the aldex.clr method.
An aldex.clr object contains the centre-log ratio transformed Monte Carlo
Dirichlet instances derived from estimating the technical variance of the
raw read count data. It is created by the aldex.clr.function
, which
is invoked by the aldex.clr method. It consists of eight slots:
the reads, the condition information, the number of instances, the denominator,
whether it was invoked as verbose, and if multi-cores was used, the
Dirichlet Monte-Carlo probabilities, and the centre-log ratio transformed
Monte Carlo probabilities. These can be accessed along with information
about the length of some attributes.
The aldex.clr object contains the raw data, the estimated probabilities drawn from a Dirichlet distribution, and the clr transformed values for each Monte-Carlo instance. These can be accessed through getters outlined below.
In the code below, x
is an aldex.clr
object, and i
,
is a positive integer.
There are N samples, D features, and M Monte-Carlo instances.
getMonteCarloInstances(x)
Returns the clr transformed Monte Carlo Dirichlet instances as a list where each list entry is a single sample containing a D x M matrix.
getSampleIDs(x)
Returns the names of the samples. These can be used to access the
original reads for a given sample, as in x@reads\$sampleID
(if the reads are a data frame).
getFeatureNames(x)
Returns the names of the keys
. These can be used to subset the
data rows.
getFeatures(x)
Returns the clr transformed values for the features in the first sample.
numFeatures(x)
Returns the number of features that were non-0 in at least one sample.
numMCInstances(x)
Returns the number of Monte-Carlo instances.
getReads(x)
Returns the input data as used by the method.
numConditions(x)
Returns the number of samples in the conditions
analysis.
getMonteCarloReplicate(x, i)
Returns the D x M matrix containing the Monte-Carlo instances for one sample.
getMonteCarloSample(x, i)
Returns the N x D matrix containing Monte-Carlo instance i
for for all samples.
Greg Gloor, Ruth Grace Wong, Andrew Fernandes, Jia Rong Wu and Matt Links contributed to this code
Please use the citation given by citation(package="ALDEx")
.
# The 'reads' data.frame or # SummarizedExperiment object should have # row and column names that are unique, # and looks like the following: # # T1a T1b T2 T3 N1 N2 Nx # Gene_00001 0 0 2 0 0 1 0 # Gene_00002 20 8 12 5 19 26 14 # Gene_00003 3 0 2 0 0 0 1 # Gene_00004 75 84 241 149 271 257 188 # Gene_00005 10 16 4 0 4 10 10 # Gene_00006 129 126 451 223 243 149 209 # ... many more rows ... data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) i=1 # x is an object of type aldex.clr x <- aldex.clr(selex, conds, mc.samples = 2, denom="all", verbose = FALSE) # get reads plus uniform prior reads <- getReads(x) # get a list containing all of the clr transformed instances monteCarloInstances <- getMonteCarloInstances(x) # get a list containing all of the Monte-Carlo Dirichlet instances monteCarloDirInstances <- getDirichletInstances(x) # retrieve the clr transformed instances for sample i. monteCarloInstance <- getMonteCarloReplicate(x,i) # retrieve the Monte-Carlo Dirichlet instances for sample i. monteCarloDirInstance <- getDirichletReplicate(x,i) # retrieve the clr transformed instance i for all samples monteCarloSample <- getMonteCarloSample(x,i) # retrieve the Monte-Carlo Dirichlet instance i for all samples monteCarloDirSample <- getDirichletSample(x,i) # get sample names sampleIDs <- getSampleIDs(x) # get features features <- getFeatures(x) # get number of features with at least one count numFeatures <- numFeatures(x) # get number of Monte Carlo instances numInstances <- numMCInstances(x) # get names of features featureNames <- getFeatureNames(x) # get number of conditions conditions <- numConditions(x) # get the offset of the features in the log-ratio denominator denom <- getDenom(x)
# The 'reads' data.frame or # SummarizedExperiment object should have # row and column names that are unique, # and looks like the following: # # T1a T1b T2 T3 N1 N2 Nx # Gene_00001 0 0 2 0 0 1 0 # Gene_00002 20 8 12 5 19 26 14 # Gene_00003 3 0 2 0 0 0 1 # Gene_00004 75 84 241 149 271 257 188 # Gene_00005 10 16 4 0 4 10 10 # Gene_00006 129 126 451 223 243 149 209 # ... many more rows ... data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) i=1 # x is an object of type aldex.clr x <- aldex.clr(selex, conds, mc.samples = 2, denom="all", verbose = FALSE) # get reads plus uniform prior reads <- getReads(x) # get a list containing all of the clr transformed instances monteCarloInstances <- getMonteCarloInstances(x) # get a list containing all of the Monte-Carlo Dirichlet instances monteCarloDirInstances <- getDirichletInstances(x) # retrieve the clr transformed instances for sample i. monteCarloInstance <- getMonteCarloReplicate(x,i) # retrieve the Monte-Carlo Dirichlet instances for sample i. monteCarloDirInstance <- getDirichletReplicate(x,i) # retrieve the clr transformed instance i for all samples monteCarloSample <- getMonteCarloSample(x,i) # retrieve the Monte-Carlo Dirichlet instance i for all samples monteCarloDirSample <- getDirichletSample(x,i) # get sample names sampleIDs <- getSampleIDs(x) # get features features <- getFeatures(x) # get number of features with at least one count numFeatures <- numFeatures(x) # get number of Monte Carlo instances numInstances <- numMCInstances(x) # get names of features featureNames <- getFeatureNames(x) # get number of conditions conditions <- numConditions(x) # get the offset of the features in the log-ratio denominator denom <- getDenom(x)
aldex.clr
Object
Generate Monte Carlo samples of the Dirichlet distribution for each sample.
Convert each instance using a centered log-ratio transform.
This is the input for all further analyses.Compute an aldex.clr
Object
Generate Monte Carlo samples of the Dirichlet distribution for each sample.
Convert each instance using a centered log-ratio transform.
This is the input for all further analyses.
aldex.clr.function( reads, conds, mc.samples = 128, denom = "all", verbose = FALSE, useMC = FALSE, summarizedExperiment = NULL, gamma = NULL )
aldex.clr.function( reads, conds, mc.samples = 128, denom = "all", verbose = FALSE, useMC = FALSE, summarizedExperiment = NULL, gamma = NULL )
reads |
A |
conds |
A |
mc.samples |
The number of Monte Carlo instances to use to estimate the underlying distributions; since we are estimating central tendencies, 128 is usually sufficient, but larger numbers may be needed with small sample sizes. |
denom |
An |
verbose |
Print diagnostic information while running. Useful only for debugging if fails on large datasets. |
useMC |
Use multicore by default (FALSE). Multi core processing will be attempted with the BiocParallel package. Serial processing will be used if this is not possible. In practice serial and multicore are nearly the same speed because of overhead in setting up the parallel processes. |
summarizedExperiment |
must be set to TRUE if input data are in this format. |
gamma |
Use scale simulation if not NULL. If a matrix is supplied, scale simulation will be used assuming that matrix denotes the scale samples. If a numeric is supplied, scale simulation will be applied by relaxing the geometric mean assumption with the numeric representing the standard deviation of the scale distribution. |
The object produced by the clr
function contains the log-ratio transformed
values for each Monte-Carlo Dirichlet instance, which can be accessed through
getMonteCarloInstances(x)
, where x
is the clr
function output.
Each list element is named by the sample ID. getFeatures(x)
returns the
features, getSampleIDs(x)
returns sample IDs, and getFeatureNames(x)
returns the feature names.
# The 'reads' data.frame or # RangedSummarizedExperiment object should # have row and column names that are unique, # and looks like the following: # # T1a T1b T2 T3 N1 N2 Nx # Gene_00001 0 0 2 0 0 1 0 # Gene_00002 20 8 12 5 19 26 14 # Gene_00003 3 0 2 0 0 0 1 # ... many more rows ...
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=4, gamma=NULL, verbose=FALSE)
aldex.corr
calculates the expected values for the correlation
between each feature and a continuous variable, using data returned
returned by aldex.clr
and a vector of the continuous variable.
Returns results of Pearson, Spearman and Kendall tests.
aldex.corr(clr, cont.var)
aldex.corr(clr, cont.var)
clr |
An |
cont.var |
A continuous numeric vector |
Returns a data.frame of the average
Pearson, Spearman and Kendall
coefficients and their p-values for each feature,
with FDR appended as a BH
column.
Arianne Albert, Greg Gloor, Thom Quinn
Please use the citation given by
citation(package="ALDEx2")
.
aldex
,
aldex.clr
,
aldex.ttest
,
aldex.kw
,
aldex.glm
,
aldex.effect
,
aldex.corr
,
selex
data(selex) #subset for efficiency selex <- selex[1:50,] conds <- c(rep("N", 7), rep("S",7)) cont.var <- c(rep(1,7), rep(2,7)) x <- aldex.clr(selex, conds, mc.samples=16) corr.test <- aldex.corr(x, cont.var)
data(selex) #subset for efficiency selex <- selex[1:50,] conds <- c(rep("N", 7), rep("S",7)) cont.var <- c(rep(1,7), rep(2,7)) x <- aldex.clr(selex, conds, mc.samples=16) corr.test <- aldex.corr(x, cont.var)
Determines the median clr abundance of the feature in all samples and in groups. Determines the median difference between the two groups. Determines the median variation within each two group. Determines the effect size, which is the median of the ratio of the between-group difference and the larger of the variance within groups.
aldex.effect( clr, verbose = TRUE, include.sample.summary = FALSE, useMC = FALSE, CI = FALSE, glm.conds = NULL, paired.test = FALSE )
aldex.effect( clr, verbose = TRUE, include.sample.summary = FALSE, useMC = FALSE, CI = FALSE, glm.conds = NULL, paired.test = FALSE )
clr |
|
verbose |
Print diagnostic information while running. Useful only for debugging if fails on large datasets. |
include.sample.summary |
Include median clr values for each sample, defaults to FALSE. |
useMC |
Use multicore by default (FALSE). |
CI |
Give effect 95% confidence intervals, defaults to FALSE. |
glm.conds |
Give effect for glm contrasts, note: saved as list. |
paired.test |
Calculate effect size for paired samples, defaults to FALSE. |
An explicit example for two conditions is shown in the 'Examples' below.
Returns a dataframe with the following information:
rab.all |
a vector containing the median clr value for each feature. |
rab.win.conditionA |
a vector containing the median clr value for each feature in condition A. |
rab.win.conditionB |
a vector containing the median clr value for each feature in condition B. |
diff.btw |
a vector containing the per-feature median difference between condition A and B. |
diff.win |
a vector containing the per-feature maximum median difference between Dirichlet instances within conditions. |
effect |
a vector containing the per-feature effect size. |
overlap |
a vector containing the per-feature proportion of effect size that is 0 or less. |
Greg Gloor, Andrew Fernandes, Matt Links
Please use the citation given by citation(package="ALDEx")
.
aldex.clr
, aldex.ttest
, aldex.glm
,
aldex.glm.effect
, selex
# x is the output of the \code{x <- clr(data, mc.samples)} function # conditions is a description of the data # for the selex dataset, conditions <- c(rep("N", 7), rep("S", 7)) data(selex) # subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=2, denom="all") effect.test <- aldex.effect(x)
# x is the output of the \code{x <- clr(data, mc.samples)} function # conditions is a description of the data # for the selex dataset, conditions <- c(rep("N", 7), rep("S", 7)) data(selex) # subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=2, denom="all") effect.test <- aldex.effect(x)
aldex
ObjectCalculates the expected value of distances between samples, given an aldex
Object, using the median value of distances derived from n Monte-Carlo replicates.
aldex.expectedDistance(clrData)
aldex.expectedDistance(clrData)
clrData |
an object of class |
Returns a dist
Object.
Please use the citation given by citation(package="ALDEx")
.
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 128, denom = "all", verbose = FALSE) x.dist <- aldex.expectedDistance(x) plot(hclust(x.dist))
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 128, denom = "all", verbose = FALSE) x.dist <- aldex.expectedDistance(x) plot(hclust(x.dist))
model.matrix
aldex.glm
calculates the expected values for each coefficient of a
glm model on the data returned by aldex.clr
. This function
requires the user to define a model with model.matrix
.
aldex.glm(clr, verbose = FALSE, fdr.method = "holm", ...)
aldex.glm(clr, verbose = FALSE, fdr.method = "holm", ...)
clr |
An |
verbose |
A boolean. Toggles whether to print diagnostic information while
running. Useful for debugging errors on large datasets. Applies to
|
fdr.method |
A string ("BH" or "holm") denoting which method to use to adjust p-values. Default is "holm" |
... |
Arguments passed to |
Returns a data.frame of the average
coefficients and their p-values for each feature,
with FDR appended as a holm
column.
Thom Quinn, Michelle Pistner
Please use the citation given by
citation(package="ALDEx2")
.
aldex
,
aldex.clr
,
aldex.ttest
,
aldex.kw
,
aldex.glm
,
aldex.effect
,
aldex.corr
,
selex
data(selex) #subset for efficiency selex <- selex[1201:1600,] covariates <- data.frame("A" = sample(0:1, 14, replace = TRUE), "B" = c(rep(0, 7), rep(1, 7))) mm <- model.matrix(~ A + B, covariates) x <- aldex.clr(selex, mm, mc.samples=4, denom="all") glm.test <- aldex.glm(x) glm.eff <- aldex.glm.effect(x) aldex.glm.plot(glm.test, eff=glm.eff, contrast='B', type='MW', post.hoc='holm')
data(selex) #subset for efficiency selex <- selex[1201:1600,] covariates <- data.frame("A" = sample(0:1, 14, replace = TRUE), "B" = c(rep(0, 7), rep(1, 7))) mm <- model.matrix(~ A + B, covariates) x <- aldex.clr(selex, mm, mc.samples=4, denom="all") glm.test <- aldex.glm(x) glm.eff <- aldex.glm.effect(x) aldex.glm.plot(glm.test, eff=glm.eff, contrast='B', type='MW', post.hoc='holm')
data for this function is saved in a list with entries named by contrast determines the median clr abundance of the feature in all samples and in groups determines the median difference between the two groups determines the median variation within each two group determines the effect size, which is the median of the ratio of the between group difference and the larger of the variance within groups
aldex.glm.effect(clr, verbose = TRUE, include.sample.summary = FALSE, useMC=FALSE, CI=FALSE)
aldex.glm.effect(clr, verbose = TRUE, include.sample.summary = FALSE, useMC=FALSE, CI=FALSE)
clr |
The data output of |
verbose |
Print diagnostic information while running. Useful only for debugging if fails on large datasets |
include.sample.summary |
include median clr values for each sample, defaults to FALSE |
useMC |
use multicore by default (FALSE) |
CI |
give effect 95% confidence intervals, defaults to FALSE |
Calculate effect sizes and differences between all contrasts for the aldex.glm model matrix
An explicit example for two conditions is shown in the ‘Examples’ below.
A dataframe with the following information:
a vector containing the median clr value for each feature
a vector containing the median clr value for each feature in condition A
a vector containing the median clr value for each feature in condition B
a vector containing the per-feature median difference between condition A and B
a vector containing the per-feature maximum median difference between Dirichlet instances within conditions
a vector containing the per-feature effect size
a vector containing the per-feature proportion of effect size that is 0 or less
Greg Gloor, Andrew Fernandes, Matt Links
Please use the citation given by citation(package="ALDEx")
.
aldex.clr
, aldex.effect
, aldex.ttest
, aldex.glm
, selex
# x is the output of the \code{x <- clr(data, mc.samples)} function # conditions is a description of the data # for the selex dataset, conditions <- c(rep("N", 7), rep("S", 7)) data(selex) #subset for efficiency selex <- selex[1201:1600,] covariates <- data.frame("A" = sample(0:1, 14, replace = TRUE), "B" = c(rep(0, 7), rep(1, 7)), "Z" = sample(c(1,2,3), 14, replace=TRUE)) mm <- model.matrix(~ A + Z + B, covariates) x <- aldex.clr(selex, mm, mc.samples=8, denom="all") glm.effect <- aldex.glm.effect(x)
# x is the output of the \code{x <- clr(data, mc.samples)} function # conditions is a description of the data # for the selex dataset, conditions <- c(rep("N", 7), rep("S", 7)) data(selex) #subset for efficiency selex <- selex[1201:1600,] covariates <- data.frame("A" = sample(0:1, 14, replace = TRUE), "B" = c(rep(0, 7), rep(1, 7)), "Z" = sample(c(1,2,3), 14, replace=TRUE)) mm <- model.matrix(~ A + Z + B, covariates) x <- aldex.clr(selex, mm, mc.samples=8, denom="all") glm.effect <- aldex.glm.effect(x)
aldex
ObjectCreate MW
- or MA
-type plots from the given aldex
object.
aldex.glm.plot( x, ..., eff = NULL, contrast = NULL, test = "fdr", type = c("MW", "MA", "volcano"), xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, all.col = rgb(0, 0, 0, 0.2), all.pch = 19, all.cex = 0.4, called.col = "red", called.pch = 20, called.cex = 0.6, thres.line.col = "darkgrey", thres.lwd = 1.5, cutoff.pval = 0.05, cutoff.effect = 1, rare.col = "black", rare = 0, rare.pch = 20, rare.cex = 0.2 )
aldex.glm.plot( x, ..., eff = NULL, contrast = NULL, test = "fdr", type = c("MW", "MA", "volcano"), xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, all.col = rgb(0, 0, 0, 0.2), all.pch = 19, all.cex = 0.4, called.col = "red", called.pch = 20, called.cex = 0.6, thres.line.col = "darkgrey", thres.lwd = 1.5, cutoff.pval = 0.05, cutoff.effect = 1, rare.col = "black", rare = 0, rare.pch = 20, rare.cex = 0.2 )
x |
an object produced by the |
... |
optional, unused arguments included for compatibility with the S3 method signature |
eff |
an object produced by the |
contrast |
the column name of the model matrix contrast to plot |
test |
the method of calculating significance, one of "pval" or "fdr" |
type |
which type of plot is to be produced. |
xlab |
the x-label for the plot, as per the parent |
ylab |
the y-label for the plot, as per the parent |
xlim |
the x-limits for the plot, as per the parent |
ylim |
the y-limits for the plot, as per the parent |
all.col |
the default colour of the plotted points |
all.pch |
the default plotting symbol |
all.cex |
the default symbol size |
called.col |
the colour of points with false discovery rate, q <= 0.1 |
called.pch |
the symbol of points with false discovery rate, q <= 0.1 |
called.cex |
the character expansion of points with false discovery rate, q <= 0.05 |
thres.line.col |
the colour of the threshold line where within and between group variation is equivalent |
thres.lwd |
the width of the threshold line where within and between group variation is equivalent |
cutoff.pval |
the fdr cutoff, default 0.05 |
cutoff.effect |
the effect size cutoff for plotting, default 1 |
rare.col |
color for rare features, default black |
rare |
relative abundance cutoff for rare features, default 0 or the mean abundance |
rare.pch |
the default symbol of rare features |
rare.cex |
the default symbol size of rare points |
Plot an aldex
Object
This particular specialization of the plot
function is relatively simple and provided for convenience.
For more advanced control of the plot is is best to use the values returned by summary(x)
.
None.
Please use the citation given by citation(package="ALDEx")
.
aldex
, aldex.effect
, aldex.ttest
, aldex.glm
# See the examples for 'aldex.glm'
# See the examples for 'aldex.glm'
aldex.kw
calculates the expected values of the Kruskal-Wallis
test and a glm on the data returned by aldex.clr
.
aldex.kw(clr, useMC = FALSE, verbose = FALSE)
aldex.kw(clr, useMC = FALSE, verbose = FALSE)
clr |
An |
useMC |
Toggles whether to use multi-core. |
verbose |
A boolean. Toggles whether to print diagnostic information while
running. Useful for debugging errors on large datasets. Applies to
|
use the aldex.glm function unless you really need the nonparametric KW test
Returns a data.frame
with the following information:
kw.ep |
a vector containing the expected p-value of the Kruskal-Wallis test for each feature |
kw.eBH |
a vector containing the corresponding expected value of the Benjamini-Hochberg corrected p-value for each feature |
glm.ep |
a vector containing the expected p-value of the glm ANOVA for each feature |
glm.eBH |
a vector containing the corresponding expected value of the Benjamini-Hochberg corrected p-value for each feature. Note, you should use the aldex.glm function for better post-hoc test statistics. |
Arianne Albert
Please use the citation given by
citation(package="ALDEx2")
.
aldex
,
aldex.clr
,
aldex.ttest
,
aldex.kw
,
aldex.glm
,
aldex.effect
,
aldex.corr
,
selex
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("A", 4), rep("B", 3), rep("C", 7)) x <- aldex.clr(selex, conds, mc.samples=1, denom="all") kw.test <- aldex.kw(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("A", 4), rep("B", 3), rep("C", 7)) x <- aldex.clr(selex, conds, mc.samples=1, denom="all") kw.test <- aldex.kw(x)
Takes as input the conditions vector, dispersion paramenter, starting scale values and the number of random instances. The ratio between the scale values is key; setting mu = c(1,1.2) will have the same effect on the analysis as a value of mu =(c(0.5,0.6). The function returns a matrix of scale values of the same dimension as the number of samples and the number of mc.samples used by the aldex() or aldex.clr() function.
aldex.makeScaleMatrix(gamma, mu, conditions, log = TRUE, mc.samples = 128)
aldex.makeScaleMatrix(gamma, mu, conditions, log = TRUE, mc.samples = 128)
gamma |
- the base gamma value for the sdlog parameter of rlnorm |
mu |
- pair of values, or a vector of values one for each sample |
conditions |
- the conditions vector for the dataset |
log |
- scale ratio in log2 (TRUE) or as simple ratio (FALSE) |
mc.samples |
- the number of Monte-Carlo instances used by aldex() |
returns a matrix of gamma values that are used as an estimate of the scale for the aldex.clr() function. This allows different scale and gamma values to be applied to each group and can move the centre of mass of the data if required. The example dataset has very extreme differences in scale. Most often these are likely in the range of 10-15
Greg Gloor, Michelle Pistner Nixon
Please use the citation given by citation(package="ALDEx")
.
# conditions is a vector describing the data data(selex) # subset for efficiency conds <- c(rep("NS", 7), rep("S", 7)) mu.in <- c(1,50) # 50-fold difference in scale between groups mu.vec <- aldex.makeScaleMatrix(1, mu.in, conds, log=TRUE, mc.samples=128)
# conditions is a vector describing the data data(selex) # subset for efficiency conds <- c(rep("NS", 7), rep("S", 7)) mu.in <- c(1,50) # 50-fold difference in scale between groups mu.vec <- aldex.makeScaleMatrix(1, mu.in, conds, log=TRUE, mc.samples=128)
aldex
ObjectCreate MW
- or MA
-type plots from the given aldex
object.
aldex.plot( x, ..., type = c("MW", "MA", "volcano", "volcano.var"), xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, all.col = rgb(0, 0, 0, 0.2), all.pch = 19, all.cex = 0.4, called.col = "red", called.pch = 20, called.cex = 0.6, thres.line.col = "darkgrey", thres.lwd = 1.5, test = "welch", cutoff.pval = 0.05, cutoff.effect = 1, rare.col = "black", rare = 0, rare.pch = 20, rare.cex = 0.2, main = NULL )
aldex.plot( x, ..., type = c("MW", "MA", "volcano", "volcano.var"), xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, all.col = rgb(0, 0, 0, 0.2), all.pch = 19, all.cex = 0.4, called.col = "red", called.pch = 20, called.cex = 0.6, thres.line.col = "darkgrey", thres.lwd = 1.5, test = "welch", cutoff.pval = 0.05, cutoff.effect = 1, rare.col = "black", rare = 0, rare.pch = 20, rare.cex = 0.2, main = NULL )
x |
an object of class |
... |
optional, unused arguments included for compatibility with the S3 method signature |
type |
which type of plot is to be produced. |
xlab |
the x-label for the plot, as per the parent |
ylab |
the y-label for the plot, as per the parent |
xlim |
the x-limits for the plot, as per the parent |
ylim |
the y-limits for the plot, as per the parent |
all.col |
the default colour of the plotted points |
all.pch |
the default plotting symbol |
all.cex |
the default symbol size |
called.col |
the colour of points with false discovery rate, q <= 0.1 |
called.pch |
the symbol of points with false discovery rate, q <= 0.1 |
called.cex |
the character expansion of points with false discovery rate, q <= 0.05 |
thres.line.col |
the colour of the threshold line where within and between group variation is equivalent |
thres.lwd |
the width of the threshold line where within and between group variation is equivalent |
test |
the method of calculating significance, one of:
|
cutoff.pval |
the Benjamini-Hochberg fdr cutoff, default 0.05 |
cutoff.effect |
the effect size cutoff for plotting, default 1 |
rare.col |
color for rare features, default black |
rare |
relative abundance cutoff for rare features, default 0 or the mean abundance |
rare.pch |
the default symbol of rare features |
rare.cex |
the default symbol size of rare points |
main |
the main label for the plot |
Plot an aldex
Object
This particular specialization of the plot
function is relatively simple and provided for convenience.
For more advanced control of the plot is is best to use the values returned by summary(x)
.
None.
Please use the citation given by citation(package="ALDEx")
.
aldex
, aldex.effect
, aldex.ttest
, aldex.glm
# See the examples for 'aldex'
# See the examples for 'aldex'
aldex.effect
aldex.plotFeature
generates density plots showing the dispersion
of the expected values given in the output from aldex.effect
.
The expected values are shown in the plots. This is a diagnostic
visualization to help determine if the expected values are trustworthy
aldex.plotFeature( clrData, featureName, pooledOnly = FALSE, densityOnly = FALSE, reset.par = FALSE )
aldex.plotFeature( clrData, featureName, pooledOnly = FALSE, densityOnly = FALSE, reset.par = FALSE )
clrData |
the output object from |
featureName |
the name of the feature from the input data |
pooledOnly |
show only the pooled plots, default FALSE, shows all plots |
densityOnly |
show only the density plots, default FALSE includes expected values |
reset.par |
reset the plotting parameter to par(c(1,1)), default FALSE |
Brandon Lieng, Greg Gloor
Please use the citation given by
citation(package="ALDEx2")
.
aldex.clr
,
aldex.effect
,
selex
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=4, denom="all") aldex.plotFeature(x, "S:D:A:D")
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=4, denom="all") aldex.plotFeature(x, "S:D:A:D")
Performs scale simulation over a range of values for gamma. Dirichlet samples are reused for computational convenience.
aldex.senAnalysis( aldex_clr, gamma, test = "t", effect = TRUE, include.sample.summary = FALSE, verbose = FALSE, iterate = FALSE, ... )
aldex.senAnalysis( aldex_clr, gamma, test = "t", effect = TRUE, include.sample.summary = FALSE, verbose = FALSE, iterate = FALSE, ... )
aldex_clr |
An 'aldex.clr' object |
gamma |
A vector of positive numeric components. Used as the standard deviation of the scale simulation model. |
test |
A character string. Indicates which tests to perform. "t" runs
Welch's t and Wilcoxon tests. "kw" runs Kruskal-Wallace and glm tests.
"glm" runs a generalized linear model using a |
effect |
A boolean. Toggles whether to calculate abundances and effect sizes. |
include.sample.summary |
A boolean. Toggles whether to include median clr
values for each sample. Applies to |
verbose |
A boolean. Toggles whether to print diagnostic information while
running. Useful for debugging errors on large datasets. Applies to
|
iterate |
A boolean. Toggles whether to iteratively perform a test. For example, this will use the results from an initial "t" routine to seed the reference (i.e., denominator of Geometric Mean calculation) for a second "t" routine. |
... |
Arguments to embedded method (e.g., |
A list of results. Each element corresponds to a single result for a given value of gamma
calculate the features that are to be used as the denominator for the Geometric Mean calculation in clr_function.R
aldex.set.mode(reads, conds, denom="all")
aldex.set.mode(reads, conds, denom="all")
reads |
A data frame containing the samples and features per sample. |
conds |
A vector describing which samples belong to what condition. |
denom |
Character argument specifying which indicies to return. 'all' returns all features in both conditons. 'zero' returns the nonzero count features per condition. 'iqlr' returns the features whose variance falls within the inter-quantile range of the CLR-transformed data. In cases of malformed or null queries, input defaults to 'all'. Additionally, the input can be a numeric vector, which contains a set of row indicies to center the data against. Only for advanced users who can pre-determine the invariant set of features within their data. Check that the same number of features are in the input and output datasets. |
Identify set of denominator features for log-ratio calculation
An explicit example for two conditions is shown in the ‘Examples’ below.
Outputs a vector containing indices per condition, or a single vector in some cases.
Jia Rong Wu
Please use the citation given by citation(package="ALDEx")
.
aldex.clr
, aldex.ttest
, aldex.effect
, selex
# x is the output of the \code{x <- clr(data, mc.samples)} function # conditions is a description of the data # for the selex dataset, conditions <- c(rep("N", 7), rep("S", 7)) # input can be "all", "iqlr", "zero" or numeric for advanced users data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=2, denom="all")
# x is the output of the \code{x <- clr(data, mc.samples)} function # conditions is a description of the data # for the selex dataset, conditions <- c(rep("N", 7), rep("S", 7)) # input can be "all", "iqlr", "zero" or numeric for advanced users data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=2, denom="all")
aldex.ttest
calculates the expected values of the Wilcoxon Rank Sum
test and the posterior predictive value of Welch's t-test on the data
returned by aldex.clr
.
aldex.ttest(clr, paired.test = FALSE, hist.plot = FALSE, verbose = FALSE)
aldex.ttest(clr, paired.test = FALSE, hist.plot = FALSE, verbose = FALSE)
clr |
An |
paired.test |
Toggles whether to calculate paired tests. |
hist.plot |
Toggles whether to plot a histogram of p-values for the first Dirichlet Monte Carlo instance. |
verbose |
A boolean. Toggles whether to print diagnostic information
while running. Useful for debugging errors on large datasets. Applies
to |
Returns a data.frame
with the following information:
we.ep |
a vector containing the the poseterior predictive p-value of Welch's t-test for each feature |
we.eBH |
a vector containing the corresponding expected value of the Benjamini-Hochberg corrected p-value for each feature |
wi.ep |
a vector containing the expected p-value of the Wilcoxon Rank Sum test for each feature |
wi.eBH |
a vector containing the corresponding expected value of the Benjamini-Hochberg corrected p-value for each feature |
Greg Gloor, Michelle Pistner
Please use the citation given by
citation(package="ALDEx2")
.
aldex
,
aldex.clr
,
aldex.ttest
,
aldex.kw
,
aldex.glm
,
aldex.effect
,
aldex.corr
,
selex
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=2, denom="all") ttest.test <- aldex.ttest(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples=2, denom="all") ttest.test <- aldex.ttest(x)
Returns the conditions, for
aldex.clr
object.
getConditions(.object)
getConditions(.object)
.object |
A |
Returns the conditions used.
A matrix representing the conditions used.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) scale.samps <- getConditions(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) scale.samps <- getConditions(x)
Returns the offset of the features used as the denominator
as the basis for the log-ratio, for an aldex.clr
object.
getDenom(.object)
getDenom(.object)
.object |
A |
Returns the offset of the features used as the denominator as the basis for the log-ratio. A vector of numbers is the offset of the non-0 features used in the denominator.
A vector of integer row offsets.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "iqlr", verbose = FALSE) Denom <- getDenom(x) # to find the names of housekeeping genes used getFeatureNames(x)[getDenom(x)]
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "iqlr", verbose = FALSE) Denom <- getDenom(x) # to find the names of housekeeping genes used getFeatureNames(x)[getDenom(x)]
Returns a list of the Monte Carlo Dirichlet
instances created by the aldex.clr
function.
getDirichletInstances(.object)
getDirichletInstances(.object)
.object |
A |
Returns a list of the raw Monte Carlo Dirichlet
instances created by the aldex.clr
function. These
are probability estimates.
A list of data frames.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloDirInstances <- getDirichletInstances(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloDirInstances <- getDirichletInstances(x)
Returns the raw per-sample Monte Carlo Dirichlet
replicates generated from analysis, for an aldex.clr
object.
getDirichletReplicate(.object,i)
getDirichletReplicate(.object,i)
.object |
A |
i |
The numeric index of the desired sample replicate. |
Returns the raw per-sample Monte Carlo Dirichlet replicates. These are estimated probabilities.
A numeric matrix.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) DirichletReplicate <- getDirichletReplicate(x,1)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) DirichletReplicate <- getDirichletReplicate(x,1)
Returns a single Monte Carlo Dirichlet instance for all samples
for an aldex.clr
object.
getDirichletSample(.object,i)
getDirichletSample(.object,i)
.object |
A |
i |
The numeric index of the desired Monte-Carlo instance. |
Returns the designated Monte Carlo Dirichlet instance for all samples generated from analysis.
A matrix representing the designated Monte Carlo Dirichlet instance for all samples.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) DirichletSample <- getDirichletSample(x,1)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) DirichletSample <- getDirichletSample(x,1)
Returns the names of the features as a vector, for an aldex.clr
object.
getFeatureNames(.object)
getFeatureNames(.object)
.object |
A |
Returns the names of the keys
that can be used to
subset the data rows. The keys
values are the rsid's.
A vector of feature names.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom="all", verbose = FALSE) featureNames <- getFeatureNames(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom="all", verbose = FALSE) featureNames <- getFeatureNames(x)
Returns the features as a vector, for an aldex.clr
object.
getFeatures(.object)
getFeatures(.object)
.object |
A |
Returns the features from the first sample and first Monte-Carlo
replicate as a vector, for an aldex.clr
object. Used only
for troubleshooting purposes.
A vector of features.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom="all", verbose = FALSE) features <- getFeatures(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom="all", verbose = FALSE) features <- getFeatures(x)
Returns a list of the log-ratio transformed Monte Carlo Dirichlet
instances created by the aldex.clr
function.
getMonteCarloInstances(.object)
getMonteCarloInstances(.object)
.object |
A |
Returns a list of the log-ratio transformed Monte Carlo Dirichlet
instances created by the aldex.clr
function.
A list of data frames.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloInstances <- getMonteCarloInstances(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloInstances <- getMonteCarloInstances(x)
Returns the log-ratio transformed per-sample Monte Carlo Dirichlet
replicates generated from analysis, for an aldex.clr
object.
getMonteCarloReplicate(.object,i)
getMonteCarloReplicate(.object,i)
.object |
A |
i |
The numeric index of the desired sample replicate. |
Returns the log-ratio transformed per-sample Monte Carlo Dirichlet replicates.
A numeric matrix.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloReplicate <- getMonteCarloReplicate(x,1)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloReplicate <- getMonteCarloReplicate(x,1)
Returns a single Monte Carlo Dirichlet instance for all samples
for an aldex.clr
object.
getMonteCarloSample(.object,i)
getMonteCarloSample(.object,i)
.object |
A |
i |
The numeric index of the desired Monte-Carlo instance. |
Returns the designated Monte Carlo Dirichlet instance for all samples generated from analysis.
A matrix representing the designated log-ratio transformed Monte Carlo Dirichlet instance for all samples.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloSample <- getMonteCarloSample(x,1)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) monteCarloSample <- getMonteCarloSample(x,1)
Returns the count table used as input for analysis, for
aldex.clr
object. Note this count table has features
that are 0 in all samples removed, and a uniform prior of
0.5 is applied.
getReads(.object)
getReads(.object)
.object |
A |
Returns the count table.Note this count table has features that are 0 in all samples removed, and a uniform prior of 0.5 is applied.
A data frame representing the count table used as input for analysis.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) reads <- getReads(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) reads <- getReads(x)
Returns the names of the samples for an aldex.clr
object.
These can be used to access the original reads, as in
reads\$sampleID
(if the reads are a data frame).
getSampleIDs(.object)
getSampleIDs(.object)
.object |
A |
Returns the names of the samples. These can be used to access
the original reads, as in reads\$sampleID
(if the reads
are a data frame).
A vector of sample names.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) sampleIDs <- getSampleIDs(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) sampleIDs <- getSampleIDs(x)
Returns the log2 scale samples if scale simulation is used, for
aldex.clr
object.
getScaleSamples(.object)
getScaleSamples(.object)
.object |
A |
Returns NULL if scale simulation was not used or a matrix of log2 scale samples if scale simuation was used.
A matrix representing the log2 scale samples if scale simulation was used.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) scale.samps <- getScaleSamples(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) scale.samps <- getScaleSamples(x)
Interpret the scale model implied by a certain level of gamma or scale model
interpretGamma(clr)
interpretGamma(clr)
clr |
A aldex.clr object |
A table. For each variable, an estimate of theta^perp that is implied by the scale model is returned. The average and 95
Returns the number of conditions compared for analysis,
for an aldex.clr
object.
numConditions(.object)
numConditions(.object)
.object |
A |
Returns the number of samples compared.
A numeric representing the number of samples compared.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) conditions <- numConditions(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) conditions <- numConditions(x)
Returns the number of non-0 features associated with the data,
for an aldex.clr
object.
numFeatures(.object)
numFeatures(.object)
.object |
A |
Returns the number of features associated with the data that are not 0 in all samples.
A numeric representing the number of non-0 features associated with the data.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) numFeatures <- numFeatures(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) numFeatures <- numFeatures(x)
Returns the number of Monte Carle Dirichlet instances
generated for analysis, for an aldex.clr
object.
numMCInstances(.object)
numMCInstances(.object)
.object |
A |
Returns the number of Monte Carle Dirichlet instances generated for analysis.
A numeric representing the number of Monte Carle Dirichlet instances generated for analysis.
aldex.clr
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) numInstances <- numMCInstances(x)
data(selex) #subset for efficiency selex <- selex[1201:1600,] conds <- c(rep("NS", 7), rep("S", 7)) x <- aldex.clr(selex, conds, mc.samples = 2, denom = "all", verbose = FALSE) numInstances <- numMCInstances(x)
Create gamma diagram for scale simulation sensitivity result
plotGamma( sen_results, test = "t", thresh = 0.05, taxa_to_label = 10, glmVar = NULL, blackWhite = FALSE, cex = 1 )
plotGamma( sen_results, test = "t", thresh = 0.05, taxa_to_label = 10, glmVar = NULL, blackWhite = FALSE, cex = 1 )
sen_results |
A list return by aldex.senAnalysis() |
test |
A character string. What test was used to calculate the results |
thresh |
A numeric between 0 and 1. What threshold should be used for significance? |
taxa_to_label |
A positive integer. How many taxa should be labeled in the plot? |
glmVar |
If 'test = "glm"', what variable do you want plotted? |
blackWhite |
boolean. If TRUE, returns the plot in black and white. |
cex |
Default == 1. Controls the size of the axis and text labels in the plots. |
A plot object
This data set gives the differential abundance of 1600 enzyme variants grown under selective (NS) and selective (S) conditions
data(selex)
data(selex)
A dataframe of 1600 features and 14 samples. The first 7 samples are non-selected, the last 7 are selected.
McMurrough et al (2014) PNAS doi:10.1073/pnas.1322352111
McMurrough et al (2014) PNAS doi:10.1073/pnas.1322352111
This synthetic dataset contains 2 percent sparsity as 0 values asymmetrically distributed. It is used as a test dataset.
data(synth2)
data(synth2)
A dataframe of 1000 features and 16 samples. The first 8 samples contain 20 features set to 0, the last 8 samples contain counts.
Gloor et al (2017) notes