Bugfix for quickCluster()
to pass along arguments to the internal per-block call.
Added a restricted=
option to quickSubCluster()
to enable subclustering on specific clusters.
All deprecated functions from the previous release are now defunct.
Added a simplify=
option to quickSubCluster()
to get the cluster assignments directly.
Deprecated combinePValues()
as this is replaced by metapod::combineParallelPValues()
.
getClusteredPCs()
now uses bluster::clusterRows()
by default.
decideTestsPerLabel()
now automatically detects pval.field=
if not supplied.
Added the clusterCells()
wrapper around bluster functionality.
Removed the option to pass a matrix in design=
from pseudoBulkDGE()
.
Migrated all normalization-related functions (computeSumFactors()
, calculateSumFactors()
,
cleanSizeFactors()
and computeSpikeFactors()
) to a better home in scuttle.
Soft-deprecated existing functions.
Modified getTopHVGs()
to accept a SingleCellExperiment and compute the DataFrame with modelGeneVar()
.
Added fixedPCA()
to compute a PCA with a fixed number of components, a la scater::runPCA()
(but without requiring scater).
Modified denoisePCA()
so that it now complains if subset.row=
is not provided.
Modified all pairwise*
functions so that the p-value from direction="any"
is derived from the two p-values from the one-sided tests.
This is necessary for correctness with all choices of lfc=
and block=
, at the cost of conservativeness when block=NULL
and lfc
is large.
Deprecated coassignProbs()
as this is replaced by bluster::pairwiseRand()
Deprecated boostrapCluster()
as this is replaced by bluster::bootstrapStability()
.
Deprecated gene.names=
in the various pairwise*
functions as being out of scope.
Added the testLinearModel()
function to obtain inferences from a linear model.
Modified pseudoBulkDGE()
to use formulas/functions in the design=
argument.
Allow contrast=
to be a character vector to be run through makeContrasts()
.
Added the pseudoBulkSpecific()
function to test for semi-label-specific DEGs in pseudo-bulk analyses.
Added the summaryMarkerStats()
function to compute some basic summary statistics for marker filtering.
Modified row.data=
in findMarkers()
to support list inputs.
Added a add.summary=
option to easily include summary information.
Modified combineVar()
and combineCV2()
to support list inputs.
Deprecated doubletCells()
as this is replaced by scDblFinder::computeDoubletDensity()
.
Deprecated doubletCluster()
as this is replaced by scDblFinder::findDoubletClusters()
.
Deprecated doubletRecovery()
as this is replaced by scDblFinder::recoverDoublets()
.
Added sparse-optimized variance calculations to modelGeneVar()
, modelGeneCV2()
and related functions,
which may result in slight changes to the results due to numeric precision.
Exported combineBlocks()
to assist combining of block-wise statistics in other packages.
Added lowess=
and density.weights=
options to fitTrendVar()
to rescue overfitted curves.
Raised an error in denoisePCA()
upon mismatches in the matrix and technical statistics.
Added the quickSubCluster() function for convenient subclustering.
Added the bootstrapCluster() function to convenient bootstrapping of cluster stability.
Added the coassignProb() function to compute coassignment probabilities of alternative groupings.
combineMarkers() and findMarkers() report a summary effect size for each cluster.
Added the multiMarkerStats() function to combine statistics from multiple findMarkers() calls.
Added the clusterPurity() function to evaluate cluster purity as a quality measure.
Added the pseudoBulkDGE() function to easily and safely perform pseudo-bulk DE analyses. Also added the decideTestsPerLabel() and summarizeTestsPerLabel() utilities.
Added the clusterSNNGraph() and clusterKNNGraph() wrapper functions for easier graph-based clustering. Provided a k-means pre-clustering option to handle large datasets.
Removed deprecated approximate= and pc.approx= arguments.
Removed deprecated batch correction functions.
Added option to pairwiseTTests() for standardization of log-fold changes.
Changed default BSPARAM= to bsparam() in quickCluster(), denoisePCA(), doubletCells() and build*NNGraph().
Added the pairwiseBinom() function for pairwise binomial tests of gene expression.
Renamed output fields of pairwiseWilcox() to use AUC for less confusion. Added the lfc= argument to test against a log-fold change.
Added the fitTrendVar(), fitTrendCV2(), modelGeneVar(), modelGeneVarWithSpikes(), modelGeneCV2(), modelCV2WithSpikes(), fitTrendPoisson() and modelGeneVarByPoisson() functions to model variability.
Deprecated the trendVar(), technicalCV2(), improvedCV2(), decomposeVar(), trendVar(), testVar(), makeTechTrend(), multiBlockVar() and multiBlockNorm() functions.
Modified combineVar() to not weight by residual d.f. unless specifically instructed.
Added the combineCV2() function to combine separate CV2 modelling results.
Added the test.type= argument in findMarkers() to switch between pairwise DE tests. Added the row.data= argument to easily include row metadata in reordered tables. Deprecated overlapExprs(), which is replaced by type="wilcox" in findMarkers().
Added the getTopMarkers() function to easily retrieve marker lists from pairwise DE results.
Added the getTopHVGs() function to easily retrieve HVG sets from variance modelling results.
In all functions that accept a block= argument, any level of the blocking factor that cannot yield a result (e.g., due to insufficient degrees of freedom) will now be completely ignored and not contribute to any statistic.
Added the getDenoisedPCs() function for general-purpose PCA-based denoising on non-SingleCellExperiment inputs. Converted denoisePCA() to a normal function, removed the method for ANY matrix. Dropped max.rank= default to 50 for greater speed in most cases.
Added the calculateSumFactors() function for general-purpose calculation of deconvolution factors on non-SingleCellExperiment inputs. Converted computeSumFactors() to a normal function, removed the method for ANY input. Auto-guess min.mean= based on the average library size.
Deprecated all special handling of spike-in rows, which are no longer necessary when spike-ins are stored as alternative experiments.
Deprecated general.use= in computeSpikeFactors(), which is no longer necessary when spike-ins are stored as alternative experiments.
Deprecated parallelPCA(), which has been moved to the PCAtools package.
Modified clusterModularity() to return upper-triangular matrices, fixing a bug where the off-diagonal weights were split into two entries across the diagonal. Added the as.ratio= argument to return a matrix of log-ratios. Renamed the get.values= argument to get.weights=.
Simplified density calculation in doubletCells() for greater robustness.
Added a method="holm-middle" option to combinePValues(), to test if most individual nulls are true. Added a min.prop= option to control the definition of "most".
Added a pval.type="some" option to combineMarkers(), as a compromise between the two other modes. Added a min.prop= option to tune stringency for pval.type="some" and "any".
Added the getClusteredPCs() function to provide a cluster-based heuristic for choosing the number of PCs.
Added the neighborsTo*NNGraph() functions to generate (shared) nearest neighbor graphs from pre-computed NN results.
Switched to using only the top 10% of HVGs for the internal PCA in quickCluster().
Added option in quickCluster() to cluster on log-expression profiles. Modified defaults to use graph-based clustering on log-expression-derived PCs.
Modified default choice of ref.clust= in computeSumFactors(). Degrade quietly to library size factors when cluster is too small for all pool sizes.
Minor change to cyclone() random number generation for consistency upon parallelization.
Added BPPARAM= to correlateNull() for parallelization. Minor change in random number generation for consistency upon parallelization.
Minor change to parallelPCA() random number generation for consistency upon parallelization.
Created correlateGenes() function to compute per-gene correlation statistics.
Modified correlatePairs() to compute expected rho after all possible tie-breaking permutations. Deprecated cache.size= as all ranks are now returned as in-memory representations. Deprecated per.gene= in favour of an external call to correlateGenes(). Deprecated tol= as ties are now directly determined by rowRanks().
Switched to BiocSingular for PCA calculations across all functions. Deprecated approximate= and pc.approx= arguments in favour of BSPARAM=.
Deprecated all batch correction functions to prepare for the migration to batchelor.
Removed selectorPlot(), exploreData() functions in favour of iSEE.
Fixed underflow problem in mnnCorrect() when dealing with the Gaussian kernel. Dropped the default sigma= in mnnCorrect() for better default performance.
Supported parallelized block-wise processing in quickCluster(). Deprecated max.size= in favour of max.cluster.size= in computeSumFactors(). Deprecated get.ranks= in favour of scaledColRanks().
Added max.cluster.size= argument to computeSumFactors(). Supported parallelized cluster-wise processing. Increased all pool sizes to avoid rare failures if number of cells is a multiple of 5. Minor improvement to how mean filtering is done for rescaling across clusters in computeSumFactors(). Throw errors upon min.mean=NULL, which used to be valid. Switched positive=TRUE behaviour to use cleanSizeFactors().
Added simpleSumFactors() as a simplified alternative to quickCluster() and computeSumFactors().
Added the scaledColRanks() function for computing scaled and centred column ranks.
Supported parallelized gene-wise processing in trendVar() and decomposeVar(). Support direct use of a factor in design= for efficiency.
Added doubletCluster() to detect clusters that consist of doublets of other clusters.
Added doubletCells() to detect cells that are doublets of other cells via simulations.
Deprecated rand.seed= in buildSNNGraph() in favour of explicit set.seed() call. Added type= argument for weighting edges based on the number of shared neighbors.
Deprecated rand.seed= in buildKNNGraph().
Added multiBlockNorm() function for spike-abundance-preserving normalization prior to multi-block variance modelling.
Added multiBatchNorm() function for consistent downscaling across batches prior to batch correction.
Added cleanSizeFactors() to coerce non-positive size factors to positive values based on number of detected genes.
Added the fastMNN() function to provide a faster, more stable alternative for MNN correction.
Added BPPARAM= option for parallelized execution in makeTechTrend(). Added approx.npts= option for interpolation-based approximation for many cells.
Added pairwiseTTests() for direct calculation of pairwise t-statistics between groups.
Added pairwiseWilcox() for direct calculation of pairwise Wilcoxon rank sum tests between groups.
Added combineMarkers() to consolidate arbitrary pairwise comparisons into a marker list.
Bugfixes to uses of block=, lfc= and design= arguments in findMarkers(). Refactored to use pairwiseTTests() and combineMarkers() internally. Added BPPARAM= option for parallelized execution.
Refactored overlapExprs() to sort by p-value based on pairwiseWilcox() and combineMarkers(). Removed design= argument as it is not compatible with p-value calculations.
Bugfixes to the use of Stouffer's Z method in combineVar().
Added combinePValues() as a centralized internal function to combine p-values.
Modified decomposeVar() to return statistics (but not p-values) for spike-ins when get.spikes=NA. Added block= argument for mean/variance calculations within each level of a blocking factor, followed by reporting of weighted averages (using Fisher's method for p-values). Automatically record global statistics in the metadata of the output for use in combineVar(). Switched output to a DataFrame object for consistency with other functions.
Fixed testVar() to report a p-value of 1 when both the observed and null variances are zero.
Allowed passing of arguments to irlba() in denoisePCA() to assist convergence. Reported low-rank approximations for all genes, regardless of whether they were used in the SVD. Deprecated design= argument in favour of manual external correction of confounding effects. Supported use of a vector or DataFrame in technical= instead of a function.
Allowed passing of arguments to prcomp_irlba() in buildSNNGraph() to assist convergence. Allowed passing of arguments to get.knn(), switched default algorithm back to a kd-tree.
Added the buildKNNGraph() function to construct a simple k-nearest-neighbours graph.
Fixed a number of bugs in mnnCorrect(), migrated code to C++ and parallelized functions. Added variance shift adjustment, calculation of angles with the biological subspace.
Modified trend specification arguments in trendVar() for greater flexibility. Switched from ns() to robustSmoothSpline() to avoid bugs with unloaded predict.ns(). Added block= argument for mean/variance calculations within each level of a blocking factor.
Added option to avoid normalization in the SingleCellExperiment method for improvedCV2(). Switched from ns() to smooth.spline() or robustSmoothSpline() to avoid bugs.
Replaced zoo functions with runmed() for calculating the median trend in DM().
Added block= argument to correlatePairs() to calculate correlations within each level of a blocking factor. Deprecated the use of residuals=FALSE for one-way layouts in design=. Preserve input order of paired genes in the gene1/gene2 output when pairings!=NULL.
Added block= argument to overlapExprs() to calculate overlaps within each level of a blocking factor. Deprecated the use of residuals=FALSE for one-way layouts in design=. Switched to automatic ranking of genes based on ability to discriminate between groups. Added rank.type= and direction= arguments to control ranking of genes.
Modified combineVar() so that it is aware of the global stats recorded in decomposeVar(). Absence of global statistics in the input DataFrames now results in an error. Added option to method= to use Stouffer's method with residual d.f.-weighted Z-scores. Added weighted= argument to allow weighting to be turned off for equal batch representation.
Modified the behaviour of min.mean= in computeSumFactors() when clusters!=NULL. Abundance filtering is now performed within each cluster and for pairs of clusters, rather than globally.
Switched to pairwise t-tests in findMarkers(), rather than fitting a global linear model. Added block= argument for within-block t-tests, the results of which are combined across blocks via Stouffer's method. Added lfc= argument for testing against a log-fold change threshold. Added log.p= argument to return log-transformed p-values/FDRs. Removed empirical Bayes shrinkage as well as the min.mean= argument.
Added the makeTechTrend() function for generating a mean-variance trend under Poisson technical noise.
Added the multiBlockVar() function for convenient fitting of multiple mean-variance trends per level of a blocking factor.
Added the clusterModularity() function for assessing the cluster-wise modularity after graph-based clustering.
Added the parallelPCA() function for performing parallel analysis to choose the number of PCs.
Modified convertT() to return raw counts and size factors for CellDataSet output.
Deprecated exploreData(), selectorPlot() in favour of iSEE().
Supported parallelization in buildSNNGraph(), overlapExprs() with BPPARAM options.
Forced zero-derived residuals to a constant value in correlatePairs(), overlapExprs().
Allowed findMarkers() to return IUT p-values, to identify uniquely expressed genes in each cluster. Added options to specify the direction of the log-fold changes, to focus on upregulated genes in each cluster.
Fixed bug in correlatePairs() when per.gene=TRUE and no spike-ins are available. Added block.size argument to control caching.
Switched all C++ code to use the beachmat API. Modified several functions to accept ANY matrix-like object, rather than only base matrix objects.
quickCluster() with method="igraph" will now merge based on modularity to satisfy min.size requirements. Added max.size option to restrict the size of the output clusters.
Updated the trendVar() interface with parametric, method arguments. Deprecated the trend="semiloess" option in favour of parametric=TRUE and method="loess". Modified the NLS equation to guarantee non-negative coefficients of the parametric trend. Slightly modified the estimation of NLS starting parameters. Second d.f. of the fitted F-distribution is now reported as df2 in the output.
Modified decomposeVar() to automatically use the second d.f. when test="f".
Added option in denoisePCA() to return the number of components or the low-rank approximation. The proportion of variance explained is also stored as an attribute in all return values.
Fixed a variety of bugs in mnnCorrect().
Switched default BPPARAM to SerialParam() in all functions.
Added run argument to selectorPlot(). Bug fix to avoid adding an empty list.
Added exploreData() function for visualization of scRNA-seq data.
Minor bug fix to DM() when extrapolation is required.
Added check for centred size factors in trendVar(), decomposeVar() methods. Refactored trendVar() to include automatic start point estimation, location rescaling and df2 estimation.
Moved spike-in specification to the scater package.
Deprecated isSpike<- to avoid confusion over input/output types.
Generalized sandbag(), cyclone() to work for other classification problems.
Added test="f" option in testVar() to account for additional scatter.
Added per.gene=FALSE option in correlatePairs(), expanded accepted value types for subset.row. Fixed an integer overflow in correlatePairs(). Also added information on whether the permutation p-value reaches its lower bound.
Added the combineVar() function to combine results from separate decomposeVar() calls.
Added protection against all-zero rows in technicalCV2().
Added the improvedCV2() function as a more stable alternative to technicalCV2().
Added the denoisePCA() function to remove technical noise via selection of early principal components.
Removed warning requiring at least twice the max size in computeSumFactors(). Elaborated on the circumstances surrounding negative size factors. Increased the default number of window sizes to be examined. Refactored C++ code for increased speed.
Allowed quickCluster() to return a matrix of ranks for use in other clustering methods. Added method="igraph" option to perform graph-based clustering for large numbers of cells.
Added the findMarkers() function to automatically identify potential markers for cell clusters.
Added the overlapExprs() function to compute the overlap in expression distributions between groups.
Added the buildSNNGraph() function to build a SNN graph for cells from their expression profiles.
Added the correctMNN() function to perform batch correction based on mutual nearest neighbors.
Streamlined examples when mocking up data sets.
Transformed correlations to a metric distance in quickCluster().
Removed normalize() in favour of scater's normalize().
Switched isSpike()<- to accept a character vector rather than a logical vector, to enforce naming of spike-in sets. Also added warning code when the specified spike-in sets overlap.
Allowed compute*Factors() functions to directly return the size factors.
Added selectorPlot() function for interactive plotting.
Switched to a group-based weighted correlation for one-way layouts in correlatePairs() and correlateNull(), and to a correlation of residuals for more complex design matrices.
Added phase assignments to the cyclone() output.
Implemented Brennecke et al.'s method in the technicalCV2() function.
Updated convertTo() to store spike-in-specific size factors as offsets.
Moved code and subsetting into C++ to improve memory efficiency.
Switched to loess-based trend fitting as the default in trendVar(), replaced polynomial with semi-loess fitting.
Added significance statistics to output of decomposeVar(), with only the p-values replaced by NAs for spike-ins.
Updated documentation and tests.
New package scran, for low-level analyses of single-cell RNA sequencing data.