Title: | Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data |
---|---|
Description: | The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk. |
Authors: | Yichen Wang [aut] , Irzam Sarfraz [aut] , Rui Hong [aut], Yusuke Koga [aut], Salam Alabdullatif [aut], Nida Pervaiz [aut], David Jenkins [aut] , Vidya Akavoor [aut], Xinyun Cao [aut], Shruthi Bandyadka [aut], Anastasia Leshchyk [aut], Tyler Faits [aut], Mohammed Muzamil Khan [aut], Zhe Wang [aut], W. Evan Johnson [aut] , Ming Liu [aut], Joshua David Campbell [aut, cre] |
Maintainer: | Joshua David Campbell <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.17.0 |
Built: | 2024-11-30 05:25:43 UTC |
Source: | https://github.com/bioc/singleCellTK |
Finds the effect sizes for all genes in the original dataset, regardless of significance.
calcEffectSizes(countMatrix, condition)
calcEffectSizes(countMatrix, condition)
countMatrix |
Matrix. A simulated counts matrix, sans labels. |
condition |
Factor. The condition labels for the simulated cells. If more than 2 conditions are given, the first will be compared to all others by default. |
A vector of cohen's d effect sizes for each gene.
data("mouseBrainSubsetSCE") res <- calcEffectSizes(assay(mouseBrainSubsetSCE, "counts"), condition = colData(mouseBrainSubsetSCE)$level1class)
data("mouseBrainSubsetSCE") res <- calcEffectSizes(assay(mouseBrainSubsetSCE, "counts"), condition = colData(mouseBrainSubsetSCE)$level1class)
Combine a list of SingleCellExperiment objects as one SingleCellExperiment object
combineSCE(sceList, by.r = NULL, by.c = NULL, combined = TRUE)
combineSCE(sceList, by.r = NULL, by.c = NULL, combined = TRUE)
sceList |
A list contains SingleCellExperiment objects. Currently, combineSCE function only support combining SCE objects with assay in dgCMatrix format. It does not support combining SCE with assay in delayedArray format. |
by.r |
Specifications of the columns used for merging rowData. If set as NULL, the rownames of rowData tables will be used to merging rowData. Default is NULL. |
by.c |
Specifications of the columns used for merging colData. If set as NULL, the rownames of colData tables will be used to merging colData. Default is NULL. |
combined |
logical; if TRUE, it will combine the list of SingleCellExperiment objects and return a SingleCellExperiment. If FALSE, it will return a list of SingleCellExperiment whose rowData, colData, assay and reducedDim data slot are compatible within SCE objects in the list. Default is TRUE. |
A SingleCellExperiment object which combines all objects in sceList. The colData is merged.
data(scExample, package = "singleCellTK") combinedsce <- combineSCE(list(sce,sce), by.r = NULL, by.c = NULL, combined = TRUE)
data(scExample, package = "singleCellTK") combinedsce <- combineSCE(list(sce,sce), by.r = NULL, by.c = NULL, combined = TRUE)
The computeHeatmap method computes the heatmap visualization for a set
of features against a set of dimensionality reduction components. This
method uses the heatmap computation algorithm code from Seurat
but
plots the heatmap using ComplexHeatmap
and cowplot
libraries.
computeHeatmap( inSCE, useAssay, dims = 10, nfeatures = 30, cells = NULL, reduction = "pca", disp.min = -2.5, disp.max = 2.5, balanced = TRUE, nCol = NULL, externalReduction = NULL )
computeHeatmap( inSCE, useAssay, dims = 10, nfeatures = 30, cells = NULL, reduction = "pca", disp.min = -2.5, disp.max = 2.5, balanced = TRUE, nCol = NULL, externalReduction = NULL )
inSCE |
Input |
useAssay |
Specify the name of the assay that will be scaled by this function for the features that are used in the heatmap. |
dims |
Specify the number of dimensions to use for heatmap. Default
|
nfeatures |
Specify the number of features to use for heatmap. Default
is |
cells |
Specify the samples/cells to use for heatmap computation.
Default is |
reduction |
Specify the reduction slot in the input object. Default
is |
disp.min |
Specify the minimum dispersion value to use for floor
clipping of assay values. Default is |
disp.max |
Specify the maximum dispersion value to use for ceiling
clipping of assay values. Default is |
balanced |
Specify if the number of of up-regulated and down-regulated
features should be balanced. Default is |
nCol |
Specify the number of columns in the output plot. Default
is |
externalReduction |
Specify an external reduction if not present in
the input object. This external reduction should be created
using |
Heatmap plot object.
Computes Z-Score from an input count matrix using the formula ((x-mean(x))/sd(x)) for each gene across all cells. The input count matrix can either be a base matrix, dgCMatrix or a DelayedMatrix. Computations are performed using DelayedMatrixStats package to efficiently compute the Z-Score matrix.
computeZScore(counts)
computeZScore(counts)
counts |
matrix (base matrix, dgCMatrix or DelayedMatrix) |
z-score computed counts matrix (DelayedMatrix)
data(sce_chcl, package = "scds") assay(sce_chcl, "countsZScore") <- computeZScore(assay(sce_chcl, "counts"))
data(sce_chcl, package = "scds") assay(sce_chcl, "countsZScore") <- computeZScore(assay(sce_chcl, "counts"))
Create SingleCellExperiment object from csv or txt input
constructSCE(data, samplename)
constructSCE(data, samplename)
data |
A data.table object containing the count matrix. |
samplename |
The sample name of the data. |
A SingleCellExperiment object containing the count matrix.
convertSCEToSeurat Converts sce object to seurat while retaining all assays and metadata
convertSCEToSeurat( inSCE, countsAssay = NULL, normAssay = NULL, scaledAssay = NULL, copyColData = FALSE, copyReducedDim = FALSE, copyDecontX = FALSE, pcaReducedDim = NULL, icaReducedDim = NULL, tsneReducedDim = NULL, umapReducedDim = NULL )
convertSCEToSeurat( inSCE, countsAssay = NULL, normAssay = NULL, scaledAssay = NULL, copyColData = FALSE, copyReducedDim = FALSE, copyDecontX = FALSE, pcaReducedDim = NULL, icaReducedDim = NULL, tsneReducedDim = NULL, umapReducedDim = NULL )
inSCE |
A |
countsAssay |
Which assay to use from sce object for raw counts.
Default |
normAssay |
Which assay to use from sce object for normalized data.
Default |
scaledAssay |
Which assay to use from sce object for scaled data.
Default |
copyColData |
Boolean. Whether copy 'colData' of SCE object to
the 'meta.data' of Seurat object. Default |
copyReducedDim |
Boolean. Whether copy 'reducedDims' of the SCE
object to the 'reductions' of Seurat object. Default |
copyDecontX |
Boolean. Whether copy 'decontXcounts' assay of the
SCE object to the 'assays' of Seurat object. Default |
pcaReducedDim |
Specify a character value indicating the name of
the reducedDim to store as default pca computation in the output seurat
object. Default is |
icaReducedDim |
Specify a character value indicating the name of
the reducedDim to store as default ica computation in the output seurat
object. Default is |
tsneReducedDim |
Specify a character value indicating the name of
the reducedDim to store as default tsne computation in the output seurat
object. Default is |
umapReducedDim |
Specify a character value indicating the name of
the reducedDim to store as default umap computation in the output seurat
object. Default is |
Updated seurat object that contains all data from the input sce object
data(scExample, package = "singleCellTK") seurat <- convertSCEToSeurat(sce)
data(scExample, package = "singleCellTK") seurat <- convertSCEToSeurat(sce)
convertSeuratToSCE Converts the input seurat object to a sce object
convertSeuratToSCE( seuratObject, normAssayName = "seuratNormData", scaledAssayName = "seuratScaledData" )
convertSeuratToSCE( seuratObject, normAssayName = "seuratNormData", scaledAssayName = "seuratScaledData" )
seuratObject |
Input Seurat object |
normAssayName |
Name of assay to store the normalized data. Default
|
scaledAssayName |
Name of assay to store the scaled data. Default
|
SingleCellExperiment
output object
data(scExample, package = "singleCellTK") seurat <- convertSCEToSeurat(sce) sce <- convertSeuratToSCE(seurat)
data(scExample, package = "singleCellTK") seurat <- convertSCEToSeurat(sce) sce <- convertSeuratToSCE(seurat)
Adds '-1', '-2', ... '-i' to multiple duplicated rownames, and in place
replace the unique rownames, store unique rownames in rowData
, or
return the unique rownames as character vecetor.
dedupRowNames(x, as.rowData = FALSE, return.list = FALSE)
dedupRowNames(x, as.rowData = FALSE, return.list = FALSE)
x |
A matrix like or SingleCellExperiment object, on which
we can apply |
as.rowData |
Only applicable when |
return.list |
When set to |
By default, a matrix or SingleCellExperiment object
with rownames deduplicated.
When x
is a SingleCellExperiment and as.rowData
is set to TRUE
, will return x
with rowData
updated.
When return.list
is set to TRUE
, will return a character vector
with the deduplicated rownames.
data("scExample", package = "singleCellTK") sce <- dedupRowNames(sce)
data("scExample", package = "singleCellTK") sce <- dedupRowNames(sce)
A wrapper function for isOutlier. Identify outliers from numeric vectors stored in the SingleCellExperiment object.
detectCellOutlier( inSCE, slotName, itemName, sample = NULL, nmads = 3, type = "both", overwrite = TRUE )
detectCellOutlier( inSCE, slotName, itemName, sample = NULL, nmads = 3, type = "both", overwrite = TRUE )
inSCE |
A SingleCellExperiment object. |
slotName |
Desired slot of SingleCellExperiment used for plotting. Possible options: "assays", "colData", "metadata", "reducedDims". Required. |
itemName |
Desired vector within the slot used for plotting. Required. |
sample |
A single character specifying a name that can be found in
|
nmads |
Integer. Number of median absolute deviation. Parameter may be adjusted for more lenient or stringent outlier cutoff. Default 3. |
type |
Character. Type/direction of outlier detection; whether the lower/higher outliers should be detected, or both. Options are "both", "lower", "higher". |
overwrite |
Boolean. If TRUE, and this function has previously generated an outlier decision on the same itemName, the outlier decision will be overwritten. Default TRUE. |
A SingleCellExperiment object with ” added to the colData slot. Additionally, the decontaminated counts will be added as an assay called 'decontXCounts'.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDecontX(sce[,sample(ncol(sce),20)]) sce <- detectCellOutlier(sce, slotName = "colData", sample = sce$sample, nmads = 4, itemName = "decontX_contamination", type = "both")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDecontX(sce[,sample(ncol(sce),20)]) sce <- detectCellOutlier(sce, slotName = "colData", sample = sce$sample, nmads = 4, itemName = "decontX_contamination", type = "both")
Calculate Differential Abundance with FET
diffAbundanceFET(inSCE, cluster, variable, control, case, analysisName)
diffAbundanceFET(inSCE, cluster, variable, control, case, analysisName)
inSCE |
A |
cluster |
A single |
variable |
A single |
control |
|
case |
|
analysisName |
A single |
This function will calculate the cell counting and fraction by dividing all cells to groups specified by the arguments, together with statistical summary by performing Fisher Exact Tests (FET).
The original SingleCellExperiment
object with metadata(inSCE)
updated with a list
diffAbundanceFET
, containing a new data.frame
for the analysis
result, named by analysisName
. The data.frame
contains columns
for number and fraction of cells that belong to different cases, as well as
"Odds_Ratio", "PValue" and "FDR".
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- diffAbundanceFET(inSCE = mouseBrainSubsetSCE, cluster = "tissue", variable = "level1class", case = "oligodendrocytes", control = "microglia", analysisName = "diffAbundFET")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- diffAbundanceFET(inSCE = mouseBrainSubsetSCE, cluster = "tissue", variable = "level1class", case = "oligodendrocytes", control = "microglia", analysisName = "diffAbundFET")
Three different generation methods are wrapped, including
distinctColors
,
[randomcoloR](SCTK_PerformingQC_Cell_V3.Rmd) and the ggplot
default color generation.
discreteColorPalette( n, palette = c("random", "ggplot", "celda"), seed = 12345, ... )
discreteColorPalette( n, palette = c("random", "ggplot", "celda"), seed = 12345, ... )
n |
An integer, the number of color codes to generate. |
palette |
A single character string. Select the method, available
options are |
seed |
An integer. Set the seed for random process that happens only in
"random" generation. Default |
... |
Other arguments that are passed to the internal function, according to the method selected. |
A character vector of n
hex color codes.
discreteColorPalette(n = 3)
discreteColorPalette(n = 3)
Generate a distinct palette for coloring different clusters
distinctColors( n, hues = c("red", "cyan", "orange", "blue", "yellow", "purple", "green", "magenta"), saturation.range = c(0.7, 1), value.range = c(0.7, 1) )
distinctColors( n, hues = c("red", "cyan", "orange", "blue", "yellow", "purple", "green", "magenta"), saturation.range = c(0.7, 1), value.range = c(0.7, 1) )
n |
Integer; Number of colors to generate |
hues |
Character vector of R colors available from the colors() function. These will be used as the base colors for the clustering scheme. Different saturations and values (i.e. darkness) will be generated for each hue. |
saturation.range |
Numeric vector of length 2 with values between 0 and 1. Default: c(0.25, 1) |
value.range |
Numeric vector of length 2 with values between 0 and 1. Default: c(0.5, 1) |
A vector of distinct colors that have been converted to HEX from HSV.
distinctColors(10)
distinctColors(10)
Estimate numbers of detected genes, significantly differentially expressed genes, and median significant effect size
downSampleCells( originalData, useAssay = "counts", minCountDetec = 10, minCellsDetec = 3, minCellnum = 10, maxCellnum = 1000, realLabels, depthResolution = 10, iterations = 10, totalReads = 1e+06 )
downSampleCells( originalData, useAssay = "counts", minCountDetec = 10, minCellsDetec = 3, minCellnum = 10, maxCellnum = 1000, realLabels, depthResolution = 10, iterations = 10, totalReads = 1e+06 )
originalData |
The SingleCellExperiment object storing all assay data from the shiny app. |
useAssay |
Character. The name of the assay to be used for subsampling. |
minCountDetec |
Numeric. The minimum number of reads found for a gene to be considered detected. |
minCellsDetec |
Numeric. The minimum number of cells a gene must have at least 1 read in for it to be considered detected. |
minCellnum |
Numeric. The minimum number of virtual cells to include in the smallest simulated dataset. |
maxCellnum |
Numeric. The maximum number of virtual cells to include in the largest simulated dataset |
realLabels |
Character. The name of the condition of interest. Must match a name from sample data. If only two factors present in the corresponding colData, will default to t-test. If multiple factors, will default to ANOVA. |
depthResolution |
Numeric. How many different read depth should the script simulate? Will simulate a number of experimental designs ranging from 10 reads to maxReadDepth, with logarithmic spacing. |
iterations |
Numeric. How many times should each experimental design be simulated? |
totalReads |
Numeric. How many aligned reads to put in each simulated dataset. |
A 3-dimensional array, with dimensions = c(iterations, depthResolution, 3). [,,1] contains the number of detected genes in each simulated dataset, [,,2] contains the number of significantly differentially expressed genes in each simulation, and [,,3] contains the mediansignificant effect size in each simulation. If no genes are significantly differentially expressed, the median effect size defaults to infinity.
data("mouseBrainSubsetSCE") subset <- mouseBrainSubsetSCE[seq(100),] res <- downSampleCells(subset, realLabels = "level1class", iterations=2)
data("mouseBrainSubsetSCE") subset <- mouseBrainSubsetSCE[seq(100),] res <- downSampleCells(subset, realLabels = "level1class", iterations=2)
Estimate numbers of detected genes, significantly differentially expressed genes, and median significant effect size
downSampleDepth( originalData, useAssay = "counts", minCount = 10, minCells = 3, maxDepth = 1e+07, realLabels, depthResolution = 10, iterations = 10 )
downSampleDepth( originalData, useAssay = "counts", minCount = 10, minCells = 3, maxDepth = 1e+07, realLabels, depthResolution = 10, iterations = 10 )
originalData |
SingleCellExperiment object storing all assay data from the shiny app. |
useAssay |
Character. The name of the assay to be used for subsampling. |
minCount |
Numeric. The minimum number of reads found for a gene to be considered detected. |
minCells |
Numeric. The minimum number of cells a gene must have at least 1 read in for it to be considered detected. |
maxDepth |
Numeric. The highest number of total reads to be simulated. |
realLabels |
Character. The name of the condition of interest. Must match a name from sample data. |
depthResolution |
Numeric. How many different read depth should the script simulate? Will simulate a number of experimental designs ranging from 10 reads to maxReadDepth, with logarithmic spacing. |
iterations |
Numeric. How many times should each experimental design be simulated? |
A 3-dimensional array, with dimensions = c(iterations, depthResolution, 3). [,,1] contains the number of detected genes in each simulated dataset, [,,2] contains the number of significantly differentially expressed genes in each simulation, and [,,3] contains the mediansignificant effect size in each simulation. If no genes are significantly differentially expressed, the median effect size defaults to infinity.
data("mouseBrainSubsetSCE") subset <- mouseBrainSubsetSCE[seq(1000),] res <- downSampleDepth(subset, realLabels = "level1class", iterations=2)
data("mouseBrainSubsetSCE") subset <- mouseBrainSubsetSCE[seq(1000),] res <- downSampleDepth(subset, realLabels = "level1class", iterations=2)
SingleCellExperiment
object. The data item can be an assay
, altExp
(subset) or a reducedDim
, which is retrieved based on the name of the data item.expData
Get data item from an input SingleCellExperiment
object. The data item can be an assay
, altExp
(subset) or a reducedDim
, which is retrieved based on the name of the data item.
expData(inSCE, assayName)
expData(inSCE, assayName)
inSCE |
Input |
assayName |
Specify the name of the data item to retrieve. |
Specified data item.
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts")
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts")
SingleCellExperiment
object. The data item can be an assay
, altExp
(subset) or a reducedDim
, which is retrieved based on the name of the data item.expData
Get data item from an input SingleCellExperiment
object. The data item can be an assay
, altExp
(subset) or a reducedDim
, which is retrieved based on the name of the data item.
## S4 method for signature 'ANY,character' expData(inSCE, assayName)
## S4 method for signature 'ANY,character' expData(inSCE, assayName)
inSCE |
Input |
assayName |
Specify the name of the data item to retrieve. |
Specified data item.
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts")
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts")
expData Store data items using tags to identify the type of data item stored. To be used as a replacement for assay<- setter function but with additional parameter to set a tag to a data item.
expData(inSCE, assayName, tag = NULL, altExp = FALSE) <- value
expData(inSCE, assayName, tag = NULL, altExp = FALSE) <- value
inSCE |
Input |
assayName |
Specify the name of the input assay. |
tag |
Specify the tag to store against the input assay. Default is |
altExp |
A |
value |
An input matrix-like value to store in the SCE object. |
A SingleCellExperiment
object containing the newly stored data.
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts") expData(sce, "counts", tag = "raw") <- mat
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts") expData(sce, "counts", tag = "raw") <- mat
expData Store data items using tags to identify the type of data item stored. To be used as a replacement for assay<- setter function but with additional parameter to set a tag to a data item.
## S4 replacement method for signature 'ANY,character,CharacterOrNullOrMissing,logical' expData(inSCE, assayName, tag = NULL, altExp = FALSE) <- value
## S4 replacement method for signature 'ANY,character,CharacterOrNullOrMissing,logical' expData(inSCE, assayName, tag = NULL, altExp = FALSE) <- value
inSCE |
Input |
assayName |
Specify the name of the input assay. |
tag |
Specify the tag to store against the input assay. Default is |
altExp |
A |
value |
An input matrix-like value to store in the SCE object. |
A SingleCellExperiment
object containing the newly stored data.
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts") expData(sce, "counts", tag = "raw") <- mat
data(scExample, package = "singleCellTK") mat <- expData(sce, "counts") expData(sce, "counts", tag = "raw") <- mat
SingleCellExperiment
object including assays, altExps and reducedDims.expDataNames
Get names of all the data items in the input SingleCellExperiment
object including assays, altExps and reducedDims.
expDataNames(inSCE)
expDataNames(inSCE)
inSCE |
Input |
A combined vector
of assayNames
, altExpNames
and reducedDimNames
.
data(scExample, package = "singleCellTK") expDataNames(sce)
data(scExample, package = "singleCellTK") expDataNames(sce)
SingleCellExperiment
object including assays, altExps and reducedDims.expDataNames
Get names of all the data items in the input SingleCellExperiment
object including assays, altExps and reducedDims.
## S4 method for signature 'ANY' expDataNames(inSCE)
## S4 method for signature 'ANY' expDataNames(inSCE)
inSCE |
Input |
A combined vector
of assayNames
, altExpNames
and reducedDimNames
.
data(scExample, package = "singleCellTK") expDataNames(sce)
data(scExample, package = "singleCellTK") expDataNames(sce)
expDeleteDataTag Remove tag against an input data from the stored tag information in the metadata of the input object.
expDeleteDataTag(inSCE, assay)
expDeleteDataTag(inSCE, assay)
inSCE |
Input |
assay |
Name of the assay or the data item against which a tag should be removed. |
The input SingleCellExperiment
object with tag information removed from the metadata slot.
data(scExample, package = "singleCellTK") sce <- expSetDataTag(sce, "raw", "counts") sce <- expDeleteDataTag(sce, "counts")
data(scExample, package = "singleCellTK") sce <- expSetDataTag(sce, "raw", "counts") sce <- expDeleteDataTag(sce, "counts")
Export data in SingleCellExperiment object
exportSCE( inSCE, samplename = "sample", directory = "./", type = "Cells", format = c("SCE", "AnnData", "FlatFile", "HTAN", "Seurat") )
exportSCE( inSCE, samplename = "sample", directory = "./", type = "Cells", format = c("SCE", "AnnData", "FlatFile", "HTAN", "Seurat") )
inSCE |
A SingleCellExperiment object that contains the data. QC metrics are stored in colData of the singleCellExperiment object. |
samplename |
Sample name. This will be used as name of subdirectories and the prefix of flat file output. Default is 'sample'. |
directory |
Output directory. Default is './'. |
type |
Type of data. The type of data stored in SingleCellExperiment object. It can be 'Droplets'(raw droplets matrix) or 'Cells' (cells matrix). |
format |
The format of output. It currently supports flat files, rds files and python h5 files. It can output multiple formats. Default: c("SCE", "AnnData", "FlatFile", "HTAN"). |
Generates a file containing data from inSCE
, in specified format
.
data(scExample) ## Not run: exportSCE(sce, format = "SCE") ## End(Not run)
data(scExample) ## Not run: exportSCE(sce, format = "SCE") ## End(Not run)
Writes all assays, colData, rowData, reducedDims, and altExps objects in a SingleCellExperiment to a Python annData object in the .h5ad format All parameters of Anndata.write_h5ad function (https://icb-anndata.readthedocs-hosted.com/en/stable/anndata.AnnData.write_h5ad.html) are available as parameters to this export function and set to defaults. Defaults can be overridden at function call.
exportSCEtoAnnData( sce, useAssay = "counts", outputDir = "./", prefix = "sample", overwrite = TRUE, compression = c("gzip", "lzf", "None"), compressionOpts = NULL, forceDense = FALSE )
exportSCEtoAnnData( sce, useAssay = "counts", outputDir = "./", prefix = "sample", overwrite = TRUE, compression = c("gzip", "lzf", "None"), compressionOpts = NULL, forceDense = FALSE )
sce |
SingleCellExperiment R object to be exported. |
useAssay |
Character. The name of assay of
interests that will be set as the primary matrix of the output AnnData.
Default |
outputDir |
Path to the directory where .h5ad outputs will be written. Default is the current working directory. |
prefix |
Prefix to use for the name of the output file. Default |
overwrite |
Boolean. Default |
compression |
If output file compression is required, this variable accepts
'gzip', 'lzf' or "None" as inputs. Default |
compressionOpts |
Integer. Sets the compression level |
forceDense |
Default |
Generates a Python anndata object containing data from inSCE
.
data(sce_chcl, package = "scds") ## Not run: exportSCEtoAnnData(sce=sce_chcl, compression="gzip") ## End(Not run)
data(sce_chcl, package = "scds") ## Not run: exportSCEtoAnnData(sce=sce_chcl, compression="gzip") ## End(Not run)
Writes all assays, colData, rowData, reducedDims, and altExps objects in a SingleCellExperiment to text files. The items in the 'metadata' slot remain stored in list and are saved in an RDS file.
exportSCEtoFlatFile( sce, outputDir = "./", overwrite = TRUE, gzipped = TRUE, prefix = "SCE" )
exportSCEtoFlatFile( sce, outputDir = "./", overwrite = TRUE, gzipped = TRUE, prefix = "SCE" )
sce |
SingleCellExperiment object to be exported. |
outputDir |
Name of the directory to store the exported file(s). |
overwrite |
Boolean. Whether to overwrite the output files. Default
|
gzipped |
Boolean. |
prefix |
Prefix of file names. |
Generates text files containing data from inSCE
.
data(sce_chcl, package = "scds") ## Not run: exportSCEtoFlatFile(sce_chcl, "sce_chcl") ## End(Not run)
data(sce_chcl, package = "scds") ## Not run: exportSCEtoFlatFile(sce_chcl, "sce_chcl") ## End(Not run)
Export data in Seurat object
exportSCEToSeurat( inSCE, prefix = "sample", outputDir = "./", overwrite = TRUE, copyColData = TRUE, copyReducedDim = TRUE, copyDecontX = TRUE )
exportSCEToSeurat( inSCE, prefix = "sample", outputDir = "./", overwrite = TRUE, copyColData = TRUE, copyReducedDim = TRUE, copyDecontX = TRUE )
inSCE |
A SingleCellExperiment object that contains the data. QC metrics are stored in colData of the singleCellExperiment object. |
prefix |
Prefix to use for the name of the output file. Default |
outputDir |
Path to the directory where outputs will be written. Default is the current working directory. |
overwrite |
Boolean. Whether overwrite the output if it already exists in the outputDir. Default |
copyColData |
Boolean. Whether copy 'colData' of SCE object to the 'meta.data' of Seurat object. Default |
copyReducedDim |
Boolean. Whether copy 'reducedDims' of the SCE object to the 'reductions' of Seurat object. Default |
copyDecontX |
Boolean. Whether copy 'decontXcounts' assay of the SCE object to the 'assays' of Seurat object. Default |
Generates a Seurat object containing data from inSCE
.
expSetDataTag Set tag to an assay or a data item in the input SCE object.
expSetDataTag(inSCE, assayType, assays)
expSetDataTag(inSCE, assayType, assays)
inSCE |
Input |
assayType |
Specify a |
assays |
Specify name(s) |
The input SingleCellExperiment
object with tag information stored in the metadata slot.
data(scExample, package = "singleCellTK") sce <- expSetDataTag(sce, "raw", "counts")
data(scExample, package = "singleCellTK") sce <- expSetDataTag(sce, "raw", "counts")
SingleCellExperiment
object based upon the input parameters.expTaggedData
Returns a list of names of data items from the
input SingleCellExperiment
object based upon the input parameters.
expTaggedData( inSCE, tags = NULL, redDims = FALSE, recommended = NULL, showTags = TRUE )
expTaggedData( inSCE, tags = NULL, redDims = FALSE, recommended = NULL, showTags = TRUE )
inSCE |
Input |
tags |
A |
redDims |
A |
recommended |
A |
showTags |
A |
A list
of names of data items specified by the other
parameters.
data(scExample, package = "singleCellTK") sce <- expSetDataTag(sce, "raw", "counts") tags <- expTaggedData(sce)
data(scExample, package = "singleCellTK") sce <- expSetDataTag(sce, "raw", "counts") tags <- expTaggedData(sce)
This will return indices of features among the rownames
or rowData of a data.frame, matrix, or a SummarizedExperiment
object including a SingleCellExperiment.
Partial matching (i.e. grepping) can be used by setting
exactMatch = FALSE
.
featureIndex( features, inSCE, by = "rownames", exactMatch = TRUE, removeNA = FALSE, errorOnNoMatch = TRUE, warningOnPartialMatch = TRUE )
featureIndex( features, inSCE, by = "rownames", exactMatch = TRUE, removeNA = FALSE, errorOnNoMatch = TRUE, warningOnPartialMatch = TRUE )
features |
Character vector of feature names to find in the rows of
|
inSCE |
A data.frame, matrix, or SingleCellExperiment object to search. |
by |
Character. Where to search for features in |
exactMatch |
Boolean. Whether to only identify exact matches
or to identify partial matches using |
removeNA |
Boolean. If set to |
errorOnNoMatch |
Boolean. If |
warningOnPartialMatch |
Boolean. If |
A vector of row indices for the matching features in inSCE
.
Yusuke Koga, Joshua D. Campbell
'retrieveFeatureInfo' from package 'scater'
and link{regex}
for how to use regular expressions when
exactMatch = FALSE
.
data(scExample) ix <- featureIndex(features = c("MT-CYB", "MT-ND2"), inSCE = sce, by = "feature_name")
data(scExample) ix <- featureIndex(features = c("MT-CYB", "MT-ND2"), inSCE = sce, by = "feature_name")
Generate HTAN manifest file for droplet and cell count data
generateHTANMeta( dropletSCE = NULL, cellSCE = NULL, samplename, htan_biospecimen_id, dir, dataType = c("Droplet", "Cell", "Both") )
generateHTANMeta( dropletSCE = NULL, cellSCE = NULL, samplename, htan_biospecimen_id, dir, dataType = c("Droplet", "Cell", "Both") )
dropletSCE |
A SingleCellExperiment object containing droplet count matrix data |
cellSCE |
A SingleCellExperiment object containing cell count matrix data |
samplename |
The sample name of the SingleCellExperiment objects |
htan_biospecimen_id |
The HTAN biospecimen id of the sample in SingleCellExperiment object |
dir |
The output directory of the SCTK QC pipeline. |
dataType |
Type of the input data. It can be one of "Droplet", "Cell" or "Both". |
A SingleCellExperiment object which combines all objects in sceList. The colData is merged.
Generate HTAN manifest file for droplet and cell count data
generateMeta( dropletSCE = NULL, cellSCE = NULL, samplename, dir, HTAN = TRUE, dataType = c("Droplet", "Cell", "Both") )
generateMeta( dropletSCE = NULL, cellSCE = NULL, samplename, dir, HTAN = TRUE, dataType = c("Droplet", "Cell", "Both") )
dropletSCE |
A SingleCellExperiment object containing droplet count matrix data |
cellSCE |
A SingleCellExperiment object containing cell count matrix data |
samplename |
The sample name of the SingleCellExperiment objects |
dir |
The output directory of the SCTK QC pipeline. |
HTAN |
Whether generates manifest file including HTAN specific ID (HTAN Biospecimen ID, HTAN parent file ID and HTAN patient ID). Default is TRUE. |
dataType |
Type of the input data. It can be one of "Droplet", "Cell" or "Both". |
A SingleCellExperiment object which combines all objects in sceList. The colData is merged.
Generates a single simulated dataset, bootstrapping from the input counts matrix.
generateSimulatedData(totalReads, cells, originalData, realLabels)
generateSimulatedData(totalReads, cells, originalData, realLabels)
totalReads |
Numeric. The total number of reads in the simulated dataset, to be split between all simulated cells. |
cells |
Numeric. The number of virtual cells to simulate. |
originalData |
Matrix. The original raw read count matrix. When used within the Shiny app, this will be assay(SCEsetObject, "counts"). |
realLabels |
Factor. The condition labels for differential expression. If only two factors present, will default to t-test. If multiple factors, will default to ANOVA. |
A simulated counts matrix, the first row of which contains the 'true' labels for each virtual cell.
data("mouseBrainSubsetSCE") res <- generateSimulatedData( totalReads = 1000, cells=10, originalData = assay(mouseBrainSubsetSCE, "counts"), realLabels = colData(mouseBrainSubsetSCE)[, "level1class"])
data("mouseBrainSubsetSCE") res <- generateSimulatedData( totalReads = 1000, cells=10, originalData = assay(mouseBrainSubsetSCE, "counts"), realLabels = colData(mouseBrainSubsetSCE)[, "level1class"])
Given a list of genes and a SingleCellExperiment object, return the binary or continuous expression of the genes.
getBiomarker( inSCE, gene, binary = "Binary", useAssay = "counts", featureLocation = NULL, featureDisplay = NULL )
getBiomarker( inSCE, gene, binary = "Binary", useAssay = "counts", featureLocation = NULL, featureDisplay = NULL )
inSCE |
Input SingleCellExperiment object. |
gene |
gene list |
binary |
"Binary" for binary expression or "Continuous" for a gradient. Default: "Binary" |
useAssay |
Indicates which assay to use. The default is "counts". |
featureLocation |
Indicates which column name of rowData to query gene. |
featureDisplay |
Indicates which column name of rowData to use to display feature for visualization. |
getBiomarker(): A data.frame of expression values
data("mouseBrainSubsetSCE") getBiomarker(mouseBrainSubsetSCE, gene="C1qa")
data("mouseBrainSubsetSCE") getBiomarker(mouseBrainSubsetSCE, gene="C1qa")
Users have to run runDEAnalysis()
first, any of the
wrapped functions of this generic function. Users can set further filters on
the result. A data.frame
object, with variables of Gene
,
Log2_FC
, Pvalue
, and FDR
, will be returned.
getDEGTopTable( inSCE, useResult, labelBy = S4Vectors::metadata(inSCE)$featureDisplay, onlyPos = FALSE, log2fcThreshold = 0.25, fdrThreshold = 0.05, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL )
getDEGTopTable( inSCE, useResult, labelBy = S4Vectors::metadata(inSCE)$featureDisplay, onlyPos = FALSE, log2fcThreshold = 0.25, fdrThreshold = 0.05, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL )
inSCE |
SingleCellExperiment inherited object, with of the singleCellTK DEG method performed in advance. |
useResult |
character. A string specifying the |
labelBy |
A single character for a column of |
onlyPos |
logical. Whether to only fetch DEG with positive log2_FC
value. Default |
log2fcThreshold |
numeric. Only fetch DEGs with the absolute values of
log2FC larger than this value. Default |
fdrThreshold |
numeric. Only fetch DEGs with FDR value smaller than this
value. Default |
minGroup1MeanExp |
numeric. Only fetch DEGs with mean expression in
group1 greater then this value. Default |
maxGroup2MeanExp |
numeric. Only fetch DEGs with mean expression in
group2 less then this value. Default |
minGroup1ExprPerc |
numeric. Only fetch DEGs expressed in greater then
this fraction of cells in group1. Default |
maxGroup2ExprPerc |
numeric. Only fetch DEGs expressed in less then this
fraction of cells in group2. Default |
A data.frame
object of the top DEGs, with variables of
Gene
, Log2_FC
, Pvalue
, and FDR
.
data("sceBatches") sceBatches <- scaterlogNormCounts(sceBatches, "logcounts") sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") getDEGTopTable(sce.w, "w.aVSb")
data("sceBatches") sceBatches <- scaterlogNormCounts(sceBatches, "logcounts") sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") getDEGTopTable(sce.w, "w.aVSb")
Get/Set diffAbundanceFET result table
getDiffAbundanceResults(x, analysisName) ## S4 method for signature 'SingleCellExperiment' getDiffAbundanceResults(x, analysisName) getDiffAbundanceResults(x, analysisName) <- value ## S4 replacement method for signature 'SingleCellExperiment' getDiffAbundanceResults(x, analysisName) <- value
getDiffAbundanceResults(x, analysisName) ## S4 method for signature 'SingleCellExperiment' getDiffAbundanceResults(x, analysisName) getDiffAbundanceResults(x, analysisName) <- value ## S4 replacement method for signature 'SingleCellExperiment' getDiffAbundanceResults(x, analysisName) <- value
x |
A |
analysisName |
A single character string specifying an analysis
performed with |
value |
The output table of |
The differential abundance table for getter method, or update the SCE object with new result for setter method.
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- diffAbundanceFET(inSCE = mouseBrainSubsetSCE, cluster = "tissue", variable = "level1class", case = "oligodendrocytes", control = "microglia", analysisName = "diffAbund") result <- getDiffAbundanceResults(mouseBrainSubsetSCE, "diffAbund")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- diffAbundanceFET(inSCE = mouseBrainSubsetSCE, cluster = "tissue", variable = "level1class", case = "oligodendrocytes", control = "microglia", analysisName = "diffAbund") result <- getDiffAbundanceResults(mouseBrainSubsetSCE, "diffAbund")
Get or Set EnrichR Result
getEnrichRResult(inSCE, analysisName) <- value getEnrichRResult(inSCE, analysisName) ## S4 method for signature 'SingleCellExperiment' getEnrichRResult(inSCE, analysisName) ## S4 replacement method for signature 'SingleCellExperiment' getEnrichRResult(inSCE, analysisName) <- value
getEnrichRResult(inSCE, analysisName) <- value getEnrichRResult(inSCE, analysisName) ## S4 method for signature 'SingleCellExperiment' getEnrichRResult(inSCE, analysisName) ## S4 replacement method for signature 'SingleCellExperiment' getEnrichRResult(inSCE, analysisName) <- value
inSCE |
A SingleCellExperiment object. |
analysisName |
A string that identifies each specific analysis |
value |
The EnrichR result table |
For getter method, a data.frame of the EnrichR result;
For setter method, inSCE
with EnrichR results updated.
data("mouseBrainSubsetSCE") if (Biobase::testBioCConnection()) { mouseBrainSubsetSCE <- runEnrichR(mouseBrainSubsetSCE, features = "Cmtm5", db = "GO_Cellular_Component_2017", analysisName = "analysis1") result <- getEnrichRResult(mouseBrainSubsetSCE, "analysis1") }
data("mouseBrainSubsetSCE") if (Biobase::testBioCConnection()) { mouseBrainSubsetSCE <- runEnrichR(mouseBrainSubsetSCE, features = "Cmtm5", db = "GO_Cellular_Component_2017", analysisName = "analysis1") result <- getEnrichRResult(mouseBrainSubsetSCE, "analysis1") }
Fetch the table of top markers that pass the filtering
getFindMarkerTopTable( inSCE, log2fcThreshold = 0, fdrThreshold = 0.05, minClustExprPerc = 0.5, maxCtrlExprPerc = 0.5, minMeanExpr = 0, topN = 1 ) findMarkerTopTable( inSCE, log2fcThreshold = 1, fdrThreshold = 0.05, minClustExprPerc = 0.7, maxCtrlExprPerc = 0.4, minMeanExpr = 1, topN = 10 )
getFindMarkerTopTable( inSCE, log2fcThreshold = 0, fdrThreshold = 0.05, minClustExprPerc = 0.5, maxCtrlExprPerc = 0.5, minMeanExpr = 0, topN = 1 ) findMarkerTopTable( inSCE, log2fcThreshold = 1, fdrThreshold = 0.05, minClustExprPerc = 0.7, maxCtrlExprPerc = 0.4, minMeanExpr = 1, topN = 10 )
inSCE |
SingleCellExperiment inherited object. |
log2fcThreshold |
Only use DEGs with the absolute values of log2FC
larger than this value. Default |
fdrThreshold |
Only use DEGs with FDR value smaller than this value.
Default |
minClustExprPerc |
A numeric scalar. The minimum cutoff of the
percentage of cells in the cluster of interests that expressed the marker
gene. Default |
maxCtrlExprPerc |
A numeric scalar. The maximum cutoff of the
percentage of cells out of the cluster (control group) that expressed the
marker gene. Default |
minMeanExpr |
A numeric scalar. The minimum cutoff of the mean
expression value of the marker in the cluster of interests. Default |
topN |
An integer. Only to fetch this number of top markers for each
cluster in maximum, in terms of log2FC value. Use |
Users have to run runFindMarker
prior to using this
function to extract a top marker table.
An organized data.frame
object, with the top marker gene
information.
runFindMarker
, plotFindMarkerHeatmap
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runFindMarker(mouseBrainSubsetSCE, useAssay = "logcounts", cluster = "level1class") getFindMarkerTopTable(mouseBrainSubsetSCE)
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runFindMarker(mouseBrainSubsetSCE, useAssay = "logcounts", cluster = "level1class") getFindMarkerTopTable(mouseBrainSubsetSCE)
List geneset names from geneSetCollection
getGenesetNamesFromCollection(inSCE, geneSetCollectionName)
getGenesetNamesFromCollection(inSCE, geneSetCollectionName)
inSCE |
Input SingleCellExperiment object. |
geneSetCollectionName |
The name of an imported geneSetCollection. |
A character vector of available genesets from the collection.
Returns a data.frame that shows MSigDB categories and subcategories as well as descriptions for each. The entries in the ID column in this table can be used as input for importGeneSetsFromMSigDB.
getMSigDBTable()
getMSigDBTable()
data.frame, containing MSigDB categories
Joshua D. Campbell
importGeneSetsFromMSigDB for importing MSigDB gene sets.
getMSigDBTable()
getMSigDBTable()
List pathway analysis result names
getPathwayResultNames(inSCE, stopIfNone = FALSE, verbose = FALSE)
getPathwayResultNames(inSCE, stopIfNone = FALSE, verbose = FALSE)
inSCE |
Input SingleCellExperiment object. |
stopIfNone |
Whether to stop and raise an error if no results found. If
|
verbose |
Show warning if no result found. Default |
Pathway analysis results will be stored as matrices in
reducedDims
slot of inSCE
. This function lists the result names
stored in metadata
slot when analysis is performed.
A character vector of valid pathway analysis result names.
data(scExample) getPathwayResultNames(sce)
data(scExample) getPathwayResultNames(sce)
Stores and returns table of QC metrics generated from QC algorithms within the metadata slot of the SingleCellExperiment object.
getSampleSummaryStatsTable(inSCE, statsName, ...) setSampleSummaryStatsTable(inSCE, statsName, ...) <- value ## S4 method for signature 'SingleCellExperiment' getSampleSummaryStatsTable(inSCE, statsName, ...) ## S4 replacement method for signature 'SingleCellExperiment' setSampleSummaryStatsTable(inSCE, statsName, ...) <- value
getSampleSummaryStatsTable(inSCE, statsName, ...) setSampleSummaryStatsTable(inSCE, statsName, ...) <- value ## S4 method for signature 'SingleCellExperiment' getSampleSummaryStatsTable(inSCE, statsName, ...) ## S4 replacement method for signature 'SingleCellExperiment' setSampleSummaryStatsTable(inSCE, statsName, ...) <- value
inSCE |
Input SingleCellExperiment object with saved assay data and/or colData data. Required. |
statsName |
A |
... |
Other arguments passed to the function. |
value |
The summary table for QC statistics generated from SingleCellTK to be added to the SCE object. |
For getSampleSummaryStatsTable
, A matrix/array object.
Contains a summary table for QC statistics generated from SingleCellTK. For
setSampleSummaryStatsTable<-
, A SingleCellExperiment object where the
summary table is updated in the metadata
slot.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- sampleSummaryStats(sce, simple = TRUE, statsName = "qc_table") getSampleSummaryStatsTable(sce, statsName = "qc_table")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- sampleSummaryStats(sce, simple = TRUE, statsName = "qc_table") getSampleSummaryStatsTable(sce, statsName = "qc_table")
Extract QC parameters from the SingleCellExperiment object
getSceParams( inSCE, skip = c("runScrublet", "runDecontX", "runBarcodeRanksMetaOutput", "genesets", "runSoupX"), ignore = c("algorithms", "estimates", "contamination", "z", "sample", "rank", "BPPARAM", "batch", "geneSetCollection", "barcodeArgs"), directory = "./", samplename = "", writeYAML = TRUE )
getSceParams( inSCE, skip = c("runScrublet", "runDecontX", "runBarcodeRanksMetaOutput", "genesets", "runSoupX"), ignore = c("algorithms", "estimates", "contamination", "z", "sample", "rank", "BPPARAM", "batch", "geneSetCollection", "barcodeArgs"), directory = "./", samplename = "", writeYAML = TRUE )
inSCE |
A SingleCellExperiment object. |
skip |
Skip extracting the parameters of the provided QC functions. |
ignore |
Skip extracting the content within QC functions. |
directory |
The output directory of the SCTK_runQC.R pipeline. |
samplename |
The sample name of the SingleCellExperiment objects. |
writeYAML |
Whether output yaml file to store parameters. Default if TRUE. If FALSE, return character object. |
If writeYAML
TRUE, a yaml object will be generated. If FALSE, character object.
Get variable feature names after running runSeuratFindHVG function
getSeuratVariableFeatures(inSCE)
getSeuratVariableFeatures(inSCE)
inSCE |
Input |
A list of variable feature names.
S4 method for getting and setting SoupX results that cannot be
appended to either rowData(inSCE)
or colData(inSCE)
.
S4 method for getting and setting SoupX results that cannot be
appended to either rowData(inSCE)
or colData(inSCE)
.
getSoupX(inSCE, sampleID, background = FALSE) <- value getSoupX(inSCE, sampleID = NULL, background = FALSE) ## S4 method for signature 'SingleCellExperiment' getSoupX(inSCE, sampleID = NULL, background = FALSE) ## S4 replacement method for signature 'SingleCellExperiment' getSoupX(inSCE, sampleID, background = FALSE) <- value
getSoupX(inSCE, sampleID, background = FALSE) <- value getSoupX(inSCE, sampleID = NULL, background = FALSE) ## S4 method for signature 'SingleCellExperiment' getSoupX(inSCE, sampleID = NULL, background = FALSE) ## S4 replacement method for signature 'SingleCellExperiment' getSoupX(inSCE, sampleID, background = FALSE) <- value
inSCE |
A SingleCellExperiment object. For getter method,
|
sampleID |
Character vector. For getter method, the samples that should
be included in the returned list. Leave this |
background |
Logical. Whether |
value |
Dedicated list object of SoupX results. |
For getter method, a list with SoupX results for specified samples.
For setter method, inSCE
with SoupX results updated.
For getter method, a list with SoupX results for specified samples.
For setter method, inSCE
with SoupX results updated.
runSoupX, plotSoupXResults
## Not run: sce <- importExampleData("pbmc3k") sce <- runSoupX(sce, sample = "sample") soupXResults <- getSoupX(sce) ## End(Not run)
## Not run: sce <- importExampleData("pbmc3k") sce <- runSoupX(sce, sample = "sample") soupXResults <- getSoupX(sce) ## End(Not run)
Extracts or select the top variable genes from an input
SingleCellExperiment object. Note that the variability metrics
must be computed using the runFeatureSelection
method before
extracting the feature names of the top variable features. getTopHVG
only returns a character vector of the HVG selection, while with
setTopHVG
, a logical vector of the selection will be saved in the
rowData
, and optionally, a subset object for the HVGs can be stored
in the altExps
slot at the same time.
getTopHVG( inSCE, method = c("vst", "dispersion", "mean.var.plot", "modelGeneVar", "seurat", "seurat_v3", "cell_ranger"), hvgNumber = 2000, useFeatureSubset = "hvf", featureDisplay = metadata(inSCE)$featureDisplay ) setTopHVG( inSCE, method = c("vst", "dispersion", "mean.var.plot", "modelGeneVar", "seurat", "seurat_v3", "cell_ranger"), hvgNumber = 2000, featureSubsetName = "hvg2000", genes = NULL, genesBy = NULL, altExp = FALSE )
getTopHVG( inSCE, method = c("vst", "dispersion", "mean.var.plot", "modelGeneVar", "seurat", "seurat_v3", "cell_ranger"), hvgNumber = 2000, useFeatureSubset = "hvf", featureDisplay = metadata(inSCE)$featureDisplay ) setTopHVG( inSCE, method = c("vst", "dispersion", "mean.var.plot", "modelGeneVar", "seurat", "seurat_v3", "cell_ranger"), hvgNumber = 2000, featureSubsetName = "hvg2000", genes = NULL, genesBy = NULL, altExp = FALSE )
inSCE |
Input SingleCellExperiment object |
method |
Specify which method to use for variable gene extraction
from Seurat |
hvgNumber |
Specify the number of top variable genes to extract. |
useFeatureSubset |
Get the feature names in the HVG list set by
|
featureDisplay |
A character string for the |
featureSubsetName |
A character string for the |
genes |
A customized character vector of gene list to be set as a
|
genesBy |
If setting customized |
altExp |
|
getTopHVG |
A character vector of the top |
setTopHVG |
The input |
Irzam Sarfraz, Yichen Wang
runFeatureSelection
, runSeuratFindHVG
,
runModelGeneVar
, plotTopHVG
data("scExample", package = "singleCellTK") # Create a "highy variable feature" subset using Seurat's vst method: sce <- runSeuratFindHVG(sce, method = "vst", hvgNumber = 2000, createFeatureSubset = "hvf") # Get the list of genes for a feature subset: hvgs <- getTopHVG(sce, useFeatureSubset = "hvf") # Create a new feature subset on the fly without rerunning the algorithm: sce <- setTopHVG(sce, method = "vst", hvgNumber = 100, featureSubsetName = "hvf100") hvgs <- getTopHVG(sce, useFeatureSubset = "hvf100") # Get a list of variable features without creating a new feature subset: hvgs <- getTopHVG(sce, useFeatureSubset = NULL, method = "vst", hvgNumber = 10)
data("scExample", package = "singleCellTK") # Create a "highy variable feature" subset using Seurat's vst method: sce <- runSeuratFindHVG(sce, method = "vst", hvgNumber = 2000, createFeatureSubset = "hvf") # Get the list of genes for a feature subset: hvgs <- getTopHVG(sce, useFeatureSubset = "hvf") # Create a new feature subset on the fly without rerunning the algorithm: sce <- setTopHVG(sce, method = "vst", hvgNumber = 100, featureSubsetName = "hvf100") hvgs <- getTopHVG(sce, useFeatureSubset = "hvf100") # Get a list of variable features without creating a new feature subset: hvgs <- getTopHVG(sce, useFeatureSubset = NULL, method = "vst", hvgNumber = 10)
SCTK allows user to access all TSCAN related results with
"getTSCANResults"
. See details.
getTSCANResults(x, analysisName = NULL, pathName = NULL) ## S4 method for signature 'SingleCellExperiment' getTSCANResults(x, analysisName = NULL, pathName = NULL) getTSCANResults(x, analysisName, pathName = NULL) <- value ## S4 replacement method for signature 'SingleCellExperiment' getTSCANResults(x, analysisName, pathName = NULL) <- value listTSCANResults(x) ## S4 method for signature 'SingleCellExperiment' listTSCANResults(x) listTSCANTerminalNodes(x) ## S4 method for signature 'SingleCellExperiment' listTSCANTerminalNodes(x)
getTSCANResults(x, analysisName = NULL, pathName = NULL) ## S4 method for signature 'SingleCellExperiment' getTSCANResults(x, analysisName = NULL, pathName = NULL) getTSCANResults(x, analysisName, pathName = NULL) <- value ## S4 replacement method for signature 'SingleCellExperiment' getTSCANResults(x, analysisName, pathName = NULL) <- value listTSCANResults(x) ## S4 method for signature 'SingleCellExperiment' listTSCANResults(x) listTSCANTerminalNodes(x) ## S4 method for signature 'SingleCellExperiment' listTSCANTerminalNodes(x)
x |
Input SingleCellExperiment object. |
analysisName |
Algorithm name implemented, should be one of
|
pathName |
Sub folder name within the |
value |
Value to be stored within the |
When analysisName = "Pseudotime"
, returns the list result from
runTSCAN
, including the MST structure.
When analysisName = "DEG"
, returns the list result from
runTSCANDEG
, including DataFrame
s containing genes that
increase/decrease along each the pseudotime paths. pathName
indicates
the path index, the available options of which can be listed by
listTSCANTerminalNodes
.
When analysisName = "ClusterDEAnalysis"
, returns the list result from
runTSCANClusterDEAnalysis
. Here pathName
needs to match
with the useCluster
argument when running the algorithm.
Get or set TSCAN results
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") results <- getTSCANResults(mouseBrainSubsetSCE, "Pseudotime")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") results <- getTSCANResults(mouseBrainSubsetSCE, "Pseudotime")
Construct SCE object from Salmon-Alevin output
importAlevin( alevinDir = NULL, sampleName = "sample", delayedArray = FALSE, class = c("Matrix", "matrix"), rowNamesDedup = TRUE )
importAlevin( alevinDir = NULL, sampleName = "sample", delayedArray = FALSE, class = c("Matrix", "matrix"), rowNamesDedup = TRUE )
alevinDir |
Character. The output directory of salmon-Alevin pipeline.
It should contain subfolder named 'alevin', which contains the count data
which is stored
in 'quants_mat.gz'. Default |
sampleName |
Character. A user-defined sample name for the sample to be imported. The 'sampleName' will be appended to the begining of cell barcodes. Default is 'sample'. |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
class |
Character. The class of the expression matrix stored in the SCE object. Can be one of "Matrix" (as returned by readMM function), or "matrix" (as returned by matrix function). Default "Matrix". |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
A SingleCellExperiment
object containing the count
matrix, the feature annotations, and the cell annotation
(which includes QC metrics stored in 'featureDump.txt').
This function reads in one or more Python AnnData files in the .h5ad format and returns a single SingleCellExperiment object containing all the AnnData samples by concatenating their counts matrices and related information slots.
importAnnData( sampleDirs = NULL, sampleNames = NULL, delayedArray = FALSE, class = c("Matrix", "matrix"), rowNamesDedup = TRUE )
importAnnData( sampleDirs = NULL, sampleNames = NULL, delayedArray = FALSE, class = c("Matrix", "matrix"), rowNamesDedup = TRUE )
sampleDirs |
Folder containing the .h5ad file. Can be one of -
|
sampleNames |
The prefix/name of the .h5ad file without the .h5ad extension
e.g. if 'sample.h5ad' is the filename, pass
|
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object. Default |
class |
Character. The class of the expression matrix stored in the SCE
object. Can be one of "Matrix" (as returned by
readMM function), or "matrix" (as returned by
matrix function). Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
importAnnData
converts scRNA-seq data in the AnnData format to the
SingleCellExperiment
object. The .X slot in AnnData is transposed to the features x cells
format and becomes the 'counts' matrix in the assay slot. The .vars AnnData slot becomes the SCE rowData
and the .obs AnnData slot becomes the SCE colData. Multidimensional data in the .obsm AnnData slot is
ported over to the SCE reducedDims slot. Additionally, unstructured data in the .uns AnnData slot is
available through the SCE metadata slot.
There are 2 currently known minor issues -
Anndata python module depends on another python module h5pyto read hd5 format files.
If there are errors reading the .h5ad files, such as "ValueError: invalid shape in fixed-type tuple."
the user will need to do downgrade h5py by running pip3 install --user h5py==2.9.0
Additionally there might be errors in converting some python objects in the unstructured data slots.
There are no known R solutions at present. Refer https://github.com/rstudio/reticulate/issues/209
A SingleCellExperiment
object.
file.path <- system.file("extdata/annData_pbmc_3k", package = "singleCellTK") ## Not run: sce <- importAnnData(sampleDirs = file.path, sampleNames = 'pbmc3k_20by20') ## End(Not run)
file.path <- system.file("extdata/annData_pbmc_3k", package = "singleCellTK") ## Not run: sce <- importAnnData(sampleDirs = file.path, sampleNames = 'pbmc3k_20by20') ## End(Not run)
Read the barcodes, features (genes), and matrix from BUStools output. Import them as one SingleCellExperiment object. Note the cells in the output files for BUStools 0.39.4 are not filtered.
importBUStools( BUStoolsDirs, samples, matrixFileNames = "genes.mtx", featuresFileNames = "genes.genes.txt", barcodesFileNames = "genes.barcodes.txt", gzipped = "auto", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
importBUStools( BUStoolsDirs, samples, matrixFileNames = "genes.mtx", featuresFileNames = "genes.genes.txt", barcodesFileNames = "genes.barcodes.txt", gzipped = "auto", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
BUStoolsDirs |
A vector of paths to BUStools output files. Each sample
should have its own path. For example: |
samples |
A vector of user-defined sample names for the samples to be
imported. Must have the same length as |
matrixFileNames |
Filenames for the Market Exchange Format (MEX) sparse
matrix files (.mtx files). Must have length 1 or the same
length as |
featuresFileNames |
Filenames for the feature annotation files.
Must have length 1 or the same length as |
barcodesFileNames |
Filenames for the cell barcode list file.
Must have length 1 or the same length as |
gzipped |
Boolean. |
class |
Character. The class of the expression matrix stored in the SCE object. Can be one of "Matrix" (as returned by readMM function), or "matrix" (as returned by matrix function). Default "Matrix". |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray-class object or not. Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
A SingleCellExperiment
object containing the count
matrix, the gene annotation, and the cell annotation.
# Example #1 # FASTQ files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0 # /pbmc_1k_v3 # They were concatenated as follows: # cat pbmc_1k_v3_S1_L001_R1_001.fastq.gz pbmc_1k_v3_S1_L002_R1_001.fastq.gz > # pbmc_1k_v3_R1.fastq.gz # cat pbmc_1k_v3_S1_L001_R2_001.fastq.gz pbmc_1k_v3_S1_L002_R2_001.fastq.gz > # pbmc_1k_v3_R2.fastq.gz # The following BUStools command generates the gene, cell, and # matrix files # bustools correct -w ./3M-february-2018.txt -p output.bus | \ # bustools sort -T tmp/ -t 4 -p - | \ # bustools count -o genecount/genes \ # -g ./transcripts_to_genes.txt \ # -e matrix.ec \ # -t transcripts.txt \ # --genecounts - # The top 20 genes and the first 20 cells are included in this example. sce <- importBUStools( BUStoolsDirs = system.file("extdata/BUStools_PBMC_1k_v3_20x20/genecount/", package = "singleCellTK"), samples = "PBMC_1k_v3_20x20")
# Example #1 # FASTQ files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0 # /pbmc_1k_v3 # They were concatenated as follows: # cat pbmc_1k_v3_S1_L001_R1_001.fastq.gz pbmc_1k_v3_S1_L002_R1_001.fastq.gz > # pbmc_1k_v3_R1.fastq.gz # cat pbmc_1k_v3_S1_L001_R2_001.fastq.gz pbmc_1k_v3_S1_L002_R2_001.fastq.gz > # pbmc_1k_v3_R2.fastq.gz # The following BUStools command generates the gene, cell, and # matrix files # bustools correct -w ./3M-february-2018.txt -p output.bus | \ # bustools sort -T tmp/ -t 4 -p - | \ # bustools count -o genecount/genes \ # -g ./transcripts_to_genes.txt \ # -e matrix.ec \ # -t transcripts.txt \ # --genecounts - # The top 20 genes and the first 20 cells are included in this example. sce <- importBUStools( BUStoolsDirs = system.file("extdata/BUStools_PBMC_1k_v3_20x20/genecount/", package = "singleCellTK"), samples = "PBMC_1k_v3_20x20")
Read the filtered barcodes, features, and matrices for all samples from (preferably a single run of) Cell Ranger output. Import and combine them as one big SingleCellExperiment object.
importCellRanger( cellRangerDirs = NULL, sampleDirs = NULL, sampleNames = NULL, cellRangerOuts = NULL, dataType = c("filtered", "raw"), matrixFileNames = "matrix.mtx.gz", featuresFileNames = "features.tsv.gz", barcodesFileNames = "barcodes.tsv.gz", gzipped = "auto", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE ) importCellRangerV2( cellRangerDirs = NULL, sampleDirs = NULL, sampleNames = NULL, dataTypeV2 = c("filtered", "raw"), class = c("Matrix", "matrix"), delayedArray = FALSE, reference = NULL, cellRangerOutsV2 = NULL, rowNamesDedup = TRUE ) importCellRangerV3( cellRangerDirs = NULL, sampleDirs = NULL, sampleNames = NULL, dataType = c("filtered", "raw"), class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
importCellRanger( cellRangerDirs = NULL, sampleDirs = NULL, sampleNames = NULL, cellRangerOuts = NULL, dataType = c("filtered", "raw"), matrixFileNames = "matrix.mtx.gz", featuresFileNames = "features.tsv.gz", barcodesFileNames = "barcodes.tsv.gz", gzipped = "auto", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE ) importCellRangerV2( cellRangerDirs = NULL, sampleDirs = NULL, sampleNames = NULL, dataTypeV2 = c("filtered", "raw"), class = c("Matrix", "matrix"), delayedArray = FALSE, reference = NULL, cellRangerOutsV2 = NULL, rowNamesDedup = TRUE ) importCellRangerV3( cellRangerDirs = NULL, sampleDirs = NULL, sampleNames = NULL, dataType = c("filtered", "raw"), class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
cellRangerDirs |
The root directories where Cell Ranger was run. These
folders should contain sample specific folders. Default |
sampleDirs |
Default
The cells in the final SCE object will be ordered in the same order of
|
sampleNames |
A vector of user-defined sample names for the samples
to be
imported. Must have the same length as |
cellRangerOuts |
Character vector. The intermediate
paths to filtered or raw cell barcode, feature, and matrix files
for each sample. Supercedes |
dataType |
Character. The type of data to import. Can be one of
"filtered" (which is equivalent to
|
matrixFileNames |
Character vector. Filenames for the Market Exchange
Format (MEX) sparse matrix files (matrix.mtx or matrix.mtx.gz files).
Must have length 1 or the same
length as |
featuresFileNames |
Character vector. Filenames for the feature
annotation files. They are usually named features.tsv.gz or
genes.tsv. Must have length 1 or the same
length as |
barcodesFileNames |
Character vector. Filename for the cell barcode
list files. They are usually named barcodes.tsv.gz or
barcodes.tsv. Must have length 1 or the same
length as |
gzipped |
|
class |
Character. The class of the expression matrix stored in the SCE
object. Can be one of "Matrix" (as returned by
readMM function), or "matrix" (as returned by
matrix function). Default |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
dataTypeV2 |
Character. The type of output to import for
Cellranger version below 3.0.0. Whether to import the filtered or the
raw data. Can be one of 'filtered' or 'raw'. Default 'filtered'. When
|
reference |
Character vector. The reference genome names.
Default |
cellRangerOutsV2 |
Character vector. The intermediate paths
to filtered or raw cell barcode, feature, and matrix files for each
sample for Cellranger version below 3.0.0. If |
importCellRangerV2
imports output from Cell Ranger V2.
importCellRangerV2Sample
imports output from one sample from Cell
Ranger V2.
importCellRangerV3
imports output from Cell Ranger V3.
importCellRangerV3
imports output from one sample from Cell Ranger
V3.
Some implicit
assumptions which match the output structure of Cell Ranger V2 & V3
are made in these 4 functions including cellRangerOuts
,
matrixFileName
, featuresFileName
, barcodesFileName
,
and gzipped
.
Alternatively, user can call importCellRanger
to explicitly
specify these arguments.
A SingleCellExperiment
object containing the combined count
matrix, the feature annotations, and the cell annotation.
# Example #1 # The following filtered feature, cell, and matrix files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/ # 3.0.0/hgmm_1k_v3 # The top 10 hg19 & mm10 genes are included in this example. # Only the first 20 cells are included. sce <- importCellRanger( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "hgmm_1k_v3_20x20", sampleNames = "hgmm1kv3", dataType = "filtered") # The following filtered feature, cell, and matrix files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/ # 2.1.0/pbmc4k # Top 20 genes are kept. 20 cell barcodes are extracted. sce <- importCellRangerV2( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "pbmc_4k_v2_20x20", sampleNames = "pbmc4k_20", reference = 'GRCh38', dataTypeV2 = "filtered") sce <- importCellRangerV3( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "hgmm_1k_v3_20x20", sampleNames = "hgmm1kv3", dataType = "filtered")
# Example #1 # The following filtered feature, cell, and matrix files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/ # 3.0.0/hgmm_1k_v3 # The top 10 hg19 & mm10 genes are included in this example. # Only the first 20 cells are included. sce <- importCellRanger( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "hgmm_1k_v3_20x20", sampleNames = "hgmm1kv3", dataType = "filtered") # The following filtered feature, cell, and matrix files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/ # 2.1.0/pbmc4k # Top 20 genes are kept. 20 cell barcodes are extracted. sce <- importCellRangerV2( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "pbmc_4k_v2_20x20", sampleNames = "pbmc4k_20", reference = 'GRCh38', dataTypeV2 = "filtered") sce <- importCellRangerV3( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "hgmm_1k_v3_20x20", sampleNames = "hgmm1kv3", dataType = "filtered")
Read the filtered barcodes, features, and matrices for all samples from Cell Ranger V2 output. Files are assumed to be named "matrix.mtx", "genes.tsv", and "barcodes.tsv".
importCellRangerV2Sample( dataDir = NULL, sampleName = NULL, class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
importCellRangerV2Sample( dataDir = NULL, sampleName = NULL, class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
dataDir |
A path to the directory containing the data files. Default "./". |
sampleName |
A User-defined sample name. This will be prepended to all cell barcode IDs. Default "sample". |
class |
Character. The class of the expression matrix stored in the SCE object. Can be one of "Matrix" (as returned by readMM function), or "matrix" (as returned by matrix function). Default "Matrix". |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
A SingleCellExperiment
object containing the count
matrix, the feature annotations, and the cell annotation for the sample.
sce <- importCellRangerV2Sample( dataDir = system.file("extdata/pbmc_4k_v2_20x20/outs/", "filtered_gene_bc_matrices/GRCh38", package = "singleCellTK"), sampleName = "pbmc4k_20")
sce <- importCellRangerV2Sample( dataDir = system.file("extdata/pbmc_4k_v2_20x20/outs/", "filtered_gene_bc_matrices/GRCh38", package = "singleCellTK"), sampleName = "pbmc4k_20")
Read the filtered barcodes, features, and matrices for all samples from Cell Ranger V3 output. Files are assumed to be named "matrix.mtx.gz", "features.tsv.gz", and "barcodes.tsv.gz".
importCellRangerV3Sample( dataDir = "./", sampleName = "sample", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
importCellRangerV3Sample( dataDir = "./", sampleName = "sample", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
dataDir |
A path to the directory containing the data files. Default "./". |
sampleName |
A User-defined sample name. This will be prepended to all cell barcode IDs. Default "sample". |
class |
Character. The class of the expression matrix stored in the SCE object. Can be one of "Matrix" (as returned by readMM function), or "matrix" (as returned by matrix function). Default "Matrix". |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
A SingleCellExperiment
object containing the count
matrix, the feature annotations, and the cell annotation for the sample.
sce <- importCellRangerV3Sample( dataDir = system.file("extdata/hgmm_1k_v3_20x20/outs/", "filtered_feature_bc_matrix", package = "singleCellTK"), sampleName = "hgmm1kv3")
sce <- importCellRangerV3Sample( dataDir = system.file("extdata/hgmm_1k_v3_20x20/outs/", "filtered_feature_bc_matrix", package = "singleCellTK"), sampleName = "hgmm1kv3")
imports the RDS file created by DropEst (https://github.com/hms-dbmi/dropEst) and create a SingleCellExperiment object from either the raw or filtered counts matrix. Additionally parse through the RDS to obtain appropriate feature annotations as SCE coldata, in addition to any metadata.
importDropEst( sampleDirs = NULL, dataType = c("filtered", "raw"), rdsFileName = "cell.counts", sampleNames = NULL, delayedArray = FALSE, class = c("Matrix", "matrix"), rowNamesDedup = TRUE )
importDropEst( sampleDirs = NULL, dataType = c("filtered", "raw"), rdsFileName = "cell.counts", sampleNames = NULL, delayedArray = FALSE, class = c("Matrix", "matrix"), rowNamesDedup = TRUE )
sampleDirs |
A path to the directory containing the data files. Default "./". |
dataType |
can be "filtered" or "raw". Default |
rdsFileName |
File name prefix of the DropEst RDS output. default is "cell.counts" |
sampleNames |
A User-defined sample name. This will be prepended to all cell barcode IDs. Default "sample". |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
class |
Character. The class of the expression matrix stored in the SCE
object. Can be one of "Matrix" (as returned by
readMM function), or "matrix" (as returned by
matrix function). Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
importDropEst
expects either raw counts matrix stored as "cm_raw" or filtered
counts matrix stored as "cm" in the DropEst rds output.
ColData is obtained from the DropEst corresponding to "mean_reads_per_umi","aligned_reads_per_cell",
"aligned_umis_per_cell","requested_umis_per_cb","requested_reads_per_cb"
If using filtered counts matrix, the colData dataframe is
subset to contain features from the filtered counts matrix alone.
If any annotations of ("saturation_info","merge_targets","reads_per_umi_per_cell") are
found in the DropEst rds, they will be added to the SCE metadata field
A SingleCellExperiment
object containing the count matrix,
the feature annotations from DropEst as ColData, and any metadata from DropEst
# Example results were generated as per instructions from the developers of dropEst described in # https://github.com/hms-dbmi/dropEst/blob/master/examples/EXAMPLES.md sce <- importDropEst(sampleDirs = system.file("extdata/dropEst_scg71", package = "singleCellTK"), sampleNames = 'scg71')
# Example results were generated as per instructions from the developers of dropEst described in # https://github.com/hms-dbmi/dropEst/blob/master/examples/EXAMPLES.md sce <- importDropEst(sampleDirs = system.file("extdata/dropEst_scg71", package = "singleCellTK"), sampleNames = 'scg71')
Retrieves published example datasets stored in SingleCellExperiment using the scRNAseq and TENxPBMCData packages. See 'Details' for a list of available datasets.
importExampleData( dataset, class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
importExampleData( dataset, class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
dataset |
Character. Name of the dataset to retrieve. |
class |
Character. The class of the expression matrix stored in the SCE
object. Can be one of |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
See the list below for the available datasets and their descriptions.
Retrieved with
ReprocessedFluidigmData
. Returns a dataset of 65
human neural cells from Pollen et al. (2014), each sequenced at high and low
coverage (SRA accession SRP041736).
Retrieved with
ReprocessedAllenData
. Returns a dataset of 379 mouse
brain cells from Tasic et al. (2016).
Retrieved with
NestorowaHSCData
. Returns a dataset of 1920 mouse
haematopoietic stem cells from Nestorowa et al. 2015
Retrieved with TENxPBMCData
.
2,700 peripheral blood mononuclear cells (PBMCs) from 10X Genomics.
Retrieved with TENxPBMCData
.
4,340 peripheral blood mononuclear cells (PBMCs) from 10X Genomics.
Retrieved with TENxPBMCData
.
5,419 peripheral blood mononuclear cells (PBMCs) from 10X Genomics.
Retrieved with TENxPBMCData
.
8,381 peripheral blood mononuclear cells (PBMCs) from 10X Genomics.
Retrieved with TENxPBMCData
.
33,148 peripheral blood mononuclear cells (PBMCs) from 10X Genomics.
Retrieved with TENxPBMCData
.
68,579 peripheral blood mononuclear cells (PBMCs) from 10X Genomics.
The specified SingleCellExperiment object.
Joshua D. Campbell, David Jenkins
sce <- importExampleData("pbmc3k")
sce <- importExampleData("pbmc3k")
Create a SingleCellExperiment object from files
importFromFiles( assayFile, annotFile = NULL, featureFile = NULL, assayName = "counts", inputDataFrames = FALSE, class = c("Matrix", "matrix"), delayedArray = FALSE, annotFileHeader = FALSE, annotFileRowName = 1, annotFileSep = "\t", featureHeader = FALSE, featureRowName = 1, featureSep = "\t", gzipped = "auto", rowNamesDedup = TRUE )
importFromFiles( assayFile, annotFile = NULL, featureFile = NULL, assayName = "counts", inputDataFrames = FALSE, class = c("Matrix", "matrix"), delayedArray = FALSE, annotFileHeader = FALSE, annotFileRowName = 1, annotFileSep = "\t", featureHeader = FALSE, featureRowName = 1, featureSep = "\t", gzipped = "auto", rowNamesDedup = TRUE )
assayFile |
The path to a file in .mtx, .txt, .csv, .tab, or .tsv format. |
annotFile |
The path to a text file that contains columns of annotation
information for each cell in the |
featureFile |
The path to a text file that contains columns of
annotation information for each gene in the count matrix. This file should
have the same genes in the same order as |
assayName |
The name of the assay that you are uploading. The default
is |
inputDataFrames |
If |
class |
Character. The class of the expression matrix stored in the SCE
object. Can be one of |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
annotFileHeader |
Whether there's a header (colnames) in the cell
annotation file. Default is |
annotFileRowName |
Which column is used as the rownames for the cell
annotation file. This should match to the colnames of the |
annotFileSep |
Separater used for the cell annotation file. Default is
|
featureHeader |
Whether there's a header (colnames) in the feature
annotation file. Default is |
featureRowName |
Which column is used as the rownames for the feature
annotation file. This should match to the rownames of the |
featureSep |
Separater used for the feature annotation file. Default is
|
gzipped |
Whether the input file is gzipped. Default is |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
Creates a SingleCellExperiment object from a counts file in various formats, and files of cell and feature annotation.
a SingleCellExperiment object
Converts a list of gene sets stored in a GeneSetCollection object and stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.
importGeneSetsFromCollection( inSCE, geneSetCollection, collectionName = "GeneSetCollection", by = "rownames", noMatchError = TRUE )
importGeneSetsFromCollection( inSCE, geneSetCollection, collectionName = "GeneSetCollection", by = "rownames", noMatchError = TRUE )
inSCE |
Input SingleCellExperiment object. |
geneSetCollection |
A GeneSetCollection object. See GeneSetCollection for more details. |
collectionName |
Character. Name of collection to add gene sets to.
If this collection already exists in |
by |
Character, character vector, or NULL. Describes the
location within |
noMatchError |
Boolean. Show an error if a collection does not have
any matching features. Default |
The gene identifiers in gene sets in the
GeneSetCollection
will be mapped to the rownames of
inSCE
using the by
parameter and
stored in a GeneSetCollection object from package
GSEABase. This object is stored in
metadata(inSCE)$sctk$genesets
, which can be accessed in downstream
analysis functions such as runCellQC.
A SingleCellExperiment object
with gene set from collectionName
output stored to the
metadata slot.
Joshua D. Campbell
importGeneSetsFromList for importing from lists, importGeneSetsFromGMT for importing from GMT files, and importGeneSetsFromMSigDB for importing MSigDB gene sets.
data(scExample) gs1 <- GSEABase::GeneSet(setName = "geneset1", geneIds = rownames(sce)[seq(10)]) gs2 <- GSEABase::GeneSet(setName = "geneset2", geneIds = rownames(sce)[seq(11,20)]) gsc <- GSEABase::GeneSetCollection(list(gs1, gs2)) sce <- importGeneSetsFromCollection(inSCE = sce, geneSetCollection = gsc, by = "rownames")
data(scExample) gs1 <- GSEABase::GeneSet(setName = "geneset1", geneIds = rownames(sce)[seq(10)]) gs2 <- GSEABase::GeneSet(setName = "geneset2", geneIds = rownames(sce)[seq(11,20)]) gsc <- GSEABase::GeneSetCollection(list(gs1, gs2)) sce <- importGeneSetsFromCollection(inSCE = sce, geneSetCollection = gsc, by = "rownames")
Converts a list of gene sets stored in a GMT file into a GeneSetCollection and stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.
importGeneSetsFromGMT( inSCE, file, collectionName = "GeneSetCollection", by = "rownames", sep = "\t", noMatchError = TRUE )
importGeneSetsFromGMT( inSCE, file, collectionName = "GeneSetCollection", by = "rownames", sep = "\t", noMatchError = TRUE )
inSCE |
Input SingleCellExperiment object. |
file |
Character. Path to GMT file. See getGmt for more information on reading GMT files. |
collectionName |
Character. Name of collection to add gene sets to.
If this collection already exists in |
by |
Character, character vector, or NULL. Describes the
location within |
sep |
Character. Delimiter of the GMT file. Default |
noMatchError |
Boolean. Show an error if a collection does not have
any matching features. Default |
The gene identifiers in gene sets in the GMT file will be
mapped to the rownames of inSCE
using the by
parameter and
stored in a GeneSetCollection object from package
GSEABase. This object is stored in
metadata(inSCE)$sctk$genesets
, which can be accessed in downstream
analysis functions such as runCellQC.
A SingleCellExperiment object
with gene set from collectionName
output stored to the
metadata slot.
Joshua D. Campbell
importGeneSetsFromList for importing from lists, importGeneSetsFromCollection for importing from GeneSetCollection objects, and importGeneSetsFromMSigDB for importing MSigDB gene sets.
data(scExample) # GMT file containing gene symbols for a subset of human mitochondrial genes gmt <- system.file("extdata/mito_subset.gmt", package = "singleCellTK") # "feature_name" is the second column in the GMT file, so the ids will # be mapped using this column in the 'rowData' of 'sce'. This # could also be accomplished by setting by = "feature_name" in the # function call. sce <- importGeneSetsFromGMT(inSCE = sce, file = gmt, by = NULL)
data(scExample) # GMT file containing gene symbols for a subset of human mitochondrial genes gmt <- system.file("extdata/mito_subset.gmt", package = "singleCellTK") # "feature_name" is the second column in the GMT file, so the ids will # be mapped using this column in the 'rowData' of 'sce'. This # could also be accomplished by setting by = "feature_name" in the # function call. sce <- importGeneSetsFromGMT(inSCE = sce, file = gmt, by = NULL)
Converts a list of gene sets into a GeneSetCollection and stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.
importGeneSetsFromList( inSCE, geneSetList, collectionName = "GeneSetCollection", by = "rownames", noMatchError = TRUE )
importGeneSetsFromList( inSCE, geneSetList, collectionName = "GeneSetCollection", by = "rownames", noMatchError = TRUE )
inSCE |
Input SingleCellExperiment object. |
geneSetList |
Named List. A list containing one or more gene sets. Each element of the list should be a character vector of gene identifiers. The names of the list will be become the gene set names in the GeneSetCollection object. |
collectionName |
Character. Name of collection to add gene sets to.
If this collection already exists in |
by |
Character or character vector. Describes the
location within |
noMatchError |
Boolean. Show an error if a collection does not have
any matching features. Default |
The gene identifiers in gene sets in geneSetList
will be
mapped to the rownames of inSCE
using the by
parameter and
stored in a GeneSetCollection object from package
GSEABase. This object is stored in
metadata(inSCE)$sctk$genesets
, which can be accessed in downstream
analysis functions such as runCellQC.
A SingleCellExperiment object
with gene set from collectionName
output stored to the
metadata slot.
Joshua D. Campbell
importGeneSetsFromCollection for importing from GeneSetCollection objects, importGeneSetsFromGMT for importing from GMT files, and importGeneSetsFromMSigDB for importing MSigDB gene sets.
data(scExample) # Generate gene sets from 'rownames' gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce, geneSetList = gs, by = "rownames") # Generate a gene set for mitochondrial genes using # Gene Symbols stored in 'rowData' mito.ix <- grep("^MT-", rowData(sce)$feature_name) mito <- list(mito = rowData(sce)$feature_name[mito.ix]) sce <- importGeneSetsFromList(inSCE = sce, geneSetList = mito, by = "feature_name")
data(scExample) # Generate gene sets from 'rownames' gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce, geneSetList = gs, by = "rownames") # Generate a gene set for mitochondrial genes using # Gene Symbols stored in 'rowData' mito.ix <- grep("^MT-", rowData(sce)$feature_name) mito <- list(mito = rowData(sce)$feature_name[mito.ix]) sce <- importGeneSetsFromList(inSCE = sce, geneSetList = mito, by = "feature_name")
Gets a list of MSigDB gene sets stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.
importGeneSetsFromMSigDB( inSCE, categoryIDs = "H", species = "Homo sapiens", mapping = c("gene_symbol", "human_gene_symbol", "entrez_gene"), by = "rownames", verbose = TRUE, noMatchError = TRUE )
importGeneSetsFromMSigDB( inSCE, categoryIDs = "H", species = "Homo sapiens", mapping = c("gene_symbol", "human_gene_symbol", "entrez_gene"), by = "rownames", verbose = TRUE, noMatchError = TRUE )
inSCE |
Input SingleCellExperiment object. |
categoryIDs |
Character vector containing the MSigDB gene set ids.
The column |
species |
Character. Species available can be found using the function
|
mapping |
Character. One of "gene_symbol", "human_gene_symbol", or
"entrez_gene". Gene identifiers to be used for MSigDB gene sets. IDs
denoted by the |
by |
Character. Describes the
location within |
verbose |
Boolean. Whether to display progress. Default |
noMatchError |
Boolean. Show an error if a collection does not have
any matching features. Default |
The gene identifiers in gene sets from MSigDB will be retrieved
using the msigdbr
package. They will be mapped to the IDs in
inSCE
using the by
parameter and
stored in a GeneSetCollection object from package
GSEABase. This object is stored in
metadata(inSCE)$sctk$genesets
, which can be accessed in downstream
analysis functions such as runCellQC.
A SingleCellExperiment object
with gene set from collectionName
output stored to the
metadata slot.
Joshua D. Campbell
importGeneSetsFromList for importing from lists, importGeneSetsFromGMT for importing from GMT files, and GeneSetCollection objects.
data(scExample) sce <- importGeneSetsFromMSigDB(inSCE = sce, categoryIDs = "H", species = "Homo sapiens", mapping = "gene_symbol", by = "feature_name")
data(scExample) sce <- importGeneSetsFromMSigDB(inSCE = sce, categoryIDs = "H", species = "Homo sapiens", mapping = "gene_symbol", by = "feature_name")
Imports mitochondrial gene sets and stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.
importMitoGeneSet( inSCE, reference = "human", id = "ensembl", by = "rownames", collectionName = "mito", noMatchError = TRUE )
importMitoGeneSet( inSCE, reference = "human", id = "ensembl", by = "rownames", collectionName = "mito", noMatchError = TRUE )
inSCE |
Input SingleCellExperiment object. |
reference |
Character. Species available are "human" and "mouse". |
id |
Types of gene id. Now it supports "symbol", "entrez", "ensembl" and "ensemblTranscriptID". |
by |
Character. Describes the location within |
collectionName |
Character. Name of collection to add gene sets to.
If this collection already exists in |
noMatchError |
Boolean. Show an error if a collection does not have
any matching features. Default |
The gene identifiers of mitochondrial genes will be loaded with
"data(AllMito)". Currently, it supports human and mouse references.
Also, it supports entrez ID, gene symbol, ensemble ID and ensemble transcript ID.
They will be mapped to the IDs in inSCE
using the by
parameter and
stored in a GeneSetCollection object from package
GSEABase. This object is stored in
metadata(inSCE)$sctk$genesets
, which can be accessed in downstream
analysis functions such as runCellQC.
A SingleCellExperiment object
with gene set from collectionName
output stored to the
metadata slot.
Rui Hong
importGeneSetsFromList for importing from lists, importGeneSetsFromGMT for importing from GMT files, and GeneSetCollection objects.
data(scExample) sce <- importMitoGeneSet(inSCE = sce, reference = "human", id = "ensembl", collectionName = "human_mito", by = "rownames")
data(scExample) sce <- importMitoGeneSet(inSCE = sce, reference = "human", id = "ensembl", collectionName = "human_mito", by = "rownames")
Imports samples from different sources and compiles them into a list of SCE objects
importMultipleSources(allImportEntries, delayedArray = FALSE)
importMultipleSources(allImportEntries, delayedArray = FALSE)
allImportEntries |
object containing the sources and parameters of all the samples being imported (from the UI) |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
A list of SingleCellExperiment object containing the droplet or cell data or both,depending on the dataType that users provided.
Read the barcodes, features (genes), and matrices from Optimus outputs. Import them as one SingleCellExperiment object.
importOptimus( OptimusDirs, samples, matrixLocation = "call-MergeCountFiles/sparse_counts.npz", colIndexLocation = "call-MergeCountFiles/sparse_counts_col_index.npy", rowIndexLocation = "call-MergeCountFiles/sparse_counts_row_index.npy", cellMetricsLocation = "call-MergeCellMetrics/merged-cell-metrics.csv.gz", geneMetricsLocation = "call-MergeGeneMetrics/merged-gene-metrics.csv.gz", emptyDropsLocation = "call-RunEmptyDrops/empty_drops_result.csv", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
importOptimus( OptimusDirs, samples, matrixLocation = "call-MergeCountFiles/sparse_counts.npz", colIndexLocation = "call-MergeCountFiles/sparse_counts_col_index.npy", rowIndexLocation = "call-MergeCountFiles/sparse_counts_row_index.npy", cellMetricsLocation = "call-MergeCellMetrics/merged-cell-metrics.csv.gz", geneMetricsLocation = "call-MergeGeneMetrics/merged-gene-metrics.csv.gz", emptyDropsLocation = "call-RunEmptyDrops/empty_drops_result.csv", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
OptimusDirs |
A vector of root directories of Optimus output files.
The paths should be something like this:
|
samples |
A vector of user-defined sample names for the sample to be
imported. Must have the same length as |
matrixLocation |
Character. It is the intermediate
path to the filtered count maxtrix file saved in sparse matrix format
( |
colIndexLocation |
Character. The intermediate path to the barcode
index file. Default |
rowIndexLocation |
Character. The intermediate path to the feature
(gene) index file. Default
|
cellMetricsLocation |
Character. It is the intermediate
path to the cell metrics file ( |
geneMetricsLocation |
Character. It is the intermediate
path to the feature (gene) metrics file ( |
emptyDropsLocation |
Character. It is the intermediate
path to emptyDrops metrics file
( |
class |
Character. The class of the expression matrix stored in the SCE object. Can be one of "Matrix" (as returned by readMM function), or "matrix" (as returned by matrix function). Default "Matrix". |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
A SingleCellExperiment object containing the count matrix, the gene annotation, and the cell annotation.
file.path <- system.file("extdata/Optimus_20x1000", package = "singleCellTK") ## Not run: sce <- importOptimus(OptimusDirs = file.path, samples = "Optimus_20x1000") ## End(Not run)
file.path <- system.file("extdata/Optimus_20x1000", package = "singleCellTK") ## Not run: sce <- importOptimus(OptimusDirs = file.path, samples = "Optimus_20x1000") ## End(Not run)
Read the filtered barcodes, features, and matrices for all samples from (preferably a single run of) seqc output. Import and combine them as one big SingleCellExperiment object.
importSEQC( seqcDirs = NULL, samples = NULL, prefix = NULL, gzipped = FALSE, class = c("Matrix", "matrix"), delayedArray = FALSE, cbNotFirstCol = TRUE, feNotFirstCol = TRUE, combinedSample = TRUE, rowNamesDedup = TRUE )
importSEQC( seqcDirs = NULL, samples = NULL, prefix = NULL, gzipped = FALSE, class = c("Matrix", "matrix"), delayedArray = FALSE, cbNotFirstCol = TRUE, feNotFirstCol = TRUE, combinedSample = TRUE, rowNamesDedup = TRUE )
seqcDirs |
A vector of paths to seqc output files. Each sample
should have its own path. For example: |
samples |
A vector of user-defined sample names for the samples to be
imported. Must have the same length as |
prefix |
A vector containing the prefix of file names within each sample directory. It cannot be null and the vector should have the same length as samples. |
gzipped |
Boolean. |
class |
Character. The class of the expression matrix stored in the SCE
object. Can be one of |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
cbNotFirstCol |
Boolean. |
feNotFirstCol |
Boolean. |
combinedSample |
Boolean. If |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Only applied
if |
importSEQC
imports output from seqc. The default
sparse_counts_barcode.csv or sparse_counts_genes.csv from seqc output
contains two columns. The first column is row index and the second column
is cell-barcode or gene symbol. importSEQC
will remove first column.
Alternatively, user can call
cbNotFirstCol
or feNotFirstCol
as FALSE to keep the first
column of these files. When combinedSample
is TRUE, importSEQC
will combined count matrix with genes detected in at least one sample.
A SingleCellExperiment
object containing the combined count
matrix, the feature annotations, and the cell annotation.
# Example #1 # The following filtered feature, cell, and matrix files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/ # 3.0.0/pbmc_1k_v3 # The top 50 hg38 genes are included in this example. # Only the top 50 cells are included. sce <- importSEQC( seqcDirs = system.file("extdata/pbmc_1k_50x50", package = "singleCellTK"), samples = "pbmc_1k_50x50", prefix = "pbmc_1k", combinedSample = FALSE)
# Example #1 # The following filtered feature, cell, and matrix files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/ # 3.0.0/pbmc_1k_v3 # The top 50 hg38 genes are included in this example. # Only the top 50 cells are included. sce <- importSEQC( seqcDirs = system.file("extdata/pbmc_1k_50x50", package = "singleCellTK"), samples = "pbmc_1k_50x50", prefix = "pbmc_1k", combinedSample = FALSE)
Read the barcodes, features (genes), and matrices from STARsolo outputs. Import them as one SingleCellExperiment object.
importSTARsolo( STARsoloDirs, samples, STARsoloOuts = c("Gene", "GeneFull"), matrixFileNames = "matrix.mtx", featuresFileNames = "features.tsv", barcodesFileNames = "barcodes.tsv", gzipped = "auto", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
importSTARsolo( STARsoloDirs, samples, STARsoloOuts = c("Gene", "GeneFull"), matrixFileNames = "matrix.mtx", featuresFileNames = "features.tsv", barcodesFileNames = "barcodes.tsv", gzipped = "auto", class = c("Matrix", "matrix"), delayedArray = FALSE, rowNamesDedup = TRUE )
STARsoloDirs |
A vector of root directories of STARsolo output files.
The paths should be something like this:
/PATH/TO/prefixSolo.out. For example: |
samples |
A vector of user-defined sample names for the sample to be
imported. Must have the same length as |
STARsoloOuts |
Character. The intermediate
folder to filtered or raw cell barcode, feature, and matrix files
for each of |
matrixFileNames |
Filenames for the Market Exchange Format (MEX) sparse
matrix file (.mtx file). Must have length 1 or the same
length as |
featuresFileNames |
Filenames for the feature annotation file.
Must have length 1 or the same
length as |
barcodesFileNames |
Filenames for the cell barcode list file.
Must have length 1 or the same
length as |
gzipped |
Boolean. |
class |
Character. The class of the expression matrix stored in the SCE object. Can be one of "Matrix" (as returned by readMM function), or "matrix" (as returned by matrix function). Default "Matrix". |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
rowNamesDedup |
Boolean. Whether to deduplicate rownames. Default
|
A SingleCellExperiment
object containing the count
matrix, the gene annotation, and the cell annotation.
# Example #1 # FASTQ files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0 # /pbmc_1k_v3 # They were concatenated as follows: # cat pbmc_1k_v3_S1_L001_R1_001.fastq.gz pbmc_1k_v3_S1_L002_R1_001.fastq.gz > # pbmc_1k_v3_R1.fastq.gz # cat pbmc_1k_v3_S1_L001_R2_001.fastq.gz pbmc_1k_v3_S1_L002_R2_001.fastq.gz > # pbmc_1k_v3_R2.fastq.gz # The following STARsolo command generates the filtered feature, cell, and # matrix files # STAR \ # --genomeDir ./index \ # --readFilesIn ./pbmc_1k_v3_R2.fastq.gz \ # ./pbmc_1k_v3_R1.fastq.gz \ # --readFilesCommand zcat \ # --outSAMtype BAM Unsorted \ # --outBAMcompression -1 \ # --soloType CB_UMI_Simple \ # --soloCBwhitelist ./737K-august-2016.txt \ # --soloUMIlen 12 # The top 20 genes and the first 20 cells are included in this example. sce <- importSTARsolo( STARsoloDirs = system.file("extdata/STARsolo_PBMC_1k_v3_20x20", package = "singleCellTK"), samples = "PBMC_1k_v3_20x20")
# Example #1 # FASTQ files were downloaded from # https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0 # /pbmc_1k_v3 # They were concatenated as follows: # cat pbmc_1k_v3_S1_L001_R1_001.fastq.gz pbmc_1k_v3_S1_L002_R1_001.fastq.gz > # pbmc_1k_v3_R1.fastq.gz # cat pbmc_1k_v3_S1_L001_R2_001.fastq.gz pbmc_1k_v3_S1_L002_R2_001.fastq.gz > # pbmc_1k_v3_R2.fastq.gz # The following STARsolo command generates the filtered feature, cell, and # matrix files # STAR \ # --genomeDir ./index \ # --readFilesIn ./pbmc_1k_v3_R2.fastq.gz \ # ./pbmc_1k_v3_R1.fastq.gz \ # --readFilesCommand zcat \ # --outSAMtype BAM Unsorted \ # --outBAMcompression -1 \ # --soloType CB_UMI_Simple \ # --soloCBwhitelist ./737K-august-2016.txt \ # --soloUMIlen 12 # The top 20 genes and the first 20 cells are included in this example. sce <- importSTARsolo( STARsoloDirs = system.file("extdata/STARsolo_PBMC_1k_v3_20x20", package = "singleCellTK"), samples = "PBMC_1k_v3_20x20")
Returns significance data from a snapshot.
iterateSimulations( originalData, useAssay = "counts", realLabels, totalReads, cells, iterations )
iterateSimulations( originalData, useAssay = "counts", realLabels, totalReads, cells, iterations )
originalData |
The SingleCellExperiment object storing all assay data from the shiny app. |
useAssay |
Character. The name of the assay to be used for subsampling. |
realLabels |
Character. The name of the condition of interest. Must match a name from sample data. |
totalReads |
Numeric. The total number of reads in the simulated dataset, to be split between all simulated cells. |
cells |
Numeric. The number of virtual cells to simulate. |
iterations |
Numeric. How many times should each experimental design be simulated. |
A matrix of significance information from a snapshot
data("mouseBrainSubsetSCE") res <- iterateSimulations(mouseBrainSubsetSCE, realLabels = "level1class", totalReads = 1000, cells = 10, iterations = 2)
data("mouseBrainSubsetSCE") res <- iterateSimulations(mouseBrainSubsetSCE, realLabels = "level1class", totalReads = 1000, cells = 10, iterations = 2)
Returns a character vector of the tables within the metadata slot of the SingleCellExperiment object.
listSampleSummaryStatsTables(inSCE, ...) ## S4 method for signature 'SingleCellExperiment' listSampleSummaryStatsTables(inSCE, ...)
listSampleSummaryStatsTables(inSCE, ...) ## S4 method for signature 'SingleCellExperiment' listSampleSummaryStatsTables(inSCE, ...)
inSCE |
Input SingleCellExperiment object with saved table within the metadata data. Required. |
... |
Other arguments passed to the function. |
A character vector. Contains a list of summary tables within the SingleCellExperiment object.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- sampleSummaryStats(sce, simple = TRUE, statsName = "qc_table") listSampleSummaryStatsTables(sce)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- sampleSummaryStats(sce, simple = TRUE, statsName = "qc_table") listSampleSummaryStatsTables(sce)
Merges colData of the singleCellExperiment objects obtained from the same dataset which contain differing colData. (i.e. raw data and filtered data)
mergeSCEColData(inSCE1, inSCE2, id1 = "column_name", id2 = "column_name")
mergeSCEColData(inSCE1, inSCE2, id1 = "column_name", id2 = "column_name")
inSCE1 |
Input SingleCellExperiment object. The function will output this singleCellExperiment object with a combined colData from inSCE1 and inSCE2. |
inSCE2 |
Input SingleCellExperiment object. colData from this object will be merged with colData from inSCE1 and loaded into inSCE1. |
id1 |
Character vector. Column in colData of inSCE1 that will be used to combine inSCE1 and inSCE2. Default "column_name" |
id2 |
Character vector. Column in colData of inSCE2 that will be used to combine inSCE1 and inSCE2. Default "column_name" |
SingleCellExperiment object containing combined colData from both singleCellExperiment for samples in inSCE1.
sce1 <- importCellRanger( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "hgmm_1k_v3_20x20", sampleNames = "hgmm1kv3", dataType = "filtered") data(scExample) sce2 <- sce sce <- mergeSCEColData(inSCE1 = sce1, inSCE2 = sce2, id1 = "column_name", id2 = "column_name")
sce1 <- importCellRanger( cellRangerDirs = system.file("extdata/", package = "singleCellTK"), sampleDirs = "hgmm_1k_v3_20x20", sampleNames = "hgmm1kv3", dataType = "filtered") data(scExample) sce2 <- sce sce <- mergeSCEColData(inSCE1 = sce1, inSCE2 = sce2, id1 = "column_name", id2 = "column_name")
A list of gene set that contains mitochondrial genes of multiple reference (hg38, hg19, mm10 and mm9). It contains multiple types of gene identifier: gene symbol, entrez ID, ensemble ID and ensemble transcript ID. It's used for the function 'importMitoGeneSet'.
data("MitoGenes")
data("MitoGenes")
A list
List of mitochondrial genes of multiple reference
data("MitoGenes")
data("MitoGenes")
A subset of 30 cells from a single cell RNA-Seq experiment from Zeisel, et al. Science 2015. The data was produced from cells from the mouse somatosensory cortex (S1) and hippocampus (CA1). 15 of the cells were identified as oligodendrocytes and 15 of the cell were identified as microglia.
data("mouseBrainSubsetSCE")
data("mouseBrainSubsetSCE")
SingleCellExperiment
A subset of 30 cells from a single cell RNA-Seq experiment
DOI: 10.1126/science.aaa1934
data("mouseBrainSubsetSCE")
data("mouseBrainSubsetSCE")
A table of gene set categories that can be download from MSigDB. The categories and descriptions can be found here: https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp. The IDs in the first column can be used to retrieve the gene sets for these categories using the importGeneSetsFromMSigDB function.
data("msigdb_table")
data("msigdb_table")
A data.frame.
A table of gene set categories
data("msigdb_table")
data("msigdb_table")
A wrapper function which visualizes outputs from the
runBarcodeRankDrops
function stored in the metadata
slot of
the SingleCellExperiment object.
plotBarcodeRankDropsResults( inSCE, sample = NULL, defaultTheme = TRUE, dotSize = 0.5, titleSize = 18, axisSize = 15, axisLabelSize = 18, legendSize = 15 )
plotBarcodeRankDropsResults( inSCE, sample = NULL, defaultTheme = TRUE, dotSize = 0.5, titleSize = 18, axisSize = 15, axisLabelSize = 18, legendSize = 15 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
titleSize |
Size of title of plot. Default |
axisSize |
Size of x/y-axis ticks. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
legendSize |
size of legend. Default |
list of .ggplot objects
data(scExample, package = "singleCellTK") sce <- runBarcodeRankDrops(inSCE = sce) plotBarcodeRankDropsResults(inSCE = sce)
data(scExample, package = "singleCellTK") sce <- runBarcodeRankDrops(inSCE = sce) plotBarcodeRankDropsResults(inSCE = sce)
A plotting function which visualizes outputs from the runBarcodeRankDrops function stored in the colData slot of the SingleCellExperiment object via scatterplot.
plotBarcodeRankScatter( inSCE, sample = NULL, defaultTheme = TRUE, dotSize = 0.1, title = NULL, titleSize = 18, xlab = NULL, ylab = NULL, axisSize = 12, axisLabelSize = 15, legendSize = 10, combinePlot = "none", sampleRelHeights = 1, sampleRelWidths = 1 )
plotBarcodeRankScatter( inSCE, sample = NULL, defaultTheme = TRUE, dotSize = 0.1, title = NULL, titleSize = 18, xlab = NULL, ylab = NULL, axisSize = 12, axisLabelSize = 15, legendSize = 10, combinePlot = "none", sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
title |
Title of plot. Default |
titleSize |
Size of title of plot. Default |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
axisSize |
Size of x/y-axis ticks. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
legendSize |
size of legend. Default |
combinePlot |
Must be either |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
a ggplot object of the scatter plot.
plotBarcodeRankDropsResults
,
runBarcodeRankDrops
data(scExample, package = "singleCellTK") sce <- runBarcodeRankDrops(inSCE = sce) plotBarcodeRankScatter(inSCE = sce)
data(scExample, package = "singleCellTK") sce <- runBarcodeRankDrops(inSCE = sce) plotBarcodeRankScatter(inSCE = sce)
Plot comparison of batch corrected result against original assay
plotBatchCorrCompare( inSCE, corrMat, batch = NULL, condition = NULL, origAssay = NULL, origLogged = NULL, method = NULL, matType = NULL )
plotBatchCorrCompare( inSCE, corrMat, batch = NULL, condition = NULL, origAssay = NULL, origLogged = NULL, method = NULL, matType = NULL )
inSCE |
SingleCellExperiment inherited object. |
corrMat |
A single character indicating the name of the corrected matrix. |
batch |
A single character. The name of batch annotation column in
|
condition |
A single character. The name of an additional covariate
annotation column in |
origAssay |
A single character indicating what the original assay used for batch correction is. |
origLogged |
Logical scalar indicating whether |
method |
A single character indicating the name of the batch correction method. Only used for the titles of plots. |
matType |
A single character indicating the type of the batch correction
result matrix, choose from |
Four plots will be combined. Two of them are violin/box-plots for percent variance explained by the batch variation, and optionally the covariate, for original and corrected. The other two are UMAPs of the original assay and the correction result matrix. If SCTK batch correction methods are performed in advance, this function will automatically detect necessary input. Otherwise, users can also customize the input. Future improvement might include solution to reduce redundant UMAP calculation.
An object of class "gtable"
, combining four ggplot
s.
Yichen Wang
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sceBatches <- runLimmaBC(sceBatches) plotBatchCorrCompare(sceBatches, "LIMMA", condition = "cell_type")
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sceBatches <- runLimmaBC(sceBatches) plotBatchCorrCompare(sceBatches, "LIMMA", condition = "cell_type")
Visualize the percent variation in the data that is explained by batch and condition, individually, and that explained by combining both annotations. Plotting only the variation explained by batch is supported but not recommended, because this can be confounded by potential condition.
plotBatchVariance( inSCE, useAssay = NULL, useReddim = NULL, useAltExp = NULL, batch = "batch", condition = NULL, title = NULL )
plotBatchVariance( inSCE, useAssay = NULL, useReddim = NULL, useAltExp = NULL, batch = "batch", condition = NULL, title = NULL )
inSCE |
SingleCellExperiment inherited object. |
useAssay |
A single character. The name of the assay that stores the
value to plot. For |
useReddim |
A single character. The name of the dimension reduced
matrix that stores the value to plot. Default |
useAltExp |
A single character. The name of the alternative experiment
that stores an assay of the value to plot. Default |
batch |
A single character. The name of batch annotation column in
|
condition |
A single character. The name of an additional condition
annotation column in |
title |
A single character. The title text on the top. Default
|
When condition and batch both are causing some variation, if the difference between full variation and condition variation is close to batch variation, this might imply that batches are causing some effect; if the difference is much less than batch variation, then the batches are likely to be confounded by the conditions.
A ggplot object of a boxplot of variation explained by batch, condition, and batch+condition.
data('sceBatches', package = 'singleCellTK') plotBatchVariance(sceBatches, useAssay="counts", batch="batch", condition = "cell_type")
data('sceBatches', package = 'singleCellTK') plotBatchVariance(sceBatches, useAssay="counts", batch="batch", condition = "cell_type")
A wrapper function which visualizes outputs from the
runBcds
function stored in the colData slot of the
SingleCellExperiment object via various plots.
plotBcdsResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotBcdsResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
shape |
If provided, add shapes based on the value. Default |
groupBy |
Groupings for each numeric value. A user may input a vector
equal length to the number of the samples in |
combinePlot |
Must be either |
violin |
Boolean. If |
boxplot |
Boolean. If |
dots |
Boolean. If |
reducedDimName |
Saved dimension reduction name in |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
dim1 |
1st dimension to be used for plotting. Can either be a string
which specifies the name of the dimension to be plotted from reducedDims, or
a numeric value which specifies the index of the dimension to be plotted.
Default is |
dim2 |
2nd dimension to be used for plotting. Similar to |
bin |
Numeric vector. If single value, will divide the numeric values
into |
binLabel |
Character vector. Labels for the bins created by |
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
summary |
Adds a summary statistic, as well as a crossbar to the
violin plot. Options are |
summaryTextSize |
The text size of the summary statistic displayed
above the violin plot. Default |
transparency |
Transparency of the dots, values will be 0-1. Default
|
baseSize |
The base font size for all text. Default |
titleSize |
Size of title of plot. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
axisSize |
Size of x/y-axis ticks. Default |
legendSize |
size of legend. Default |
legendTitleSize |
size of legend title. Default |
relHeights |
Relative heights of plots when combine is set. Default
|
relWidths |
Relative widths of plots when combine is set. Default
|
plotNCols |
Number of columns when plots are combined in a grid. Default
|
plotNRows |
Number of rows when plots are combined in a grid. Default
|
labelSamples |
Will label sample name in title of plot if TRUE. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
list of .ggplot objects
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runBcds(sce) plotBcdsResults(inSCE=sce, reducedDimName="UMAP")
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runBcds(sce) plotBcdsResults(inSCE=sce, reducedDimName="UMAP")
Plot a bubble plot with the color of the plot being the mean expression and the size of the dot being the percent of cells in the cluster expressing the gene.
plotBubble( inSCE, useAssay = "logcounts", featureNames, displayName = NULL, groupNames = "cluster", title = "", xlab = NULL, ylab = NULL, colorLow = "white", colorHigh = "blue", scale = FALSE )
plotBubble( inSCE, useAssay = "logcounts", featureNames, displayName = NULL, groupNames = "cluster", title = "", xlab = NULL, ylab = NULL, colorLow = "white", colorHigh = "blue", scale = FALSE )
inSCE |
The single cell experiment to use. |
useAssay |
The assay to use. |
featureNames |
A string or vector of strings with each gene to aggregate. |
displayName |
A string that is the name of the column used for genes. |
groupNames |
The name of a colData entry that can be used as groupNames. |
title |
The title of the bubble plot |
xlab |
The x-axis label |
ylab |
The y-axis label |
colorLow |
The color to be used for lowest value of mean expression |
colorHigh |
The color to be used for highest value of mean expression |
scale |
Option to scale the data. Default: |
A ggplot of the bubble plot.
data("scExample") plotBubble(inSCE=sce, useAssay="counts", featureNames=c("B2M", "MALAT1"), displayName="feature_name", groupNames="type", title="cell type test", xlab="gene", ylab="cluster", colorLow="white", colorHigh="blue")
data("scExample") plotBubble(inSCE=sce, useAssay="counts", featureNames=c("B2M", "MALAT1"), displayName="feature_name", groupNames="type", title="cell type test", xlab="gene", ylab="cluster", colorLow="white", colorHigh="blue")
Plot the differential Abundance
plotClusterAbundance(inSCE, cluster, variable, combinePlot = c("all", "none"))
plotClusterAbundance(inSCE, cluster, variable, combinePlot = c("all", "none"))
inSCE |
A |
cluster |
A single |
variable |
A single |
combinePlot |
Must be either "all" or "none". "all" will combine all
plots into a single |
This function will visualize the differential abundance in two given variables, by making bar plots that presents the cell counting and fraction in different cases.
When combinePlot = "none"
, a list
with 4
ggplot
objects; when combinePlot = "all"
, a
single ggplot
object with for subplots.
data("mouseBrainSubsetSCE", package = "singleCellTK") plotClusterAbundance(inSCE = mouseBrainSubsetSCE, cluster = "tissue", variable = "level1class")
data("mouseBrainSubsetSCE", package = "singleCellTK") plotClusterAbundance(inSCE = mouseBrainSubsetSCE, cluster = "tissue", variable = "level1class")
A wrapper function which visualizes outputs from the
runCxds
function stored in the colData slot of the
SingleCellExperiment object via various plots.
plotCxdsResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotCxdsResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
shape |
If provided, add shapes based on the value. Default |
groupBy |
Groupings for each numeric value. A user may input a vector
equal length to the number of the samples in |
combinePlot |
Must be either |
violin |
Boolean. If |
boxplot |
Boolean. If |
dots |
Boolean. If |
reducedDimName |
Saved dimension reduction name in |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
dim1 |
1st dimension to be used for plotting. Can either be a string
which specifies the name of the dimension to be plotted from reducedDims, or
a numeric value which specifies the index of the dimension to be plotted.
Default is |
dim2 |
2nd dimension to be used for plotting. Similar to |
bin |
Numeric vector. If single value, will divide the numeric values
into |
binLabel |
Character vector. Labels for the bins created by |
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
summary |
Adds a summary statistic, as well as a crossbar to the
violin plot. Options are |
summaryTextSize |
The text size of the summary statistic displayed
above the violin plot. Default |
transparency |
Transparency of the dots, values will be 0-1. Default
|
baseSize |
The base font size for all text. Default |
titleSize |
Size of title of plot. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
axisSize |
Size of x/y-axis ticks. Default |
legendSize |
size of legend. Default |
legendTitleSize |
size of legend title. Default |
relHeights |
Relative heights of plots when combine is set. Default
|
relWidths |
Relative widths of plots when combine is set. Default
|
plotNCols |
Number of columns when plots are combined in a grid. Default
|
plotNRows |
Number of rows when plots are combined in a grid. Default
|
labelSamples |
Will label sample name in title of plot if TRUE. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
list of .ggplot objects
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runCxds(sce) plotCxdsResults(inSCE=sce, reducedDimName="UMAP")
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runCxds(sce) plotCxdsResults(inSCE=sce, reducedDimName="UMAP")
A wrapper function which visualizes outputs from the runDecontX function stored in the colData slot of the SingleCellExperiment object via various plots.
plotDecontXResults( inSCE, sample = NULL, bgResult = FALSE, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, labelClusters = TRUE, clusterLabelSize = 3.5, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotDecontXResults( inSCE, sample = NULL, bgResult = FALSE, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, labelClusters = TRUE, clusterLabelSize = 3.5, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results from runDecontX. Required. |
sample |
Character vector. Indicates which sample each cell belongs to. Default NULL. |
bgResult |
Boolean. If TRUE, will plot decontX results generated with raw/droplet matrix Default FALSE. |
shape |
If provided, add shapes based on the value. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "all". |
violin |
Boolean. If TRUE, will plot the violin plot. Default TRUE. |
boxplot |
Boolean. If TRUE, will plot boxplots for each violin plot. Default TRUE. |
dots |
Boolean. If TRUE, will plot dots for each violin plot. Default TRUE. |
reducedDimName |
Saved dimension reduction name in the SingleCellExperiment object. Required. Default = "UMAP" |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
dim1 |
1st dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
dim2 |
2nd dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
bin |
Numeric vector. If single value, will divide the numeric values into the 'bin' groups. If more than one value, will bin numeric values using values as a cut point. |
binLabel |
Character vector. Labels for the bins created by the 'bin' parameter. Default NULL. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
dotSize |
Size of dots. Default 0.5. |
summary |
Adds a summary statistic, as well as a crossbar to the violin plot. Options are "mean" or "median". Default NULL. |
summaryTextSize |
The text size of the summary statistic displayed above the violin plot. Default 3. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
baseSize |
The base font size for all text. Default 12. Can be overwritten by titleSize, axisSize, and axisLabelSize, legendSize, legendTitleSize. |
titleSize |
Size of title of plot. Default NULL. |
axisLabelSize |
Size of x/y-axis labels. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default NULL. |
legendSize |
size of legend. Default NULL. |
legendTitleSize |
size of legend title. Default NULL. |
relHeights |
Relative heights of plots when combine is set. |
relWidths |
Relative widths of plots when combine is set. |
plotNCols |
Number of columns when plots are combined in a grid. |
plotNRows |
Number of rows when plots are combined in a grid. |
labelSamples |
Will label sample name in title of plot if TRUE. Default TRUE. |
labelClusters |
Logical. Whether the cluster labels are plotted. Default FALSE. |
clusterLabelSize |
Numeric. Determines the size of cluster label when 'labelClusters' is set to TRUE. Default 3.5. |
samplePerColumn |
If TRUE, when there are multiple samples and combining by "all", the output .ggplot will have plots from each sample on a single column. Default TRUE. |
sampleRelHeights |
If there are multiple samples and combining by "all", the relative heights for each plot. |
sampleRelWidths |
If there are multiple samples and combining by "all", the relative widths for each plot. |
list of .ggplot objects
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDecontX(sce) plotDecontXResults(inSCE=sce, reducedDimName="decontX_UMAP")
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDecontX(sce) plotDecontXResults(inSCE=sce, reducedDimName="decontX_UMAP")
Heatmap visualization of DEG result
plotDEGHeatmap( inSCE, useResult, onlyPos = FALSE, log2fcThreshold = 0.25, fdrThreshold = 0.05, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, useAssay = NULL, doLog = FALSE, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, rowDataName = NULL, colDataName = NULL, colSplitBy = "condition", rowSplitBy = "regulation", rowLabel = S4Vectors::metadata(inSCE)$featureDisplay, title = paste0("DE Analysis: ", useResult), ... )
plotDEGHeatmap( inSCE, useResult, onlyPos = FALSE, log2fcThreshold = 0.25, fdrThreshold = 0.05, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, useAssay = NULL, doLog = FALSE, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, rowDataName = NULL, colDataName = NULL, colSplitBy = "condition", rowSplitBy = "regulation", rowLabel = S4Vectors::metadata(inSCE)$featureDisplay, title = paste0("DE Analysis: ", useResult), ... )
inSCE |
SingleCellExperiment inherited object. |
useResult |
character. A string specifying the |
onlyPos |
logical. Whether to only plot DEG with positive log2_FC
value. Default |
log2fcThreshold |
numeric. Only plot DEGs with the absolute values of
log2FC larger than this value. Default |
fdrThreshold |
numeric. Only plot DEGs with FDR value smaller than this
value. Default |
minGroup1MeanExp |
numeric. Only plot DEGs with mean expression in
group1 greater then this value. Default |
maxGroup2MeanExp |
numeric. Only plot DEGs with mean expression in
group2 less then this value. Default |
minGroup1ExprPerc |
numeric. Only plot DEGs expressed in greater then
this fraction of cells in group1. Default |
maxGroup2ExprPerc |
numeric. Only plot DEGs expressed in less then this
fraction of cells in group2. Default |
useAssay |
character. A string specifying an assay of expression value
to plot. By default the assay used for |
doLog |
Logical scalar. Whether to do |
featureAnnotations |
|
cellAnnotations |
|
featureAnnotationColor |
A named list. Customized color settings for
feature labeling. Should match the entries in the |
cellAnnotationColor |
A named list. Customized color settings for
cell labeling. Should match the entries in the |
rowDataName |
character. The column name(s) in |
colDataName |
character. The column name(s) in |
colSplitBy |
character. Do semi-heatmap based on the grouping of
this(these) annotation(s). Should exist in either |
rowSplitBy |
character. Do semi-heatmap based on the grouping of
this(these) annotation(s). Should exist in either |
rowLabel |
|
title |
character. Main title of the heatmap. Default
|
... |
Other arguments passed to |
A differential expression analysis function has to be run in advance
so that information is stored in the metadata of the input SCE object. This
function wraps plotSCEHeatmap
.
A feature annotation basing on the log2FC level called "regulation"
will be automatically added. A cell annotation basing on the condition
selection while running the analysis called "condition"
, and the
annotations used from colData(inSCE)
while setting the condition and
covariates will also be added.
A ggplot
object
Yichen Wang
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGHeatmap(sce.w, "w.aVSb")
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGHeatmap(sce.w, "w.aVSb")
Create linear regression plot to show the expression the of top DEGs
plotDEGRegression( inSCE, useResult, threshP = FALSE, labelBy = NULL, nrow = 6, ncol = 6, defaultTheme = TRUE, isLogged = TRUE, check_sanity = TRUE )
plotDEGRegression( inSCE, useResult, threshP = FALSE, labelBy = NULL, nrow = 6, ncol = 6, defaultTheme = TRUE, isLogged = TRUE, check_sanity = TRUE )
inSCE |
SingleCellExperiment inherited object. |
useResult |
character. A string specifying the |
threshP |
logical. Whether to plot threshold values from adaptive
thresholding, instead of using the assay used by when performing DE analysis.
Default |
labelBy |
A single character for a column of |
nrow |
Integer. Number of rows in the plot grid. Default |
ncol |
Integer. Number of columns in the plot grid. Default |
defaultTheme |
Logical scalar. Whether to use default SCTK theme in
ggplot. Default |
isLogged |
Logical scalar. Whether the assay used for the analysis is
logged. If not, will do a |
check_sanity |
Logical scalar. Whether to perform MAST's sanity check
to see if the counts are logged. Default |
Any of the differential expression analysis method from SCTK should be performed prior to using this function
A ggplot object of linear regression
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGRegression(sce.w, "w.aVSb")
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGRegression(sce.w, "w.aVSb")
Generate violin plot to show the expression of top DEGs
plotDEGViolin( inSCE, useResult, threshP = FALSE, labelBy = NULL, nrow = 6, ncol = 6, defaultTheme = TRUE, isLogged = TRUE, check_sanity = TRUE )
plotDEGViolin( inSCE, useResult, threshP = FALSE, labelBy = NULL, nrow = 6, ncol = 6, defaultTheme = TRUE, isLogged = TRUE, check_sanity = TRUE )
inSCE |
SingleCellExperiment inherited object. |
useResult |
character. A string specifying the |
threshP |
logical. Whether to plot threshold values from adaptive
thresholding, instead of using the assay used by |
labelBy |
A single character for a column of |
nrow |
Integer. Number of rows in the plot grid. Default |
ncol |
Integer. Number of columns in the plot grid. Default |
defaultTheme |
Logical scalar. Whether to use default SCTK theme in
ggplot. Default |
isLogged |
Logical scalar. Whether the assay used for the analysis is
logged. If not, will do a |
check_sanity |
Logical scalar. Whether to perform MAST's sanity check
to see if the counts are logged. Default |
Any of the differential expression analysis method from SCTK should be performed prior to using this function
A ggplot object of violin plot
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGViolin(sce.w, "w.aVSb")
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGViolin(sce.w, "w.aVSb")
Generate volcano plot for DEGs
plotDEGVolcano( inSCE, useResult, labelTopN = 10, log2fcThreshold = 0.25, fdrThreshold = 0.05, featureDisplay = S4Vectors::metadata(inSCE)$featureDisplay )
plotDEGVolcano( inSCE, useResult, labelTopN = 10, log2fcThreshold = 0.25, fdrThreshold = 0.05, featureDisplay = S4Vectors::metadata(inSCE)$featureDisplay )
inSCE |
SingleCellExperiment inherited object. |
useResult |
character. A string specifying the |
labelTopN |
Integer, label this number of top DEGs that pass the
filters. |
log2fcThreshold |
numeric. Label genes with the absolute values of
log2FC greater than this value as regulated. Default |
fdrThreshold |
numeric. Label genes with FDR value less than this
value as regulated. Default |
featureDisplay |
A character string to indicate a variable in
|
Any of the differential expression analysis method from SCTK should be performed prior to using this function to generate volcano plots.
A ggplot
object of volcano plot
data("sceBatches") sceBatches <- scaterlogNormCounts(sceBatches, "logcounts") sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGVolcano(sce.w, "w.aVSb")
data("sceBatches") sceBatches <- scaterlogNormCounts(sceBatches, "logcounts") sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runWilcox(sce.w, class = "cell_type", classGroup1 = "alpha", classGroup2 = "beta", groupName1 = "w.alpha", groupName2 = "w.beta", analysisName = "w.aVSb") plotDEGVolcano(sce.w, "w.aVSb")
Plot dimensionality reduction from computed metrics including PCA, ICA, tSNE and UMAP
plotDimRed( inSCE, useReduction = "PCA", showLegend = FALSE, xDim = 1, yDim = 2, xAxisLabel = NULL, yAxisLabel = NULL )
plotDimRed( inSCE, useReduction = "PCA", showLegend = FALSE, xDim = 1, yDim = 2, xAxisLabel = NULL, yAxisLabel = NULL )
inSCE |
Input SCE object |
useReduction |
Reduction to plot. Default is |
showLegend |
If legends should be plotted or not |
xDim |
Numeric value indicating the dimension to use for X-axis. Default is 1 (refers to PC1). |
yDim |
Numeric value indicating the dimension to use for Y-axis. Default is 2 (refers to PC2). |
xAxisLabel |
Specify the label for x-axis. Default is |
yAxisLabel |
Specify the label for y-axis. Default is |
plot object
data("mouseBrainSubsetSCE", package = "singleCellTK") plotDimRed(mouseBrainSubsetSCE, "PCA_logcounts")
data("mouseBrainSubsetSCE", package = "singleCellTK") plotDimRed(mouseBrainSubsetSCE, "PCA_logcounts")
A wrapper function which visualizes outputs from the runDoubletFinder function stored in the colData slot of the SingleCellExperiment object via various plots.
plotDoubletFinderResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotDoubletFinderResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
shape |
If provided, add shapes based on the value. Default |
groupBy |
Groupings for each numeric value. A user may input a vector
equal length to the number of the samples in |
combinePlot |
Must be either |
violin |
Boolean. If |
boxplot |
Boolean. If |
dots |
Boolean. If |
reducedDimName |
Saved dimension reduction name in |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
dim1 |
1st dimension to be used for plotting. Can either be a string
which specifies the name of the dimension to be plotted from reducedDims, or
a numeric value which specifies the index of the dimension to be plotted.
Default is |
dim2 |
2nd dimension to be used for plotting. Similar to |
bin |
Numeric vector. If single value, will divide the numeric values
into |
binLabel |
Character vector. Labels for the bins created by |
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
summary |
Adds a summary statistic, as well as a crossbar to the
violin plot. Options are |
summaryTextSize |
The text size of the summary statistic displayed
above the violin plot. Default |
transparency |
Transparency of the dots, values will be 0-1. Default
|
baseSize |
The base font size for all text. Default |
titleSize |
Size of title of plot. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
axisSize |
Size of x/y-axis ticks. Default |
legendSize |
size of legend. Default |
legendTitleSize |
size of legend title. Default |
relHeights |
Relative heights of plots when combine is set. Default
|
relWidths |
Relative widths of plots when combine is set. Default
|
plotNCols |
Number of columns when plots are combined in a grid. Default
|
plotNRows |
Number of rows when plots are combined in a grid. Default
|
labelSamples |
Will label sample name in title of plot if TRUE. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
list of .ggplot objects
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runDoubletFinder(sce) plotDoubletFinderResults(inSCE = sce, reducedDimName = "UMAP")
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runDoubletFinder(sce) plotDoubletFinderResults(inSCE = sce, reducedDimName = "UMAP")
A wrapper function which visualizes outputs from the
runEmptyDrops
function stored in the colData
slot of the
SingleCellExperiment object.
plotEmptyDropsResults( inSCE, sample = NULL, combinePlot = "all", fdrCutoff = 0.01, defaultTheme = TRUE, dotSize = 0.5, titleSize = 18, axisLabelSize = 18, axisSize = 15, legendSize = 15, legendTitleSize = 16, relHeights = 1, relWidths = 1, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotEmptyDropsResults( inSCE, sample = NULL, combinePlot = "all", fdrCutoff = 0.01, defaultTheme = TRUE, dotSize = 0.5, titleSize = 18, axisLabelSize = 18, axisSize = 15, legendSize = 15, legendTitleSize = 16, relHeights = 1, relWidths = 1, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
combinePlot |
Must be either |
fdrCutoff |
Numeric. Thresholds barcodes based on the FDR values from
|
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
titleSize |
Size of title of plot. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
axisSize |
Size of x/y-axis ticks. Default |
legendSize |
size of legend. Default |
legendTitleSize |
size of legend title. Default |
relHeights |
Relative heights of plots when combine is set. Default
|
relWidths |
Relative widths of plots when combine is set. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
list of .ggplot objects
runEmptyDrops
, plotEmptyDropsScatter
data(scExample, package = "singleCellTK") sce <- runEmptyDrops(inSCE = sce) plotEmptyDropsResults(inSCE = sce)
data(scExample, package = "singleCellTK") sce <- runEmptyDrops(inSCE = sce) plotEmptyDropsResults(inSCE = sce)
A plotting function which visualizes outputs from the
runEmptyDrops
function stored in the colData slot of the
SingleCellExperiment object via scatter plots.
plotEmptyDropsScatter( inSCE, sample = NULL, fdrCutoff = 0.01, defaultTheme = TRUE, dotSize = 0.1, title = NULL, titleSize = 18, xlab = NULL, ylab = NULL, axisSize = 12, axisLabelSize = 15, legendTitle = NULL, legendTitleSize = 12, legendSize = 10, combinePlot = "none", relHeights = 1, relWidths = 1, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotEmptyDropsScatter( inSCE, sample = NULL, fdrCutoff = 0.01, defaultTheme = TRUE, dotSize = 0.1, title = NULL, titleSize = 18, xlab = NULL, ylab = NULL, axisSize = 12, axisLabelSize = 15, legendTitle = NULL, legendTitleSize = 12, legendSize = 10, combinePlot = "none", relHeights = 1, relWidths = 1, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
fdrCutoff |
Numeric. Thresholds barcodes based on the FDR values from
|
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
title |
Title of plot. Default |
titleSize |
Size of title of plot. Default |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
axisSize |
Size of x/y-axis ticks. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
legendTitle |
Title of legend. Default |
legendTitleSize |
size of legend title. Default |
legendSize |
size of legend. Default |
combinePlot |
Must be either |
relHeights |
Relative heights of plots when combine is set. Default
|
relWidths |
Relative widths of plots when combine is set. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
a ggplot object of the scatter plot.
runEmptyDrops
, plotEmptyDropsResults
data(scExample, package = "singleCellTK") sce <- runEmptyDrops(inSCE = sce) plotEmptyDropsScatter(inSCE = sce)
data(scExample, package = "singleCellTK") sce <- runEmptyDrops(inSCE = sce) plotEmptyDropsScatter(inSCE = sce)
runFindMarker
This function will first reads the result saved in
metadata
slot, named by "findMarker"
and generated by
runFindMarker
. Then it do the filtering on the statistics
based on the input parameters and get unique genes to plot. We choose the
genes that are identified as up-regulated only. As for the genes identified
as up-regulated for multiple clusters, we only keep the belonging towards the
one they have the highest Log2FC value.
In the heatmap, there will always be a cell annotation for the cluster
labeling used when finding the markers, and a feature annotation for which
cluster each gene belongs to. And by default we split the heatmap by these
two annotations. Additional legends can be added and the splitting can be
canceled.
plotFindMarkerHeatmap( inSCE, orderBy = "size", log2fcThreshold = 1, fdrThreshold = 0.05, minClustExprPerc = 0.7, maxCtrlExprPerc = 0.4, minMeanExpr = 1, topN = 10, decreasing = TRUE, rowLabel = TRUE, rowDataName = NULL, colDataName = NULL, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, colSplitBy = NULL, rowSplitBy = "marker", rowDend = FALSE, colDend = FALSE, title = "Top Marker Heatmap", ... ) plotMarkerDiffExp( inSCE, orderBy = "size", log2fcThreshold = 1, fdrThreshold = 0.05, minClustExprPerc = 0.7, maxCtrlExprPerc = 0.4, minMeanExpr = 1, topN = 10, decreasing = TRUE, rowDataName = NULL, colDataName = NULL, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, colSplitBy = NULL, rowSplitBy = "marker", rowDend = FALSE, colDend = FALSE, title = "Top Marker Heatmap", ... )
plotFindMarkerHeatmap( inSCE, orderBy = "size", log2fcThreshold = 1, fdrThreshold = 0.05, minClustExprPerc = 0.7, maxCtrlExprPerc = 0.4, minMeanExpr = 1, topN = 10, decreasing = TRUE, rowLabel = TRUE, rowDataName = NULL, colDataName = NULL, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, colSplitBy = NULL, rowSplitBy = "marker", rowDend = FALSE, colDend = FALSE, title = "Top Marker Heatmap", ... ) plotMarkerDiffExp( inSCE, orderBy = "size", log2fcThreshold = 1, fdrThreshold = 0.05, minClustExprPerc = 0.7, maxCtrlExprPerc = 0.4, minMeanExpr = 1, topN = 10, decreasing = TRUE, rowDataName = NULL, colDataName = NULL, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, colSplitBy = NULL, rowSplitBy = "marker", rowDend = FALSE, colDend = FALSE, title = "Top Marker Heatmap", ... )
inSCE |
SingleCellExperiment inherited object. |
orderBy |
The ordering method of the clusters on the splitted heatmap.
Can be chosen from |
log2fcThreshold |
Only use DEGs with the absolute values of log2FC
larger than this value. Default |
fdrThreshold |
Only use DEGs with FDR value smaller than this value.
Default |
minClustExprPerc |
A numeric scalar. The minimum cutoff of the
percentage of cells in the cluster of interests that expressed the marker
gene. Default |
maxCtrlExprPerc |
A numeric scalar. The maximum cutoff of the
percentage of cells out of the cluster (control group) that expressed the
marker gene. Default |
minMeanExpr |
A numeric scalar. The minimum cutoff of the mean
expression value of the marker in the cluster of interests. Default |
topN |
An integer. Only to plot this number of top markers for each
cluster in maximum, in terms of log2FC value. Use |
decreasing |
Order the cluster decreasingly. Default |
rowLabel |
|
rowDataName |
character. The column name(s) in |
colDataName |
character. The column name(s) in |
featureAnnotations |
|
cellAnnotations |
|
featureAnnotationColor |
A named list. Customized color settings for
feature labeling. Should match the entries in the |
cellAnnotationColor |
A named list. Customized color settings for
cell labeling. Should match the entries in the |
colSplitBy |
character vector. Do semi-heatmap based on the grouping of
this(these) annotation(s). Should exist in either |
rowSplitBy |
character vector. Do semi-heatmap based on the grouping of
this(these) annotation(s). Should exist in either |
rowDend |
Whether to display row dendrogram. Default |
colDend |
Whether to display column dendrogram. Default |
title |
Text of the title, at the top of the heatmap. Default
|
... |
Other arguments passed to |
A Heatmap
object
Yichen Wang
runFindMarker
, getFindMarkerTopTable
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runFindMarker(sce.w, method = "wilcox", cluster = "cell_type") plotFindMarkerHeatmap(sce.w)
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) sce.w <- subsetSCECols(sceBatches, colData = "batch == 'w'") sce.w <- runFindMarker(sce.w, method = "wilcox", cluster = "cell_type") plotFindMarkerHeatmap(sce.w)
Calculate and produce a list of thresholded counts (on natural scale),
thresholds, bins, densities estimated on each bin, and the original data from
thresholdSCRNACountMatrix
plotMASTThresholdGenes( inSCE, useAssay = "logcounts", doPlot = TRUE, isLogged = TRUE, check_sanity = TRUE )
plotMASTThresholdGenes( inSCE, useAssay = "logcounts", doPlot = TRUE, isLogged = TRUE, check_sanity = TRUE )
inSCE |
SingleCellExperiment object |
useAssay |
character, default |
doPlot |
Logical scalar. Whether to directly plot in the plotting area.
If |
isLogged |
Logical scalar. Whether the assay used for the analysis is
logged. If not, will do a |
check_sanity |
Logical scalar. Whether to perform MAST's sanity check
to see if the counts are logged. Default |
Plot the thresholding onto the plotting region if plot == TRUE
or a graphical object if plot == FALSE
.
data("mouseBrainSubsetSCE") plotMASTThresholdGenes(mouseBrainSubsetSCE)
data("mouseBrainSubsetSCE") plotMASTThresholdGenes(mouseBrainSubsetSCE)
Generate violin plots for pathway analysis results
plotPathway( inSCE, resultName, geneset, groupBy = NULL, boxplot = FALSE, violin = TRUE, dots = TRUE, summary = "median", axisSize = 10, axisLabelSize = 10, dotSize = 0.5, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, title = geneset, titleSize = NULL )
plotPathway( inSCE, resultName, geneset, groupBy = NULL, boxplot = FALSE, violin = TRUE, dots = TRUE, summary = "median", axisSize = 10, axisLabelSize = 10, dotSize = 0.5, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, title = geneset, titleSize = NULL )
inSCE |
Input SingleCellExperiment object. With
|
resultName |
A single character of the name of a score matrix, which
should be found in |
geneset |
A single character specifying the geneset of interest. Should be found in the geneSetCollection used for performing the analysis. |
groupBy |
Either a single character specifying a column of
|
boxplot |
Boolean, Whether to add a boxplot. Default |
violin |
Boolean, Whether to add a violin plot. Default |
dots |
Boolean, If |
summary |
Adds a summary statistic, as well as a crossbar to the violin
plot. Options are |
axisSize |
Size of x/y-axis ticks. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
dotSize |
Size of dots. Default |
transparency |
Transparency of the dots, values will be 0-1. Default
|
defaultTheme |
Removes grid in plot and sets axis title size to
|
gridLine |
Adds a horizontal grid line if |
title |
Title of plot. Default using |
titleSize |
Size of the title of the plot. Default |
runGSVA()
or runVAM()
should be applied in advance of
using this function. Users can group the data by specifying groupby
.
A ggplot
object for the violin plot
data("scExample", package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce, geneSetList = gs, by = "rownames") sce <- runVAM(inSCE = sce, geneSetCollectionName = "GeneSetCollection", useAssay = "logcounts") plotPathway(sce, "VAM_GeneSetCollection_CDF", "geneset1")
data("scExample", package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce, geneSetList = gs, by = "rownames") sce <- runVAM(inSCE = sce, geneSetCollectionName = "GeneSetCollection", useAssay = "logcounts") plotPathway(sce, "VAM_GeneSetCollection_CDF", "geneset1")
Plot PCA run data from its components.
plotPCA( inSCE, colorBy = NULL, shape = NULL, pcX = "PC1", pcY = "PC2", reducedDimName = "PCA", runPCA = FALSE, useAssay = "logcounts" )
plotPCA( inSCE, colorBy = NULL, shape = NULL, pcX = "PC1", pcY = "PC2", reducedDimName = "PCA", runPCA = FALSE, useAssay = "logcounts" )
inSCE |
Input SingleCellExperiment object. |
colorBy |
The variable to color clusters by |
shape |
Shape of the points |
pcX |
User choice for the first principal component |
pcY |
User choice for the second principal component |
reducedDimName |
a name to store the results of the dimension reduction coordinates obtained from this method. This is stored in the SingleCellExperiment object in the reducedDims slot. Required. |
runPCA |
Run PCA if the reducedDimName does not exist. the Default is FALSE. |
useAssay |
Indicate which assay to use. The default is "logcounts". |
A PCA plot
data("mouseBrainSubsetSCE") plotPCA(mouseBrainSubsetSCE, colorBy = "level1class", reducedDimName = "PCA_counts")
data("mouseBrainSubsetSCE") plotPCA(mouseBrainSubsetSCE, colorBy = "level1class", reducedDimName = "PCA_counts")
A wrapper function which visualizes outputs from the runPerCellQC function stored in the colData slot of the SingleCellExperiment object via various plots.
plotRunPerCellQCResults( inSCE, sample = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, baseSize = 15, axisSize = NULL, axisLabelSize = NULL, transparency = 1, defaultTheme = TRUE, titleSize = NULL, relHeights = 1, relWidths = 1, labelSamples = TRUE, plotNCols = NULL, plotNRows = NULL, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotRunPerCellQCResults( inSCE, sample = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, baseSize = 15, axisSize = NULL, axisLabelSize = NULL, transparency = 1, defaultTheme = TRUE, titleSize = NULL, relHeights = 1, relWidths = 1, labelSamples = TRUE, plotNCols = NULL, plotNRows = NULL, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
groupBy |
Groupings for each numeric value. Users may input a vector
equal length to the number of the samples in |
combinePlot |
Must be either |
violin |
Boolean. If |
boxplot |
Boolean. If |
dots |
Boolean. If |
dotSize |
Size of dots. Default |
summary |
Adds a summary statistic, as well as a crossbar to the
violin plot. Options are |
summaryTextSize |
The text size of the summary statistic displayed
above the violin plot. Default |
baseSize |
The base font size for all text. Default |
axisSize |
Size of x/y-axis ticks. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
transparency |
Transparency of the dots, values will be 0-1. Default |
defaultTheme |
Removes grid in plot and sets axis title size to
|
titleSize |
Size of title of plot. Default |
relHeights |
Relative heights of plots when combine is set. Default
|
relWidths |
Relative widths of plots when combine is set. Default
|
labelSamples |
Will label sample name in title of plot if |
plotNCols |
Number of columns when plots are combined in a grid. Default
|
plotNRows |
Number of rows when plots are combined in a grid. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
list of .ggplot objects
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runPerCellQC(sce) plotRunPerCellQCResults(inSCE = sce)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runPerCellQC(sce) plotRunPerCellQCResults(inSCE = sce)
plotScanpyDotPlot
plotScanpyDotPlot( inSCE, useAssay = NULL, features, groupBy, standardScale = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "Mean expression in group" )
plotScanpyDotPlot( inSCE, useAssay = NULL, features, groupBy, standardScale = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "Mean expression in group" )
inSCE |
Input |
useAssay |
Assay to use for plotting. By default it will use counts assay. |
features |
Genes to plot. Sometimes is useful to pass a specific list of var names (e.g. genes). The var_names could be a dictionary or a list. |
groupBy |
The key of the observation grouping to consider. |
standardScale |
Whether or not to standardize the given dimension
between 0 and 1, meaning for each variable or group, subtract the minimum and
divide each by its maximum. Default |
title |
Provide title for the figure. |
vmin |
The value representing the lower limit of the color scale.
Values smaller than vmin are plotted with the same color as vmin.
Default |
vmax |
The value representing the upper limit of the color scale.
Values larger than vmax are plotted with the same color as vmax.
Default |
colorBarTitle |
Title for the color bar. |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyDotPlot(sce, features = markers, groupBy = 'Scanpy_louvain_1') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyDotPlot(sce, features = markers, groupBy = 'Scanpy_louvain_1') ## End(Not run)
plotScanpyEmbedding
plotScanpyEmbedding( inSCE, reducedDimName, useAssay = NULL, color = NULL, legend = "right margin", title = "" )
plotScanpyEmbedding( inSCE, reducedDimName, useAssay = NULL, color = NULL, legend = "right margin", title = "" )
inSCE |
Input |
reducedDimName |
Name of reducedDims object containing embeddings. Eg. scanpyUMAP. |
useAssay |
Specify name of assay to use. Default is |
color |
Keys for annotations of observations/cells or variables/genes. |
legend |
Location of legend, either 'on data', 'right margin' or a valid keyword for the loc parameter of Legend. |
title |
Provide title for panels either as string or list of strings |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") plotScanpyEmbedding(sce, reducedDimName = "scanpyUMAP", color = 'Scanpy_louvain_1') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") plotScanpyEmbedding(sce, reducedDimName = "scanpyUMAP", color = 'Scanpy_louvain_1') ## End(Not run)
plotScanpyHeatmap
plotScanpyHeatmap( inSCE, useAssay = NULL, features, groupBy, standardScale = "var", vmin = NULL, vmax = NULL )
plotScanpyHeatmap( inSCE, useAssay = NULL, features, groupBy, standardScale = "var", vmin = NULL, vmax = NULL )
inSCE |
Input |
useAssay |
Assay to use for plotting. By default it will use counts assay. |
features |
Genes to plot. Sometimes is useful to pass a specific list of var names (e.g. genes). The var_names could be a dictionary or a list. |
groupBy |
The key of the observation grouping to consider. |
standardScale |
Whether or not to standardize the given dimension
between 0 and 1, meaning for each variable or group, subtract the minimum and
divide each by its maximum. Default |
vmin |
The value representing the lower limit of the color scale.
Values smaller than vmin are plotted with the same color as vmin.
Default |
vmax |
The value representing the upper limit of the color scale.
Values larger than vmax are plotted with the same color as vmax.
Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyHeatmap(sce, features = markers, groupBy = 'Scanpy_louvain_1') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyHeatmap(sce, features = markers, groupBy = 'Scanpy_louvain_1') ## End(Not run)
plotScanpyHVG
plotScanpyHVG(inSCE, log = FALSE)
plotScanpyHVG(inSCE, log = FALSE)
inSCE |
Input |
log |
Plot on logarithmic axes. Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") plotScanpyHVG(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") plotScanpyHVG(sce) ## End(Not run)
plotScanpyMarkerGenes
plotScanpyMarkerGenes( inSCE, groups = NULL, nGenes = 10, nCols = 4, sharey = FALSE )
plotScanpyMarkerGenes( inSCE, groups = NULL, nGenes = 10, nCols = 4, sharey = FALSE )
inSCE |
Input |
groups |
The groups for which to show the gene ranking. Default |
nGenes |
Number of genes to show. Default |
nCols |
Number of panels shown per row. Default |
sharey |
Controls if the y-axis of each panels should be shared.
Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenes(sce, groups = '0') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenes(sce, groups = '0') ## End(Not run)
plotScanpyMarkerGenesDotPlot
plotScanpyMarkerGenesDotPlot( inSCE, groups = NULL, nGenes = 10, groupBy, log2fcThreshold = NULL, parameters = "logfoldchanges", standardScale = NULL, features = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "log fold change" )
plotScanpyMarkerGenesDotPlot( inSCE, groups = NULL, nGenes = 10, groupBy, log2fcThreshold = NULL, parameters = "logfoldchanges", standardScale = NULL, features = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "log fold change" )
inSCE |
Input |
groups |
The groups for which to show the gene ranking. Default |
nGenes |
Number of genes to show. Default |
groupBy |
The key of the observation grouping to consider. By default, the groupby is chosen from the rank genes groups parameter. |
log2fcThreshold |
Only output DEGs with the absolute values of log2FC
larger than this value. Default |
parameters |
The options for marker genes results to plot are: ‘scores’, ‘logfoldchanges’, ‘pvals’, ‘pvals_adj’, ‘log10_pvals’, ‘log10_pvals_adj’. If NULL provided then it uses mean gene value to plot. |
standardScale |
Whether or not to standardize the given dimension
between 0 and 1, meaning for each variable or group, subtract the minimum and
divide each by its maximum. Default |
features |
Genes to plot. Sometimes is useful to pass a specific list of
var names (e.g. genes) to check their fold changes or p-values, instead of
the top/bottom genes. The gene names could be a dictionary or a list.
Default |
title |
Provide title for the figure. |
vmin |
The value representing the lower limit of the color scale.
Values smaller than vmin are plotted with the same color as vmin.
Default |
vmax |
The value representing the upper limit of the color scale.
Values larger than vmax are plotted with the same color as vmax.
Default |
colorBarTitle |
Title for the color bar. |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesDotPlot(sce, groupBy = 'Scanpy_louvain_1') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesDotPlot(sce, groupBy = 'Scanpy_louvain_1') ## End(Not run)
plotScanpyMarkerGenesHeatmap
plotScanpyMarkerGenesHeatmap( inSCE, groups = NULL, groupBy, nGenes = 10, features = NULL, log2fcThreshold = NULL )
plotScanpyMarkerGenesHeatmap( inSCE, groups = NULL, groupBy, nGenes = 10, features = NULL, log2fcThreshold = NULL )
inSCE |
Input |
groups |
The groups for which to show the gene ranking. Default |
groupBy |
The key of the observation grouping to consider. By default, the groupby is chosen from the rank genes groups parameter. |
nGenes |
Number of genes to show. Default |
features |
Genes to plot. Sometimes is useful to pass a specific list of var names (e.g. genes). The var_names could be a dictionary or a list. |
log2fcThreshold |
Only output DEGs with the absolute values of log2FC
larger than this value. Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesHeatmap(sce, groupBy = 'Scanpy_louvain_1') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesHeatmap(sce, groupBy = 'Scanpy_louvain_1') ## End(Not run)
plotScanpyMarkerGenesMatrixPlot
plotScanpyMarkerGenesMatrixPlot( inSCE, groups = NULL, nGenes = 10, groupBy, log2fcThreshold = NULL, parameters = "logfoldchanges", standardScale = "var", features = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "log fold change" )
plotScanpyMarkerGenesMatrixPlot( inSCE, groups = NULL, nGenes = 10, groupBy, log2fcThreshold = NULL, parameters = "logfoldchanges", standardScale = "var", features = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "log fold change" )
inSCE |
Input |
groups |
The groups for which to show the gene ranking. Default |
nGenes |
Number of genes to show. Default |
groupBy |
The key of the observation grouping to consider. By default, the groupby is chosen from the rank genes groups parameter. |
log2fcThreshold |
Only output DEGs with the absolute values of log2FC
larger than this value. Default |
parameters |
The options for marker genes results to plot are: ‘scores’, ‘logfoldchanges’, ‘pvals’, ‘pvals_adj’, ‘log10_pvals’, ‘log10_pvals_adj’. If NULL provided then it uses mean gene value to plot. |
standardScale |
Whether or not to standardize the given dimension
between 0 and 1, meaning for each variable or group, subtract the minimum and
divide each by its maximum. Default |
features |
Genes to plot. Sometimes is useful to pass a specific list of
var names (e.g. genes) to check their fold changes or p-values, instead of
the top/bottom genes. The var_names could be a dictionary or a list.
Default |
title |
Provide title for the figure. |
vmin |
The value representing the lower limit of the color scale.
Values smaller than vmin are plotted with the same color as vmin.
Default |
vmax |
The value representing the upper limit of the color scale.
Values larger than vmax are plotted with the same color as vmax.
Default |
colorBarTitle |
Title for the color bar. |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesMatrixPlot(sce, groupBy = 'Scanpy_louvain_1') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesMatrixPlot(sce, groupBy = 'Scanpy_louvain_1') ## End(Not run)
plotScanpyMarkerGenesViolin
plotScanpyMarkerGenesViolin(inSCE, groups = NULL, features = NULL, nGenes = 10)
plotScanpyMarkerGenesViolin(inSCE, groups = NULL, features = NULL, nGenes = 10)
inSCE |
Input |
groups |
The groups for which to show the gene ranking. Default |
features |
List of genes to plot. Is only useful if interested in a custom gene list |
nGenes |
Number of genes to show. Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesViolin(sce, groups = '0') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) plotScanpyMarkerGenesViolin(sce, groups = '0') ## End(Not run)
plotScanpyMatrixPlot
plotScanpyMatrixPlot( inSCE, useAssay = NULL, features, groupBy, standardScale = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "Mean expression in group" )
plotScanpyMatrixPlot( inSCE, useAssay = NULL, features, groupBy, standardScale = NULL, title = "", vmin = NULL, vmax = NULL, colorBarTitle = "Mean expression in group" )
inSCE |
Input |
useAssay |
Assay to use for plotting. By default it will use counts assay. |
features |
Genes to plot. Sometimes is useful to pass a specific list of var names (e.g. genes). The var_names could be a dictionary or a list. |
groupBy |
The key of the observation grouping to consider. |
standardScale |
Whether or not to standardize the given dimension
between 0 and 1, meaning for each variable or group, subtract the minimum and
divide each by its maximum. Default |
title |
Provide title for the figure. |
vmin |
The value representing the lower limit of the color scale.
Values smaller than vmin are plotted with the same color as vmin.
Default |
vmax |
The value representing the upper limit of the color scale.
Values larger than vmax are plotted with the same color as vmax.
Default |
colorBarTitle |
Title for the color bar. |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyMatrixPlot(sce, features = markers, groupBy = 'Scanpy_louvain_1') ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyMatrixPlot(sce, features = markers, groupBy = 'Scanpy_louvain_1') ## End(Not run)
plotScanpyPCA
plotScanpyPCA( inSCE, reducedDimName = "scanpyPCA", color = NULL, title = "", legend = "right margin" )
plotScanpyPCA( inSCE, reducedDimName = "scanpyPCA", color = NULL, title = "", legend = "right margin" )
inSCE |
Input |
reducedDimName |
Name of new reducedDims object containing Scanpy PCA. |
color |
Keys for annotations of observations/cells or variables/genes. |
title |
Provide title for panels either as string or list of strings |
legend |
Location of legend, either 'on data', 'right margin' or a valid keyword for the loc parameter of Legend. |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") plotScanpyPCA(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") plotScanpyPCA(sce) ## End(Not run)
plotScanpyPCAGeneRanking
plotScanpyPCAGeneRanking(inSCE, PC_comp = "1,2,3", includeLowest = TRUE)
plotScanpyPCAGeneRanking(inSCE, PC_comp = "1,2,3", includeLowest = TRUE)
inSCE |
Input |
PC_comp |
For example, '1,2,3' means [1, 2, 3], first, second, third principal component. |
includeLowest |
Whether to show the variables with both highest and
lowest loadings. Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") plotScanpyPCAGeneRanking(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") plotScanpyPCAGeneRanking(sce) ## End(Not run)
plotScanpyPCAVariance
plotScanpyPCAVariance(inSCE, nPCs = 50, log = FALSE)
plotScanpyPCAVariance(inSCE, nPCs = 50, log = FALSE)
inSCE |
Input |
nPCs |
Number of PCs to show. Default |
log |
Plot on logarithmic scale. Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") plotScanpyPCAVariance(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") plotScanpyPCAVariance(sce) ## End(Not run)
plotScanpyViolin
plotScanpyViolin( inSCE, useAssay = NULL, features, groupBy, xlabel = "", ylabel = NULL )
plotScanpyViolin( inSCE, useAssay = NULL, features, groupBy, xlabel = "", ylabel = NULL )
inSCE |
Input |
useAssay |
Assay to use for plotting. By default it will use counts assay. |
features |
Genes to plot. Sometimes is useful to pass a specific list of var names (e.g. genes). The var_names could be a dictionary or a list. |
groupBy |
The key of the observation grouping to consider. |
xlabel |
Label of the x axis. Defaults to groupBy. |
ylabel |
Label of the y axis. If NULL and groupBy is NULL, defaults to 'value'. If NULL and groupBy is not NULL, defaults to features. |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyViolin(sce, features = markers, groupBy = "Scanpy_louvain_1") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") markers <- c("MALAT1" ,"RPS27" ,"CST3") plotScanpyViolin(sce, features = markers, groupBy = "Scanpy_louvain_1") ## End(Not run)
A wrapper function which visualizes outputs from the
runScDblFinder
function stored in the colData slot of the
SingleCellExperiment object via various plots.
plotScDblFinderResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotScDblFinderResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
shape |
If provided, add shapes based on the value. Default |
groupBy |
Groupings for each numeric value. A user may input a vector
equal length to the number of the samples in |
combinePlot |
Must be either |
violin |
Boolean. If |
boxplot |
Boolean. If |
dots |
Boolean. If |
reducedDimName |
Saved dimension reduction name in |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
dim1 |
1st dimension to be used for plotting. Can either be a string
which specifies the name of the dimension to be plotted from reducedDims, or
a numeric value which specifies the index of the dimension to be plotted.
Default is |
dim2 |
2nd dimension to be used for plotting. Similar to |
bin |
Numeric vector. If single value, will divide the numeric values
into |
binLabel |
Character vector. Labels for the bins created by |
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
summary |
Adds a summary statistic, as well as a crossbar to the
violin plot. Options are |
summaryTextSize |
The text size of the summary statistic displayed
above the violin plot. Default |
transparency |
Transparency of the dots, values will be 0-1. Default
|
baseSize |
The base font size for all text. Default |
titleSize |
Size of title of plot. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
axisSize |
Size of x/y-axis ticks. Default |
legendSize |
size of legend. Default |
legendTitleSize |
size of legend title. Default |
relHeights |
Relative heights of plots when combine is set. Default |
relWidths |
Relative widths of plots when combine is set. Default
|
plotNCols |
Number of columns when plots are combined in a grid. Default
|
plotNRows |
Number of rows when plots are combined in a grid. Default
|
labelSamples |
Will label sample name in title of plot if TRUE. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
list of .ggplot objects
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runScDblFinder(sce) plotScDblFinderResults(inSCE = sce, reducedDimName = "UMAP")
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runScDblFinder(sce) plotScDblFinderResults(inSCE = sce, reducedDimName = "UMAP")
A wrapper function which visualizes outputs from the runCxdsBcdsHybrid function stored in the colData slot of the SingleCellExperiment object via various plots.
plotScdsHybridResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotScdsHybridResults( inSCE, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, reducedDimName = "UMAP", xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results from runCxdsBcdsHybrid. Required. |
sample |
Character vector. Indicates which sample each cell belongs to. Default NULL. |
shape |
If provided, add shapes based on the value. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "all". |
violin |
Boolean. If TRUE, will plot the violin plot. Default TRUE. |
boxplot |
Boolean. If TRUE, will plot boxplots for each violin plot. Default TRUE. |
dots |
Boolean. If TRUE, will plot dots for each violin plot. Default TRUE. |
reducedDimName |
Saved dimension reduction name in the SingleCellExperiment object. Required. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
dim1 |
1st dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
dim2 |
2nd dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
bin |
Numeric vector. If single value, will divide the numeric values into the 'bin' groups. If more than one value, will bin numeric values using values as a cut point. |
binLabel |
Character vector. Labels for the bins created by the 'bin' parameter. Default NULL. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
dotSize |
Size of dots. Default 0.5. |
summary |
Adds a summary statistic, as well as a crossbar to the violin plot. Options are "mean" or "median". Default NULL. |
summaryTextSize |
The text size of the summary statistic displayed above the violin plot. Default 3. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
baseSize |
The base font size for all text. Default 12. Can be overwritten by titleSize, axisSize, and axisLabelSize, legendSize, legendTitleSize. |
titleSize |
Size of title of plot. Default NULL. |
axisLabelSize |
Size of x/y-axis labels. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default NULL. |
legendSize |
size of legend. Default NULL. |
legendTitleSize |
size of legend title. Default NULL. |
relHeights |
Relative heights of plots when combine is set. |
relWidths |
Relative widths of plots when combine is set. |
plotNCols |
Number of columns when plots are combined in a grid. |
plotNRows |
Number of rows when plots are combined in a grid. |
labelSamples |
Will label sample name in title of plot if TRUE. Default TRUE. |
samplePerColumn |
If TRUE, when there are multiple samples and combining by "all", the output .ggplot will have plots from each sample on a single column. Default TRUE. |
sampleRelHeights |
If there are multiple samples and combining by "all", the relative heights for each plot. |
sampleRelWidths |
If there are multiple samples and combining by "all", the relative widths for each plot. |
list of .ggplot objects
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runCxdsBcdsHybrid(sce) plotScdsHybridResults(inSCE=sce, reducedDimName="UMAP")
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runCxdsBcdsHybrid(sce) plotScdsHybridResults(inSCE=sce, reducedDimName="UMAP")
Visualizes values stored in the assay slot of a SingleCellExperiment object via a bar plot.
plotSCEBarAssayData( inSCE, feature, sample = NULL, useAssay = "counts", featureLocation = NULL, featureDisplay = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, combinePlot = TRUE )
plotSCEBarAssayData( inSCE, feature, sample = NULL, useAssay = "counts", featureLocation = NULL, featureDisplay = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, combinePlot = TRUE )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
feature |
Name of feature stored in assay of SingleCellExperiment object. |
sample |
Character vector. Indicates which sample each cell belongs to. |
useAssay |
Indicate which assay to use. Default "counts". |
featureLocation |
Indicates which column name of rowData to query gene. |
featureDisplay |
Indicates which column name of rowData to use to display feature for visualization. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
gridLine |
Adds a horizontal grid line if TRUE. Will still be drawn even if defaultTheme is TRUE. Default FALSE. |
summary |
Adds a summary statistic, as well as a crossbar to the violin plot. Options are "mean" or "median". Default NULL. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
combinePlot |
Boolean. If multiple plots are generated (multiple samples, etc.), will combined plots using 'cowplot::plot_grid'. Default TRUE. |
a ggplot of the barplot of assay data.
data("mouseBrainSubsetSCE") plotSCEBarAssayData( inSCE = mouseBrainSubsetSCE, feature = "Apoe", groupBy = "sex" )
data("mouseBrainSubsetSCE") plotSCEBarAssayData( inSCE = mouseBrainSubsetSCE, feature = "Apoe", groupBy = "sex" )
Visualizes values stored in the colData slot of a SingleCellExperiment object via a bar plot.
plotSCEBarColData( inSCE, coldata, sample = NULL, groupBy = NULL, dots = TRUE, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, combinePlot = TRUE )
plotSCEBarColData( inSCE, coldata, sample = NULL, groupBy = NULL, dots = TRUE, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, combinePlot = TRUE )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
coldata |
colData value that will be plotted. |
sample |
Character vector. Indicates which sample each cell belongs to. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
dots |
Boolean. If TRUE, will plot dots for each violin plot. Default TRUE. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
gridLine |
Adds a horizontal grid line if TRUE. Will still be drawn even if defaultTheme is TRUE. Default FALSE. |
summary |
Adds a summary statistic, as well as a crossbar to the violin plot. Options are "mean" or "median". Default NULL. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
combinePlot |
Boolean. If multiple plots are generated (multiple samples, etc.), will combined plots using 'cowplot::plot_grid'. Default TRUE. |
a ggplot of the barplot of coldata.
data("mouseBrainSubsetSCE") plotSCEBarColData( inSCE = mouseBrainSubsetSCE, coldata = "age", groupBy = "sex" )
data("mouseBrainSubsetSCE") plotSCEBarColData( inSCE = mouseBrainSubsetSCE, coldata = "age", groupBy = "sex" )
Plot mean feature value in each batch of a SingleCellExperiment object
plotSCEBatchFeatureMean( inSCE, useAssay = NULL, useReddim = NULL, useAltExp = NULL, batch = "batch", xlab = "batch", ylab = "Feature Mean", ... )
plotSCEBatchFeatureMean( inSCE, useAssay = NULL, useReddim = NULL, useAltExp = NULL, batch = "batch", xlab = "batch", ylab = "Feature Mean", ... )
inSCE |
SingleCellExperiment inherited object. |
useAssay |
A single character. The name of the assay that stores the
value to plot. For |
useReddim |
A single character. The name of the dimension reduced
matrix that stores the value to plot. Default |
useAltExp |
A single character. The name of the alternative experiment
that stores an assay of the value to plot. Default |
batch |
A single character. The name of batch annotation column in
|
xlab |
label for x-axis. Default |
ylab |
label for y-axis. Default |
... |
Additional arguments passed to |
ggplot
data('sceBatches', package = 'singleCellTK') plotSCEBatchFeatureMean(sceBatches, useAssay = "counts")
data('sceBatches', package = 'singleCellTK') plotSCEBatchFeatureMean(sceBatches, useAssay = "counts")
Visualizes values stored in any slot of a SingleCellExperiment object via a densityn plot.
plotSCEDensity( inSCE, slotName, itemName, sample = NULL, feature = NULL, dimension = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, defaultTheme = TRUE, title = NULL, titleSize = 18, cutoff = NULL, combinePlot = "none", plotLabels = NULL )
plotSCEDensity( inSCE, slotName, itemName, sample = NULL, feature = NULL, dimension = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, defaultTheme = TRUE, title = NULL, titleSize = 18, cutoff = NULL, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
slotName |
Desired slot of SingleCellExperiment used for plotting. Possible options: "assays", "colData", "metadata", "reducedDims". Required. |
itemName |
Desired vector within the slot used for plotting. Required. |
sample |
Character vector. Indicates which sample each cell belongs to. |
feature |
Desired name of feature stored in assay of SingleCellExperiment object. Only used when "assays" slotName is selected. Default NULL. |
dimension |
Desired dimension stored in the specified reducedDims. Either an integer which indicates the column or a character vector specifies column name. By default, the 1st dimension/column will be used. Only used when "reducedDims" slotName is selected. Default NULL. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
cutoff |
Numeric value. The plot will be annotated with a vertical line if set. Default NULL. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot object of the density plot.
data("mouseBrainSubsetSCE") plotSCEDensity( inSCE = mouseBrainSubsetSCE, slotName = "assays", itemName = "counts", feature = "Apoe", groupBy = "sex" )
data("mouseBrainSubsetSCE") plotSCEDensity( inSCE = mouseBrainSubsetSCE, slotName = "assays", itemName = "counts", feature = "Apoe", groupBy = "sex" )
Visualizes values stored in the assay slot of a SingleCellExperiment object via a density plot.
plotSCEDensityAssayData( inSCE, feature, sample = NULL, useAssay = "counts", featureLocation = NULL, featureDisplay = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, defaultTheme = TRUE, cutoff = NULL, title = NULL, titleSize = 18, combinePlot = "none", plotLabels = NULL )
plotSCEDensityAssayData( inSCE, feature, sample = NULL, useAssay = "counts", featureLocation = NULL, featureDisplay = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, defaultTheme = TRUE, cutoff = NULL, title = NULL, titleSize = 18, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
feature |
Name of feature stored in assay of SingleCellExperiment object. |
sample |
Character vector. Indicates which sample each cell belongs to. |
useAssay |
Indicate which assay to use. Default "counts". |
featureLocation |
Indicates which column name of rowData to query gene. |
featureDisplay |
Indicates which column name of rowData to use to display feature for visualization. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
cutoff |
Numeric value. The plot will be annotated with a vertical line if set. Default NULL. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the density plot of assay data.
data("mouseBrainSubsetSCE") plotSCEDensityAssayData( inSCE = mouseBrainSubsetSCE, feature = "Apoe" )
data("mouseBrainSubsetSCE") plotSCEDensityAssayData( inSCE = mouseBrainSubsetSCE, feature = "Apoe" )
Visualizes values stored in the colData slot of a SingleCellExperiment object via a density plot.
plotSCEDensityColData( inSCE, coldata, sample = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, baseSize = 12, axisSize = NULL, axisLabelSize = NULL, defaultTheme = TRUE, title = NULL, titleSize = 18, cutoff = NULL, combinePlot = "none", plotLabels = NULL )
plotSCEDensityColData( inSCE, coldata, sample = NULL, groupBy = NULL, xlab = NULL, ylab = NULL, baseSize = 12, axisSize = NULL, axisLabelSize = NULL, defaultTheme = TRUE, title = NULL, titleSize = 18, cutoff = NULL, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
coldata |
colData value that will be plotted. |
sample |
Character vector. Indicates which sample each cell belongs to. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
baseSize |
The base font size for all text. Default 12. Can be overwritten by titleSize, axisSize, and axisLabelSize, legendSize, legendTitleSize. |
axisSize |
Size of x/y-axis ticks. Default NULL. |
axisLabelSize |
Size of x/y-axis labels. Default NULL. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
cutoff |
Numeric value. The plot will be annotated with a vertical line if set. Default NULL. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the density plot of colData.
data("mouseBrainSubsetSCE") plotSCEDensityColData( inSCE = mouseBrainSubsetSCE, coldata = "age", groupBy = "sex" )
data("mouseBrainSubsetSCE") plotSCEDensityColData( inSCE = mouseBrainSubsetSCE, coldata = "age", groupBy = "sex" )
Plot results of reduced dimensions data and colors by annotation data stored in the colData slot.
plotSCEDimReduceColData( inSCE, colorBy, reducedDimName, sample = NULL, groupBy = NULL, conditionClass = NULL, shape = NULL, xlab = NULL, ylab = NULL, baseSize = 12, axisSize = NULL, axisLabelSize = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, dotSize = 0.1, transparency = 1, colorScale = NULL, colorLow = "white", colorMid = "gray", colorHigh = "blue", defaultTheme = TRUE, title = NULL, titleSize = 15, labelClusters = TRUE, clusterLabelSize = 3.5, legendTitle = NULL, legendTitleSize = NULL, legendSize = NULL, combinePlot = "none", plotLabels = NULL )
plotSCEDimReduceColData( inSCE, colorBy, reducedDimName, sample = NULL, groupBy = NULL, conditionClass = NULL, shape = NULL, xlab = NULL, ylab = NULL, baseSize = 12, axisSize = NULL, axisLabelSize = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, dotSize = 0.1, transparency = 1, colorScale = NULL, colorLow = "white", colorMid = "gray", colorHigh = "blue", defaultTheme = TRUE, title = NULL, titleSize = 15, labelClusters = TRUE, clusterLabelSize = 3.5, legendTitle = NULL, legendTitleSize = NULL, legendSize = NULL, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
colorBy |
Color by a condition(any column of the annotation data). Required. |
reducedDimName |
Saved dimension reduction matrix name in the SingleCellExperiment object. Required. |
sample |
Character vector. Indicates which sample each cell belongs to. |
groupBy |
Group by a condition(any column of the annotation data). Default NULL. |
conditionClass |
Class of the annotation data used in colorBy. Options are NULL, "factor" or "numeric". If NULL, class will default to the original class. Default NULL. |
shape |
Add shapes to each condition. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
baseSize |
The base font size for all text. Default 12. Can be overwritten by titleSize, axisSize, and axisLabelSize, legendSize, legendTitleSize. |
axisSize |
Size of x/y-axis ticks. Default NULL. |
axisLabelSize |
Size of x/y-axis labels. Default NULL. |
dim1 |
1st dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
dim2 |
2nd dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
bin |
Numeric vector. If single value, will divide the numeric values into the 'bin' groups. If more than one value, will bin numeric values using values as a cut point. |
binLabel |
Character vector. Labels for the bins created by the 'bin' parameter. Default NULL. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
colorScale |
Vector. Needs to be same length as the number of unique levels of colorBy. Will be used only if conditionClass = "factor" or "character". Default NULL. |
colorLow |
Character. A color available from 'colors()'. The color will be used to signify the lowest values on the scale. Default 'white'. |
colorMid |
Character. A color available from 'colors()'. The color will be used to signify the midpoint on the scale. Default 'gray'. |
colorHigh |
Character. A color available from 'colors()'. The color will be used to signify the highest values on the scale. Default 'blue'. |
defaultTheme |
adds grid to plot when TRUE. Default TRUE. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
labelClusters |
Logical. Whether the cluster labels are plotted. |
clusterLabelSize |
Numeric. Determines the size of cluster label when 'labelClusters' is set to TRUE. Default 3.5. |
legendTitle |
title of legend. Default NULL. |
legendTitleSize |
size of legend title. Default 12. |
legendSize |
size of legend. Default NULL. Default FALSE. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the reduced dimension plot of coldata.
data("mouseBrainSubsetSCE") plotSCEDimReduceColData( inSCE = mouseBrainSubsetSCE, colorBy = "tissue", shape = NULL, conditionClass = "factor", reducedDimName = "TSNE_counts", xlab = "tSNE1", ylab = "tSNE2", labelClusters = TRUE ) plotSCEDimReduceColData( inSCE = mouseBrainSubsetSCE, colorBy = "age", shape = NULL, conditionClass = "numeric", reducedDimName = "TSNE_counts", bin = c(-Inf, 20, 25, +Inf), xlab = "tSNE1", ylab = "tSNE2", labelClusters = FALSE )
data("mouseBrainSubsetSCE") plotSCEDimReduceColData( inSCE = mouseBrainSubsetSCE, colorBy = "tissue", shape = NULL, conditionClass = "factor", reducedDimName = "TSNE_counts", xlab = "tSNE1", ylab = "tSNE2", labelClusters = TRUE ) plotSCEDimReduceColData( inSCE = mouseBrainSubsetSCE, colorBy = "age", shape = NULL, conditionClass = "numeric", reducedDimName = "TSNE_counts", bin = c(-Inf, 20, 25, +Inf), xlab = "tSNE1", ylab = "tSNE2", labelClusters = FALSE )
Plot results of reduced dimensions data and colors by feature data stored in the assays slot.
plotSCEDimReduceFeatures( inSCE, feature, reducedDimName, sample = NULL, featureLocation = NULL, featureDisplay = NULL, shape = NULL, useAssay = "logcounts", xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, dotSize = 0.1, transparency = 1, colorLow = "white", colorMid = "gray", colorHigh = "blue", defaultTheme = TRUE, title = NULL, titleSize = 15, legendTitle = NULL, legendSize = 10, legendTitleSize = 12, groupBy = NULL, combinePlot = "none", plotLabels = NULL )
plotSCEDimReduceFeatures( inSCE, feature, reducedDimName, sample = NULL, featureLocation = NULL, featureDisplay = NULL, shape = NULL, useAssay = "logcounts", xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, dotSize = 0.1, transparency = 1, colorLow = "white", colorMid = "gray", colorHigh = "blue", defaultTheme = TRUE, title = NULL, titleSize = 15, legendTitle = NULL, legendSize = 10, legendTitleSize = 12, groupBy = NULL, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
feature |
Name of feature stored in assay of SingleCellExperiment object. |
reducedDimName |
saved dimension reduction name in the SingleCellExperiment object. Required. |
sample |
Character vector. Indicates which sample each cell belongs to. |
featureLocation |
Indicates which column name of rowData to query gene. |
featureDisplay |
Indicates which column name of rowData to use to display feature for visualization. |
shape |
add shapes to each condition. Default NULL. |
useAssay |
Indicate which assay to use. The default is "logcounts" |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
dim1 |
1st dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
dim2 |
2nd dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
bin |
Numeric vector. If single value, will divide the numeric values into the 'bin' groups. If more than one value, will bin numeric values using values as a cut point. |
binLabel |
Character vector. Labels for the bins created by the 'bin' parameter. Default NULL. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
colorLow |
Character. A color available from 'colors()'. The color will be used to signify the lowest values on the scale. Default 'white'. |
colorMid |
Character. A color available from 'colors()'. The color will be used to signify the midpoint on the scale. Default 'gray'. |
colorHigh |
Character. A color available from 'colors()'. The color will be used to signify the highest values on the scale. Default 'blue'. |
defaultTheme |
adds grid to plot when TRUE. Default TRUE. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
legendTitle |
title of legend. Default NULL. |
legendSize |
size of legend. Default 10. |
legendTitleSize |
size of legend title. Default 12. |
groupBy |
Facet wrap the scatterplot based on value.
Default |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the reduced dimension plot of feature data.
data("mouseBrainSubsetSCE") plotSCEDimReduceFeatures( inSCE = mouseBrainSubsetSCE, feature = "Apoe", shape = NULL, reducedDimName = "TSNE_counts", useAssay = "counts", xlab = "tSNE1", ylab = "tSNE2" )
data("mouseBrainSubsetSCE") plotSCEDimReduceFeatures( inSCE = mouseBrainSubsetSCE, feature = "Apoe", shape = NULL, reducedDimName = "TSNE_counts", useAssay = "counts", xlab = "tSNE1", ylab = "tSNE2" )
Plot heatmap of using data stored in SingleCellExperiment Object
plotSCEHeatmap( inSCE, useAssay = "logcounts", useReducedDim = NULL, doLog = FALSE, featureIndex = NULL, cellIndex = NULL, scale = TRUE, trim = c(-2, 2), featureIndexBy = "rownames", cellIndexBy = "rownames", cluster_columns = FALSE, cluster_rows = FALSE, rowDataName = NULL, colDataName = NULL, aggregateRow = NULL, aggregateCol = NULL, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, palette = c("ggplot", "celda", "random"), heatmapPalette = c("sequential", "diverging"), addCellSummary = NULL, rowSplitBy = NULL, colSplitBy = NULL, rowLabel = FALSE, colLabel = FALSE, rowLabelSize = 6, colLabelSize = 6, rowDend = TRUE, colDend = TRUE, title = NULL, rowTitle = "Features", colTitle = "Cells", rowGap = grid::unit(0, "mm"), colGap = grid::unit(0, "mm"), border = FALSE, colorScheme = NULL, ... )
plotSCEHeatmap( inSCE, useAssay = "logcounts", useReducedDim = NULL, doLog = FALSE, featureIndex = NULL, cellIndex = NULL, scale = TRUE, trim = c(-2, 2), featureIndexBy = "rownames", cellIndexBy = "rownames", cluster_columns = FALSE, cluster_rows = FALSE, rowDataName = NULL, colDataName = NULL, aggregateRow = NULL, aggregateCol = NULL, featureAnnotations = NULL, cellAnnotations = NULL, featureAnnotationColor = NULL, cellAnnotationColor = NULL, palette = c("ggplot", "celda", "random"), heatmapPalette = c("sequential", "diverging"), addCellSummary = NULL, rowSplitBy = NULL, colSplitBy = NULL, rowLabel = FALSE, colLabel = FALSE, rowLabelSize = 6, colLabelSize = 6, rowDend = TRUE, colDend = TRUE, title = NULL, rowTitle = "Features", colTitle = "Cells", rowGap = grid::unit(0, "mm"), colGap = grid::unit(0, "mm"), border = FALSE, colorScheme = NULL, ... )
inSCE |
SingleCellExperiment inherited object. |
useAssay |
character. A string indicating the assay name that
provides the expression level to plot. Only for |
useReducedDim |
character. A string indicating the reducedDim name that
provides the expression level to plot. Only for |
doLog |
Logical scalar. Whether to do |
featureIndex |
A vector that can subset the input SCE object by rows
(features). Alternatively, it can be a vector identifying features in
another feature list indicated by |
cellIndex |
A vector that can subset the input SCE object by columns
(cells). Alternatively, it can be a vector identifying cells in another
cell list indicated by |
scale |
Whether to perform z-score or min-max scaling on each row.Choose from |
trim |
A 2-element numeric vector. Values outside of this range will be
trimmed to their nearst bound. Default |
featureIndexBy |
A single character specifying a column name of
|
cellIndexBy |
A single character specifying a column name of
|
cluster_columns |
A logical scalar that turns on/off
clustering of columns. Default |
cluster_rows |
A logical scalar that turns on/off clustering of rows.
Default |
rowDataName |
character. The column name(s) in |
colDataName |
character. The column name(s) in |
aggregateRow |
Feature variable for aggregating the heatmap by row. Can
be a vector or a |
aggregateCol |
Cell variable for aggregating the heatmap by column. Can
be a vector or a |
featureAnnotations |
|
cellAnnotations |
|
featureAnnotationColor |
A named list. Customized color settings for
feature labeling. Should match the entries in the |
cellAnnotationColor |
A named list. Customized color settings for
cell labeling. Should match the entries in the |
palette |
Choose from |
heatmapPalette |
Choose from |
addCellSummary |
Add summary barplots to column annotation. Supply the name of the column in colData as a character. This option will add summary for categorical variables as stacked barplots. |
rowSplitBy |
character. Do semi-heatmap based on the grouping of
this(these) annotation(s). Should exist in either |
colSplitBy |
character. Do semi-heatmap based on the grouping of
this(these) annotation(s). Should exist in either |
rowLabel |
Use a logical for whether to display all the feature names,
a single character to display a column of |
colLabel |
Use a logical for whether to display all the cell names, a
single character to display a column of |
rowLabelSize |
A number for the font size of feature names. Default
|
colLabelSize |
A number for the font size of cell names. Default
|
rowDend |
Whether to display row dendrogram. Default |
colDend |
Whether to display column dendrogram. Default |
title |
The main title of the whole plot. Default |
rowTitle |
The subtitle for the rows. Default |
colTitle |
The subtitle for the columns. Default |
rowGap |
A numeric value or a |
colGap |
A numeric value or a |
border |
A logical scalar. Whether to show the border of the heatmap or
splitted heatmaps. Default |
colorScheme |
function. A function that generates color code by giving
a value. Can be generated by |
... |
Other arguments passed to |
A ggplot
object.
Yichen Wang
data(scExample, package = "singleCellTK") plotSCEHeatmap(sce[1:3,1:3], useAssay = "counts")
data(scExample, package = "singleCellTK") plotSCEHeatmap(sce[1:3,1:3], useAssay = "counts")
Plot results of reduced dimensions data of counts stored in any slot in the SingleCellExperiment object.
plotSCEScatter( inSCE, annotation, reducedDimName = NULL, slot = NULL, sample = NULL, feature = NULL, groupBy = NULL, shape = NULL, conditionClass = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, dotSize = 0.1, transparency = 1, colorLow = "white", colorMid = "gray", colorHigh = "blue", defaultTheme = TRUE, title = NULL, titleSize = 15, labelClusters = TRUE, legendTitle = NULL, legendTitleSize = 12, legendSize = 10, combinePlot = "none", plotLabels = NULL )
plotSCEScatter( inSCE, annotation, reducedDimName = NULL, slot = NULL, sample = NULL, feature = NULL, groupBy = NULL, shape = NULL, conditionClass = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, dotSize = 0.1, transparency = 1, colorLow = "white", colorMid = "gray", colorHigh = "blue", defaultTheme = TRUE, title = NULL, titleSize = 15, labelClusters = TRUE, legendTitle = NULL, legendTitleSize = 12, legendSize = 10, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
annotation |
Desired vector within the slot used for plotting. Default NULL. |
reducedDimName |
saved dimension reduction name in the SingleCellExperiment object. |
slot |
Desired slot of SingleCellExperiment used for plotting. Possible options: "assays", "colData", "metadata", "reducedDims". Default NULL. |
sample |
Character vector. Indicates which sample each cell belongs to. |
feature |
name of feature stored in assay of SingleCellExperiment object. Will be used only if "assays" slot is chosen. Default NULL. |
groupBy |
Group by a condition(any column of the annotation data). Default NULL. |
shape |
add shapes to each condition. |
conditionClass |
class of the annotation data used in colorBy. Options are NULL, "factor" or "numeric". If NULL, class will default to the original class. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
dim1 |
1st dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
dim2 |
2nd dimension to be used for plotting. Can either be a string which specifies the name of the dimension to be plotted from reducedDims, or a numeric value which specifies the index of the dimension to be plotted. Default is NULL. |
bin |
Numeric vector. If single value, will divide the numeric values into the 'bin' groups. If more than one value, will bin numeric values using values as a cut point. |
binLabel |
Character vector. Labels for the bins created by the 'bin' parameter. Default NULL. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
colorLow |
Character. A color available from 'colors()'. The color will be used to signify the lowest values on the scale. Default 'white'. |
colorMid |
Character. A color available from 'colors()'. The color will be used to signify the midpoint on the scale. Default 'gray'. |
colorHigh |
Character. A color available from 'colors()'. The color will be used to signify the highest values on the scale. Default 'blue'. |
defaultTheme |
adds grid to plot when TRUE. Default TRUE. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
labelClusters |
Logical. Whether the cluster labels are plotted. |
legendTitle |
title of legend. Default NULL. |
legendTitleSize |
size of legend title. Default 12. |
legendSize |
size of legend. Default 10. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the reduced dimensions.
data("mouseBrainSubsetSCE") plotSCEScatter( inSCE = mouseBrainSubsetSCE, legendTitle = NULL, slot = "assays", annotation = "counts", feature = "Apoe", reducedDimName = "TSNE_counts", labelClusters = FALSE )
data("mouseBrainSubsetSCE") plotSCEScatter( inSCE = mouseBrainSubsetSCE, legendTitle = NULL, slot = "assays", annotation = "counts", feature = "Apoe", reducedDimName = "TSNE_counts", labelClusters = FALSE )
Visualizes values stored in any slot of a SingleCellExperiment object via a violin plot.
plotSCEViolin( inSCE, slotName, itemName, feature = NULL, sample = NULL, dimension = NULL, groupBy = NULL, violin = TRUE, boxplot = TRUE, dots = TRUE, plotOrder = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, hcutoff = NULL, hcolor = "red", hsize = 1, hlinetype = 1, vcutoff = NULL, vcolor = "red", vsize = 1, vlinetype = 1, combinePlot = "none", plotLabels = NULL )
plotSCEViolin( inSCE, slotName, itemName, feature = NULL, sample = NULL, dimension = NULL, groupBy = NULL, violin = TRUE, boxplot = TRUE, dots = TRUE, plotOrder = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, hcutoff = NULL, hcolor = "red", hsize = 1, hlinetype = 1, vcutoff = NULL, vcolor = "red", vsize = 1, vlinetype = 1, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
slotName |
Desired slot of SingleCellExperiment used for plotting. Possible options: "assays", "colData", "metadata", "reducedDims". Required. |
itemName |
Desired vector within the slot used for plotting. Required. |
feature |
Desired name of feature stored in assay of SingleCellExperiment object. Only used when "assays" slotName is selected. Default NULL. |
sample |
Character vector. Indicates which sample each cell belongs to. |
dimension |
Desired dimension(s) stored in the specified reducedDims. Either an integer which indicates the column(s) or a character vector specifies column name(s). By default, the 1st dimension/column will be used. Only used when "reducedDims" slotName is selected. Default NULL. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
violin |
Boolean. If TRUE, will plot the violin plot. Default TRUE. |
boxplot |
Boolean. If TRUE, will plot boxplots for each violin plot. Default TRUE. |
dots |
Boolean. If TRUE, will plot dots for each violin plot. Default TRUE. |
plotOrder |
Character vector. If set, reorders the violin plots in the order of the character vector when 'groupBy' is set. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
gridLine |
Adds a horizontal grid line if TRUE. Will still be drawn even if defaultTheme is TRUE. Default FALSE. |
summary |
Adds a summary statistic, as well as a crossbar to the violin plot. Options are "mean" or "median". Default NULL. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
hcutoff |
Adds a horizontal line with the y-intercept at given value. Default NULL. |
hcolor |
Character. A color available from 'colors()'. Controls the color of the horizontal cutoff line, if drawn. Default 'black'. |
hsize |
Size of horizontal line, if drawn. Default 0.5. |
hlinetype |
Type of horizontal line, if drawn. can be specified with either an integer or a name (0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash). Default 1. |
vcutoff |
Adds a vertical line with the x-intercept at given value. Default NULL. |
vcolor |
Character. A color available from 'colors()'. Controls the color of the vertical cutoff line, if drawn. Default 'black'. |
vsize |
Size of vertical line, if drawn. Default 0.5. |
vlinetype |
Type of vertical line, if drawn. can be specified with either an integer or a name (0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash). Default 1. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the violin plot.
data("mouseBrainSubsetSCE") plotSCEViolin( inSCE = mouseBrainSubsetSCE, slotName = "assays", itemName = "counts", feature = "Apoe", groupBy = "sex" )
data("mouseBrainSubsetSCE") plotSCEViolin( inSCE = mouseBrainSubsetSCE, slotName = "assays", itemName = "counts", feature = "Apoe", groupBy = "sex" )
Visualizes values stored in the assay slot of a SingleCellExperiment object via a violin plot.
plotSCEViolinAssayData( inSCE, feature, sample = NULL, useAssay = "counts", featureLocation = NULL, featureDisplay = NULL, groupBy = NULL, violin = TRUE, boxplot = TRUE, dots = TRUE, plotOrder = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, hcutoff = NULL, hcolor = "red", hsize = 1, hlinetype = 1, vcutoff = NULL, vcolor = "red", vsize = 1, vlinetype = 1, combinePlot = "none", plotLabels = NULL )
plotSCEViolinAssayData( inSCE, feature, sample = NULL, useAssay = "counts", featureLocation = NULL, featureDisplay = NULL, groupBy = NULL, violin = TRUE, boxplot = TRUE, dots = TRUE, plotOrder = NULL, xlab = NULL, ylab = NULL, axisSize = 10, axisLabelSize = 10, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, title = NULL, titleSize = NULL, hcutoff = NULL, hcolor = "red", hsize = 1, hlinetype = 1, vcutoff = NULL, vcolor = "red", vsize = 1, vlinetype = 1, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
feature |
Name of feature stored in assay of SingleCellExperiment object. |
sample |
Character vector. Indicates which sample each cell belongs to. |
useAssay |
Indicate which assay to use. Default "counts". |
featureLocation |
Indicates which column name of rowData to query gene. |
featureDisplay |
Indicates which column name of rowData to use to display feature for visualization. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
violin |
Boolean. If TRUE, will plot the violin plot. Default TRUE. |
boxplot |
Boolean. If TRUE, will plot boxplots for each violin plot. Default TRUE. |
dots |
Boolean. If TRUE, will plot dots for each violin plot. Default TRUE. |
plotOrder |
Character vector. If set, reorders the violin plots in the order of the character vector when 'groupBy' is set. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
axisSize |
Size of x/y-axis ticks. Default 10. |
axisLabelSize |
Size of x/y-axis labels. Default 10. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
gridLine |
Adds a horizontal grid line if TRUE. Will still be drawn even if defaultTheme is TRUE. Default FALSE. |
summary |
Adds a summary statistic, as well as a crossbar to the violin plot. Options are "mean" or "median". Default NULL. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
hcutoff |
Adds a horizontal line with the y-intercept at given value. Default NULL. |
hcolor |
Character. A color available from 'colors()'. Controls the color of the horizontal cutoff line, if drawn. Default 'black'. |
hsize |
Size of horizontal line, if drawn. Default 0.5. |
hlinetype |
Type of horizontal line, if drawn. can be specified with either an integer or a name (0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash). Default 1. |
vcutoff |
Adds a vertical line with the x-intercept at given value. Default NULL. |
vcolor |
Character. A color available from 'colors()'. Controls the color of the vertical cutoff line, if drawn. Default 'black'. |
vsize |
Size of vertical line, if drawn. Default 0.5. |
vlinetype |
Type of vertical line, if drawn. can be specified with either an integer or a name (0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash). Default 1. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the violin plot of assay data.
data("mouseBrainSubsetSCE") plotSCEViolinAssayData( inSCE = mouseBrainSubsetSCE, feature = "Apoe", groupBy = "sex" )
data("mouseBrainSubsetSCE") plotSCEViolinAssayData( inSCE = mouseBrainSubsetSCE, feature = "Apoe", groupBy = "sex" )
Visualizes values stored in the colData slot of a SingleCellExperiment object via a violin plot.
plotSCEViolinColData( inSCE, coldata, sample = NULL, groupBy = NULL, violin = TRUE, boxplot = TRUE, dots = TRUE, plotOrder = NULL, xlab = NULL, ylab = NULL, baseSize = 12, axisSize = NULL, axisLabelSize = NULL, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, summaryTextSize = 3, title = NULL, titleSize = NULL, hcutoff = NULL, hcolor = "red", hsize = 1, hlinetype = 1, vcutoff = NULL, vcolor = "red", vsize = 1, vlinetype = 1, combinePlot = "none", plotLabels = NULL )
plotSCEViolinColData( inSCE, coldata, sample = NULL, groupBy = NULL, violin = TRUE, boxplot = TRUE, dots = TRUE, plotOrder = NULL, xlab = NULL, ylab = NULL, baseSize = 12, axisSize = NULL, axisLabelSize = NULL, dotSize = 0.1, transparency = 1, defaultTheme = TRUE, gridLine = FALSE, summary = NULL, summaryTextSize = 3, title = NULL, titleSize = NULL, hcutoff = NULL, hcolor = "red", hsize = 1, hlinetype = 1, vcutoff = NULL, vcolor = "red", vsize = 1, vlinetype = 1, combinePlot = "none", plotLabels = NULL )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
coldata |
colData value that will be plotted. |
sample |
Character vector. Indicates which sample each cell belongs to. |
groupBy |
Groupings for each numeric value. A user may input a vector equal length to the number of the samples in the SingleCellExperiment object, or can be retrieved from the colData slot. Default NULL. |
violin |
Boolean. If TRUE, will plot the violin plot. Default TRUE. |
boxplot |
Boolean. If TRUE, will plot boxplots for each violin plot. Default TRUE. |
dots |
Boolean. If TRUE, will plot dots for each violin plot. Default TRUE. |
plotOrder |
Character vector. If set, reorders the violin plots in the order of the character vector when 'groupBy' is set. Default NULL. |
xlab |
Character vector. Label for x-axis. Default NULL. |
ylab |
Character vector. Label for y-axis. Default NULL. |
baseSize |
The base font size for all text. Default 12. Can be overwritten by titleSize, axisSize, and axisLabelSize. |
axisSize |
Size of x/y-axis ticks. Default NULL. |
axisLabelSize |
Size of x/y-axis labels. Default NULL. |
dotSize |
Size of dots. Default 0.1. |
transparency |
Transparency of the dots, values will be 0-1. Default 1. |
defaultTheme |
Removes grid in plot and sets axis title size to 10 when TRUE. Default TRUE. |
gridLine |
Adds a horizontal grid line if TRUE. Will still be drawn even if defaultTheme is TRUE. Default FALSE. |
summary |
Adds a summary statistic, as well as a crossbar to the violin plot. Options are "mean" or "median". Default NULL. |
summaryTextSize |
The text size of the summary statistic displayed above the violin plot. Default 3. |
title |
Title of plot. Default NULL. |
titleSize |
Size of title of plot. Default 15. |
hcutoff |
Adds a horizontal line with the y-intercept at given value. Default NULL. |
hcolor |
Character. A color available from 'colors()'. Controls the color of the horizontal cutoff line, if drawn. Default 'black'. |
hsize |
Size of horizontal line, if drawn. Default 0.5. |
hlinetype |
Type of horizontal line, if drawn. can be specified with either an integer or a name (0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash). Default 1. |
vcutoff |
Adds a vertical line with the x-intercept at given value. Default NULL. |
vcolor |
Character. A color available from 'colors()'. Controls the color of the vertical cutoff line, if drawn. Default 'black'. |
vsize |
Size of vertical line, if drawn. Default 0.5. |
vlinetype |
Type of vertical line, if drawn. can be specified with either an integer or a name (0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash). Default 1. |
combinePlot |
Must be either "all", "sample", or "none". "all" will combine all plots into a single .ggplot object, while "sample" will output a list of plots separated by sample. Default "none". |
plotLabels |
labels to each plot. If set to "default", will use the name of the samples as the labels. If set to "none", no label will be plotted. |
a ggplot of the violin plot of coldata.
data("mouseBrainSubsetSCE") plotSCEViolinColData( inSCE = mouseBrainSubsetSCE, coldata = "age", groupBy = "sex" )
data("mouseBrainSubsetSCE") plotSCEViolinColData( inSCE = mouseBrainSubsetSCE, coldata = "age", groupBy = "sex" )
A wrapper function which visualizes outputs from the runScrublet function stored in the colData slot of the SingleCellExperiment object via various plots.
plotScrubletResults( inSCE, reducedDimName, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
plotScrubletResults( inSCE, reducedDimName, sample = NULL, shape = NULL, groupBy = NULL, combinePlot = "all", violin = TRUE, boxplot = FALSE, dots = TRUE, xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, bin = NULL, binLabel = NULL, defaultTheme = TRUE, dotSize = 0.5, summary = "median", summaryTextSize = 3, transparency = 1, baseSize = 15, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL, relHeights = 1, relWidths = c(1, 1, 1), plotNCols = NULL, plotNRows = NULL, labelSamples = TRUE, samplePerColumn = TRUE, sampleRelHeights = 1, sampleRelWidths = 1 )
inSCE |
Input SingleCellExperiment object with saved
dimension reduction components or a variable with saved results from
|
reducedDimName |
Saved dimension reduction name in |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
shape |
If provided, add shapes based on the value. Default |
groupBy |
Groupings for each numeric value. A user may input a vector
equal length to the number of the samples in |
combinePlot |
Must be either |
violin |
Boolean. If |
boxplot |
Boolean. If |
dots |
Boolean. If |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
dim1 |
1st dimension to be used for plotting. Can either be a string
which specifies the name of the dimension to be plotted from reducedDims, or
a numeric value which specifies the index of the dimension to be plotted.
Default is |
dim2 |
2nd dimension to be used for plotting. Similar to |
bin |
Numeric vector. If single value, will divide the numeric values
into |
binLabel |
Character vector. Labels for the bins created by |
defaultTheme |
Removes grid in plot and sets axis title size to
|
dotSize |
Size of dots. Default |
summary |
Adds a summary statistic, as well as a crossbar to the
violin plot. Options are |
summaryTextSize |
The text size of the summary statistic displayed
above the violin plot. Default |
transparency |
Transparency of the dots, values will be 0-1. Default
|
baseSize |
The base font size for all text. Default |
titleSize |
Size of title of plot. Default |
axisLabelSize |
Size of x/y-axis labels. Default |
axisSize |
Size of x/y-axis ticks. Default |
legendSize |
size of legend. Default |
legendTitleSize |
size of legend title. Default |
relHeights |
Relative heights of plots when combine is set. Default
|
relWidths |
Relative widths of plots when combine is set. Default
|
plotNCols |
Number of columns when plots are combined in a grid. Default
|
plotNRows |
Number of rows when plots are combined in a grid. Default
|
labelSamples |
Will label sample name in title of plot if TRUE. Default
|
samplePerColumn |
If |
sampleRelHeights |
If there are multiple samples and combining by
|
sampleRelWidths |
If there are multiple samples and combining by
|
list of .ggplot objects
data(scExample, package="singleCellTK") ## Not run: sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runScrublet(sce) plotScrubletResults(inSCE=sce, reducedDimName="UMAP") ## End(Not run)
data(scExample, package="singleCellTK") ## Not run: sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) sce <- runScrublet(sce) plotScrubletResults(inSCE=sce, reducedDimName="UMAP") ## End(Not run)
plotSeuratElbow Computes the plot object for elbow plot from the pca slot in the input sce object
plotSeuratElbow( inSCE, significantPC = NULL, reduction = "pca", ndims = 20, externalReduction = NULL, interactive = TRUE )
plotSeuratElbow( inSCE, significantPC = NULL, reduction = "pca", ndims = 20, externalReduction = NULL, interactive = TRUE )
inSCE |
(sce) object from which to compute the elbow plot (pca should be computed) |
significantPC |
Number of significant principal components to plot.
This is used to alter the color of the points for the corresponding PCs.
If |
reduction |
Reduction to use for elbow plot generation. Either
|
ndims |
Number of components to use. Default |
externalReduction |
Pass DimReduc object if PCA/ICA computed through
other libraries. Default |
interactive |
Logical value indicating if the returned object should
be an interactive plotly object if |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") plotSeuratElbow(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") plotSeuratElbow(sce) ## End(Not run)
Compute and plot visualizations for marker genes
plotSeuratGenes( inSCE, useAssay = "seuratNormData", plotType, features, groupVariable, reducedDimName = "seuratUMAP", splitBy = NULL, cols = c("lightgrey", "blue"), ncol = 1, combine = FALSE )
plotSeuratGenes( inSCE, useAssay = "seuratNormData", plotType, features, groupVariable, reducedDimName = "seuratUMAP", splitBy = NULL, cols = c("lightgrey", "blue"), ncol = 1, combine = FALSE )
inSCE |
Input |
useAssay |
Specify the name of the assay that will be scaled by this function. |
plotType |
Specify the type of the plot to compute. Options are limited to "ridge", "violin", "feature", "dot" and "heatmap". |
features |
Specify the features to compute the plot against. |
groupVariable |
Specify the column name from the colData slot that should be used as grouping variable. |
reducedDimName |
saved dimension reduction name in the
SingleCellExperiment object. Default |
splitBy |
Specify the column name from the colData slot that should be
used to split samples.
Default is |
cols |
Specify two colors to form a gradient between. Default is
|
ncol |
Visualizations will be adjusted in "ncol" number of columns.
Default is |
combine |
A logical value that indicates if the plots should be combined
together into a single plot if |
Plot object
plotSeuratHeatmap Modifies the heatmap plot object so it contains specified number of heatmaps in a single plot
plotSeuratHeatmap(plotObject, dims, ncol, labels)
plotSeuratHeatmap(plotObject, dims, ncol, labels)
plotObject |
plot object computed from runSeuratHeatmap() function |
dims |
numerical value of how many heatmaps to draw (default is 0) |
ncol |
numerical value indicating that in how many columns should the heatmaps be distrbuted (default is 2) |
labels |
list() of labels to draw on heatmaps |
modified plot object
plotSeuratHVG Plot highly variable genes from input sce object (must have highly variable genes computations stored)
plotSeuratHVG(inSCE, labelPoints = 0)
plotSeuratHVG(inSCE, labelPoints = 0)
inSCE |
(sce) object that contains the highly variable genes computations |
labelPoints |
Numeric value indicating the number of top genes that
should be labeled.
Default is |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") plotSeuratHVG(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") plotSeuratHVG(sce) ## End(Not run)
plotSeuratJackStraw Computes the plot object for jackstraw plot from the pca slot in the input sce object
plotSeuratJackStraw( inSCE, dims = NULL, xmax = 0.1, ymax = 0.3, externalReduction = NULL )
plotSeuratJackStraw( inSCE, dims = NULL, xmax = 0.1, ymax = 0.3, externalReduction = NULL )
inSCE |
(sce) object from which to compute the jackstraw plot (pca should be computed) |
dims |
Number of components to plot in Jackstraw. If |
xmax |
X-axis maximum on each QQ plot. Default |
ymax |
Y-axis maximum on each QQ plot. Default |
externalReduction |
Pass DimReduc object if PCA/ICA computed through
other libraries. Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratJackStraw(sce, useAssay = "counts") plotSeuratJackStraw(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratJackStraw(sce, useAssay = "counts") plotSeuratJackStraw(sce) ## End(Not run)
plotSeuratReduction Plots the selected dimensionality reduction method
plotSeuratReduction( inSCE, useReduction = c("pca", "ica", "tsne", "umap"), showLegend = FALSE, groupBy = NULL, splitBy = NULL )
plotSeuratReduction( inSCE, useReduction = c("pca", "ica", "tsne", "umap"), showLegend = FALSE, groupBy = NULL, splitBy = NULL )
inSCE |
(sce) object which has the selected dimensionality reduction algorithm already computed and stored |
useReduction |
Dimentionality reduction to plot. One of "pca", "ica",
"tsne", or "umap". Default |
showLegend |
Select if legends and labels should be shown on the output
plot or not. Either "TRUE" or "FALSE". Default |
groupBy |
Specify a colData column name that be used for grouping.
Default is |
splitBy |
Specify a colData column name that be used for splitting the
output plot. Default is |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") plotSeuratReduction(sce, useReduction = "pca") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") plotSeuratReduction(sce, useReduction = "pca") ## End(Not run)
This function will generate a combination of plots basing on the correction done by SoupX. For each sample, there will be a UMAP with cluster labeling, followed by a number of UMAPs showing the change in selected top markers. The cluster labeling is what should be used for SoupX to estimate the contamination. The Soup Fraction is calculated by subtracting the gene expression value of the output corrected matrix from that of the original input matrix, and then devided by the input.
plotSoupXResults( inSCE, sample = NULL, background = FALSE, reducedDimName = NULL, plotNCols = 3, plotNRows = 2, baseSize = 8, combinePlot = c("all", "sample", "none"), xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, labelClusters = FALSE, clusterLabelSize = 3.5, defaultTheme = TRUE, dotSize = 0.5, transparency = 1, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL )
plotSoupXResults( inSCE, sample = NULL, background = FALSE, reducedDimName = NULL, plotNCols = 3, plotNRows = 2, baseSize = 8, combinePlot = c("all", "sample", "none"), xlab = NULL, ylab = NULL, dim1 = NULL, dim2 = NULL, labelClusters = FALSE, clusterLabelSize = 3.5, defaultTheme = TRUE, dotSize = 0.5, transparency = 1, titleSize = NULL, axisLabelSize = NULL, axisSize = NULL, legendSize = NULL, legendTitleSize = NULL )
inSCE |
A SingleCellExperiment object. With
|
sample |
Character vector. Indicates which sample each cell belongs to.
Default |
background |
Logical. Whether |
reducedDimName |
Character. The embedding to use for plotting. Leave it
|
plotNCols |
Integer. Number of columns for the plot grid per sample.
Will determine the number of top markers to show together with
|
plotNRows |
Integer. Number of rows for the plot grid per sample. Will
determine the number of top markers to show together with |
baseSize |
Numeric. The base font size for all text. Default 12. Can be
overwritten by titleSize, axisSize, and axisLabelSize, legendSize,
legendTitleSize. Default |
combinePlot |
Must be either |
xlab |
Character vector. Label for x-axis. Default |
ylab |
Character vector. Label for y-axis. Default |
dim1 |
See |
dim2 |
See |
labelClusters |
Logical. Whether the cluster labels are plotted. Default
|
clusterLabelSize |
Numeric. Determines the size of cluster label when
|
defaultTheme |
Logical. Adds grid to plot when |
dotSize |
Numeric. Size of dots. Default |
transparency |
Numeric. Transparency of the dots, values will be from 0
to 1. Default |
titleSize |
Numeric. Size of title of plot. Default |
axisLabelSize |
Numeric. Size of x/y-axis labels. Default |
axisSize |
Numeric. Size of x/y-axis ticks. Default |
legendSize |
Numeric. Size of legend. Default |
legendTitleSize |
Numeric. Size of legend title. Default |
ggplot object of the combination of UMAPs. See description.
runSoupX
## Not run: sce <- importExampleData("pbmc3k") sce <- runSoupX(sce, sample = "sample") plotSoupXResults(sce, sample = "sample") ## End(Not run)
## Not run: sce <- importExampleData("pbmc3k") sce <- runSoupX(sce, sample = "sample") plotSoupXResults(sce, sample = "sample") ## End(Not run)
Plot highly variable genes
plotTopHVG( inSCE, method = "modelGeneVar", hvgNumber = 2000, useFeatureSubset = NULL, labelsCount = 10, featureDisplay = metadata(inSCE)$featureDisplay, labelSize = 2, dotSize = 2, textSize = 12 )
plotTopHVG( inSCE, method = "modelGeneVar", hvgNumber = 2000, useFeatureSubset = NULL, labelsCount = 10, featureDisplay = metadata(inSCE)$featureDisplay, labelSize = 2, dotSize = 2, textSize = 12 )
inSCE |
Input |
method |
Select either |
hvgNumber |
Specify the number of top genes to highlight in red. Default
|
useFeatureSubset |
A character string for the |
labelsCount |
Specify the number of data points/genes to label. Should
be less than |
featureDisplay |
A character string for the |
labelSize |
Numeric, size of the text label on top HVGs. Default
|
dotSize |
Numeric, size of the dots of the features. Default |
textSize |
Numeric, size of the text of axis title, axis label, etc.
Default |
When hvgNumber = NULL
and useFeature = NULL
, only plot
the mean VS variance/dispersion scatter plot. When only hvgNumber
set,
label the top hvgNumber
HVGs ranked by the metrics calculated by
method
. When useFeatureSubset
set, label the features in
the subset on the scatter plot created with method
and ignore
hvgNumber
.
ggplot of HVG metrics and top HVG labels
runFeatureSelection
, runSeuratFindHVG
,
runModelGeneVar
, getTopHVG
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runModelGeneVar(mouseBrainSubsetSCE) plotTopHVG(mouseBrainSubsetSCE, method = "modelGeneVar")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runModelGeneVar(mouseBrainSubsetSCE) plotTopHVG(mouseBrainSubsetSCE, method = "modelGeneVar")
runTSCANClusterDEAnalysis
on
cell 2D embedding with MST overlaidA wrapper function which plot the top features expression
identified by runTSCANClusterDEAnalysis
on the 2D embedding of
the cells cluster used in the analysis. The related MST edges are overlaid.
plotTSCANClusterDEG( inSCE, useCluster, pathIndex = NULL, useReducedDim = "UMAP", topN = 9, useAssay = NULL, featureDisplay = metadata(inSCE)$featureDisplay, combinePlot = c("all", "none") )
plotTSCANClusterDEG( inSCE, useCluster, pathIndex = NULL, useReducedDim = "UMAP", topN = 9, useAssay = NULL, featureDisplay = metadata(inSCE)$featureDisplay, combinePlot = c("all", "none") )
inSCE |
Input SingleCellExperiment object. |
useCluster |
Choose a cluster used for identifying DEG with
|
pathIndex |
Specifies one of the branching paths from |
useReducedDim |
A single character for the matrix of 2D embedding.
Should exist in |
topN |
Integer. Use top N genes identified. Default |
useAssay |
A single character for the feature expression matrix. Should
exist in |
featureDisplay |
Specify the feature ID type to display. Users can set
default value with |
combinePlot |
Must be either |
A .ggplot
object of cell scatter plot, colored by the
expression of a gene identified by runTSCANClusterDEAnalysis
,
with the layer of trajectory.
Yichen Wang
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") mouseBrainSubsetSCE <- runTSCANClusterDEAnalysis(inSCE = mouseBrainSubsetSCE, useCluster = 1) plotTSCANClusterDEG(mouseBrainSubsetSCE, useCluster = 1, useReducedDim = "TSNE_logcounts")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") mouseBrainSubsetSCE <- runTSCANClusterDEAnalysis(inSCE = mouseBrainSubsetSCE, useCluster = 1) plotTSCANClusterDEG(mouseBrainSubsetSCE, useCluster = 1, useReducedDim = "TSNE_logcounts")
This function finds all paths that root from a given cluster
useCluster
. For each path, this function plots the recomputed
pseudotime starting from the root on a scatter plot which contains cells only
in this cluster. MST has to be pre-calculated with runTSCAN
.
plotTSCANClusterPseudo( inSCE, useCluster, useReducedDim = "UMAP", combinePlot = c("all", "none") )
plotTSCANClusterPseudo( inSCE, useCluster, useReducedDim = "UMAP", combinePlot = c("all", "none") )
inSCE |
Input SingleCellExperiment object. |
useCluster |
The cluster to be regarded as the root, has to existing in
|
useReducedDim |
Saved dimension reduction name in the SingleCellExperiment object. Required. |
combinePlot |
Must be either |
combinePlot = "all" |
A |
combinePlot = "none" |
A list of |
Nida Pervaiz
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") plotTSCANClusterPseudo(mouseBrainSubsetSCE, useCluster = 1, useReducedDim = "TSNE_logcounts")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") plotTSCANClusterPseudo(mouseBrainSubsetSCE, useCluster = 1, useReducedDim = "TSNE_logcounts")
A wrapper function which plots all cells or cells in chosen cluster. Each point is a cell colored by the expression of a feature of interest, the relevant edges of the MST are overlaid on top.
plotTSCANDimReduceFeatures( inSCE, features, useReducedDim = "UMAP", useAssay = "logcounts", by = "rownames", useCluster = NULL, featureDisplay = metadata(inSCE)$featureDisplay, combinePlot = c("all", "none") )
plotTSCANDimReduceFeatures( inSCE, features, useReducedDim = "UMAP", useAssay = "logcounts", by = "rownames", useCluster = NULL, featureDisplay = metadata(inSCE)$featureDisplay, combinePlot = c("all", "none") )
inSCE |
Input SingleCellExperiment object. |
features |
Choose the feature of interest to explore the expression level on the trajectory. Required. |
useReducedDim |
A single character for the matrix of 2D embedding.
Should exist in |
useAssay |
A single character for the feature expression matrix. Should
exist in |
by |
Where should |
useCluster |
Choose specific clusters where gene expression needs to be
visualized. By default |
featureDisplay |
Specify the feature ID type to display. Users can set
default value with |
combinePlot |
Must be either |
A .ggplot
object of cell scatter plot, colored by the
expression of a gene of interest, with the layer of trajectory.
Yichen Wang
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") plotTSCANDimReduceFeatures(inSCE = mouseBrainSubsetSCE, features = "Tshz1", useReducedDim = "TSNE_logcounts")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") plotTSCANDimReduceFeatures(inSCE = mouseBrainSubsetSCE, features = "Tshz1", useReducedDim = "TSNE_logcounts")
A wrapper function which visualizes outputs from the
runTSCANDEG
function. Plots the genes that increase or decrease
in expression with increasing pseudotime along the path in the MST.
runTSCANDEG
has to be run in advance with using the same
pathIndex
of interest.
plotTSCANPseudotimeGenes( inSCE, pathIndex, direction = c("increasing", "decreasing"), topN = 10, useAssay = NULL, featureDisplay = metadata(inSCE)$featureDisplay )
plotTSCANPseudotimeGenes( inSCE, pathIndex, direction = c("increasing", "decreasing"), topN = 10, useAssay = NULL, featureDisplay = metadata(inSCE)$featureDisplay )
inSCE |
Input SingleCellExperiment object. |
pathIndex |
Path index for which the pseudotime values should be used.
Should have being used in |
direction |
Should we show features with expression increasing or
decreeasing along the increase in TSCAN pseudotime? Choices are
|
topN |
An integer. Only to plot this number of top genes that are increasing/decreasing in expression with increasing pseudotime along the path in the MST. Default 10 |
useAssay |
A single character to specify a feature expression matrix in
|
featureDisplay |
Specify the feature ID type to display. Users can set
default value with |
A .ggplot
object with the facets of the top genes. Expression
on y-axis, pseudotime on x-axis.
Nida Pervaiz
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") terminalNodes <- listTSCANTerminalNodes(mouseBrainSubsetSCE) mouseBrainSubsetSCE <- runTSCANDEG(inSCE = mouseBrainSubsetSCE, pathIndex = terminalNodes[1]) plotTSCANPseudotimeGenes(mouseBrainSubsetSCE, pathIndex = terminalNodes[1], useAssay = "logcounts")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") terminalNodes <- listTSCANTerminalNodes(mouseBrainSubsetSCE) mouseBrainSubsetSCE <- runTSCANDEG(inSCE = mouseBrainSubsetSCE, pathIndex = terminalNodes[1]) plotTSCANPseudotimeGenes(mouseBrainSubsetSCE, pathIndex = terminalNodes[1], useAssay = "logcounts")
A wrapper function which visualizes outputs from the
runTSCANDEG
function. Plots the top genes that change in
expression with increasing pseudotime along the path in the MST.
runTSCANDEG
has to be run in advance with using the same
pathIndex
of interest.
plotTSCANPseudotimeHeatmap( inSCE, pathIndex, direction = c("both", "increasing", "decreasing"), topN = 50, log2fcThreshold = NULL, useAssay = NULL, featureDisplay = metadata(inSCE)$featureDisplay )
plotTSCANPseudotimeHeatmap( inSCE, pathIndex, direction = c("both", "increasing", "decreasing"), topN = 50, log2fcThreshold = NULL, useAssay = NULL, featureDisplay = metadata(inSCE)$featureDisplay )
inSCE |
Input SingleCellExperiment object. |
pathIndex |
Path index for which the pseudotime values should be used.
Should have being used in |
direction |
Should we show features with expression increasing or
decreeasing along the increase in TSCAN pseudotime? Choices are
|
topN |
An integer. Only to plot this number of top genes along the path
in the MST, in terms of FDR value. Use |
log2fcThreshold |
Only output DEGs with the absolute values of log2FC
larger than this value. Default |
useAssay |
A single character to specify a feature expression matrix in
|
featureDisplay |
Whether to display feature ID and what ID type to
display. Users can set default ID type by |
A ComplexHeatmap in .ggplot
class
Nida Pervaiz
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") terminalNodes <- listTSCANTerminalNodes(mouseBrainSubsetSCE) mouseBrainSubsetSCE <- runTSCANDEG(inSCE = mouseBrainSubsetSCE, pathIndex = terminalNodes[1]) plotTSCANPseudotimeHeatmap(mouseBrainSubsetSCE, pathIndex = terminalNodes[1])
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") terminalNodes <- listTSCANTerminalNodes(mouseBrainSubsetSCE) mouseBrainSubsetSCE <- runTSCANDEG(inSCE = mouseBrainSubsetSCE, pathIndex = terminalNodes[1]) plotTSCANPseudotimeHeatmap(mouseBrainSubsetSCE, pathIndex = terminalNodes[1])
A wrapper function which visualizes outputs from the
runTSCAN
function. Plots the pseudotime ordering of the cells
and project them onto the MST.
plotTSCANResults(inSCE, useReducedDim = "UMAP")
plotTSCANResults(inSCE, useReducedDim = "UMAP")
inSCE |
Input SingleCellExperiment object. |
useReducedDim |
Saved dimension reduction name in |
A .ggplot
object with the pseudotime ordering of the cells
colored on a cell 2D embedding, and the MST path drawn on it.
Nida Pervaiz
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") plotTSCANResults(inSCE = mouseBrainSubsetSCE, useReducedDim = "TSNE_logcounts")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") plotTSCANResults(inSCE = mouseBrainSubsetSCE, useReducedDim = "TSNE_logcounts")
Plot t-SNE plot on dimensionality reduction data run from t-SNE method.
plotTSNE( inSCE, colorBy = NULL, shape = NULL, reducedDimName = "TSNE", runTSNE = FALSE, useAssay = "counts" )
plotTSNE( inSCE, colorBy = NULL, shape = NULL, reducedDimName = "TSNE", runTSNE = FALSE, useAssay = "counts" )
inSCE |
Input SingleCellExperiment object. |
colorBy |
color by condition. |
shape |
add shape to each distinct label. |
reducedDimName |
a name to store the results of the dimension reduction coordinates obtained from this method. This is stored in the SingleCellExperiment object in the reducedDims slot. Required. |
runTSNE |
Run t-SNE if the reducedDimName does not exist. the Default is FALSE. |
useAssay |
Indicate which assay to use. The default is "logcounts". |
A t-SNE plot
data("mouseBrainSubsetSCE") plotTSNE(mouseBrainSubsetSCE, colorBy = "level1class", reducedDimName = "TSNE_counts")
data("mouseBrainSubsetSCE") plotTSNE(mouseBrainSubsetSCE, colorBy = "level1class", reducedDimName = "TSNE_counts")
Plot UMAP results either on already run results or run first and then plot.
plotUMAP( inSCE, colorBy = NULL, shape = NULL, reducedDimName = "UMAP", runUMAP = FALSE, useAssay = "counts" )
plotUMAP( inSCE, colorBy = NULL, shape = NULL, reducedDimName = "UMAP", runUMAP = FALSE, useAssay = "counts" )
inSCE |
Input SingleCellExperiment object with saved dimension reduction components. Required |
colorBy |
color by a condition(any column of the annotation data). |
shape |
add shapes to each condition. |
reducedDimName |
saved dimension reduction name in the SingleCellExperiment object. Required. |
runUMAP |
If the dimension reduction components are already available set this to FALSE, otherwise set to TRUE. Default is False. |
useAssay |
Indicate which assay to use. The default is "logcounts" |
a UMAP plot of the reduced dimensions.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) plotUMAP(sce)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runQuickUMAP(sce) plotUMAP(sce)
Create SingleCellExperiment object from command line input arguments
qcInputProcess( preproc, samplename, path, raw, fil, ref, rawFile, filFile, flatFiles, dataType )
qcInputProcess( preproc, samplename, path, raw, fil, ref, rawFile, filFile, flatFiles, dataType )
preproc |
Method used to preprocess the data. It's one of the path provided in –preproc argument. |
samplename |
The sample name of the data. It's one of the path provided in –sample argument. |
path |
Base path of the dataset. It's one of the path provided in –bash_path argument. |
raw |
The directory contains droplet matrix, gene and cell barcodes information. It's one of the path provided in –raw_data_path argument. |
fil |
The directory contains cell matrix, gene and cell barcodes information. It's one of the path provided in –cell_data_path argument. |
ref |
The name of reference used by cellranger. Only need for CellrangerV2 data. |
rawFile |
The full path of the RDS file or Matrix file of the raw gene count matrix. It's one of the path provided in –raw_data argument. |
filFile |
The full path of the RDS file or Matrix file of the cell count matrix. It's one of the path provided in –cell_data argument. |
flatFiles |
The full paths of the matrix, barcode, and features (in that order) files used to construct an SCE object. |
dataType |
Type of the input. It can be "Both", "Droplet" or "Cell". It's one of the path provided in –genome argument. |
A list of SingleCellExperiment object containing the droplet or cell data or both,depending on the dataType that users provided.
Automatically detact the format of the input file and read the file.
readSingleCellMatrix( file, class = c("Matrix", "matrix"), delayedArray = TRUE, colIndexLocation = NULL, rowIndexLocation = NULL )
readSingleCellMatrix( file, class = c("Matrix", "matrix"), delayedArray = TRUE, colIndexLocation = NULL, rowIndexLocation = NULL )
file |
Path to input file. Supported file endings include .mtx, .txt,
.csv, .tab, .tsv, .npz, and their corresponding |
class |
Character. Class of matrix. One of "Matrix" or "matrix". Specifying "Matrix" will convert to a sparse format which should be used for datasets with large numbers of cells. Default "Matrix". |
delayedArray |
Boolean. Whether to read the expression matrix as
DelayedArray object or not. Default |
colIndexLocation |
Character. For Optimus output, the path to the
barcode index .npy file. Used only if |
rowIndexLocation |
Character. For Optimus output, The path to the
feature (gene) index .npy file. Used only if |
A DelayedArray object or matrix.
mat <- readSingleCellMatrix(system.file("extdata/hgmm_1k_v3_20x20/outs/", "filtered_feature_bc_matrix/matrix.mtx.gz", package = "singleCellTK"))
mat <- readSingleCellMatrix(system.file("extdata/hgmm_1k_v3_20x20/outs/", "filtered_feature_bc_matrix/matrix.mtx.gz", package = "singleCellTK"))
A function to generate .html Rmarkdown report containing the visualizations of the runCellQC function output
reportCellQC( inSCE, output_file = NULL, output_dir = NULL, subTitle = NULL, studyDesign = NULL, useReducedDim = NULL )
reportCellQC( inSCE, output_file = NULL, output_dir = NULL, subTitle = NULL, studyDesign = NULL, useReducedDim = NULL )
inSCE |
A SingleCellExperiment object containing the filtered count matrix with the output from runCellQC function |
output_file |
Character. The name of the generated file. If NULL/default then the output file name will be based on the name of the Rmarkdown template. |
output_dir |
Character. The name of the output directory to save the rendered file. If NULL/default the file is stored to the current working directory |
subTitle |
subtitle of the QC HTML report. Default is NULL. |
studyDesign |
Character. The description of the data set and experiment design. It would be shown at the top of QC HTML report. Default is NULL. |
useReducedDim |
Character. The name of the saved dimension reduction slot including cells from all samples in thenSingleCellExperiment object, Default is NULL |
.html file
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") ## Not run: sce <- runCellQC(sce) reportCellQC(inSCE = sce) ## End(Not run)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") ## Not run: sce <- runCellQC(sce) reportCellQC(inSCE = sce) ## End(Not run)
A function to generate .html Rmarkdown report containing the visualizations of the plotClusterAbundance function output
reportClusterAbundance( inSCE, cluster, variable, output_dir = ".", output_file = "plotClusterAbundance_Report", pdf = FALSE, showSession = TRUE )
reportClusterAbundance( inSCE, cluster, variable, output_dir = ".", output_file = "plotClusterAbundance_Report", pdf = FALSE, showSession = TRUE )
inSCE |
A |
cluster |
A single |
variable |
A single |
output_dir |
name of the output directory to save the rendered file. If
|
output_file |
name of the generated file. If |
pdf |
A |
showSession |
A |
An HTML file of the report will be generated at the path specified in the arguments.
A function to generate .html Rmarkdown report containing the visualizations of the diffAbundanceFET function output
reportDiffAbundanceFET( inSCE, cluster, variable, control, case, analysisName, output_dir = ".", output_file = "DifferentialAbundanceFET_Report", pdf = FALSE, showSession = TRUE )
reportDiffAbundanceFET( inSCE, cluster, variable, control, case, analysisName, output_dir = ".", output_file = "DifferentialAbundanceFET_Report", pdf = FALSE, showSession = TRUE )
inSCE |
A |
cluster |
A single |
variable |
A single |
control |
|
case |
|
analysisName |
A single |
output_dir |
name of the output directory to save the rendered file. If
|
output_file |
name of the generated file. If |
pdf |
A |
showSession |
A |
An HTML file of the report will be generated at the path specified in the arguments.
A function to generate .html Rmarkdown report containing the
visualizations of the runDEAnalysis
function output
reportDiffExp( inSCE, study, useReducedDim, featureDisplay = NULL, output_file = NULL, output_dir = NULL )
reportDiffExp( inSCE, study, useReducedDim, featureDisplay = NULL, output_file = NULL, output_dir = NULL )
inSCE |
A |
study |
The specific analysis to visualize, used as |
useReducedDim |
Specify an embedding for visualizing the relation ship between the conditions. |
featureDisplay |
The feature ID type to use for displaying. Should
exists as a variable name of |
output_file |
name of the generated file. If |
output_dir |
name of the output directory to save the rendered file. If
|
Saves the HTML report in the specified output directory.
A function to generate .html Rmarkdown report containing the visualizations of the runDropletQC function output
reportDropletQC( inSCE, output_file = NULL, output_dir = NULL, subTitle = NULL, studyDesign = NULL )
reportDropletQC( inSCE, output_file = NULL, output_dir = NULL, subTitle = NULL, studyDesign = NULL )
inSCE |
A SingleCellExperiment object containing the full droplet count matrix with the output from runDropletQC function |
output_file |
name of the generated file. If NULL/default then the output file name will be based on the name of the Rmarkdown template |
output_dir |
name of the output directory to save the rendered file. If NULL/default the file is stored to the current working directory |
subTitle |
subtitle of the QC HTML report. Default is NULL. |
studyDesign |
description of the data set and experiment design. It would be shown at the top of QC HTML report. Default is NULL. |
.html file
data(scExample, package = "singleCellTK") ## Not run: sce <- runDropletQC(sce) reportDropletQC(inSCE = sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runDropletQC(sce) reportDropletQC(inSCE = sce) ## End(Not run)
A function to generate .html Rmarkdown report containing the
visualizations of the runFindMarker
function output
reportFindMarker(inSCE, output_file = NULL, output_dir = NULL)
reportFindMarker(inSCE, output_file = NULL, output_dir = NULL)
inSCE |
A |
output_file |
name of the generated file. If |
output_dir |
name of the output directory to save the rendered file. If
|
An HTML file of the report will be generated at the path specified in the arguments.
A function to generate .html Rmarkdown report for the specified QC algorithm output
reportQCTool( inSCE, algorithm = c("BarcodeRankDrops", "EmptyDrops", "QCMetrics", "Scrublet", "ScDblFinder", "Cxds", "Bcds", "CxdsBcdsHybrid", "DoubletFinder", "DecontX", "SoupX"), output_file = NULL, output_dir = NULL )
reportQCTool( inSCE, algorithm = c("BarcodeRankDrops", "EmptyDrops", "QCMetrics", "Scrublet", "ScDblFinder", "Cxds", "Bcds", "CxdsBcdsHybrid", "DoubletFinder", "DecontX", "SoupX"), output_file = NULL, output_dir = NULL )
inSCE |
A SingleCellExperiment object containing the count matrix (full droplets or filtered matrix, depends on the selected QC algorithm) with the output from at least one of these functions: runQCMetrics, runScrublet, runScDblFinder, runCxds, runBcds, runCxdsBcdsHybrid, runDecontX, runBarcodeRankDrops, runEmptyDrops |
algorithm |
Character. Specifies which QC algorithm report to generate. Available options are "BarcodeRankDrops", "EmptyDrops", "QCMetrics", "Scrublet", "ScDblFinder", "Cxds", "Bcds", "CxdsBcdsHybrid", "DoubletFinder", "DecontX" and "SoupX". |
output_file |
name of the generated file. If NULL/default then the output file name will be based on the name of the selected QC algorithm name . |
output_dir |
name of the output directory to save the rendered file. If NULL/default the file is stored to the current working directory |
.html file
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") ## Not run: sce <- runDecontX(sce) sce <- runQuickUMAP(sce) reportQCTool(inSCE = sce, algorithm = "DecontX") ## End(Not run)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") ## Not run: sce <- runDecontX(sce) sce <- runQuickUMAP(sce) reportQCTool(inSCE = sce, algorithm = "DecontX") ## End(Not run)
Generates an HTML report for the complete Seurat workflow and returns the SCE object with the results computed and stored inside the object.
reportSeurat( inSCE, biological.group = NULL, phenotype.groups = NULL, selected.markers = NULL, clustering.resolution = 0.8, variable.features = 2000, pc.count = 50, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, runHVG = TRUE, plotHVG = TRUE, runDimRed = TRUE, plotJackStraw = FALSE, plotElbowPlot = TRUE, plotHeatmaps = TRUE, runClustering = TRUE, plotTSNE = TRUE, plotUMAP = TRUE, minResolution = 0.3, maxResolution = 1.5, runMSClusters = TRUE, runMSBioGroup = TRUE, numTopFeatures = 10, forceRun = TRUE )
reportSeurat( inSCE, biological.group = NULL, phenotype.groups = NULL, selected.markers = NULL, clustering.resolution = 0.8, variable.features = 2000, pc.count = 50, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, runHVG = TRUE, plotHVG = TRUE, runDimRed = TRUE, plotJackStraw = FALSE, plotElbowPlot = TRUE, plotHeatmaps = TRUE, runClustering = TRUE, plotTSNE = TRUE, plotUMAP = TRUE, minResolution = 0.3, maxResolution = 1.5, runMSClusters = TRUE, runMSBioGroup = TRUE, numTopFeatures = 10, forceRun = TRUE )
inSCE |
Input |
biological.group |
A character value that specifies the name of the
|
phenotype.groups |
A character vector that specifies the names of the
|
selected.markers |
A character vector containing the user-specified gene symbols or feature names of marker genes that be used to generate gene plots in addition to the gene markers computed from differential expression. |
clustering.resolution |
A numeric value indicating the user-specified
final resolution to use with clustering. Default is |
variable.features |
A numeric value indicating the number of top
variable features to identify. Default |
pc.count |
A numeric value indicating the number of principal components
to use in the analysis workflow. Default is |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
runHVG |
A logical value indicating if the feature selection
computation should be run or not. Default is |
plotHVG |
A logical value indicating if the plot for the top most
variable genes should be visualized in a mean-to-variance plot.
Default is |
runDimRed |
A logical value indicating if PCA should be computed.
Default is |
plotJackStraw |
A logical value indicating if JackStraw plot be
visualized for the principal components. Default is |
plotElbowPlot |
A logical value indicating if the ElbowPlot be
visualized for the principal components. Default is |
plotHeatmaps |
A logical value indicating if heatmaps should be plotted
for the principal components. Default is |
runClustering |
A logical value indicating if clustering section should
be run in the report. Default is |
plotTSNE |
A logical value indicating if TSNE plots should be visualized
for clustering results. Default is |
plotUMAP |
A logical value indicating if the UMAP plots should be
visualized for the clustering results. Default is |
minResolution |
A numeric value indicating the minimum resolution to
use for clustering. Default is |
maxResolution |
A numeric value indicating the maximum resolution to use
for clustering. Default is |
runMSClusters |
A logical value indicating if marker selection should
be run between clusters. Default is |
runMSBioGroup |
A logical value indicating if marker selection should
be run between the |
numTopFeatures |
A numeric value indicating the number of top features
to visualize in each group. Default |
forceRun |
A logical value indicating if all algorithms should be
re-run regardless if they have been computed previously in the input object.
Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Clustering and returns the SCE object with the results computed and stored inside the object.
reportSeuratClustering( inSCE, biological.group = NULL, phenotype.groups = NULL, runClustering = TRUE, plotTSNE = TRUE, plotUMAP = TRUE, minResolution = 0.3, maxResolution = 1.5, numClusters = 10, significant_PC = 10, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
reportSeuratClustering( inSCE, biological.group = NULL, phenotype.groups = NULL, runClustering = TRUE, plotTSNE = TRUE, plotUMAP = TRUE, minResolution = 0.3, maxResolution = 1.5, numClusters = 10, significant_PC = 10, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
inSCE |
Input |
biological.group |
A character value that specifies the name of the
|
phenotype.groups |
A character vector that specifies the names of the
|
runClustering |
A logical value indicating if Clustering should be run
or not in the report. Default is |
plotTSNE |
A logical value indicating if TSNE plots should be visualized
in the clustering section of the report. Default is |
plotUMAP |
A logical value indicating if UMAP plots should be visualized
in the clustering section of the report. Default is |
minResolution |
A numeric value indicating the minimum resolution to use
for clustering. Default |
maxResolution |
A numeric value indicating the maximum resolution to use
for clustering. Default |
numClusters |
temp (to remove) |
significant_PC |
temp (change to pc.use) |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
forceRun |
A logical value indicating if all computations previously
computed should be re-calculated regardless if these computations are
available in the input object. Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Dimensionality Reduction and returns the SCE object with the results computed and stored inside the object.
reportSeuratDimRed( inSCE, pc.count = 50, runDimRed = TRUE, plotJackStraw = FALSE, plotElbowPlot = TRUE, plotHeatmaps = TRUE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
reportSeuratDimRed( inSCE, pc.count = 50, runDimRed = TRUE, plotJackStraw = FALSE, plotElbowPlot = TRUE, plotHeatmaps = TRUE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
inSCE |
Input |
pc.count |
A numeric value indicating the number of principal components
to compute. Default is |
runDimRed |
A logical value indicating if dimenionality reduction should
be computed. Default |
plotJackStraw |
A logical value indicating if JackStraw plot should be
visualized. Default |
plotElbowPlot |
A logical value indicating if ElbowPlot should be
visualized. Default |
plotHeatmaps |
A logical value indicating if heatmaps should be
visualized. Default |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
forceRun |
A logical value indicating if all computations previously
computed should be re-calculated regardless if these computations are
available in the input object. Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Feature Selection and returns the SCE object with the results computed and stored inside the object.
reportSeuratFeatureSelection( inSCE, variable.features = 2000, runHVG = TRUE, plotHVG = TRUE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
reportSeuratFeatureSelection( inSCE, variable.features = 2000, runHVG = TRUE, plotHVG = TRUE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
inSCE |
Input |
variable.features |
A numeric value indicating the number of top variable
features to identify. Default |
runHVG |
A logical value indicating if the feature selection algorithm
should be run or not. Default |
plotHVG |
A logical value indicating if the mean-to-variance plot
of the top variable feature should be visualized or not. Default |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
forceRun |
A logical value indicating if all computations previously
computed should be re-calculated regardless if these computations are
available in the input object. Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Results (including Clustering & Marker Selection) and returns the SCE object with the results computed and stored inside the object.
reportSeuratMarkerSelection( inSCE, biological.group = NULL, phenotype.groups = NULL, selected.markers = NULL, runMarkerSelection = TRUE, plotMarkerSelection = TRUE, numTopFeatures = 10, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE )
reportSeuratMarkerSelection( inSCE, biological.group = NULL, phenotype.groups = NULL, selected.markers = NULL, runMarkerSelection = TRUE, plotMarkerSelection = TRUE, numTopFeatures = 10, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE )
inSCE |
Input |
biological.group |
A character value that specifies the name of the
|
phenotype.groups |
A character vector that specifies the names of the
|
selected.markers |
A character vector containing the user-specified gene symbols or feature names of marker genes that be used to generate gene plots in addition to the gene markers computed from differential expression. |
runMarkerSelection |
A logical value indicating if the marker selection
computation should be run or not. Default |
plotMarkerSelection |
A logical value indicating if the gene marker
plots should be visualized or not. Default |
numTopFeatures |
A numeric value indicating the number of top features
to visualize in each group. Default |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Normalization and returns the SCE object with the results computed and stored inside the object.
reportSeuratNormalization( inSCE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
reportSeuratNormalization( inSCE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
inSCE |
Input |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
forceRun |
A logical value indicating if all computations previously
computed should be re-calculated regardless if these computations are
available in the input object. Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Results (including Clustering & Marker Selection) and returns the SCE object with the results computed and stored inside the object.
reportSeuratResults( inSCE, biological.group = NULL, phenotype.groups = NULL, selected.markers = NULL, clustering.resolution = 0.8, pc.count = 50, plotTSNE = TRUE, plotUMAP = TRUE, runClustering = TRUE, runMSClusters = TRUE, runMSBioGroup = TRUE, numTopFeatures = 10, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
reportSeuratResults( inSCE, biological.group = NULL, phenotype.groups = NULL, selected.markers = NULL, clustering.resolution = 0.8, pc.count = 50, plotTSNE = TRUE, plotUMAP = TRUE, runClustering = TRUE, runMSClusters = TRUE, runMSBioGroup = TRUE, numTopFeatures = 10, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
inSCE |
Input |
biological.group |
A character value that specifies the name of the
|
phenotype.groups |
A character vector that specifies the names of the
|
selected.markers |
A character vector containing the user-specified gene symbols or feature names of marker genes that be used to generate gene plots in addition to the gene markers computed from differential expression. |
clustering.resolution |
A numeric value indicating the user-specified
final resolution to use with clustering. Default is |
pc.count |
A numeric value indicating the number of principal components
to use in the analysis workflow. Default is |
plotTSNE |
A logical value indicating if TSNE plots should be visualized
in the clustering section of the report. Default is |
plotUMAP |
A logical value indicating if UMAP plots should be visualized
in the clustering section of the report. Default is |
runClustering |
A logical value indicating if Clustering should be run
or not in the report. Default is |
runMSClusters |
A logical value indicating if the marker selection
section for identifying marker genes between clusters should be run and
visualized in the report. Default |
runMSBioGroup |
A logical value indicating if the marker selection
section for identifying marker genes between the |
numTopFeatures |
A numeric value indicating the number of top features
to visualize in each group. Default |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
forceRun |
A logical value indicating if all computations previously
computed should be re-calculated regardless if these computations are
available in the input object. Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Run (including Normalization, Feature Selection, Dimensionality Reduction & Clustering) and returns the SCE object with the results computed and stored inside the object.
reportSeuratRun( inSCE, biological.group = NULL, phenotype.groups = NULL, variable.features = 2000, pc.count = 50, runHVG = TRUE, plotHVG = TRUE, runDimRed = TRUE, plotJackStraw = FALSE, plotElbowPlot = TRUE, plotHeatmaps = TRUE, runClustering = TRUE, plotTSNE = TRUE, plotUMAP = TRUE, minResolution = 0.3, maxResolution = 1.5, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
reportSeuratRun( inSCE, biological.group = NULL, phenotype.groups = NULL, variable.features = 2000, pc.count = 50, runHVG = TRUE, plotHVG = TRUE, runDimRed = TRUE, plotJackStraw = FALSE, plotElbowPlot = TRUE, plotHeatmaps = TRUE, runClustering = TRUE, plotTSNE = TRUE, plotUMAP = TRUE, minResolution = 0.3, maxResolution = 1.5, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
inSCE |
Input |
biological.group |
A character value that specifies the name of the
|
phenotype.groups |
A character value that specifies the name of the
|
variable.features |
A numeric value indicating the number of top
variable genes to identify in the report. Default is |
pc.count |
A numeric value indicating the number of principal
components to use in the analysis workflow. Default is |
runHVG |
A logical value indicating if feature selection should be run
in the report. Default |
plotHVG |
A logical value indicating if the top variable genes should
be visualized through a mean-to-variance plot. Default is |
runDimRed |
A logical value indicating if PCA should be computed in the
report. Default is |
plotJackStraw |
A logical value indicating if the JackStraw plot should
be visualized for the principal components. Default is |
plotElbowPlot |
A logical value indicating if the ElbowPlot should be
visualized for the principal components. Default is |
plotHeatmaps |
A logical value indicating if the Heatmaps should be
visualized for the principal components. Default is |
runClustering |
A logical value indicating if Clustering should be
run over multiple resolutions as defined by the |
plotTSNE |
A logical value indicating if TSNE plot should be visualized
for clusters. Default is |
plotUMAP |
A logical value indicating if UMAP plot should be visualized
for clusters. Default is |
minResolution |
A numeric value indicating the minimum resolution to use
for clustering. Default |
maxResolution |
A numeric value indicating the maximum resolution to use
for clustering. Default |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
forceRun |
A logical value indicating if all computations previously
computed should be re-calculated regardless if these computations are
available in the input object. Default is |
A SingleCellExperiment
object
with computations stored.
Generates an HTML report for Seurat Scaling and returns the SCE object with the results computed and stored inside the object.
reportSeuratScaling( inSCE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
reportSeuratScaling( inSCE, outputFile = NULL, outputPath = NULL, subtitle = NULL, authors = NULL, showSession = FALSE, pdf = FALSE, forceRun = TRUE )
inSCE |
Input |
outputFile |
Specify the name of the generated output HTML file.
If |
outputPath |
Specify the name of the output directory to save the
rendered HTML file. If |
subtitle |
A character value specifying the subtitle to use in the
report. Default |
authors |
A character value specifying the names of the authors to use
in the report. Default |
showSession |
A logical value indicating if session information
should be displayed or not. Default is |
pdf |
A logical value indicating if a pdf should also be generated for
each figure in the report. Default is |
forceRun |
A logical value indicating if all computations previously
computed should be re-calculated regardless if these computations are
available in the input object. Default is |
A SingleCellExperiment
object
with computations stored.
Originally written in retrieveFeatureIndex
.
Modified for also retrieving cell indices and only working for
SingleCellExperiment object. This will return indices of
features among the rowData
/colData
. Partial matching (i.e.
grepping) can be used.
retrieveSCEIndex( inSCE, IDs, axis, by = NULL, exactMatch = TRUE, firstMatch = TRUE )
retrieveSCEIndex( inSCE, IDs, axis, by = NULL, exactMatch = TRUE, firstMatch = TRUE )
inSCE |
Input SingleCellExperiment object. Required |
IDs |
Character vector of identifiers for features or cells to find in
|
axis |
A character scalar to specify whether to search for features or
cells. Use |
by |
Character. In which column to search for features/cells in
|
exactMatch |
A logical scalar. Whether to only identify exact matches
or to identify partial matches using |
firstMatch |
A logical scalar. Whether to only identify the first
matches or to return all plausible matches. Default |
A unique, non-NA numeric vector of indices for the matching
features/cells in inSCE
.
Yusuke Koga, Joshua Campbell, Yichen Wang
data(scExample, package = "singleCellTK") retrieveSCEIndex(inSCE = sce, IDs = "ENSG00000205542", axis = "row")
data(scExample, package = "singleCellTK") retrieveSCEIndex(inSCE = sce, IDs = "ENSG00000205542", axis = "row")
Run barcodeRanks on a count matrix provided in a SingleCellExperiment object. Distinguish between droplets containing cells and ambient RNA in a droplet-based single-cell RNA sequencing experiment.
runBarcodeRankDrops( inSCE, sample = NULL, useAssay = "counts", lower = 100, fitBounds = NULL, df = 20 )
runBarcodeRankDrops( inSCE, sample = NULL, useAssay = "counts", lower = 100, fitBounds = NULL, df = 20 )
inSCE |
A SingleCellExperiment object. Must contain a raw counts matrix before empty droplets have been removed. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
useAssay |
A string specifying which assay in the SCE to use. Default
|
lower |
See barcodeRanks for more information.
Default |
fitBounds |
See barcodeRanks for more information.
Default |
df |
See barcodeRanks for more information. Default
|
A SingleCellExperiment object with the
barcodeRanks output table appended to the
colData slot. The columns include
dropletUtils_BarcodeRank_Knee
and
dropletUtils_barcodeRank_inflection
. Please refer to the documentation
of barcodeRanks for details.
barcodeRanks
,
runDropletQC
, plotBarcodeRankDropsResults
data(scExample, package = "singleCellTK") sce <- runBarcodeRankDrops(inSCE = sce)
data(scExample, package = "singleCellTK") sce <- runBarcodeRankDrops(inSCE = sce)
BBKNN, an extremely fast graph-based data integration algorithm. It modifies the neighbourhood construction step to produce a graph that is balanced across all batches of the data.
runBBKNN( inSCE, useAssay = "logcounts", batch = "batch", reducedDimName = "BBKNN", nComponents = 50L )
runBBKNN( inSCE, useAssay = "logcounts", batch = "batch", reducedDimName = "BBKNN", nComponents = 50L )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Default |
batch |
A single character indicating a field in |
reducedDimName |
A single character. The name for the corrected
low-dimensional representation. Will be saved to |
nComponents |
An integer. Number of principle components or the
dimensionality, adopted in the pre-PCA-computation step, the BBKNN step (for
how many PCs the algorithm takes into account), and the final UMAP
combination step where the value represent the dimensionality of the updated
reducedDim. Default |
The input SingleCellExperiment object with
reducedDim(inSCE, reducedDimName)
updated.
Krzysztof Polanski et al., 2020
## Not run: data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceBatches <- runBBKNN(sceBatches, useAssay = "logcounts", nComponents = 10) ## End(Not run)
## Not run: data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceBatches <- runBBKNN(sceBatches, useAssay = "logcounts", nComponents = 10) ## End(Not run)
A wrapper function for bcds. Annotate
doublets/multiplets using a binary classification approach to discriminate
artificial doublets from original data. Generate a doublet
score for each cell. Infer doublets if estNdbl
is TRUE
.
runBcds( inSCE, sample = NULL, seed = 12345, ntop = 500, srat = 1, verb = FALSE, retRes = FALSE, nmax = "tune", varImp = FALSE, estNdbl = FALSE, useAssay = "counts" )
runBcds( inSCE, sample = NULL, seed = 12345, ntop = 500, srat = 1, verb = FALSE, retRes = FALSE, nmax = "tune", varImp = FALSE, estNdbl = FALSE, useAssay = "counts" )
inSCE |
A SingleCellExperiment object. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
seed |
Seed for the random number generator, can be |
ntop |
See bcds for more information. Default |
srat |
See bcds for more information. Default |
verb |
See bcds for more information. Default |
retRes |
See bcds for more information. Default
|
nmax |
See bcds for more information. Default
|
varImp |
See bcds for more information. Default
|
estNdbl |
See bcds for more information. Default
|
useAssay |
A string specifying which assay in |
When the argument sample
is specified, bcds will
be run on cells from each sample separately. If sample = NULL
, then
all cells will be processed together.
A SingleCellExperiment object with bcds output appended to the colData slot. The columns include bcds_score and optionally bcds_call. Please refer to the documentation of bcds for details.
bcds
, plotBcdsResults
,
runCellQC
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runBcds(sce)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runBcds(sce)
A wrapper function to run several QC algorithms on a SingleCellExperiment object containing cells after empty droplets have been removed.
runCellQC( inSCE, algorithms = c("QCMetrics", "scDblFinder", "cxds", "bcds", "cxds_bcds_hybrid", "decontX", "decontX_bg", "soupX", "soupX_bg"), sample = NULL, collectionName = NULL, geneSetList = NULL, geneSetListLocation = "rownames", geneSetCollection = NULL, mitoRef = "human", mitoIDType = "ensembl", mitoPrefix = "MT-", mitoID = NULL, mitoGeneLocation = "rownames", useAssay = "counts", background = NULL, bgAssayName = NULL, bgBatch = NULL, seed = 12345, paramsList = NULL )
runCellQC( inSCE, algorithms = c("QCMetrics", "scDblFinder", "cxds", "bcds", "cxds_bcds_hybrid", "decontX", "decontX_bg", "soupX", "soupX_bg"), sample = NULL, collectionName = NULL, geneSetList = NULL, geneSetListLocation = "rownames", geneSetCollection = NULL, mitoRef = "human", mitoIDType = "ensembl", mitoPrefix = "MT-", mitoID = NULL, mitoGeneLocation = "rownames", useAssay = "counts", background = NULL, bgAssayName = NULL, bgBatch = NULL, seed = 12345, paramsList = NULL )
inSCE |
A SingleCellExperiment object. |
algorithms |
Character vector. Specify which QC algorithms to run. Available options are "QCMetrics", "scrublet", "doubletFinder", "scDblFinder", "cxds", "bcds", "cxds_bcds_hybrid", "decontX" and "soupX". |
sample |
Character vector. Indicates which sample each cell belongs to. Algorithms will be run on cells from each sample separately. |
collectionName |
Character. Name of a |
geneSetList |
See |
geneSetListLocation |
See |
geneSetCollection |
See |
mitoRef , mitoIDType , mitoPrefix , mitoID , mitoGeneLocation
|
Arguments used to import mitochondrial genes and quantify their expression. Please see runPerCellQC for detailed information. |
useAssay |
A string specifying which assay contains the count matrix for cells. |
background |
A SingleCellExperiment
with the matrix located in the assay slot under |
bgAssayName |
Character. Name of the assay to use if background is a
SingleCellExperiment. If NULL, the function
will use the same value as |
bgBatch |
Batch labels for |
seed |
Seed for the random number generator. Default 12345. |
paramsList |
A list containing parameters for QC functions. Default NULL. |
SingleCellExperiment object containing the outputs of the
specified algorithms in the colData
of inSCE
.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") ## Not run: sce <- runCellQC(sce) ## End(Not run)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") ## Not run: sce <- runCellQC(sce) ## End(Not run)
Calculates the mean expression of percent of cells that express the given genes for each cluster
runClusterSummaryMetrics( inSCE, useAssay = "logcounts", featureNames, displayName = NULL, groupNames = "cluster", scale = FALSE )
runClusterSummaryMetrics( inSCE, useAssay = "logcounts", featureNames, displayName = NULL, groupNames = "cluster", scale = FALSE )
inSCE |
The single cell experiment to use. |
useAssay |
The assay to use. |
featureNames |
A string or vector of strings with each gene to aggregate. |
displayName |
A string that is the name of the column used for genes. |
groupNames |
The name of a colData entry that can be used as groupNames. |
scale |
Option to scale the data. Default: |
A dataframe with mean expression and percent of cells in cluster that express for each cluster.
data("scExample") runClusterSummaryMetrics(inSCE=sce, useAssay="counts", featureNames=c("B2M", "MALAT1"), displayName="feature_name", groupNames="type")
data("scExample") runClusterSummaryMetrics(inSCE=sce, useAssay="counts", featureNames=c("B2M", "MALAT1"), displayName="feature_name", groupNames="type")
The ComBat-Seq batch adjustment approach assumes that batch effects represent non-biological but systematic shifts in the mean or variability of genomic features for all samples within a processing batch. It uses either parametric or non-parametric empirical Bayes frameworks for adjusting data for batch effects.
runComBatSeq( inSCE, useAssay = "counts", batch = "batch", covariates = NULL, bioCond = NULL, useSVA = FALSE, assayName = "ComBatSeq", shrink = FALSE, shrinkDisp = FALSE, nGene = NULL )
runComBatSeq( inSCE, useAssay = "counts", batch = "batch", covariates = NULL, bioCond = NULL, useSVA = FALSE, assayName = "ComBatSeq", shrink = FALSE, shrinkDisp = FALSE, nGene = NULL )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Default |
batch |
A single character indicating a field in
|
covariates |
A character vector indicating the fields in
|
bioCond |
A single character indicating a field in
|
useSVA |
A logical scalar. Whether to estimate surrogate variables and
use them as an empirical control. Default |
assayName |
A single characeter. The name for the corrected assay. Will
be saved to |
shrink |
A logical scalar. Whether to apply shrinkage on parameter
estimation. Default |
shrinkDisp |
A logical scalar. Whether to apply shrinkage on dispersion.
Default |
nGene |
An integer. Number of random genes to use in empirical Bayes
estimation, only useful when |
For the parameters covariates
and useSVA
, when the cell type
information is known, it is recommended to specify the cell type annotation
to the argument covariates
; if the cell types are unknown but
expected to be balanced, it is recommended to run with default settings, yet
informative covariates could still be useful. If the cell types are unknown
and are expected to be unbalanced, it is recommended to set useSVA
to TRUE
.
The input SingleCellExperiment object with
assay(inSCE, assayName)
updated.
data('sceBatches', package = 'singleCellTK') sceBatches <- sample(sceBatches, 40) # Cell type known sceBatches <- runComBatSeq(sceBatches, "counts", "batch", covariates = "cell_type", assayName = "ComBat_cell_seq") # Cell type unknown but balanced #sceBatches <- runComBatSeq(sceBatches, "counts", "batch", # assayName = "ComBat_seq") # Cell type unknown and unbalanced #sceBatches <- runComBatSeq(sceBatches, "counts", "batch", # useSVA = TRUE, # assayName = "ComBat_sva_seq")
data('sceBatches', package = 'singleCellTK') sceBatches <- sample(sceBatches, 40) # Cell type known sceBatches <- runComBatSeq(sceBatches, "counts", "batch", covariates = "cell_type", assayName = "ComBat_cell_seq") # Cell type unknown but balanced #sceBatches <- runComBatSeq(sceBatches, "counts", "batch", # assayName = "ComBat_seq") # Cell type unknown and unbalanced #sceBatches <- runComBatSeq(sceBatches, "counts", "batch", # useSVA = TRUE, # assayName = "ComBat_sva_seq")
A wrapper function for cxds. Annotate
doublets/multiplets using co-expression based approach. Generate a doublet
score for each cell. Infer doublets if estNdbl
is TRUE
.
runCxds( inSCE, sample = NULL, seed = 12345, ntop = 500, binThresh = 0, verb = FALSE, retRes = FALSE, estNdbl = FALSE, useAssay = "counts" )
runCxds( inSCE, sample = NULL, seed = 12345, ntop = 500, binThresh = 0, verb = FALSE, retRes = FALSE, estNdbl = FALSE, useAssay = "counts" )
inSCE |
A SingleCellExperiment object. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
seed |
Seed for the random number generator, can be |
ntop |
See cxds for more information. Default |
binThresh |
See cxds for more information. Default
|
verb |
See cxds for more information. Default |
retRes |
See cxds for more information. Default
|
estNdbl |
See cxds for more information. Default
|
useAssay |
A string specifying which assay in the SCE to use. Default
|
When the argument sample
is specified, cxds will
be run on cells from each sample separately. If sample = NULL
, then
all cells will be processed together.
A SingleCellExperiment object with cxds output appended to the colData slot. The columns include cxds_score and optionally cxds_call.
cxds
, plotCxdsResults
,
runCellQC
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runCxds(sce)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runCxds(sce)
A wrapper function for cxds_bcds_hybrid. Annotate
doublets/multiplets using a binary classification approach to discriminate
artificial doublets from original data. Generate a doublet
score for each cell. Infer doublets if estNdbl
is TRUE
.
runCxdsBcdsHybrid( inSCE, sample = NULL, seed = 12345, nTop = 500, cxdsArgs = list(), bcdsArgs = list(), verb = FALSE, estNdbl = FALSE, force = FALSE, useAssay = "counts" )
runCxdsBcdsHybrid( inSCE, sample = NULL, seed = 12345, nTop = 500, cxdsArgs = list(), bcdsArgs = list(), verb = FALSE, estNdbl = FALSE, force = FALSE, useAssay = "counts" )
inSCE |
A SingleCellExperiment object.
Needs |
sample |
Character vector. Indicates which sample each cell belongs to. cxds_bcds_hybrid will be run on cells from each sample separately. If NULL, then all cells will be processed together. Default NULL. |
seed |
Seed for the random number generator. Default 12345. |
nTop |
The number of top varialbe genes to consider. Used in both |
cxdsArgs |
See cxds_bcds_hybrid for more information. Default |
bcdsArgs |
See cxds_bcds_hybrid for more information. Default |
verb |
See cxds_bcds_hybrid for more information. Default |
estNdbl |
See cxds_bcds_hybrid for more information. Default |
force |
See cxds_bcds_hybrid for more information. Default |
useAssay |
A string specifying which assay in the SCE to use. |
A SingleCellExperiment object with cxds_bcds_hybrid output appended to the colData slot. The columns include hybrid_score and optionally hybrid_call. Please refer to the documentation of cxds_bcds_hybrid for details.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runCxdsBcdsHybrid(sce)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runCxdsBcdsHybrid(sce)
Perform differential expression analysis on SCE object
runDEAnalysis(inSCE, method = "wilcox", ...) runDESeq2( inSCE, useAssay = "counts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, fullReduced = TRUE, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE ) runLimmaDE( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE ) runANOVA( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE ) runMAST( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, check_sanity = TRUE, verbose = TRUE ) runWilcox( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = "cluster", classGroup1 = c(1), classGroup2 = c(2), analysisName = "cluster1_VS_2", groupName1 = "cluster1", groupName2 = "cluster2", covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE )
runDEAnalysis(inSCE, method = "wilcox", ...) runDESeq2( inSCE, useAssay = "counts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, fullReduced = TRUE, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE ) runLimmaDE( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE ) runANOVA( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE ) runMAST( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = NULL, classGroup1 = NULL, classGroup2 = NULL, analysisName, groupName1, groupName2, covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, check_sanity = TRUE, verbose = TRUE ) runWilcox( inSCE, useAssay = "logcounts", useReducedDim = NULL, index1 = NULL, index2 = NULL, class = "cluster", classGroup1 = c(1), classGroup2 = c(2), analysisName = "cluster1_VS_2", groupName1 = "cluster1", groupName2 = "cluster2", covariates = NULL, onlyPos = FALSE, log2fcThreshold = NULL, fdrThreshold = NULL, minGroup1MeanExp = NULL, maxGroup2MeanExp = NULL, minGroup1ExprPerc = NULL, maxGroup2ExprPerc = NULL, overwrite = FALSE, verbose = TRUE )
inSCE |
SingleCellExperiment inherited object. |
method |
Character. Specify which method to use when using
|
... |
Arguments to pass to specific methods when using the generic
|
useAssay |
character. A string specifying which assay to use for the
DE regression. Ignored when |
useReducedDim |
character. A string specifying which reducedDim to use
for DE analysis. Will treat the dimensions as features. Default |
index1 |
Any type of indices that can subset a
SingleCellExperiment inherited object by cells. Specifies
which cells are of interests. Default |
index2 |
Any type of indices that can subset a
SingleCellExperiment inherited object by cells. specifies
the control group against those specified by |
class |
A vector/factor with |
classGroup1 |
a vector specifying which "levels" given in |
classGroup2 |
a vector specifying which "levels" given in |
analysisName |
A character scalar naming the DEG analysis.
Default |
groupName1 |
A character scalar naming the group of interests.
Default |
groupName2 |
A character scalar naming the control group.
Default |
covariates |
A character vector of additional covariates to use when
building the model. All covariates must exist in
|
fullReduced |
Logical, DESeq2 only argument. Whether to apply LRT
(Likelihood ratio test) with a 'full' model. Default |
onlyPos |
Whether to only output DEG with positive log2_FC value.
Default |
log2fcThreshold |
Only out put DEGs with the absolute values of log2FC
greater than this value. Default |
fdrThreshold |
Only out put DEGs with FDR value less than this
value. Default |
minGroup1MeanExp |
Only out put DEGs with mean expression in group1
greater then this value. Default |
maxGroup2MeanExp |
Only out put DEGs with mean expression in group2
less then this value. Default |
minGroup1ExprPerc |
Only out put DEGs expressed in greater then this
fraction of cells in group1. Default |
maxGroup2ExprPerc |
Only out put DEGs expressed in less then this
fraction of cells in group2. Default |
overwrite |
A logical scalar. Whether to overwrite result if exists.
Default |
verbose |
A logical scalar. Whether to show messages. Default
|
check_sanity |
Logical, MAST only argument. Whether to perform MAST's
sanity check to see if the counts are logged. Default |
SCTK provides Limma, MAST, DESeq2, ANOVA and Wilcoxon test for differential expression analysis, where DESeq2 expects non-negtive integer assay input while others expect logcounts.
Condition specification allows two methods:
1. Index level selection. Only use arguments index1
and index2
.
2. Annotation level selection. Only use arguments class
,
classGroup1
and classGroup2
.
The input SingleCellExperiment object, where
metadata(inSCE)$diffExp
is updated with a list named by
analysisName
, with elements of:
$groupNames |
the naming of the two conditions |
$useAssay , $useReducedDim
|
the matrix name that was used for calculation |
$select |
the cell selection indices (logical) for each condition |
$result |
a |
$method |
the method used |
See plotDEGHeatmap
, plotDEGRegression
,
plotDEGViolin
and plotDEGVolcano
for
visualization method after running DE analysis.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") sce <- runDEAnalysis(method = "Limma", inSCE = sce, groupName1 = "group1", groupName2 = "group2", index1 = seq(20), index2 = seq(21,40), analysisName = "Limma")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") sce <- runDEAnalysis(method = "Limma", inSCE = sce, groupName1 = "group1", groupName2 = "group2", index1 = seq(20), index2 = seq(21,40), analysisName = "Limma")
A wrapper function for decontX. Identify potential contamination from experimental factors such as ambient RNA.
runDecontX( inSCE, sample = NULL, useAssay = "counts", background = NULL, bgAssayName = NULL, bgBatch = NULL, z = NULL, maxIter = 500, delta = c(10, 10), estimateDelta = TRUE, convergence = 0.001, iterLogLik = 10, varGenes = 5000, dbscanEps = 1, seed = 12345, logfile = NULL, verbose = TRUE )
runDecontX( inSCE, sample = NULL, useAssay = "counts", background = NULL, bgAssayName = NULL, bgBatch = NULL, z = NULL, maxIter = 500, delta = c(10, 10), estimateDelta = TRUE, convergence = 0.001, iterLogLik = 10, varGenes = 5000, dbscanEps = 1, seed = 12345, logfile = NULL, verbose = TRUE )
inSCE |
A SingleCellExperiment object. |
sample |
A single character specifying a name that can be found in
|
useAssay |
A string specifying which assay in the SCE to use. Default 'counts'. |
background |
A SingleCellExperiment
with the matrix located in the assay slot under |
bgAssayName |
Character. Name of the assay to use if background is a
SingleCellExperiment. If NULL, the function
will use the same value as |
bgBatch |
Batch labels for |
z |
Numeric or character vector. Cell cluster labels. If NULL, PCA will be used to reduce the dimensionality of the dataset initially, 'umap' from the 'uwot' package will be used to further reduce the dataset to 2 dimenions and the 'dbscan' function from the 'dbscan' package will be used to identify clusters of broad cell types. Default NULL. |
maxIter |
Integer. Maximum iterations of the EM algorithm. Default 500. |
delta |
Numeric Vector of length 2. Concentration parameters for
the Dirichlet prior for the contamination in each cell. The first element
is the prior for the native counts while the second element is the prior for
the contamination counts. These essentially act as pseudocounts for the
native and contamination in each cell. If |
estimateDelta |
Boolean. Whether to update |
convergence |
Numeric. The EM algorithm will be stopped if the maximum difference in the contamination estimates between the previous and current iterations is less than this. Default 0.001. |
iterLogLik |
Integer. Calculate log likelihood every |
varGenes |
Integer. The number of variable genes to use in
dimensionality reduction before clustering. Variability is calcualted using
|
dbscanEps |
Numeric. The clustering resolution parameter used in 'dbscan' to estimate broad cell clusters. Used only when z is not provided. Default 1. |
seed |
Integer. Passed to with_seed. For reproducibility, a default value of 12345 is used. If NULL, no calls to with_seed are made. |
logfile |
Character. Messages will be redirected to a file named 'logfile'. If NULL, messages will be printed to stdout. Default NULL. |
verbose |
Logical. Whether to print log messages. Default TRUE. |
A SingleCellExperiment object with 'decontX_Contamination' and 'decontX_Clusters' added to the colData slot. Additionally, the decontaminated counts will be added as an assay called 'decontXCounts'.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDecontX(sce[,sample(ncol(sce),20)])
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDecontX(sce[,sample(ncol(sce),20)])
Generic Wrapper function for running dimensionality reduction
runDimReduce( inSCE, method = c("scaterPCA", "seuratPCA", "seuratICA", "scanpyPCA", "rTSNE", "seuratTSNE", "scaterUMAP", "seuratUMAP", "scanpyUMAP", "scanpyTSNE"), useAssay = NULL, useReducedDim = NULL, useAltExp = NULL, reducedDimName = method, nComponents = 20, useFeatureSubset = NULL, scale = FALSE, seed = 12345, ... )
runDimReduce( inSCE, method = c("scaterPCA", "seuratPCA", "seuratICA", "scanpyPCA", "rTSNE", "seuratTSNE", "scaterUMAP", "seuratUMAP", "scanpyUMAP", "scanpyTSNE"), useAssay = NULL, useReducedDim = NULL, useAltExp = NULL, reducedDimName = method, nComponents = 20, useFeatureSubset = NULL, scale = FALSE, seed = 12345, ... )
inSCE |
Input SingleCellExperiment object. |
method |
One from |
useAssay |
Assay to use for computation. If |
useReducedDim |
The low dimension representation to use for embedding
computation. Default |
useAltExp |
The subset to use for computation, usually for the
selected variable features. Default |
reducedDimName |
The name of the result matrix. Required. |
nComponents |
Specify the number of dimensions to compute with the selected method in case of PCA/ICA and the number of components to use in the case of TSNE/UMAP methods. |
useFeatureSubset |
Subset of feature to use for dimension reduction. A
character string indicating a |
scale |
Logical scalar, whether to standardize the expression values.
Default |
seed |
Random seed for reproducibility of results.
Default |
... |
The other arguments for running a specific algorithm. Please refer to the one you use. |
Wrapper function to run one of the available dimensionality
reduction algorithms integrated within SCTK from scaterPCA
,
runSeuratPCA
, runSeuratICA
, runTSNE
,
runSeuratTSNE
, runUMAP
and
runSeuratUMAP
. Users can use an assay by specifying
useAssay
, use the assay in an altExp by specifying both
useAltExp
and useAssay
, or use a low-dimensionality
representation by specifying useReducedDim
.
The input SingleCellExperiment object with
reducedDim
updated with the result.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runNormalization(sce, useAssay = "counts", outAssayName = "logcounts", normalizationMethod = "logNormCounts") sce <- runDimReduce(inSCE = sce, method = "scaterPCA", useAssay = "logcounts", scale = TRUE, reducedDimName = "PCA")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runNormalization(sce, useAssay = "counts", outAssayName = "logcounts", normalizationMethod = "logNormCounts") sce <- runDimReduce(inSCE = sce, method = "scaterPCA", useAssay = "logcounts", scale = TRUE, reducedDimName = "PCA")
Uses doubletFinder to determine cells within the dataset suspected to be doublets.
runDoubletFinder( inSCE, sample = NULL, useAssay = "counts", seed = 12345, seuratNfeatures = 2000, seuratPcs = seq(15), seuratRes = 1.5, formationRate = 0.075, nCores = NULL, verbose = FALSE )
runDoubletFinder( inSCE, sample = NULL, useAssay = "counts", seed = 12345, seuratNfeatures = 2000, seuratPcs = seq(15), seuratRes = 1.5, formationRate = 0.075, nCores = NULL, verbose = FALSE )
inSCE |
inSCE A SingleCellExperiment object. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
useAssay |
A string specifying which assay in the SCE to use. Default
|
seed |
Seed for the random number generator, can be set to |
seuratNfeatures |
Integer. Number of highly variable genes to use.
Default |
seuratPcs |
Numeric vector. The PCs used in seurat function to
determine number of clusters. Default |
seuratRes |
Numeric vector. The resolution parameter used in Seurat,
which adjusts the number of clusters determined via the algorithm. Default
|
formationRate |
Doublet formation rate used within algorithm. Default
|
nCores |
Number of cores used for running the function. Default
|
verbose |
Boolean. Wheter to print messages from Seurat and
DoubletFinder. Default |
SingleCellExperiment object containing the
doublet_finder_doublet_score
variable in colData
slot.
runCellQC
, plotDoubletFinderResults
data(scExample, package = "singleCellTK") options(future.globals.maxSize = 786432000) sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDoubletFinder(sce)
data(scExample, package = "singleCellTK") options(future.globals.maxSize = 786432000) sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runDoubletFinder(sce)
A wrapper function to run several QC algorithms for determining empty droplets in single cell RNA-seq data
runDropletQC( inSCE, algorithms = c("QCMetrics", "emptyDrops", "barcodeRanks"), sample = NULL, useAssay = "counts", paramsList = NULL )
runDropletQC( inSCE, algorithms = c("QCMetrics", "emptyDrops", "barcodeRanks"), sample = NULL, useAssay = "counts", paramsList = NULL )
inSCE |
A SingleCellExperiment object containing the full droplet count matrix |
algorithms |
Character vector. Specify which QC algorithms to run. Available options are "emptyDrops" and "barcodeRanks". |
sample |
Character vector. Indicates which sample each cell belongs to. Algorithms will be run on cells from each sample separately. |
useAssay |
A string specifying which assay contains the count matrix for droplets. |
paramsList |
A list containing parameters for QC functions. Default NULL. |
SingleCellExperiment object containing the outputs of the
specified algorithms in the colData
of inSCE
.
data(scExample, package = "singleCellTK") ## Not run: sce <- runDropletQC(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runDropletQC(sce) ## End(Not run)
Run emptyDrops on the count matrix in the provided SingleCellExperiment object. Distinguish between droplets containing cells and ambient RNA in a droplet-based single-cell RNA sequencing experiment.
runEmptyDrops( inSCE, sample = NULL, useAssay = "counts", lower = 100, niters = 10000, testAmbient = FALSE, ignore = NULL, alpha = NULL, retain = NULL, barcodeArgs = list(), BPPARAM = BiocParallel::SerialParam() )
runEmptyDrops( inSCE, sample = NULL, useAssay = "counts", lower = 100, niters = 10000, testAmbient = FALSE, ignore = NULL, alpha = NULL, retain = NULL, barcodeArgs = list(), BPPARAM = BiocParallel::SerialParam() )
inSCE |
A SingleCellExperiment object. Must contain a raw counts matrix before empty droplets have been removed. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
useAssay |
A string specifying which assay in the SCE to use. Default
|
lower |
See emptyDrops for more information.
Default |
niters |
See emptyDrops for more information.
Default |
testAmbient |
See emptyDrops for more information.
Default |
ignore |
See emptyDrops for more information.
Default |
alpha |
See emptyDrops for more information.
Default |
retain |
See emptyDrops for more information.
Default |
barcodeArgs |
See emptyDrops for more information.
Default |
BPPARAM |
See emptyDrops for more information.
Default |
A SingleCellExperiment object with the
emptyDrops output table appended to the
colData slot. The columns include
emptyDrops_total
, emptyDrops_logprob
,
emptyDrops_pvalue
, emptyDrops_limited
, emptyDrops_fdr
.
Please refer to the documentation of emptyDrops for
details.
runDropletQC
, plotEmptyDropsResults
,
plotEmptyDropsScatter
data(scExample, package = "singleCellTK") sce <- runEmptyDrops(inSCE = sce)
data(scExample, package = "singleCellTK") sce <- runEmptyDrops(inSCE = sce)
Run EnrichR on SCE object
runEnrichR( inSCE, features, analysisName, db = NULL, by = "rownames", featureName = NULL )
runEnrichR( inSCE, features, analysisName, db = NULL, by = "rownames", featureName = NULL )
inSCE |
A SingleCellExperiment object. |
features |
Character vector, selected genes for enrichment analysis. |
analysisName |
A string that identifies each specific analysis. |
db |
Character vector. Selected database name(s) from the enrichR
database list. If |
by |
Character. From where should we find the |
featureName |
Character. Indicates the actual feature identifiers to be
passed to EnrichR. Can be |
EnrichR works by querying the specified features
to its online
databases, thus it requires the Internet connection.
Available db
options could be shown by running
enrichR::listEnrichrDbs()$libraryName
This function checks for the existence of features in the SCE object. When
features
do not have a match in rownames(inSCE)
, users may
try to specify by
to pass the check.
EnrichR expects gene symbols/names as the input (i.e. Ensembl ID might not
work). When specified features
are not qualified for this, users may
try to specify featureName
to change the identifier type to pass to
EnrichR.
Updates inSCE
metadata with a data.frame of enrichment terms
overlapping in the respective databases along with p-values, z-scores etc.
data("mouseBrainSubsetSCE") if (Biobase::testBioCConnection()) { mouseBrainSubsetSCE <- runEnrichR(mouseBrainSubsetSCE, features = "Cmtm5", db = "GO_Cellular_Component_2017", analysisName = "analysis1") }
data("mouseBrainSubsetSCE") if (Biobase::testBioCConnection()) { mouseBrainSubsetSCE <- runEnrichR(mouseBrainSubsetSCE, features = "Cmtm5", db = "GO_Cellular_Component_2017", analysisName = "analysis1") }
fastMNN is a variant of the classic MNN method, modified for speed and more
robust performance. For introduction of MNN, see runMNNCorrect
.
runFastMNN( inSCE, useAssay = "logcounts", useReducedDim = NULL, batch = "batch", reducedDimName = "fastMNN", k = 20, propK = NULL, ndist = 3, minBatchSkip = 0, cosNorm = TRUE, nComponents = 50, weights = NULL, BPPARAM = BiocParallel::SerialParam() )
runFastMNN( inSCE, useAssay = "logcounts", useReducedDim = NULL, batch = "batch", reducedDimName = "fastMNN", k = 20, propK = NULL, ndist = 3, minBatchSkip = 0, cosNorm = TRUE, nComponents = 50, weights = NULL, BPPARAM = BiocParallel::SerialParam() )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Default |
useReducedDim |
A single character indicating the dimension reduction
used for batch correction. Will ignore |
batch |
A single character indicating a field in |
reducedDimName |
A single character. The name for the corrected
low-dimensional representation. Default |
k |
An integer scalar specifying the number of nearest neighbors to
consider when identifying MNNs. See "See Also". Default |
propK |
A numeric scalar in (0, 1) specifying the proportion of cells in
each dataset to use for mutual nearest neighbor searching. See "See Also".
Default |
ndist |
A numeric scalar specifying the threshold beyond which
neighbours are to be ignored when computing correction vectors. See "See
Also". Default |
minBatchSkip |
Numeric scalar specifying the minimum relative magnitude
of the batch effect, below which no correction will be performed at a given
merge step. See "See Also". Default |
cosNorm |
A logical scalar indicating whether cosine normalization
should be performed on |
nComponents |
An integer scalar specifying the number of dimensions to
produce. See "See Also". Default |
weights |
The weighting scheme to use. Passed to
|
BPPARAM |
A BiocParallelParam object specifying whether the SVD should be parallelized. |
The input SingleCellExperiment object with
reducedDim(inSCE, reducedDimName)
updated.
Lun ATL, et al., 2016
fastMNN
for using useAssay
, and
reducedMNN
for using useReducedDim
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runFastMNN(sceBatches, useAssay = 'logcounts')
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runFastMNN(sceBatches, useAssay = 'logcounts')
Wrapper function to run all of the feature selection methods
integrated within the singleCellTK package including three methods from
Seurat ("vst"
, "mean.var.plot"
or dispersion
) and the
Scran modelGeneVar
method.
This function does not return the names of the variable features but only
computes the metrics, which will be stored in the rowData
slot. To set
a HVG list for downstream use, users should call setTopHVG
after computing the metrics. To get the names of the variable features, users
should call getTopHVG
function after computing the metrics.
runFeatureSelection(inSCE, useAssay, method = "vst")
runFeatureSelection(inSCE, useAssay, method = "vst")
inSCE |
Input SingleCellExperiment object. |
useAssay |
Specify the name of the assay that should be used. Should use
raw counts for |
method |
Specify the method to use for variable gene selection.
Options include |
The input SingleCellExperiment object that contains
the computed statistics in the rowData
slot
runModelGeneVar
, runSeuratFindHVG
,
getTopHVG
, plotTopHVG
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runFeatureSelection(mouseBrainSubsetSCE, "logcounts", "modelGeneVar")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runFeatureSelection(mouseBrainSubsetSCE, "logcounts", "modelGeneVar")
With an input SingleCellExperiment object and specifying the
clustering labels, this function iteratively call the differential expression
analysis on each cluster against all the others. runFindMarker
will be deprecated in the future.
runFindMarker( inSCE, useAssay = "logcounts", useReducedDim = NULL, method = "wilcox", cluster = "cluster", covariates = NULL, log2fcThreshold = NULL, fdrThreshold = 0.05, minClustExprPerc = NULL, maxCtrlExprPerc = NULL, minMeanExpr = NULL, detectThresh = 0 ) findMarkerDiffExp( inSCE, useAssay = "logcounts", useReducedDim = NULL, method = c("wilcox", "MAST", "DESeq2", "Limma", "ANOVA"), cluster = "cluster", covariates = NULL, log2fcThreshold = NULL, fdrThreshold = 0.05, minClustExprPerc = NULL, maxCtrlExprPerc = NULL, minMeanExpr = NULL, detectThresh = 0 )
runFindMarker( inSCE, useAssay = "logcounts", useReducedDim = NULL, method = "wilcox", cluster = "cluster", covariates = NULL, log2fcThreshold = NULL, fdrThreshold = 0.05, minClustExprPerc = NULL, maxCtrlExprPerc = NULL, minMeanExpr = NULL, detectThresh = 0 ) findMarkerDiffExp( inSCE, useAssay = "logcounts", useReducedDim = NULL, method = c("wilcox", "MAST", "DESeq2", "Limma", "ANOVA"), cluster = "cluster", covariates = NULL, log2fcThreshold = NULL, fdrThreshold = 0.05, minClustExprPerc = NULL, maxCtrlExprPerc = NULL, minMeanExpr = NULL, detectThresh = 0 )
inSCE |
SingleCellExperiment inherited object. |
useAssay |
character. A string specifying which assay to use for the
MAST calculations. Default |
useReducedDim |
character. A string specifying which reducedDim to use
for MAST calculations. Set |
method |
A single character for specific differential expression
analysis method. Choose from |
cluster |
One single character to specify a column in
|
covariates |
A character vector of additional covariates to use when
building the model. All covariates must exist in
|
log2fcThreshold |
Only out put DEGs with the absolute values of log2FC
larger than this value. Default |
fdrThreshold |
Only out put DEGs with FDR value smaller than this
value. Default |
minClustExprPerc |
A numeric scalar. The minimum cutoff of the
percentage of cells in the cluster of interests that expressed the marker
gene. From 0 to 1. Default |
maxCtrlExprPerc |
A numeric scalar. The maximum cutoff of the
percentage of cells out of the cluster (control group) that expressed the
marker gene. From 0 to 1. Default |
minMeanExpr |
A numeric scalar. The minimum cutoff of the mean
expression value of the marker in the cluster of interests. Default
|
detectThresh |
A numeric scalar, above which a matrix value will be
treated as expressed when calculating cluster/control expression percentage.
Default |
The returned marker table, in the metadata
slot, consists of 8
columns: "Gene"
, "Log2_FC"
, "Pvalue"
, "FDR"
,
cluster
, "clusterExprPerc"
, "ControlExprPerc"
and
"clusterAveExpr"
.
"clusterExprPerc"
is the fraction of cells,
that has marker value (e.g. gene expression counts) larger than
detectThresh
, in the cell population of the cluster. As for each
cluster, we set all cells out of this cluster as control. Similarly,
"ControlExprPerc"
is the fraction of cells with marker value larger
than detectThresh
in the control cell group.
The input SingleCellExperiment object with
metadata(inSCE)$findMarker
updated with a data.table of the up-
regulated DEGs for each cluster.
runDEAnalysis
, getFindMarkerTopTable
,
plotFindMarkerHeatmap
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runFindMarker(mouseBrainSubsetSCE, useAssay = "logcounts", cluster = "level1class")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runFindMarker(mouseBrainSubsetSCE, useAssay = "logcounts", cluster = "level1class")
Run GSVA analysis on a SingleCellExperiment object
runGSVA( inSCE, useAssay = "logcounts", resultNamePrefix = NULL, geneSetCollectionName, ... )
runGSVA( inSCE, useAssay = "logcounts", resultNamePrefix = NULL, geneSetCollectionName, ... )
inSCE |
Input SingleCellExperiment object. |
useAssay |
Indicate which assay to use. The default is "logcounts" |
resultNamePrefix |
Character. Prefix to the name the GSVA results
which will be stored in the reducedDim slot of |
geneSetCollectionName |
Character. The name of the gene set collection to use. |
... |
Parameters to pass to gsva() |
A SingleCellExperiment object with pathway activity
scores from GSVA stored in reducedDim
as
GSVA_geneSetCollectionName_Scores
.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce,geneSetList = gs, by = "rownames") sce <- runGSVA(inSCE = sce, geneSetCollectionName = "GeneSetCollection", useAssay = "logcounts")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce,geneSetList = gs, by = "rownames") sce <- runGSVA(inSCE = sce, geneSetCollectionName = "GeneSetCollection", useAssay = "logcounts")
Harmony is an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions.
runHarmony( inSCE, useAssay = NULL, useReducedDim = NULL, batch = "batch", reducedDimName = "HARMONY", nComponents = 50, lambda = 0.1, theta = 5, sigma = 0.1, nIter = 10, seed = 12345, verbose = TRUE, ... )
runHarmony( inSCE, useAssay = NULL, useReducedDim = NULL, batch = "batch", reducedDimName = "HARMONY", nComponents = 50, lambda = 0.1, theta = 5, sigma = 0.1, nIter = 10, seed = 12345, verbose = TRUE, ... )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Default |
useReducedDim |
A single character indicating the name of the reducedDim
to be used. It is recommended to use a reducedDim instead of a full assay as
using an assay might cause the algorithm to not converge and throw error.
Specifying this will ignore |
batch |
A single character indicating a field in |
reducedDimName |
A single character. The name for the corrected
low-dimensional representation. Will be saved to |
nComponents |
An integer. The number of PCs to use and generate.
Default |
lambda |
A Numeric scalar. Ridge regression penalty parameter. Must be
strictly positive. Smaller values result in more aggressive correction.
Default |
theta |
A Numeric scalar. Diversity clustering penalty parameter. Larger
values of theta result in more diverse clusters. theta=0 does not encourage
any diversity. Default |
sigma |
A Numeric scalar. Width of soft kmeans clusters. Larger values
of sigma result in cells assigned to more clusters. Smaller values of sigma
make soft kmeans cluster approach hard clustering. Default |
nIter |
An integer. The max number of iterations to perform. Default
|
seed |
Set seed for reproducibility. Default is |
verbose |
Whether to print progress messages. Default |
... |
Other arguments passed to |
Since some of the arguments of HarmonyMatrix
is controlled by this wrapper function. The additional arguments users can
work with only include: nclust
, tau
, block.size
,
max.iter.cluster
, epsilon.cluster
, epsilon.harmony
,
plot_convergence
, reference_values
and cluster_prior
.
The input SingleCellExperiment object with
reducedDim(inSCE, reducedDimName)
updated.
Ilya Korsunsky, et al., 2019
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) ## Not run: if (require("harmony")) sceCorr <- runHarmony(sceBatches) ## End(Not run)
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) ## Not run: if (require("harmony")) sceCorr <- runHarmony(sceBatches) ## End(Not run)
Perform KMeans clustering on a
SingleCellExperiment object, with kmeans
.
runKMeans( inSCE, nCenters, useReducedDim = "PCA", clusterName = "KMeans_cluster", nComp = 10, nIter = 10, nStart = 1, seed = 12345, algorithm = c("Hartigan-Wong", "Lloyd", "MacQueen") )
runKMeans( inSCE, nCenters, useReducedDim = "PCA", clusterName = "KMeans_cluster", nComp = 10, nIter = 10, nStart = 1, seed = 12345, algorithm = c("Hartigan-Wong", "Lloyd", "MacQueen") )
inSCE |
A SingleCellExperiment object. |
nCenters |
An |
useReducedDim |
A single |
clusterName |
A single |
nComp |
An |
nIter |
An |
nStart |
An |
seed |
An |
algorithm |
A single |
The input SingleCellExperiment object with
factor
cluster labeling updated in
colData(inSCE)[[clusterName]]
.
data("mouseBrainSubsetSCE") mouseBrainSubsetSCE <- runKMeans(mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts", nCenters = 2)
data("mouseBrainSubsetSCE") mouseBrainSubsetSCE <- runKMeans(mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts", nCenters = 2)
Limma's batch effect removal function fits a linear model to the data, then removes the component due to the batch effects.
runLimmaBC(inSCE, useAssay = "logcounts", assayName = "LIMMA", batch = "batch")
runLimmaBC(inSCE, useAssay = "logcounts", assayName = "LIMMA", batch = "batch")
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Default |
assayName |
A single characeter. The name for the corrected assay. Will
be saved to |
batch |
A single character indicating a field in |
The input SingleCellExperiment object with
assay(inSCE, assayName)
updated.
Gordon K Smyth, et al., 2003
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runLimmaBC(sceBatches)
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runLimmaBC(sceBatches)
MNN is designed for batch correction of single-cell RNA-seq data where the batches are partially confounded with biological conditions of interest. It does so by identifying pairs of MNN in the high-dimensional log-expression space. For each MNN pair, a pairwise correction vector is computed by applying a Gaussian smoothing kernel with bandwidth 'sigma'.
runMNNCorrect( inSCE, useAssay = "logcounts", batch = "batch", assayName = "MNN", k = 20L, propK = NULL, sigma = 0.1, cosNormIn = TRUE, cosNormOut = TRUE, varAdj = TRUE, BPPARAM = BiocParallel::SerialParam() )
runMNNCorrect( inSCE, useAssay = "logcounts", batch = "batch", assayName = "MNN", k = 20L, propK = NULL, sigma = 0.1, cosNormIn = TRUE, cosNormOut = TRUE, varAdj = TRUE, BPPARAM = BiocParallel::SerialParam() )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Default |
batch |
A single character indicating a field in |
assayName |
A single characeter. The name for the corrected assay. Will
be saved to |
k |
An integer scalar specifying the number of nearest neighbors to
consider when identifying MNNs. See "See Also". Default |
propK |
A numeric scalar in (0, 1) specifying the proportion of cells in
each dataset to use for mutual nearest neighbor searching. See "See Also".
Default |
sigma |
A numeric scalar specifying the bandwidth of the Gaussian
smoothing kernel used to compute the correction vector for each cell. See
"See Also". Default |
cosNormIn |
A logical scalar indicating whether cosine normalization
should be performed on the input data prior to calculating distances between
cells. See "See Also". Default |
cosNormOut |
A logical scalar indicating whether cosine normalization
should be performed prior to computing corrected expression values. See "See
Also". Default |
varAdj |
A logical scalar indicating whether variance adjustment should
be performed on the correction vectors. See "See Also". Default |
BPPARAM |
A BiocParallelParam object specifying whether the PCA and nearest-neighbor searches should be parallelized. |
The input SingleCellExperiment object with
assay(inSCE, assayName)
updated.
Haghverdi L, Lun ATL, et. al., 2018
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runMNNCorrect(sceBatches)
data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runMNNCorrect(sceBatches)
Generates and stores variability data in the input
SingleCellExperiment object, using
modelGeneVar
method.
Also selects a specified number of top HVGs and store the logical selection
in rowData
.
runModelGeneVar(inSCE, useAssay = "logcounts")
runModelGeneVar(inSCE, useAssay = "logcounts")
inSCE |
A SingleCellExperiment object |
useAssay |
A character string to specify an assay to compute variable
features from. Default |
inSCE
updated with variable feature metrics in rowData
Irzam Sarfraz
runFeatureSelection
, runSeuratFindHVG
,
getTopHVG
, plotTopHVG
data("scExample", package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, "logcounts") sce <- runModelGeneVar(sce) hvf <- getTopHVG(sce, method = "modelGeneVar", hvgNumber = 10, useFeatureSubset = NULL)
data("scExample", package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, "logcounts") sce <- runModelGeneVar(sce) hvf <- getTopHVG(sce, method = "modelGeneVar", hvgNumber = 10, useFeatureSubset = NULL)
Wrapper function to run any of the integrated normalization/transformation methods in the singleCellTK. The available methods include 'LogNormalize', 'CLR', 'RC' and 'SCTransform' from Seurat, 'logNormCounts and 'CPM' from Scater. Additionally, users can 'scale' using Z.Score, 'transform' using log, log1p and sqrt, add 'pseudocounts' and trim the final matrices between a range of values.
runNormalization( inSCE, useAssay = "counts", outAssayName = "logcounts", normalizationMethod = "logNormCounts", scale = FALSE, seuratScaleFactor = 10000, transformation = NULL, pseudocountsBeforeNorm = NULL, pseudocountsBeforeTransform = NULL, trim = NULL, verbose = TRUE )
runNormalization( inSCE, useAssay = "counts", outAssayName = "logcounts", normalizationMethod = "logNormCounts", scale = FALSE, seuratScaleFactor = 10000, transformation = NULL, pseudocountsBeforeNorm = NULL, pseudocountsBeforeTransform = NULL, trim = NULL, verbose = TRUE )
inSCE |
Input |
useAssay |
Specify the name of the assay that should be used. |
outAssayName |
Specify the name of the new output assay. |
normalizationMethod |
Specify a normalization method from 'LogNormalize',
'CLR', 'RC' and 'SCTransform' from Seurat or 'logNormCounts' and 'CPM' from
scater packages. Default |
scale |
Logical value indicating if the data should be scaled using
Z.Score. Default |
seuratScaleFactor |
Specify the 'scaleFactor' argument if a Seurat
normalization method is selected. Default is |
transformation |
Specify the transformation options to run on the
selected assay. Options include 'log2' (base 2 log transformation),
'log1p' (natural log + 1 transformation) and 'sqrt' (square root). Default
value is |
pseudocountsBeforeNorm |
Specify a numeric pseudo value that should be added
to the assay before normalization is performed. Default is |
pseudocountsBeforeTransform |
Specify a numeric pseudo value that should be
added to the assay before transformation is run. Default is |
trim |
Specify a vector of two numeric values that should be used
as the upper and lower trim values to trim the assay between these two
values. For example, |
verbose |
Logical value indicating if progress messages should be
displayed to the user. Default is |
Output SCE object with new normalized/transformed assay stored.
data(sce_chcl, package = "scds") sce_chcl <- runNormalization( inSCE = sce_chcl, normalizationMethod = "LogNormalize", useAssay = "counts", outAssayName = "logcounts")
data(sce_chcl, package = "scds") sce_chcl <- runNormalization( inSCE = sce_chcl, normalizationMethod = "LogNormalize", useAssay = "counts", outAssayName = "logcounts")
A wrapper function for addPerCellQC. Calculate general quality control metrics for each cell in the count matrix.
runPerCellQC( inSCE, useAssay = "counts", mitoGeneLocation = "rownames", mitoRef = c(NULL, "human", "mouse"), mitoIDType = c("ensembl", "symbol", "entrez", "ensemblTranscriptID"), mitoPrefix = "MT-", mitoID = NULL, collectionName = NULL, geneSetList = NULL, geneSetListLocation = "rownames", geneSetCollection = NULL, percent_top = c(50, 100, 200, 500), use_altexps = FALSE, flatten = TRUE, detectionLimit = 0, BPPARAM = BiocParallel::SerialParam() )
runPerCellQC( inSCE, useAssay = "counts", mitoGeneLocation = "rownames", mitoRef = c(NULL, "human", "mouse"), mitoIDType = c("ensembl", "symbol", "entrez", "ensemblTranscriptID"), mitoPrefix = "MT-", mitoID = NULL, collectionName = NULL, geneSetList = NULL, geneSetListLocation = "rownames", geneSetCollection = NULL, percent_top = c(50, 100, 200, 500), use_altexps = FALSE, flatten = TRUE, detectionLimit = 0, BPPARAM = BiocParallel::SerialParam() )
inSCE |
A SingleCellExperiment object. |
useAssay |
A string specifying which assay in the SCE to use. Default
|
mitoGeneLocation |
Character. Describes the location within |
mitoRef |
Character. The species used to extract mitochondrial genes ID
from build-in mitochondrial geneset in SCTK. Available species options are
|
mitoIDType |
Character. Types of mitochondrial gene id. SCTK supports
|
mitoPrefix |
Character. The prefix used to get mitochondrial gene from
either |
mitoID |
Character. A vector of mitochondrial genes to be quantified. |
collectionName |
Character. Name of a |
geneSetList |
List of gene sets to be quantified. The genes in the
assays will be matched to the genes in the list based on
|
geneSetListLocation |
Character or numeric vector. If set to
|
geneSetCollection |
Class of |
percent_top |
An integer vector. Each element is treated as a number of
top genes to compute the percentage of library size occupied by the most
highly expressed genes in each cell. Default |
use_altexps |
Logical scalar indicating whether QC statistics should
be computed for alternative Experiments in |
flatten |
Logical scalar indicating whether the nested
DataFrame-class in the output should be flattened. Default
|
detectionLimit |
A numeric scalar specifying the lower detection limit
for expression. Default |
BPPARAM |
A BiocParallelParam object specifying whether the QC
calculations should be parallelized. Default
|
This function allows multiple ways to import mitochondrial genes and quantify
their expression in cells. mitoGeneLocation
is required for all
methods to point to the location within inSCE object that stores the
mitochondrial gene IDs or Symbols. The various ways mito genes can be
specified are:
A combination of mitoRef
and mitoIDType
parameters can be used to load pre-built mitochondrial gene sets stored
in the SCTK package. These parameters are used in the
importMitoGeneSet function.
The mitoPrefix
parameter can be used to search for features
matching a particular pattern. The default pattern is an "MT-"
at the beginning of the ID.
The mitoID
parameter can be used to directy supply a vector of
mitochondrial gene IDs or names. Only features that exactly match items
in this vector will be included in the mitochondrial gene set.
A SingleCellExperiment object with cell QC metrics added to the colData slot.
addPerCellQC
,
link{plotRunPerCellQCResults}
, runCellQC
data(scExample, package = "singleCellTK") mito.ix = grep("^MT-", rowData(sce)$feature_name) geneSet <- list("Mito"=rownames(sce)[mito.ix]) sce <- runPerCellQC(sce, geneSetList = geneSet)
data(scExample, package = "singleCellTK") mito.ix = grep("^MT-", rowData(sce)$feature_name) geneSet <- list("Mito"=rownames(sce)[mito.ix]) sce <- runPerCellQC(sce, geneSetList = geneSet)
SCANORAMA is analogous to computer vision algorithms for panorama stitching that identify images with overlapping content and merge these into a larger panorama.
runSCANORAMA( inSCE, useAssay = "logcounts", batch = "batch", assayName = "SCANORAMA", SIGMA = 15, ALPHA = 0.1, KNN = 20, approx = TRUE )
runSCANORAMA( inSCE, useAssay = "logcounts", batch = "batch", assayName = "SCANORAMA", SIGMA = 15, ALPHA = 0.1, KNN = 20, approx = TRUE )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Scanorama requires a transformed normalized expression
assay. Default |
batch |
A single character indicating a field in |
assayName |
A single characeter. The name for the corrected assay. Will
be saved to |
SIGMA |
A numeric scalar. Algorithmic parameter, correction smoothing
parameter on Gaussian kernel. Default |
ALPHA |
A numeric scalar. Algorithmic parameter, alignment score
minimum cutoff. Default |
KNN |
An integer. Algorithmic parameter, number of nearest neighbors to
use for matching. Default |
approx |
Boolean. Use approximate nearest neighbors, greatly speeds up
matching runtime. Default |
The input SingleCellExperiment object with
assay(inSCE, assayName)
updated.
Brian Hie et al, 2019
## Not run: data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runSCANORAMA(sceBatches, "ScaterLogNormCounts") ## End(Not run)
## Not run: data('sceBatches', package = 'singleCellTK') logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runSCANORAMA(sceBatches, "ScaterLogNormCounts") ## End(Not run)
runScanpyFindClusters Computes the clusters from the input sce object and stores them back in sce object
runScanpyFindClusters( inSCE, useAssay = "scanpyScaledData", useReducedDim = "scanpyPCA", nNeighbors = 10, dims = 40, method = c("leiden", "louvain"), colDataName = NULL, resolution = 1, niterations = -1, flavor = "vtraag", use_weights = FALSE, cor_method = "pearson", inplace = TRUE, externalReduction = NULL, seed = 12345 )
runScanpyFindClusters( inSCE, useAssay = "scanpyScaledData", useReducedDim = "scanpyPCA", nNeighbors = 10, dims = 40, method = c("leiden", "louvain"), colDataName = NULL, resolution = 1, niterations = -1, flavor = "vtraag", use_weights = FALSE, cor_method = "pearson", inplace = TRUE, externalReduction = NULL, seed = 12345 )
inSCE |
(sce) object from which clusters should be computed and stored in |
useAssay |
Assay containing scaled counts to use for clustering. |
useReducedDim |
Reduction method to use for computing clusters.
Default |
nNeighbors |
The size of local neighborhood (in terms of number of
neighboring data points) used for manifold approximation. Larger values
result in more global views of the manifold, while smaller values result in
more local data being preserved. Default |
dims |
numeric value of how many components to use for computing
clusters. Default |
method |
selected method to compute clusters. One of "louvain",
and "leiden". Default |
colDataName |
Specify the name to give to this clustering result.
Default is |
resolution |
A parameter value controlling the coarseness of the
clustering. Higher values lead to more clusters Default |
niterations |
How many iterations of the Leiden clustering method to
perform. Positive values above 2 define the total number of iterations to
perform, -1 has the method run until it reaches its optimal clustering.
Default |
flavor |
Choose between to packages for computing the clustering.
Default |
use_weights |
Boolean. Use weights from knn graph. Default |
cor_method |
correlation method to use. Options are ‘pearson’,
‘kendall’, and ‘spearman’. Default |
inplace |
If True, adds dendrogram information to annData object,
else this function returns the information. Default |
externalReduction |
Pass DimReduce object if PCA computed through
other libraries. Default |
seed |
Specify numeric value to set as a seed. Default |
Updated sce object which now contains the computed clusters
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") ## End(Not run)
runScanpyFindHVG Find highly variable genes and store in the input sce object
runScanpyFindHVG( inSCE, useAssay = "scanpyNormData", method = c("seurat", "cell_ranger", "seurat_v3"), altExpName = "featureSubset", altExp = FALSE, hvgNumber = 2000, minMean = 0.0125, maxMean = 3, minDisp = 0.5, maxDisp = Inf )
runScanpyFindHVG( inSCE, useAssay = "scanpyNormData", method = c("seurat", "cell_ranger", "seurat_v3"), altExpName = "featureSubset", altExp = FALSE, hvgNumber = 2000, minMean = 0.0125, maxMean = 3, minDisp = 0.5, maxDisp = Inf )
inSCE |
(sce) object to compute highly variable genes from and to store back to it |
useAssay |
Specify the name of the assay to use for computation of variable genes. It is recommended to use log normalized data, except when flavor='seurat_v3', in which counts data is expected. |
method |
selected method to use for computation of highly variable
genes. One of |
altExpName |
Character. Name of the alternative experiment object to
add if |
altExp |
Logical value indicating if the input object is an
altExperiment. Default |
hvgNumber |
numeric value of how many genes to select as highly
variable. Default |
minMean |
If n_top_genes unequals None, this and all other cutoffs for
the means and the normalized dispersions are ignored. Ignored if
flavor='seurat_v3'. Default |
maxMean |
If n_top_genes unequals None, this and all other cutoffs for
the means and the normalized dispersions are ignored. Ignored if
flavor='seurat_v3'. Default |
minDisp |
If n_top_genes unequals None, this and all other cutoffs for
the means and the normalized dispersions are ignored. Ignored if
flavor='seurat_v3'. Default |
maxDisp |
If n_top_genes unequals None, this and all other cutoffs for
the means and the normalized dispersions are ignored. Ignored if
flavor='seurat_v3'. Default |
Updated SingleCellExperiment
object with highly variable genes
computation stored
getTopHVG
, plotTopHVG
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") g <- getTopHVG(sce, method = "seurat", hvgNumber = 500) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") g <- getTopHVG(sce, method = "seurat", hvgNumber = 500) ## End(Not run)
runScanpyFindMarkers
runScanpyFindMarkers( inSCE, nGenes = NULL, useAssay = "scanpyNormData", colDataName, group1 = "all", group2 = "rest", test = c("wilcoxon", "t-test", "t-test_overestim_var", "logreg"), corr_method = c("benjamini-hochberg", "bonferroni") )
runScanpyFindMarkers( inSCE, nGenes = NULL, useAssay = "scanpyNormData", colDataName, group1 = "all", group2 = "rest", test = c("wilcoxon", "t-test", "t-test_overestim_var", "logreg"), corr_method = c("benjamini-hochberg", "bonferroni") )
inSCE |
Input |
nGenes |
The number of genes that appear in the returned tables. Defaults to all genes. |
useAssay |
Specify the name of the assay to use for computation of marker genes. It is recommended to use log normalized assay. |
colDataName |
colData to use as the key of the observations grouping to consider. |
group1 |
Name of group1. Subset of groups, to which comparison shall be restricted, or 'all' (default), for all groups. |
group2 |
Name of group2. If 'rest', compare each group to the union of the rest of the group. If a group identifier, compare with respect to this group. Default is 'rest' |
test |
Test to use for DE. Default |
corr_method |
p-value correction method. Used only for 't-test', 't-test_overestim_var', and 'wilcoxon'. |
A SingleCellExperiment
object that contains marker genes
populated in a data.frame stored inside metadata slot.
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyFindMarkers(sce, colDataName = "Scanpy_louvain_1" ) ## End(Not run)
runScanpyNormalizeData Wrapper for NormalizeData() function from scanpy library Normalizes the sce object according to the input parameters
runScanpyNormalizeData( inSCE, useAssay, targetSum = 10000, maxFraction = 0.05, normAssayName = "scanpyNormData" )
runScanpyNormalizeData( inSCE, useAssay, targetSum = 10000, maxFraction = 0.05, normAssayName = "scanpyNormData" )
inSCE |
(sce) object to normalize |
useAssay |
Assay containing raw counts to use for normalization. |
targetSum |
If NULL, after normalization, each observation (cell) has a
total count equal to the median of total counts for observations (cells)
before normalization. Default |
maxFraction |
Include cells that have more counts than max_fraction of
the original total counts in at least one cell. Default |
normAssayName |
Name of new assay containing normalized data. Default
|
Normalized SingleCellExperiment
object
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") rownames(sce) <- rowData(sce)$feature_name ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") ## End(Not run)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") rownames(sce) <- rowData(sce)$feature_name ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") ## End(Not run)
runScanpyPCA Computes PCA on the input sce object and stores the calculated principal components within the sce object
runScanpyPCA( inSCE, useAssay = "scanpyScaledData", reducedDimName = "scanpyPCA", nPCs = 50, method = c("arpack", "randomized", "auto", "lobpcg"), use_highly_variable = TRUE, seed = 12345 )
runScanpyPCA( inSCE, useAssay = "scanpyScaledData", reducedDimName = "scanpyPCA", nPCs = 50, method = c("arpack", "randomized", "auto", "lobpcg"), use_highly_variable = TRUE, seed = 12345 )
inSCE |
(sce) object on which to compute PCA |
useAssay |
Assay containing scaled counts to use in PCA. Default
|
reducedDimName |
Name of new reducedDims object containing Scanpy PCA.
Default |
nPCs |
numeric value of how many components to compute. Default
|
method |
selected method to use for computation of pca.
One of |
use_highly_variable |
boolean value of whether to use highly variable genes only. By default uses them if they have been determined beforehand. |
seed |
Specify numeric value to set as a seed. Default |
Updated SingleCellExperiment
object which now contains the
computed principal components
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") ## End(Not run)
runScanpyScaleData Scales the input sce object according to the input parameters
runScanpyScaleData( inSCE, useAssay = "scanpyNormData", scaledAssayName = "scanpyScaledData" )
runScanpyScaleData( inSCE, useAssay = "scanpyNormData", scaledAssayName = "scanpyScaledData" )
inSCE |
(sce) object to scale |
useAssay |
Assay containing normalized counts to scale. |
scaledAssayName |
Name of new assay containing scaled data. Default
|
Scaled SingleCellExperiment
object
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") ## End(Not run)
runScanpyTSNE Computes tSNE from the given sce object and stores the tSNE computations back into the sce object
runScanpyTSNE( inSCE, useAssay = NULL, useReducedDim = "scanpyPCA", reducedDimName = "scanpyTSNE", dims = 40, perplexity = 30, externalReduction = NULL, seed = 12345 )
runScanpyTSNE( inSCE, useAssay = NULL, useReducedDim = "scanpyPCA", reducedDimName = "scanpyTSNE", dims = 40, perplexity = 30, externalReduction = NULL, seed = 12345 )
inSCE |
(sce) object on which to compute the tSNE |
useAssay |
Specify name of assay to use. Default is |
useReducedDim |
selected reduction method to use for computing tSNE.
Default |
reducedDimName |
Name of new reducedDims object containing Scanpy tSNE
Default |
dims |
Number of reduction components to use for tSNE computation.
Default |
perplexity |
Adjust the perplexity tuneable parameter for the underlying
tSNE call. Default |
externalReduction |
Pass DimReduc object if PCA computed through
other libraries. Default |
seed |
Specify numeric value to set as a seed. Default |
Updated sce object with tSNE computations stored
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyTSNE(sce, useReducedDim = "scanpyPCA") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyTSNE(sce, useReducedDim = "scanpyPCA") ## End(Not run)
runScanpyUMAP Computes UMAP from the given sce object and stores the UMAP computations back into the sce object
runScanpyUMAP( inSCE, useAssay = NULL, useReducedDim = "scanpyPCA", reducedDimName = "scanpyUMAP", dims = 40, minDist = 0.5, nNeighbors = 10, spread = 1, alpha = 1, gamma = 1, externalReduction = NULL, seed = 12345 )
runScanpyUMAP( inSCE, useAssay = NULL, useReducedDim = "scanpyPCA", reducedDimName = "scanpyUMAP", dims = 40, minDist = 0.5, nNeighbors = 10, spread = 1, alpha = 1, gamma = 1, externalReduction = NULL, seed = 12345 )
inSCE |
(sce) object on which to compute the UMAP |
useAssay |
Specify name of assay to use. Default is |
useReducedDim |
Reduction to use for computing UMAP.
Default is |
reducedDimName |
Name of new reducedDims object containing Scanpy UMAP
Default |
dims |
Numerical value of how many reduction components to use for UMAP
computation. Default |
minDist |
Sets the |
nNeighbors |
Sets the |
spread |
Sets the |
alpha |
Sets the |
gamma |
Sets the |
externalReduction |
Pass DimReduce object if PCA computed through
other libraries. Default |
seed |
Specify numeric value to set as a seed. Default |
Updated sce object with UMAP computations stored
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runScanpyNormalizeData(sce, useAssay = "counts") sce <- runScanpyFindHVG(sce, useAssay = "scanpyNormData", method = "seurat") sce <- runScanpyScaleData(sce, useAssay = "scanpyNormData") sce <- runScanpyPCA(sce, useAssay = "scanpyScaledData") sce <- runScanpyFindClusters(sce, useReducedDim = "scanpyPCA") sce <- runScanpyUMAP(sce, useReducedDim = "scanpyPCA") ## End(Not run)
A wrapper function for scDblFinder. Identify potential doublet cells based on simulations of putative doublet expression profiles. Generate a doublet score for each cell.
runScDblFinder( inSCE, sample = NULL, useAssay = "counts", nNeighbors = 50, simDoublets = max(10000, ncol(inSCE)), seed = 12345, BPPARAM = BiocParallel::SerialParam() )
runScDblFinder( inSCE, sample = NULL, useAssay = "counts", nNeighbors = 50, simDoublets = max(10000, ncol(inSCE)), seed = 12345, BPPARAM = BiocParallel::SerialParam() )
inSCE |
A SingleCellExperiment object. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
useAssay |
A string specifying which assay in the SCE to use. Default
|
nNeighbors |
Number of nearest neighbors used to calculate density for
doublet detection. Default |
simDoublets |
Number of simulated doublets created for doublet
detection. Default |
seed |
Seed for the random number generator, can be set to |
BPPARAM |
A |
This function is a wrapper function for
scDblFinder. runScDblFinder
runs
scDblFinder for each sample within inSCE
iteratively. The resulting doublet scores for all cells will be appended to
the colData
of inSCE
.
A SingleCellExperiment object with the scDblFinder QC outputs added to the colData slot.
Lun ATL (2018). Detecting doublet cells with scran. https://ltla.github.io/SingleCellThoughts/software/doublet_detection/bycell.html
scDblFinder
,
plotScDblFinderResults
, runCellQC
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runScDblFinder(sce)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runScDblFinder(sce)
The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple scRNA-Seq data.
runSCMerge( inSCE, useAssay = "logcounts", batch = "batch", assayName = "scMerge", hvgExprs = "counts", seg = NULL, kmeansK = NULL, cellType = NULL, BPPARAM = BiocParallel::SerialParam() )
runSCMerge( inSCE, useAssay = "logcounts", batch = "batch", assayName = "scMerge", hvgExprs = "counts", seg = NULL, kmeansK = NULL, cellType = NULL, BPPARAM = BiocParallel::SerialParam() )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Default |
batch |
A single character indicating a field in
|
assayName |
A single characeter. The name for the corrected assay. Will
be saved to |
hvgExprs |
A single characeter. The assay that to be used for highly
variable genes identification. Default |
seg |
A vector of gene names or indices that specifies SEG (Stably
Expressed Genes) set as negative control. Pre-defined dataset with human and
mouse SEG lists is available with |
kmeansK |
An integer vector. Indicating the kmeans' K-value for each
batch (i.e. how many subclusters in each batch should exist), in order to
construct pseudo-replicates. The length of |
cellType |
A single character. A string indicating a field in
|
BPPARAM |
A BiocParallelParam object specifying whether
should be parallelized. Default |
The input SingleCellExperiment object with
assay(inSCE, assayName)
updated.
Hoa, et al., 2020
data('sceBatches', package = 'singleCellTK') ## Not run: logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runSCMerge(sceBatches) ## End(Not run)
data('sceBatches', package = 'singleCellTK') ## Not run: logcounts(sceBatches) <- log1p(counts(sceBatches)) sceCorr <- runSCMerge(sceBatches) ## End(Not run)
Perform SNN graph clustering on a
SingleCellExperiment object, with graph
construction by buildSNNGraph
and graph clustering by
"igraph" package.
runScranSNN( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, altExpAssay = "counts", altExpRedDim = NULL, clusterName = "cluster", k = 14, nComp = 10, weightType = "jaccard", algorithm = c("louvain", "leiden", "walktrap", "infomap", "fastGreedy", "labelProp", "leadingEigen"), BPPARAM = BiocParallel::SerialParam(), seed = 12345, ... )
runScranSNN( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, altExpAssay = "counts", altExpRedDim = NULL, clusterName = "cluster", k = 14, nComp = 10, weightType = "jaccard", algorithm = c("louvain", "leiden", "walktrap", "infomap", "fastGreedy", "labelProp", "leadingEigen"), BPPARAM = BiocParallel::SerialParam(), seed = 12345, ... )
inSCE |
A SingleCellExperiment object. |
useReducedDim |
A single |
useAssay |
A single |
useAltExp |
A single |
altExpAssay |
A single |
altExpRedDim |
A single |
clusterName |
A single |
k |
An |
nComp |
An |
weightType |
A single |
algorithm |
A single |
BPPARAM |
A |
seed |
Random seed for reproducibility of results. Default |
... |
Other optional parameters passed to the |
Different graph based clustering algorithms have diverse sets of parameters that users can tweak. The help information can be found here:
for "louvain"
, see function help
cluster_louvain
for "leiden"
, see function help
cluster_leiden
for "walktrap"
, see function help
cluster_walktrap
for "infomap"
, see function help
cluster_infomap
for "fastGreedy"
, see function help
cluster_fast_greedy
for "labelProp"
, see function help
cluster_label_prop
for "leadingEigen"
, see function help
cluster_leading_eigen
The Scran SNN building method can work on specified nComp
components.
When users specify input matrix by useAssay
or useAltExp
+
altExpAssay
, the method will generate nComp
components and use
them all. When specifying useReducedDim
or useAltExp
+
altExpRedDim
, this function will subset the top nComp
components and pass them to the method.
The input SingleCellExperiment object with
factor
cluster labeling updated in
colData(inSCE)[[clusterName]]
.
Aaron Lun and et. al., 2016
data("mouseBrainSubsetSCE") mouseBrainSubsetSCE <- runScranSNN(mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts")
data("mouseBrainSubsetSCE") mouseBrainSubsetSCE <- runScranSNN(mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts")
scrublet
.A wrapper function that calls scrub_doublets
from python
module scrublet
. Simulates doublets from the observed data and uses
a k-nearest-neighbor classifier to calculate a continuous
scrublet_score
(between 0 and 1) for each transcriptome. The score
is automatically thresholded to generate scrublet_call
, a boolean
array that is TRUE
for predicted doublets and FALSE
otherwise.
runScrublet( inSCE, sample = NULL, useAssay = "counts", simDoubletRatio = 2, nNeighbors = NULL, minDist = NULL, expectedDoubletRate = 0.1, stdevDoubletRate = 0.02, syntheticDoubletUmiSubsampling = 1, useApproxNeighbors = TRUE, distanceMetric = "euclidean", getDoubletNeighborParents = FALSE, minCounts = 3, minCells = 3L, minGeneVariabilityPctl = 85, logTransform = FALSE, meanCenter = TRUE, normalizeVariance = TRUE, nPrinComps = 30L, tsneAngle = NULL, tsnePerplexity = NULL, verbose = TRUE, seed = 12345 )
runScrublet( inSCE, sample = NULL, useAssay = "counts", simDoubletRatio = 2, nNeighbors = NULL, minDist = NULL, expectedDoubletRate = 0.1, stdevDoubletRate = 0.02, syntheticDoubletUmiSubsampling = 1, useApproxNeighbors = TRUE, distanceMetric = "euclidean", getDoubletNeighborParents = FALSE, minCounts = 3, minCells = 3L, minGeneVariabilityPctl = 85, logTransform = FALSE, meanCenter = TRUE, normalizeVariance = TRUE, nPrinComps = 30L, tsneAngle = NULL, tsnePerplexity = NULL, verbose = TRUE, seed = 12345 )
inSCE |
A SingleCellExperiment object. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
useAssay |
A string specifying which assay in the SCE to use. Default
|
simDoubletRatio |
Numeric. Number of doublets to simulate relative to
the number of observed transcriptomes. Default |
nNeighbors |
Integer. Number of neighbors used to construct the KNN
graph of observed transcriptomes and simulated doublets. If |
minDist |
Float Determines how tightly UMAP packs points together. If
|
expectedDoubletRate |
The estimated doublet rate for the experiment.
Default |
stdevDoubletRate |
Uncertainty in the expected doublet rate. Default
|
syntheticDoubletUmiSubsampling |
Numeric. Rate for sampling UMIs when
creating synthetic doublets. If |
useApproxNeighbors |
Boolean. Use approximate nearest neighbor method
(annoy) for the KNN classifier. Default |
distanceMetric |
Character. Distance metric used when finding nearest
neighbors. See detail. Default |
getDoubletNeighborParents |
Boolean. If |
minCounts |
Numeric. Used for gene filtering prior to PCA. Genes
expressed at fewer than |
minCells |
Integer. Used for gene filtering prior to PCA. Genes
expressed at fewer than |
minGeneVariabilityPctl |
Numeric. Used for gene filtering prior to
PCA. Keep the most highly variable genes (in the top
|
logTransform |
Boolean. If |
meanCenter |
If |
normalizeVariance |
Boolean. If |
nPrinComps |
Integer. Number of principal components used to embed
the transcriptomes prior to k-nearest-neighbor graph construction.
Default |
tsneAngle |
Float. Determines angular size of a distant node as measured
from a point in the t-SNE plot. If |
tsnePerplexity |
Integer. The number of nearest neighbors that is used
in other manifold learning algorithms. If |
verbose |
Boolean. If |
seed |
Seed for the random number generator, can be set to |
For the list of valid values for distanceMetric
, see the
documentation for
annoy (if
useApproxNeighbors
is TRUE
) or
sklearn.neighbors.NearestNeighbors
(if useApproxNeighbors
is FALSE
).
A SingleCellExperiment object with
scrub_doublets
output appended to the colData slot. The columns
include scrublet_score
and scrublet_call
.
plotScrubletResults
, runCellQC
data(scExample, package = "singleCellTK") ## Not run: sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runScrublet(sce) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- runScrublet(sce) ## End(Not run)
runSeuratFindClusters Computes the clusters from the input sce object and stores them back in sce object
runSeuratFindClusters( inSCE, useAssay = "seuratNormData", useReduction = c("pca", "ica"), dims = 10, algorithm = c("louvain", "multilevel", "SLM"), groupSingletons = TRUE, resolution = 0.8, seed = 12345, externalReduction = NULL, verbose = TRUE )
runSeuratFindClusters( inSCE, useAssay = "seuratNormData", useReduction = c("pca", "ica"), dims = 10, algorithm = c("louvain", "multilevel", "SLM"), groupSingletons = TRUE, resolution = 0.8, seed = 12345, externalReduction = NULL, verbose = TRUE )
inSCE |
(sce) object from which clusters should be computed and stored in |
useAssay |
Assay containing scaled counts to use for clustering. |
useReduction |
Reduction method to use for computing clusters. One of
"pca" or "ica". Default |
dims |
numeric value of how many components to use for computing
clusters. Default |
algorithm |
selected algorithm to compute clusters. One of "louvain",
"multilevel", or "SLM". Use |
groupSingletons |
boolean if singletons should be grouped together or
not. Default |
resolution |
Set the resolution parameter to find larger (value above 1)
or smaller (value below 1) number of communities. Default |
seed |
Specify the seed value. Default |
externalReduction |
Pass DimReduc object if PCA/ICA computed through
other libraries. Default |
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
Updated sce object which now contains the computed clusters
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratFindClusters(sce, useAssay = "counts") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratFindClusters(sce, useAssay = "counts") ## End(Not run)
runSeuratFindHVG Find highly variable genes and store in the input sce object
runSeuratFindHVG( inSCE, useAssay = "counts", method = c("vst", "dispersion", "mean.var.plot"), hvgNumber = 2000, createFeatureSubset = "hvf", altExp = FALSE, verbose = TRUE )
runSeuratFindHVG( inSCE, useAssay = "counts", method = c("vst", "dispersion", "mean.var.plot"), hvgNumber = 2000, createFeatureSubset = "hvf", altExp = FALSE, verbose = TRUE )
inSCE |
(sce) object to compute highly variable genes from and to store back to it |
useAssay |
Specify the name of the assay to use for computation
of variable genes. It is recommended to use a raw counts assay with the
|
method |
selected method to use for computation of highly variable
genes. One of |
hvgNumber |
numeric value of how many genes to select as highly
variable. Default |
createFeatureSubset |
Specify a name of the subset to create
for the identified variable features. Default is |
altExp |
Logical value indicating if the input object is an
altExperiment. Default |
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
Updated SingleCellExperiment
object with highly variable genes
computation stored
runFeatureSelection
, runModelGeneVar
,
getTopHVG
, plotTopHVG
data(scExample, package = "singleCellTK") sce <- runSeuratFindHVG(sce)
data(scExample, package = "singleCellTK") sce <- runSeuratFindHVG(sce)
runSeuratFindMarkers
runSeuratFindMarkers( inSCE, cells1 = NULL, cells2 = NULL, group1 = NULL, group2 = NULL, allGroup = NULL, conserved = FALSE, test = "wilcox", onlyPos = FALSE, minPCT = 0.1, threshUse = 0.25, verbose = TRUE )
runSeuratFindMarkers( inSCE, cells1 = NULL, cells2 = NULL, group1 = NULL, group2 = NULL, allGroup = NULL, conserved = FALSE, test = "wilcox", onlyPos = FALSE, minPCT = 0.1, threshUse = 0.25, verbose = TRUE )
inSCE |
Input |
cells1 |
A |
cells2 |
A |
group1 |
Name of group1. |
group2 |
Name of group2. |
allGroup |
Name of all groups. |
conserved |
Logical value indicating if markers conserved between two
groups should be identified. Default is |
test |
Test to use for DE. Default |
onlyPos |
Logical value indicating if only positive markers should be returned. |
minPCT |
Numeric value indicating the minimum fraction of min.pct
cells in which genes are detected. Default is |
threshUse |
Numeric value indicating the logFC threshold value on
which on average, at least X-fold difference (log-scale) between the
two groups of cells exists. Default is |
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
A SingleCellExperiment
object that contains marker genes
populated in a data.frame stored inside metadata slot.
runSeuratHeatmap Computes the heatmap plot object from the pca slot in the input sce object
runSeuratHeatmap( inSCE, useAssay, useReduction = c("pca", "ica"), dims = NULL, nfeatures = 30, cells = NULL, ncol = NULL, balanced = TRUE, fast = TRUE, combine = TRUE, raster = TRUE, externalReduction = NULL )
runSeuratHeatmap( inSCE, useAssay, useReduction = c("pca", "ica"), dims = NULL, nfeatures = 30, cells = NULL, ncol = NULL, balanced = TRUE, fast = TRUE, combine = TRUE, raster = TRUE, externalReduction = NULL )
inSCE |
(sce) object from which to compute heatmap (pca should be computed) |
useAssay |
Specify name of the assay that will be scaled by this function. The output scaled assay will be used for computation of the heatmap. |
useReduction |
Reduction method to use for computing clusters. One of
"pca" or "ica". Default |
dims |
Number of components to generate heatmap plot objects. If
|
nfeatures |
Number of features to include in the heatmap. Default
|
cells |
Numeric value indicating the number of top cells to plot.
Default is |
ncol |
Numeric value indicating the number of columns to use for plot.
Default is |
balanced |
Plot equal number of genes with positive and negative scores.
Default is |
fast |
See DimHeatmap for more information. Default
|
combine |
See DimHeatmap for more information. Default
|
raster |
See DimHeatmap for more information. Default
|
externalReduction |
Pass DimReduc object if PCA/ICA computed through
other libraries. Default |
plot object
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") heatmap <- runSeuratHeatmap(sce, useAssay = "counts") plotSeuratHeatmap(heatmap) ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") heatmap <- runSeuratHeatmap(sce, useAssay = "counts") plotSeuratHeatmap(heatmap) ## End(Not run)
runSeuratICA Computes ICA on the input sce object and stores the calculated independent components within the sce object
runSeuratICA( inSCE, useAssay = "seuratScaledData", useFeatureSubset = NULL, scale = TRUE, reducedDimName = "seuratICA", nics = 20, seed = 12345, verbose = FALSE )
runSeuratICA( inSCE, useAssay = "seuratScaledData", useFeatureSubset = NULL, scale = TRUE, reducedDimName = "seuratICA", nics = 20, seed = 12345, verbose = FALSE )
inSCE |
(sce) object on which to compute ICA |
useAssay |
Assay containing scaled counts to use in ICA. |
useFeatureSubset |
Subset of feature to use for dimension reduction. A
character string indicating a |
scale |
Logical scalar, whether to standardize the expression values
using |
reducedDimName |
Name of new reducedDims object containing Seurat ICA
Default |
nics |
Number of independent components to compute. Default |
seed |
Random seed for reproducibility of results.
Default |
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
For features used for computation, it can be controlled by features
or
useFeatureSubset
. When features
is specified, the scaling and
dimensionality reduction will only be processed with these features. When
features
is NULL
but useFeatureSubset
is specified, will
use the features that the HVG list points to. If both parameters are
NULL
, the function will see if any Seurat's variable feature detection
has been ever performed, and use them if found. Otherwise, all features are
used.
Updated SingleCellExperiment
object which now contains the
computed independent components
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratICA(sce, useAssay = "counts") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratICA(sce, useAssay = "counts") ## End(Not run)
runSeuratIntegration A wrapper function to Seurat Batch-Correction/Integration workflow.
runSeuratIntegration( inSCE, useAssay = "counts", batch, newAssayName = "SeuratIntegratedAssay", kAnchor, kFilter, kWeight, ndims = 10 )
runSeuratIntegration( inSCE, useAssay = "counts", batch, newAssayName = "SeuratIntegratedAssay", kAnchor, kFilter, kWeight, ndims = 10 )
inSCE |
Input |
useAssay |
Assay to batch-correct. |
batch |
Batch variable from |
newAssayName |
Assay name for the batch-corrected output assay. |
kAnchor |
Number of neighbours to use for finding the anchors in the FindIntegrationAnchors function. |
kFilter |
Number of neighbours to use for filtering the anchors in the FindIntegrationAnchors function. |
kWeight |
Number of neighbours to use when weigthing the anchors in the IntegrateData function. |
ndims |
Number of dimensions to use. Default |
A SingleCellExperiment
object that contains the
batch-corrected assay inside the altExp
slot of the object
runSeuratJackStraw Compute jackstraw plot and store the computations in the input sce object
runSeuratJackStraw( inSCE, useAssay, dims = NULL, numReplicate = 100, propFreq = 0.025, externalReduction = NULL )
runSeuratJackStraw( inSCE, useAssay, dims = NULL, numReplicate = 100, propFreq = 0.025, externalReduction = NULL )
inSCE |
(sce) object on which to compute and store jackstraw plot |
useAssay |
Specify name of the assay to use for scaling. Assay name
provided against this parameter is scaled by the function and used
for the computation of JackStraw scores along with the reduced dimensions
specified by the |
dims |
Number of components to test in Jackstraw. If |
numReplicate |
Numeric value indicating the number of replicate
samplings to perform.
Default value is |
propFreq |
Numeric value indicating the proportion of data to randomly
permute for each replicate.
Default value is |
externalReduction |
Pass DimReduc object if PCA/ICA computed through
other libraries. Default |
Updated SingleCellExperiment
object with jackstraw
computations stored in it
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratJackStraw(sce, useAssay = "counts") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratJackStraw(sce, useAssay = "counts") ## End(Not run)
runSeuratNormalizeData Wrapper for NormalizeData() function from seurat library Normalizes the sce object according to the input parameters
runSeuratNormalizeData( inSCE, useAssay, normAssayName = "seuratNormData", normalizationMethod = "LogNormalize", scaleFactor = 10000, verbose = TRUE )
runSeuratNormalizeData( inSCE, useAssay, normAssayName = "seuratNormData", normalizationMethod = "LogNormalize", scaleFactor = 10000, verbose = TRUE )
inSCE |
(sce) object to normalize |
useAssay |
Assay containing raw counts to use for normalization. |
normAssayName |
Name of new assay containing normalized data. Default
|
normalizationMethod |
selected normalization method. Default
|
scaleFactor |
numeric value that represents the scaling factor. Default
|
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
Normalized SingleCellExperiment
object
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") ## End(Not run)
runSeuratPCA Computes PCA on the input sce object and stores the calculated principal components within the sce object
runSeuratPCA( inSCE, useAssay = "seuratNormData", useFeatureSubset = "hvf", scale = TRUE, reducedDimName = "seuratPCA", nPCs = 20, seed = 12345, verbose = TRUE )
runSeuratPCA( inSCE, useAssay = "seuratNormData", useFeatureSubset = "hvf", scale = TRUE, reducedDimName = "seuratPCA", nPCs = 20, seed = 12345, verbose = TRUE )
inSCE |
(sce) object on which to compute PCA |
useAssay |
Assay containing scaled counts to use in PCA. Default
|
useFeatureSubset |
Subset of feature to use for dimension reduction. A
character string indicating a |
scale |
Logical scalar, whether to standardize the expression values
using |
reducedDimName |
Name of new reducedDims object containing Seurat PCA.
Default |
nPCs |
numeric value of how many components to compute. Default
|
seed |
Random seed for reproducibility of results.
Default |
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
For features used for computation, it can be controlled by features
or
useFeatureSubset
. When features
is specified, the scaling and
dimensionality reduction will only be processed with these features. When
features
is NULL
but useFeatureSubset
is specified, will
use the features that the HVG list points to. If both parameters are
NULL
, the function will see if any Seurat's variable feature detection
has been ever performed, and use them if found. Otherwise, all features are
used.
Updated SingleCellExperiment
object which now contains the
computed principal components
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- setTopHVG(sce, method = "vst", featureSubsetName = "hvf") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- setTopHVG(sce, method = "vst", featureSubsetName = "hvf") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") ## End(Not run)
runSeuratScaleData Scales the input sce object according to the input parameters
runSeuratScaleData( inSCE, useAssay = "seuratNormData", scaledAssayName = "seuratScaledData", model = "linear", scale = TRUE, center = TRUE, scaleMax = 10, verbose = TRUE )
runSeuratScaleData( inSCE, useAssay = "seuratNormData", scaledAssayName = "seuratScaledData", model = "linear", scale = TRUE, center = TRUE, scaleMax = 10, verbose = TRUE )
inSCE |
(sce) object to scale |
useAssay |
Assay containing normalized counts to scale. |
scaledAssayName |
Name of new assay containing scaled data. Default
|
model |
selected model to use for scaling data. Default |
scale |
boolean if data should be scaled or not. Default |
center |
boolean if data should be centered or not. Default |
scaleMax |
maximum numeric value to return for scaled data. Default
|
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
Scaled SingleCellExperiment
object
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") ## End(Not run)
runSeuratSCTransform Runs the SCTransform function to transform/normalize the input data
runSeuratSCTransform( inSCE, normAssayName = "SCTCounts", useAssay = "counts", verbose = TRUE )
runSeuratSCTransform( inSCE, normAssayName = "SCTCounts", useAssay = "counts", verbose = TRUE )
inSCE |
Input SingleCellExperiment object |
normAssayName |
Name for the output data assay. Default
|
useAssay |
Name for the input data assay. Default |
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
Updated SingleCellExperiment object containing the transformed data
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runSeuratSCTransform(mouseBrainSubsetSCE)
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runSeuratSCTransform(mouseBrainSubsetSCE)
runSeuratTSNE Computes tSNE from the given sce object and stores the tSNE computations back into the sce object
runSeuratTSNE( inSCE, useReduction = c("pca", "ica"), reducedDimName = "seuratTSNE", dims = 10, perplexity = 30, externalReduction = NULL, seed = 1 )
runSeuratTSNE( inSCE, useReduction = c("pca", "ica"), reducedDimName = "seuratTSNE", dims = 10, perplexity = 30, externalReduction = NULL, seed = 1 )
inSCE |
(sce) object on which to compute the tSNE |
useReduction |
selected reduction algorithm to use for computing tSNE.
One of "pca" or "ica". Default |
reducedDimName |
Name of new reducedDims object containing Seurat tSNE
Default |
dims |
Number of reduction components to use for tSNE computation.
Default |
perplexity |
Adjust the perplexity tuneable parameter for the underlying
tSNE call. Default |
externalReduction |
Pass DimReduc object if PCA/ICA computed through
other libraries. Default |
seed |
Random seed for reproducibility of results.
Default |
Updated sce object with tSNE computations stored
runSeuratUMAP Computes UMAP from the given sce object and stores the UMAP computations back into the sce object
runSeuratUMAP( inSCE, useReduction = c("pca", "ica"), reducedDimName = "seuratUMAP", dims = 10, minDist = 0.3, nNeighbors = 30L, spread = 1, externalReduction = NULL, seed = 42, verbose = TRUE )
runSeuratUMAP( inSCE, useReduction = c("pca", "ica"), reducedDimName = "seuratUMAP", dims = 10, minDist = 0.3, nNeighbors = 30L, spread = 1, externalReduction = NULL, seed = 42, verbose = TRUE )
inSCE |
(sce) object on which to compute the UMAP |
useReduction |
Reduction to use for computing UMAP. One of "pca" or
"ica". Default is |
reducedDimName |
Name of new reducedDims object containing Seurat UMAP
Default |
dims |
Numerical value of how many reduction components to use for UMAP
computation. Default |
minDist |
Sets the |
nNeighbors |
Sets the |
spread |
Sets the |
externalReduction |
Pass DimReduc object if PCA/ICA computed through
other libraries. Default |
seed |
Random seed for reproducibility of results.
Default |
verbose |
Logical value indicating if informative messages should
be displayed. Default is |
Updated sce object with UMAP computations stored
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratFindClusters(sce, useAssay = "counts") sce <- runSeuratUMAP(sce, useReduction = "pca") ## End(Not run)
data(scExample, package = "singleCellTK") ## Not run: sce <- runSeuratNormalizeData(sce, useAssay = "counts") sce <- runSeuratFindHVG(sce, useAssay = "counts") sce <- runSeuratScaleData(sce, useAssay = "counts") sce <- runSeuratPCA(sce, useAssay = "counts") sce <- runSeuratFindClusters(sce, useAssay = "counts") sce <- runSeuratUMAP(sce, useReduction = "pca") ## End(Not run)
SingleR works with a reference dataset where the cell type labeling is given. Given a reference dataset of samples (single-cell or bulk) with known labels, it assigns those labels to new cells from a test dataset based on similarities in their expression profiles.
runSingleR( inSCE, useAssay = "logcounts", useSCERef = NULL, labelColName = NULL, useBltinRef = c("hpca", "bpe", "mp", "dice", "immgen", "mouse", "zeisel"), level = "fine", featureType = c("symbol", "ensembl"), labelByCluster = NULL )
runSingleR( inSCE, useAssay = "logcounts", useSCERef = NULL, labelColName = NULL, useBltinRef = c("hpca", "bpe", "mp", "dice", "immgen", "mouse", "zeisel"), level = "fine", featureType = c("symbol", "ensembl"), labelByCluster = NULL )
inSCE |
SingleCellExperiment inherited object. Required. |
useAssay |
character. A string specifying which assay to use for expression profile identification. Required. |
useSCERef |
SingleCellExperiment inherited object. An
optional customized reference dataset. Default |
labelColName |
A single character. A string specifying the column in
|
useBltinRef |
A single character. A string that specifies a reference
provided by SingleR. Choose from |
level |
A string for cell type labeling level. Used only when using
some of the SingleR built-in references. Choose from |
featureType |
A string for whether to use gene symbols or Ensembl IDs
when using a SingleR built-in reference. Should be set based on the type of
|
labelByCluster |
A single character. A string specifying the column name
in |
Input SCE object with cell type labeling updated in
colData(inSCE)
, together with scoring metrics.
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) #sceBatches <- runSingleR(sceBatches, useBltinRef = "mp")
data("sceBatches") logcounts(sceBatches) <- log1p(counts(sceBatches)) #sceBatches <- runSingleR(sceBatches, useBltinRef = "mp")
A wrapper function for autoEstCont and adjustCounts. Identify potential contamination from experimental factors such as ambient RNA. Visit their vignette for better understanding.
runSoupX( inSCE, sample = NULL, useAssay = "counts", background = NULL, bgAssayName = NULL, bgBatch = NULL, assayName = ifelse(is.null(background), "SoupX", "SoupX_bg"), cluster = NULL, reducedDimName = ifelse(is.null(background), "SoupX_UMAP_", "SoupX_bg_UMAP_"), tfidfMin = 1, soupQuantile = 0.9, maxMarkers = 100, contaminationRange = c(0.01, 0.8), rhoMaxFDR = 0.2, priorRho = 0.05, priorRhoStdDev = 0.1, forceAccept = FALSE, adjustMethod = c("subtraction", "soupOnly", "multinomial"), roundToInt = FALSE, tol = 0.001, pCut = 0.01 )
runSoupX( inSCE, sample = NULL, useAssay = "counts", background = NULL, bgAssayName = NULL, bgBatch = NULL, assayName = ifelse(is.null(background), "SoupX", "SoupX_bg"), cluster = NULL, reducedDimName = ifelse(is.null(background), "SoupX_UMAP_", "SoupX_bg_UMAP_"), tfidfMin = 1, soupQuantile = 0.9, maxMarkers = 100, contaminationRange = c(0.01, 0.8), rhoMaxFDR = 0.2, priorRho = 0.05, priorRhoStdDev = 0.1, forceAccept = FALSE, adjustMethod = c("subtraction", "soupOnly", "multinomial"), roundToInt = FALSE, tol = 0.001, pCut = 0.01 )
inSCE |
A SingleCellExperiment object. |
sample |
A single character specifying a name that can be found in
|
useAssay |
A single character string specifying which assay in
|
background |
A numeric matrix of counts or a
SingleCellExperiment object with the matrix in |
bgAssayName |
A single character string specifying which assay in
|
bgBatch |
The same thing as |
assayName |
A single character string of the output corrected matrix.
Default |
cluster |
Prior knowledge of clustering labels on cells. A single
character string for specifying clustering label stored in
|
reducedDimName |
A single character string of the prefix of output
corrected embedding matrix for each sample. Default |
tfidfMin |
Numeric. Minimum value of tfidf to accept for a marker gene.
Default |
soupQuantile |
Numeric. Only use genes that are at or above this
expression quantile in the soup. This prevents inaccurate estimates due to
using genes with poorly constrained contribution to the background. Default
|
maxMarkers |
Integer. If we have heaps of good markers, keep only the
best maxMarkers of them. Default |
contaminationRange |
Numeric vector of two elements. This constrains
the contamination fraction to lie within this range. Must be between 0 and 1.
The high end of this range is passed to
|
rhoMaxFDR |
Numeric. False discovery rate passed to
|
priorRho |
Numeric. Mode of gamma distribution prior on contamination
fraction. Default |
priorRhoStdDev |
Numeric. Standard deviation of gamma distribution prior
on contamination fraction. Default |
forceAccept |
Logical. Should we allow very high contamination fractions
to be used. Passed to |
adjustMethod |
Character. Method to use for correction. One of
|
roundToInt |
Logical. Should the resulting matrix be rounded to
integers? Default |
tol |
Numeric. Allowed deviation from expected number of soup counts.
Don't change this. Default |
pCut |
Numeric. The p-value cut-off used when
|
The input inSCE
object with soupX_nUMIs
,
soupX_clustrers
, soupX_contamination
appended to colData
slot; soupX_{sample}_est
and soupX_{sample}_counts
for each
sample appended to rowData
slot; and other computational metrics at
getSoupX(inSCE)
. Replace "soupX" to "soupX_bg" when background
is used.
Yichen Wang
plotSoupXResults
## Not run: # SoupX does not work for toy example, sce <- importExampleData("pbmc3k") sce <- runSoupX(sce, sample = "sample") plotSoupXResults(sce, sample = "sample") ## End(Not run)
## Not run: # SoupX does not work for toy example, sce <- importExampleData("pbmc3k") sce <- runSoupX(sce, sample = "sample") plotSoupXResults(sce, sample = "sample") ## End(Not run)
Wrapper for obtaining a pseudotime ordering of the cells by projecting them onto the minimum spanning tree (MST)
runTSCAN( inSCE, useReducedDim = "PCA", cluster = NULL, starter = NULL, seed = 12345 )
runTSCAN( inSCE, useReducedDim = "PCA", cluster = NULL, starter = NULL, seed = 12345 )
inSCE |
Input SingleCellExperiment object. |
useReducedDim |
Character. A low-dimension representation in
|
cluster |
Grouping for each cell in |
starter |
Character. Specifies the starting node from which to compute
the pseudotime. Default |
seed |
An integer. Random seed for clustering if |
The input inSCE
object with pseudotime ordering of the cells
along the paths and the cluster label stored in colData
, and other
unstructured information in metadata
.
Nida Pervaiz
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts")
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts")
This function finds all paths that root from a given cluster
useCluster
, and performs tests to identify significant features for
each path, and are not significant and/or changing in the opposite direction
in the other paths. Using a branching cluster (i.e. a node with degree > 2)
may highlight features which are responsible for the branching event. MST has
to be pre-calculated with runTSCAN
.
runTSCANClusterDEAnalysis( inSCE, useCluster, useAssay = "logcounts", fdrThreshold = 0.05 )
runTSCANClusterDEAnalysis( inSCE, useCluster, useAssay = "logcounts", fdrThreshold = 0.05 )
inSCE |
Input SingleCellExperiment object. |
useCluster |
The cluster to be regarded as the root, has to existing in
|
useAssay |
Character. The name of the assay to use. This assay should
contain log normalized counts. Default |
fdrThreshold |
Only out put DEGs with FDR value smaller than this value.
Default |
The input inSCE
with results updated in metadata
.
Nida Pervaiz
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") mouseBrainSubsetSCE <- runTSCANClusterDEAnalysis(inSCE = mouseBrainSubsetSCE, useCluster = 1)
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") mouseBrainSubsetSCE <- runTSCANClusterDEAnalysis(inSCE = mouseBrainSubsetSCE, useCluster = 1)
Wrapper for identifying genes with significant changes with respect to one of the TSCAN pseudotime paths
runTSCANDEG(inSCE, pathIndex, useAssay = "logcounts", discardCluster = NULL)
runTSCANDEG(inSCE, pathIndex, useAssay = "logcounts", discardCluster = NULL)
inSCE |
Input SingleCellExperiment object. |
pathIndex |
Path index for which the pseudotime values should be used.
This corresponds to the terminal node of specific path from the root
node to the terminal node. Run |
useAssay |
Character. The name of the assay to use for testing the
expression change. Should be log-normalized. Default |
discardCluster |
Cluster(s) which are not of use or masks other
interesting effects can be discarded. Default |
The input inSCE
with results updated in metadata
.
Nida Pervaiz
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") terminalNodes <- listTSCANTerminalNodes(mouseBrainSubsetSCE) mouseBrainSubsetSCE <- runTSCANDEG(inSCE = mouseBrainSubsetSCE, pathIndex = terminalNodes[1])
data("mouseBrainSubsetSCE", package = "singleCellTK") mouseBrainSubsetSCE <- runTSCAN(inSCE = mouseBrainSubsetSCE, useReducedDim = "PCA_logcounts") terminalNodes <- listTSCANTerminalNodes(mouseBrainSubsetSCE) mouseBrainSubsetSCE <- runTSCANDEG(inSCE = mouseBrainSubsetSCE, pathIndex = terminalNodes[1])
T-Stochastic Neighbour Embedding (t-SNE) algorithm is commonly
for 2D visualization of single-cell data. This function wraps the
Rtsne Rtsne
function.
With this funciton, users can create tSNE embedding directly from raw count matrix, with necessary preprocessing including normalization, scaling, dimension reduction all automated. Yet we still recommend having the PCA as input, so that the result can match with the clustering based on the same input PCA, and will be much faster.
runTSNE( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, reducedDimName = "TSNE", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, center = TRUE, scale = TRUE, pca = TRUE, partialPCA = FALSE, initialDims = 25, theta = 0.5, perplexity = 30, nIterations = 1000, numThreads = 1, seed = 12345 ) runQuickTSNE(inSCE, useAssay = "counts", ...) getTSNE( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, reducedDimName = "TSNE", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, center = TRUE, scale = TRUE, pca = TRUE, partialPCA = FALSE, initialDims = 25, theta = 0.5, perplexity = 30, nIterations = 1000, numThreads = 1, seed = 12345 )
runTSNE( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, reducedDimName = "TSNE", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, center = TRUE, scale = TRUE, pca = TRUE, partialPCA = FALSE, initialDims = 25, theta = 0.5, perplexity = 30, nIterations = 1000, numThreads = 1, seed = 12345 ) runQuickTSNE(inSCE, useAssay = "counts", ...) getTSNE( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, reducedDimName = "TSNE", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, center = TRUE, scale = TRUE, pca = TRUE, partialPCA = FALSE, initialDims = 25, theta = 0.5, perplexity = 30, nIterations = 1000, numThreads = 1, seed = 12345 )
inSCE |
Input SingleCellExperiment object. |
useReducedDim |
The low dimension representation to use for UMAP
computation. Default |
useAssay |
Assay to use for tSNE computation. If |
useAltExp |
The subset to use for tSNE computation, usually for the
selected.variable features. Default |
reducedDimName |
a name to store the results of the dimension
reductions. Default |
logNorm |
Whether the counts will need to be log-normalized prior to
generating the tSNE via |
useFeatureSubset |
Subset of feature to use for dimension reduction. A
character string indicating a |
nTop |
Automatically detect this number of variable features to use for
dimension reduction. Ignored when using |
center |
Whether data should be centered before PCA is applied. Ignored
when using |
scale |
Whether data should be scaled before PCA is applied. Ignored
when using |
pca |
Whether an initial PCA step should be performed. Ignored when
using |
partialPCA |
Whether truncated PCA should be used to calculate principal
components (requires the irlba package). This is faster for large input
matrices. Ignored when using |
initialDims |
Number of dimensions from PCA to use as input in tSNE.
Default |
theta |
Numeric value for speed/accuracy trade-off (increase for less
accuracy), set to |
perplexity |
perplexity parameter. Should not be bigger than
|
nIterations |
maximum iterations. Default |
numThreads |
Integer, number of threads to use using OpenMP, Default
|
seed |
Random seed for reproducibility of tSNE results.
Default |
... |
Other parameters to be passed to |
A SingleCellExperiment object with tSNE computation
updated in reducedDim(inSCE, reducedDimName)
.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") # Run from raw counts sce <- runQuickTSNE(sce) ## Not run: # Run from PCA sce <- scaterlogNormCounts(sce, "logcounts") sce <- runModelGeneVar(sce) sce <- setTopHVG(sce, method = "modelGeneVar", hvgNumber = 2000, featureSubsetName = "HVG_modelGeneVar2000") sce <- scaterPCA(sce, useAssay = "logcounts", useFeatureSubset = "HVG_modelGeneVar2000", scale = TRUE) sce <- runTSNE(sce, useReducedDim = "PCA") ## End(Not run)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") # Run from raw counts sce <- runQuickTSNE(sce) ## Not run: # Run from PCA sce <- scaterlogNormCounts(sce, "logcounts") sce <- runModelGeneVar(sce) sce <- setTopHVG(sce, method = "modelGeneVar", hvgNumber = 2000, featureSubsetName = "HVG_modelGeneVar2000") sce <- scaterPCA(sce, useAssay = "logcounts", useFeatureSubset = "HVG_modelGeneVar2000", scale = TRUE) sce <- runTSNE(sce, useReducedDim = "PCA") ## End(Not run)
Uniform Manifold Approximation and Projection (UMAP) algorithm
is commonly for 2D visualization of single-cell data. These functions wrap
the scater calculateUMAP
function.
Users can use runQuickUMAP
to directly create UMAP embedding from raw
count matrix, with necessary preprocessing including normalization, variable
feature selection, scaling, dimension reduction all automated. Therefore,
useReducedDim
is disabled for runQuickUMAP
.
In a complete analysis, we still recommend having dimension reduction such as
PCA created beforehand and select proper numbers of dimensions for using
runUMAP
, so that the result can match with the clustering based on the
same input PCA.
runUMAP( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, sample = NULL, reducedDimName = "UMAP", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, scale = TRUE, pca = TRUE, initialDims = 10, nNeighbors = 30, nIterations = 200, alpha = 1, minDist = 0.01, spread = 1, seed = 12345, verbose = TRUE, BPPARAM = SerialParam() ) runQuickUMAP(inSCE, useAssay = "counts", sample = "sample", ...) getUMAP( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, sample = NULL, reducedDimName = "UMAP", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, scale = TRUE, pca = TRUE, initialDims = 25, nNeighbors = 30, nIterations = 200, alpha = 1, minDist = 0.01, spread = 1, seed = 12345, BPPARAM = SerialParam() )
runUMAP( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, sample = NULL, reducedDimName = "UMAP", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, scale = TRUE, pca = TRUE, initialDims = 10, nNeighbors = 30, nIterations = 200, alpha = 1, minDist = 0.01, spread = 1, seed = 12345, verbose = TRUE, BPPARAM = SerialParam() ) runQuickUMAP(inSCE, useAssay = "counts", sample = "sample", ...) getUMAP( inSCE, useReducedDim = "PCA", useAssay = NULL, useAltExp = NULL, sample = NULL, reducedDimName = "UMAP", logNorm = TRUE, useFeatureSubset = NULL, nTop = 2000, scale = TRUE, pca = TRUE, initialDims = 25, nNeighbors = 30, nIterations = 200, alpha = 1, minDist = 0.01, spread = 1, seed = 12345, BPPARAM = SerialParam() )
inSCE |
Input SingleCellExperiment object. |
useReducedDim |
The low dimension representation to use for UMAP
computation. If |
useAssay |
Assay to use for UMAP computation. If |
useAltExp |
The subset to use for UMAP computation, usually for the
selected variable features. Default |
sample |
Character vector. Indicates which sample each cell belongs to.
If given a single character, will take the annotation from |
reducedDimName |
A name to store the results of the UMAP embedding
coordinates obtained from this method. Default |
logNorm |
Whether the counts will need to be log-normalized prior to
generating the UMAP via |
useFeatureSubset |
Subset of feature to use for dimension reduction. A
character string indicating a |
nTop |
Automatically detect this number of variable features to use for
dimension reduction. Ignored when using |
scale |
Whether |
pca |
Logical. Whether to perform dimension reduction with PCA before
UMAP. Ignored when using |
initialDims |
Number of dimensions from PCA to use as input in UMAP.
Default |
nNeighbors |
The size of local neighborhood used for manifold
approximation. Larger values result in more global views of the manifold,
while smaller values result in more local data being preserved. Default
|
nIterations |
The number of iterations performed during layout
optimization. Default is |
alpha |
The initial value of "learning rate" of layout optimization.
Default is |
minDist |
The effective minimum distance between embedded points.
Smaller values will result in a more clustered/clumped embedding where nearby
points on the manifold are drawn closer together, while larger values will
result on a more even dispersal of points. Default |
spread |
The effective scale of embedded points. In combination with
|
seed |
Random seed for reproducibility of UMAP results.
Default |
verbose |
Logical. Whether to print log messages. Default |
BPPARAM |
A BiocParallelParam object specifying whether the PCA should be parallelized. |
... |
Parameters passed to |
A SingleCellExperiment object with UMAP computation
updated in reducedDim(inSCE, reducedDimName)
.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") # Run from raw counts sce <- runQuickUMAP(sce) plotDimRed(sce, "UMAP")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") # Run from raw counts sce <- runQuickUMAP(sce) plotDimRed(sce, "UMAP")
Wrapper for the Variance-adjusted Mahalanobis (VAM), which is a fast and accurate method for cell-specific gene set scoring of single cell data. This algorithm computes distance statistics and one-sided p-values for all cells in the specified single cell gene expression matrix. Gene sets should already be imported and stored in the meta data using functions such as importGeneSetsFromList or importGeneSetsFromMSigDB
runVAM( inSCE, geneSetCollectionName = "H", useAssay = "logcounts", resultNamePrefix = NULL, center = FALSE, gamma = TRUE )
runVAM( inSCE, geneSetCollectionName = "H", useAssay = "logcounts", resultNamePrefix = NULL, center = FALSE, gamma = TRUE )
inSCE |
Input SingleCellExperiment object. |
geneSetCollectionName |
Character. The name of the gene set collection
to use. Default |
useAssay |
Character. The name of the assay to use. This assay should
contain log normalized counts. Default |
resultNamePrefix |
Character. Prefix to the name the VAM results which
will be stored in the reducedDim slot of |
center |
Boolean. If |
gamma |
Boolean. If |
A SingleCellExperiment object with VAM metrics stored
in reducedDim
as VAM_NameOfTheGeneset_Distance
and
VAM_NameOfTheGeneset_CDF
.
Nida Pervaiz
importGeneSetsFromList, importGeneSetsFromMSigDB,
importGeneSetsFromGMT, importGeneSetsFromCollection for
importing gene sets. sctkListGeneSetCollections,
getPathwayResultNames and getGenesetNamesFromCollection for
available related information in inSCE
.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce,geneSetList = gs, by = "rownames") sce <- runVAM(inSCE = sce, geneSetCollectionName = "GeneSetCollection", useAssay = "logcounts")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, assayName = "logcounts") gs1 <- rownames(sce)[seq(10)] gs2 <- rownames(sce)[seq(11,20)] gs <- list("geneset1" = gs1, "geneset2" = gs2) sce <- importGeneSetsFromList(inSCE = sce,geneSetList = gs, by = "rownames") sce <- runVAM(inSCE = sce, geneSetCollectionName = "GeneSetCollection", useAssay = "logcounts")
A general and flexible zero-inflated negative binomial model that can be used to provide a low-dimensional representations of scRNAseq data. The model accounts for zero inflation (dropouts), over-dispersion, and the count nature of the data. The model also accounts for the difference in library sizes and optionally for batch effects and/or other covariates.
runZINBWaVE( inSCE, useAssay = "counts", batch = "batch", nHVG = 1000L, nComponents = 50L, epsilon = 1000, nIter = 10L, reducedDimName = "zinbwave", BPPARAM = BiocParallel::SerialParam() )
runZINBWaVE( inSCE, useAssay = "counts", batch = "batch", nHVG = 1000L, nComponents = 50L, epsilon = 1000, nIter = 10L, reducedDimName = "zinbwave", BPPARAM = BiocParallel::SerialParam() )
inSCE |
Input SingleCellExperiment object |
useAssay |
A single character indicating the name of the assay requiring
batch correction. Note that ZINBWaVE works for counts (integer) input rather
than logcounts that other methods prefer. Default |
batch |
A single character indicating a field in
|
nHVG |
An integer. Number of highly variable genes to use when fitting
the model. Default |
nComponents |
An integer. The number of principle components or
dimensionality to generate in the resulting matrix. Default |
epsilon |
An integer. Algorithmic parameter. Empirically, a high epsilon
is often required to obtained a good low-level representation. Default
|
nIter |
An integer, The max number of iterations to perform. Default
|
reducedDimName |
A single character. The name for the corrected
low-dimensional representation. Will be saved to |
BPPARAM |
A BiocParallelParam object specifying whether
should be parallelized. Default |
The input SingleCellExperiment object with
reducedDim(inSCE, reducedDimName)
updated.
Pollen, Alex A et al., 2014
data('sceBatches', package = 'singleCellTK') ## Not run: sceCorr <- runZINBWaVE(sceBatches, nIter = 5) ## End(Not run)
data('sceBatches', package = 'singleCellTK') ## Not run: sceCorr <- runZINBWaVE(sceBatches, nIter = 5) ## End(Not run)
Creates a table of QC metrics generated from QC algorithms, which is stored within the metadata slot of the input SingleCellExperiment object.
sampleSummaryStats( inSCE, sample = NULL, useAssay = "counts", simple = TRUE, statsName = "qc_table" )
sampleSummaryStats( inSCE, sample = NULL, useAssay = "counts", simple = TRUE, statsName = "qc_table" )
inSCE |
Input SingleCellExperiment object with saved assay data and/or colData data. Required. |
sample |
Character vector. Indicates which sample each cell belongs to. |
useAssay |
A string specifying which assay in the SCE to use. Default 'counts'. |
simple |
Boolean. Indicates whether to generate a table of only basic QC stats (ex. library size), or to generate a summary table of all QC stats stored in the inSCE. |
statsName |
Character. The name of the slot that will store the QC stat table. Default "qc_table". |
A SingleCellExperiment object with a summary table for QC statistics in the 'sample_summary' slot of metadata.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- sampleSummaryStats(sce, simple = TRUE) getSampleSummaryStatsTable(sce, statsName = "qc_table")
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- sampleSummaryStats(sce, simple = TRUE) getSampleSummaryStatsTable(sce, statsName = "qc_table")
scaterCPM Uses CPM from scater library to compute counts-per-million.
scaterCPM(inSCE, assayName = "ScaterCPMCounts", useAssay = "counts")
scaterCPM(inSCE, assayName = "ScaterCPMCounts", useAssay = "counts")
inSCE |
Input SingleCellExperiment object |
assayName |
New assay name for cpm data. |
useAssay |
Input assay |
inSCE Updated SingleCellExperiment object
Irzam Sarfraz
data(sce_chcl, package = "scds") sce_chcl <- scaterCPM(sce_chcl,"countsCPM", "counts")
data(sce_chcl, package = "scds") sce_chcl <- scaterCPM(sce_chcl,"countsCPM", "counts")
scaterlogNormCounts Uses logNormCounts to log normalize input data
scaterlogNormCounts( inSCE, assayName = "ScaterLogNormCounts", useAssay = "counts" )
scaterlogNormCounts( inSCE, assayName = "ScaterLogNormCounts", useAssay = "counts" )
inSCE |
Input SingleCellExperiment object |
assayName |
New assay name for log normalized data |
useAssay |
Input assay |
inSCE Updated SingleCellExperiment object that contains the new log normalized data
Irzam Sarfraz
data(sce_chcl, package = "scds") sce_chcl <- scaterlogNormCounts(sce_chcl,"logcounts", "counts")
data(sce_chcl, package = "scds") sce_chcl <- scaterlogNormCounts(sce_chcl,"logcounts", "counts")
A wrapper to runPCA function to compute principal component analysis (PCA) from a given SingleCellExperiment object.
scaterPCA( inSCE, useAssay = "logcounts", useFeatureSubset = "hvg2000", scale = TRUE, reducedDimName = "PCA", nComponents = 50, ntop = 2000, useAltExp = NULL, seed = 12345, BPPARAM = BiocParallel::SerialParam() )
scaterPCA( inSCE, useAssay = "logcounts", useFeatureSubset = "hvg2000", scale = TRUE, reducedDimName = "PCA", nComponents = 50, ntop = 2000, useAltExp = NULL, seed = 12345, BPPARAM = BiocParallel::SerialParam() )
inSCE |
Input SingleCellExperiment object. |
useAssay |
Assay to use for PCA computation. If |
useFeatureSubset |
Subset of feature to use for dimension reduction. A
character string indicating a |
scale |
Logical scalar, whether to standardize the expression values.
Default |
reducedDimName |
Name to use for the reduced output assay. Default
|
nComponents |
Number of principal components to obtain from the PCA
computation. Default |
ntop |
Automatically detect this number of variable features to use for
dimension reduction. Ignored when using |
useAltExp |
The subset to use for PCA computation, usually for the
selected.variable features. Default |
seed |
Integer, random seed for reproducibility of PCA results.
Default |
BPPARAM |
A BiocParallelParam object specifying whether the PCA should be parallelized. |
A SingleCellExperiment object with PCA computation
updated in reducedDim(inSCE, reducedDimName)
.
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, "logcounts") # Example of ranking variable genes, selecting the top variable features, # and running PCA. Make sure to increase the number of highly variable # features (hvgNumber) and the number of principal components (nComponents) # for real datasets sce <- runModelGeneVar(sce, useAssay = "logcounts") sce <- setTopHVG(sce, method = "modelGeneVar", hvgNumber = 100, featureSubsetName = "hvf") sce <- scaterPCA(sce, useAssay = "logcounts", scale = TRUE, useFeatureSubset = "hvf", nComponents = 5) # Alternatively, let the scater PCA function select the top variable genes sce <- scaterPCA(sce, useAssay = "logcounts", scale = TRUE, useFeatureSubset = NULL, ntop = 100, nComponents = 5)
data(scExample, package = "singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- scaterlogNormCounts(sce, "logcounts") # Example of ranking variable genes, selecting the top variable features, # and running PCA. Make sure to increase the number of highly variable # features (hvgNumber) and the number of principal components (nComponents) # for real datasets sce <- runModelGeneVar(sce, useAssay = "logcounts") sce <- setTopHVG(sce, method = "modelGeneVar", hvgNumber = 100, featureSubsetName = "hvf") sce <- scaterPCA(sce, useAssay = "logcounts", scale = TRUE, useFeatureSubset = "hvf", nComponents = 5) # Alternatively, let the scater PCA function select the top variable genes sce <- scaterPCA(sce, useAssay = "logcounts", scale = TRUE, useFeatureSubset = NULL, ntop = 100, nComponents = 5)
https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc4k A subset of 390 barcodes and top 200 genes were included in this example. Within 390 barcodes, 195 barcodes are empty droplet, 150 barcodes are cell barcode and 45 barcodes are doublets predicted by scrublet and doubletFinder package. This example only serves as a proof of concept and a tutoriol on how to run the functions in this package. The results should not be used for drawing scientific conclusions.
data("scExample")
data("scExample")
A SingleCellExperiment object.
Example Single Cell RNA-Seq data in SingleCellExperiment Object, subset of 10x public dataset
data("scExample")
data("scExample")
Two batches of pancreas scRNAseq dataset are combined with their original counts. Cell types and batches are annotated in 'colData(sceBatches)'. Two batches came from Wang, et al., 2016, annotated as ''w''; and Xin, et al., 2016, annotated as ''x'‘. Two common cell types, '’alpha'' and ''beta'', that could be found in both original studies with relatively large population were kept for cleaner demonstration.
data('sceBatches')
data('sceBatches')
An object of class SingleCellExperiment
with 100 rows and 250 columns.
Example Single Cell RNA-Seq data in SingleCellExperiment object, with different batches annotated
Returns a vector of GeneSetCollections that have been
imported and stored in metadata(inSCE)$sctk$genesets
.
sctkListGeneSetCollections(inSCE)
sctkListGeneSetCollections(inSCE)
inSCE |
A SingleCellExperiment object. |
Character vector.
Joshua D. Campbell
importGeneSetsFromList for importing from lists, importGeneSetsFromGMT for importing from GMT files, GeneSetCollection objects, and importGeneSetsFromMSigDB for importing MSigDB gene sets.
data(scExample) gs1 <- GSEABase::GeneSet(setName = "geneset1", geneIds = rownames(sce)[seq(10)]) gs2 <- GSEABase::GeneSet(setName = "geneset2", geneIds = rownames(sce)[seq(11,20)]) gsc1 <- GSEABase::GeneSetCollection(gs1) gsc2 <- GSEABase::GeneSetCollection(gs2) sce <- importGeneSetsFromCollection(inSCE = sce, geneSetCollection = gsc1, by = "rownames", collectionName = "Collection1") sce <- importGeneSetsFromCollection(inSCE = sce, geneSetCollection = gsc2, by = "rownames", collectionName = "Collection2") collections <- sctkListGeneSetCollections(sce)
data(scExample) gs1 <- GSEABase::GeneSet(setName = "geneset1", geneIds = rownames(sce)[seq(10)]) gs2 <- GSEABase::GeneSet(setName = "geneset2", geneIds = rownames(sce)[seq(11,20)]) gsc1 <- GSEABase::GeneSetCollection(gs1) gsc2 <- GSEABase::GeneSetCollection(gs2) sce <- importGeneSetsFromCollection(inSCE = sce, geneSetCollection = gsc1, by = "rownames", collectionName = "Collection1") sce <- importGeneSetsFromCollection(inSCE = sce, geneSetCollection = gsc2, by = "rownames", collectionName = "Collection2") collections <- sctkListGeneSetCollections(sce)
Install all Python packages used in the singleCellTK
package
using conda_install
from package reticulate
. This
will create a new Conda environment with the name envname
if not already present.
Note that Anaconda or Miniconda already need to be installed on the local system.
sctkPythonInstallConda( envname = "sctk-reticulate", conda = "auto", packages = c("scipy", "numpy", "astroid", "six"), pipPackages = c("scrublet", "scanpy", "louvain", "leidenalg", "bbknn", "scanorama", "anndata"), selectConda = TRUE, forge = FALSE, pipIgnoreInstalled = TRUE, pythonVersion = NULL, ... )
sctkPythonInstallConda( envname = "sctk-reticulate", conda = "auto", packages = c("scipy", "numpy", "astroid", "six"), pipPackages = c("scrublet", "scanpy", "louvain", "leidenalg", "bbknn", "scanorama", "anndata"), selectConda = TRUE, forge = FALSE, pipIgnoreInstalled = TRUE, pythonVersion = NULL, ... )
envname |
Character. Name of the conda environment to create. |
conda |
Character. Path to conda executable. Usue "auto" to find conda using the PATH and other conventional install locations. Default 'auto'. |
packages |
Character Vector. List of packages to install from Conda. |
pipPackages |
Character Vector. List of packages to install into the Conda environment using 'pip'. |
selectConda |
Boolean. Run |
forge |
Boolean. Include the Conda Forge repository. |
pipIgnoreInstalled |
Boolean. Ignore installed versions when using pip. This is TRUE by default so that specific package versions can be installed even if they are downgrades. The FALSE option is useful for situations where you don't want a pip install to attempt an overwrite of a conda binary package (e.g. SciPy on Windows which is very difficult to install via pip due to compilation requirements). |
pythonVersion |
Passed to |
... |
Other parameters to pass to |
None. Installation of Conda environment.
See conda_create
for more information on creating a Conda environment.
See conda_install
for more description of the installation parameters.
See https://rstudio.github.io/reticulate/ for more information on package reticulate
.
See selectSCTKConda
for reloading the Conda environment if R is restarted without
going through the whole installation process again.
See https://docs.conda.io/en/latest/ for more information on Conda environments.
## Not run: sctkPythonInstallConda(envname = "sctk-reticulate") ## End(Not run)
## Not run: sctkPythonInstallConda(envname = "sctk-reticulate") ## End(Not run)
Install all Python packages used in the singleCellTK
package
using virtualenv_install
from package reticulate
. This
will create a new virtual environment with the name envname
if not already present.
sctkPythonInstallVirtualEnv( envname = "sctk-reticulate", packages = c("scipy", "numpy", "astroid", "six", "scrublet", "scanpy", "louvain", "leidenalg", "scanorama", "bbknn", "anndata"), selectEnvironment = TRUE, python = NULL )
sctkPythonInstallVirtualEnv( envname = "sctk-reticulate", packages = c("scipy", "numpy", "astroid", "six", "scrublet", "scanpy", "louvain", "leidenalg", "scanorama", "bbknn", "anndata"), selectEnvironment = TRUE, python = NULL )
envname |
Character. Name of the virtual environment to create. |
packages |
Character Vector. List of packages to install. |
selectEnvironment |
Boolean. Run |
python |
The path to a Python interpreter, to be used with the created virtual environment. When NULL, the Python interpreter associated with the current session will be used. Default NULL. |
None. Installation of virtual environment.
See virtualenv_create
for more information on creating a Conda environment.
See virtualenv_install
for more description of the installation parameters.
See https://rstudio.github.io/reticulate/ for more information on package reticulate
.
See selectSCTKVirtualEnvironment
for reloading the virtual environment if R is restarted without
going through the whole installation process again.
## Not run: sctkPythonInstallVirtualEnv(envname = "sctk-reticulate") ## End(Not run)
## Not run: sctkPythonInstallVirtualEnv(envname = "sctk-reticulate") ## End(Not run)
The two gene sets came from dataset called 'segList' of package 'scMerge'.
data('SEG')
data('SEG')
list, with two entries "human"
and "mouse"
, each is a
charactor vector.
Stably Expressed Gene (SEG) list obect, with SEG sets for human and mouse.
data('segList', package='scMerge')
data('SEG') humanSEG <- SEG$human
data('SEG') humanSEG <- SEG$human
Selects a Conda environment with Python packages used in singleCellTK
.
selectSCTKConda(envname = "sctk-reticulate")
selectSCTKConda(envname = "sctk-reticulate")
envname |
Character. Name of the conda environment to activate. |
None. Selects Conda environment.
conda-tools
for more information on using Conda environments with package reticulate
.
See https://rstudio.github.io/reticulate/ for more information on package reticulate
.
See sctkPythonInstallConda
for installation of Python modules into a Conda environment.
Seeconda-tools
for more information on using Conda environments with package reticulate
.
See https://rstudio.github.io/reticulate/ for more information on package reticulate
.
See https://docs.conda.io/en/latest/ for more information on Conda environments.
## Not run: sctkPythonInstallConda(envname = "sctk-reticulate", selectConda = FALSE) selectSCTKConda(envname = "sctk-reticulate") ## End(Not run)
## Not run: sctkPythonInstallConda(envname = "sctk-reticulate", selectConda = FALSE) selectSCTKConda(envname = "sctk-reticulate") ## End(Not run)
Selects a virtual environment with Python packages used in singleCellTK
selectSCTKVirtualEnvironment(envname = "sctk-reticulate")
selectSCTKVirtualEnvironment(envname = "sctk-reticulate")
envname |
Character. Name of the virtual environment to activate. |
None. Selects virtual environment.
See sctkPythonInstallVirtualEnv
for installation of Python modules into a virtual environment.
Seevirtualenv-tools
for more information on using virtual environments with package reticulate
.
See https://rstudio.github.io/reticulate/ for more information on package reticulate
.
## Not run: sctkPythonInstallVirtualEnv(envname = "sctk-reticulate", selectEnvironment = FALSE) selectSCTKVirtualEnvironment(envname = "sctk-reticulate") ## End(Not run)
## Not run: sctkPythonInstallVirtualEnv(envname = "sctk-reticulate", selectEnvironment = FALSE) selectSCTKVirtualEnvironment(envname = "sctk-reticulate") ## End(Not run)
Users can set rownames of an SCE object with either a character
vector where the length equals to nrow(x)
, or a single character
specifying a column in rowData(x)
. Also applicable to matrix like
object where rownames<-
method works, but only allows full size name
vector. Users can set dedup = TRUE
to remove duplicated entries in the
specification, by adding -1, -2, ..., -i
suffix to the duplication of
the same identifier.
setRowNames(x, rowNames, dedup = TRUE)
setRowNames(x, rowNames, dedup = TRUE)
x |
Input object where the rownames will be modified. |
rowNames |
Character vector of the rownames. If |
dedup |
Logical. Whether to deduplicate the specified rowNames. Default
|
The input SCE object with rownames updated.
data("scExample", package = "singleCellTK") head(rownames(sce)) sce <- setRowNames(sce, "feature_name") head(rownames(sce))
data("scExample", package = "singleCellTK") head(rownames(sce)) sce <- setRowNames(sce, "feature_name") head(rownames(sce))
This function is to be used to specify which
setSCTKDisplayRow(inSCE, featureDisplayRow)
setSCTKDisplayRow(inSCE, featureDisplayRow)
inSCE |
Input SingleCellExperiment object with saved dimension reduction components or a variable with saved results. Required. |
featureDisplayRow |
Indicates which column name of rowData to be used for plots. |
A SingleCellExperiment object with the specific column name of rowData to be used for plotting stored in metadata.
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- setSCTKDisplayRow(inSCE = sce, featureDisplayRow = "feature_name") plotSCEViolinAssayData(inSCE = sce, feature = "ENSG00000019582")
data(scExample, package="singleCellTK") sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'") sce <- setSCTKDisplayRow(inSCE = sce, featureDisplayRow = "feature_name") plotSCEViolinAssayData(inSCE = sce, feature = "ENSG00000019582")
Use this function to run the single cell analysis app.
singleCellTK(inSCE = NULL, includeVersion = TRUE, theme = "yeti")
singleCellTK(inSCE = NULL, includeVersion = TRUE, theme = "yeti")
inSCE |
Input SingleCellExperiment object. |
includeVersion |
Include the version number in the SCTK header. The default is TRUE. |
theme |
The bootswatch theme to use for the singleCellTK UI. The default is 'flatly'. |
The shiny app will open
## Not run: #Upload data through the app singleCellTK() # Load the app with a SingleCellExperiment object data("mouseBrainSubsetSCE") singleCellTK(mouseBrainSubsetSCE) ## End(Not run)
## Not run: #Upload data through the app singleCellTK() # Load the app with a SingleCellExperiment object data("mouseBrainSubsetSCE") singleCellTK(mouseBrainSubsetSCE) ## End(Not run)
Passes the output of generateSimulatedData() to differential expression tests, picking either t-tests or ANOVA for data with only two conditions or multiple conditions, respectively.
subDiffEx(tempData) subDiffExttest(countMatrix, class.labels, test.type = "t.equalvar") subDiffExANOVA(countMatrix, condition)
subDiffEx(tempData) subDiffExttest(countMatrix, class.labels, test.type = "t.equalvar") subDiffExANOVA(countMatrix, condition)
tempData |
Matrix. The output of generateSimulatedData(), where the first row contains condition labels. |
countMatrix |
Matrix. A simulated counts matrix, sans labels. |
class.labels |
Factor. The condition labels for the simulated cells. Will be coerced into 1's and 0's. |
test.type |
Type of test to perform. The default is t.equalvar. |
condition |
Factor. The condition labels for the simulated cells. |
subDiffEx(): A vector of fdr-adjusted p-values for all genes. Nonviable results (such as for genes with 0 counts in a simulated dataset) are coerced to 1.
subDiffExttest(): A vector of fdr-adjusted p-values for all genes. Nonviable results (such as for genes with 0 counts in a simulated dataset) are coerced to 1.
subDiffExANOVA(): A vector of fdr-adjusted p-values for all genes. Nonviable results (such as for genes with 0 counts in a simulated dataset) are coerced to 1.
subDiffEx()
:
subDiffExttest()
: Runs t-tests on all genes in a simulated dataset with 2
conditions, and adjusts for FDR.
subDiffExANOVA()
: Runs ANOVA on all genes in a simulated dataset with
more than 2 conditions, and adjusts for FDR.
data("mouseBrainSubsetSCE") res <- generateSimulatedData( totalReads = 1000, cells=10, originalData = assay(mouseBrainSubsetSCE, "counts"), realLabels = colData(mouseBrainSubsetSCE)[, "level1class"]) tempSigDiff <- subDiffEx(res) data("mouseBrainSubsetSCE") #sort first 100 expressed genes ord <- rownames(mouseBrainSubsetSCE)[ order(rowSums(assay(mouseBrainSubsetSCE, "counts")), decreasing = TRUE)][seq(100)] #subset to those first 100 genes subset <- mouseBrainSubsetSCE[ord, ] res <- generateSimulatedData(totalReads = 1000, cells=10, originalData = assay(subset, "counts"), realLabels = colData(subset)[, "level1class"]) realLabels <- res[1, ] output <- res[-1, ] fdr <- subDiffExttest(output, realLabels) data("mouseBrainSubsetSCE") #sort first 100 expressed genes ord <- rownames(mouseBrainSubsetSCE)[ order(rowSums(assay(mouseBrainSubsetSCE, "counts")), decreasing = TRUE)][seq(100)] # subset to those first 100 genes subset <- mouseBrainSubsetSCE[ord, ] res <- generateSimulatedData(totalReads = 1000, cells=10, originalData = assay(subset, "counts"), realLabels = colData(subset)[, "level2class"]) realLabels <- res[1, ] output <- res[-1, ] fdr <- subDiffExANOVA(output, realLabels)
data("mouseBrainSubsetSCE") res <- generateSimulatedData( totalReads = 1000, cells=10, originalData = assay(mouseBrainSubsetSCE, "counts"), realLabels = colData(mouseBrainSubsetSCE)[, "level1class"]) tempSigDiff <- subDiffEx(res) data("mouseBrainSubsetSCE") #sort first 100 expressed genes ord <- rownames(mouseBrainSubsetSCE)[ order(rowSums(assay(mouseBrainSubsetSCE, "counts")), decreasing = TRUE)][seq(100)] #subset to those first 100 genes subset <- mouseBrainSubsetSCE[ord, ] res <- generateSimulatedData(totalReads = 1000, cells=10, originalData = assay(subset, "counts"), realLabels = colData(subset)[, "level1class"]) realLabels <- res[1, ] output <- res[-1, ] fdr <- subDiffExttest(output, realLabels) data("mouseBrainSubsetSCE") #sort first 100 expressed genes ord <- rownames(mouseBrainSubsetSCE)[ order(rowSums(assay(mouseBrainSubsetSCE, "counts")), decreasing = TRUE)][seq(100)] # subset to those first 100 genes subset <- mouseBrainSubsetSCE[ord, ] res <- generateSimulatedData(totalReads = 1000, cells=10, originalData = assay(subset, "counts"), realLabels = colData(subset)[, "level2class"]) realLabels <- res[1, ] output <- res[-1, ] fdr <- subDiffExANOVA(output, realLabels)
Used to peform subsetting of a
SingleCellExperiment object using a variety of methods that
indicate the correct columns to keep. The various methods,
index
, bool
, and colData
, can be used in conjunction
with one another.
subsetSCECols(inSCE, index = NULL, bool = NULL, colData = NULL)
subsetSCECols(inSCE, index = NULL, bool = NULL, colData = NULL)
inSCE |
Input SingleCellExperiment object. |
index |
Integer vector. Vector of indicies indicating which columns
to keep. If |
bool |
Boolean vector. Vector of |
colData |
Character. An expression that will identify a subset of
columns using variables found in the |
A SingleCellExperiment object that has been subsetted by colData.
Joshua D. Campbell
data(scExample) sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'")
data(scExample) sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'")
Used to peform subsetting of a
SingleCellExperiment object using a variety of methods that
indicate the correct rows to keep. The various methods,
index
, bool
, and rowData
, can be used in conjunction
with one another. If returnAsAltExp
is set to TRUE
,
then the returned object will have the same number of rows as the input
inSCE
as the subsetted object will be stored in the
altExp
slot.
subsetSCERows( inSCE, index = NULL, bool = NULL, rowData = NULL, returnAsAltExp = TRUE, altExpName = "subset", prependAltExpName = TRUE )
subsetSCERows( inSCE, index = NULL, bool = NULL, rowData = NULL, returnAsAltExp = TRUE, altExpName = "subset", prependAltExpName = TRUE )
inSCE |
Input SingleCellExperiment object. |
index |
Integer vector. Vector of indicies indicating which rows
to keep. If |
bool |
Boolean vector. Vector of |
rowData |
Character. An expression that will identify a subset of rows
using variables found in the |
returnAsAltExp |
Boolean. If |
altExpName |
Character. Name of the alternative experiment object to
add if |
prependAltExpName |
Boolean. If |
A SingleCellExperiment object that has been subsetted by rowData.
Joshua D. Campbell
data(scExample) # Set a variable up in the rowData indicating mitochondrial genes rowData(sce)$isMito <- ifelse(grepl("^MT-", rowData(sce)$feature_name), "yes", "no") sce <- subsetSCERows(sce, rowData = "isMito == 'yes'")
data(scExample) # Set a variable up in the rowData indicating mitochondrial genes rowData(sce)$isMito <- ifelse(grepl("^MT-", rowData(sce)$feature_name), "yes", "no") sce <- subsetSCERows(sce, rowData = "isMito == 'yes'")
Creates a table of summary metrics from an input SingleCellExperiment
summarizeSCE(inSCE, useAssay = NULL, sampleVariableName = NULL)
summarizeSCE(inSCE, useAssay = NULL, sampleVariableName = NULL)
inSCE |
Input SingleCellExperiment object. |
useAssay |
Indicate which assay to summarize. If |
sampleVariableName |
Variable name in |
A data.frame object of summary metrics.
data("mouseBrainSubsetSCE") summarizeSCE(mouseBrainSubsetSCE, sample = NULL)
data("mouseBrainSubsetSCE") summarizeSCE(mouseBrainSubsetSCE, sample = NULL)
Trims an input count matrix such that each value greater than a threshold value and each value less than a provided lower threshold value is trimmed to the lower treshold value.
trimCounts(counts, trimValue = c(10, -10))
trimCounts(counts, trimValue = c(10, -10))
counts |
matrix |
trimValue |
where trimValue[1] for upper threshold and trimValue[2] as
lower threshold. Default is |
trimmed counts matrix
data(sce_chcl, package = "scds") assay(sce_chcl, "countsTrimmed") <- trimCounts(assay(sce_chcl, "counts"), c(10, -10))
data(sce_chcl, package = "scds") assay(sce_chcl, "countsTrimmed") <- trimCounts(assay(sce_chcl, "counts"), c(10, -10))