Title: | Single-Cell Consensus Clustering |
---|---|
Description: | A tool for unsupervised clustering and analysis of single cell RNA-Seq data. |
Authors: | Vladimir Kiselev |
Maintainer: | Vladimir Kiselev <[email protected]> |
License: | GPL-3 |
Version: | 1.35.0 |
Built: | 2024-10-31 04:44:16 UTC |
Source: | https://github.com/bioc/SC3 |
Cell type annotations for data extracted from a publication by Yan et al.
ann
ann
An object of class data.frame
with 90 rows and 1 columns.
http://dx.doi.org/10.1038/nsmb.2660
Each row corresponds to a single cell from 'yan' dataset
Distance between the cells, i.e. columns, in the input expression matrix are calculated using the Euclidean, Pearson and Spearman metrics to construct distance matrices.
calculate_distance(data, method)
calculate_distance(data, method)
data |
expression matrix |
method |
one of the distance metrics: 'spearman', 'pearson', 'euclidean' |
distance matrix
k
Stability index shows how stable each cluster is accross the selected range of k
.
The stability index varies between 0 and 1, where
1 means that the same cluster appears in every solution for different k
.
calculate_stability(consensus, k)
calculate_stability(consensus, k)
consensus |
consensus item of the sc3 slot of an object of 'SingleCellExperiment' class |
k |
number of clusters k |
Imagine a given cluster is split into N
clusters when k
is changed (all possible
values of k
are provided via ks
argument in the main sc3
function).
In each of the new clusters there are given_cells
of the given cluster and also some
extra_cells
from other clusters. Then we define stability as follows:
Where one N
corrects for the number of clusters and the other N
is a penalty
for splitting the cluster. ks
corrects for the range of k
.
a numeric vector containing a stability index of each cluster
Consensus matrix is calculated using the Cluster-based Similarity Partitioning Algorithm (CSPA). For each clustering solution a binary similarity matrix is constructed from the corresponding cell labels: if two cells belong to the same cluster, their similarity is 1, otherwise the similarity is 0. A consensus matrix is calculated by averaging all similarity matrices.
consensus_matrix(clusts)
consensus_matrix(clusts)
clusts |
a matrix containing clustering solutions in columns |
consensus matrix
Computes consensus matrix given cluster labels
consmx(dat)
consmx(dat)
dat |
a matrix containing clustering solutions in columns |
Used in consmx function
ED1(x)
ED1(x)
x |
A numeric matrix. |
Used in sc3-funcs.R distance matrix calculation and within the consensus clustering.
ED2(x)
ED2(x)
x |
A numeric matrix. |
The function finds the eigenvalues of the sample covariance matrix. It will then return the number of significant eigenvalues according to the Tracy-Widom test.
estkTW(dataset)
estkTW(dataset)
dataset |
processed input expression matrix. |
an estimated number of clusters k
For a given gene a binary classifier is constructed
based on the mean cluster expression values (these are calculated using the
cell labels). The classifier prediction
is then calculated using the gene expression ranks. The area under the
receiver operating characteristic (ROC) curve is used to quantify the accuracy
of the prediction. A p-value
is assigned to each gene by using the Wilcoxon
signed rank test.
get_auroc(gene, labels)
get_auroc(gene, labels)
gene |
expression data of a given gene |
labels |
cell labels correspodning to the expression values of the gene |
Wrapper for calculating biological properties
get_biolgy(dataset, labels, regime)
get_biolgy(dataset, labels, regime)
dataset |
expression matrix |
labels |
cell labels corresponding clusters |
regime |
defines what biological analysis to perform. "marker" for marker genes, "de" for differentiall expressed genes and "outl" for outlier cells |
results of either
Differential expression is calculated using the non-parametric Kruskal-Wallis test.
A significant p-value
indicates that gene expression in at least one cluster
stochastically dominates one other cluster. Note that the calculation of
differential expression after clustering can introduce a bias in the
distribution of p-value
s, and thus we advise to use the p-value
s
for ranking the genes only.
get_de_genes(dataset, labels)
get_de_genes(dataset, labels)
dataset |
expression matrix |
labels |
cell labels corresponding to the columns of the expression matrix |
a numeric vector containing the differentially expressed genes and correspoding p-values
d <- get_de_genes(yan[1:10,], as.numeric(ann[,1])) head(d)
d <- get_de_genes(yan[1:10,], as.numeric(ann[,1])) head(d)
Find marker genes in the dataset. The get_auroc
is used to calculate
marker values for each gene.
get_marker_genes(dataset, labels)
get_marker_genes(dataset, labels)
dataset |
expression matrix |
labels |
cell labels corresponding clusters |
data.frame containing the marker genes, corresponding cluster indexes
and adjusted p-value
s
d <- get_marker_genes(yan[1:10,], as.numeric(ann[,1])) d
d <- get_marker_genes(yan[1:10,], as.numeric(ann[,1])) d
Outlier cells in each cluster are detected using robust distances, calculated
using the minimum covariance determinant (MCD), namely using
covMcd
. The outlier score shows how
different a cell is from all other cells in the cluster and it is defined as
the differences between the square root of the robust distance and the square
root of the 99.99
get_outl_cells(dataset, labels)
get_outl_cells(dataset, labels)
dataset |
expression matrix |
labels |
cell labels corresponding to the columns of the expression matrix |
a numeric vector containing the cell labels and correspoding outlier scores ordered by the labels
d <- get_outl_cells(yan[1:10,], as.numeric(ann[,1])) head(d)
d <- get_outl_cells(yan[1:10,], as.numeric(ann[,1])) head(d)
SC3
clusteringTakes data from the logcounts
slot, removes spike-ins and applies the gene filter.
get_processed_dataset(object)
get_processed_dataset(object)
object |
an object of |
Reorders the rows of the input data.frame based on the sc3_k_markers_clusts
column and also keeps only the top 10 genes for each value of sc3_k_markers_clusts
.
markers_for_heatmap(markers)
markers_for_heatmap(markers)
markers |
a |
Calculate graph Laplacian of a symmetrix matrix
norm_laplacian(A)
norm_laplacian(A)
A |
symmetric matrix |
SingleCellExperiment
classThis functions returns all marker gene columns from the phenoData
slot
of the input object corresponding to the number of clusters k
. Additionally,
it rearranges genes by the cluster index and order them by the area under the
ROC curve value inside of each cluster.
organise_de_genes(object, k, p_val)
organise_de_genes(object, k, p_val)
object |
an object of |
k |
number of cluster |
p_val |
p-value threshold |
SingleCellExperiment
classThis functions returns all marker gene columns from the phenoData
slot
of the input object corresponding to the number of clusters k
. Additionally,
it rearranges genes by the cluster index and order them by the area under the
ROC curve value inside of each cluster.
organise_marker_genes(object, k, p_val, auroc)
organise_marker_genes(object, k, p_val, auroc)
object |
an object of |
k |
number of cluster |
p_val |
p-value threshold |
auroc |
area under the ROC curve threshold |
Defines train and study cell indeces based on the svm_num_cells and svm_train_inds input parameters
prepare_for_svm(N, svm_num_cells = NULL, svm_train_inds = NULL, svm_max)
prepare_for_svm(N, svm_num_cells = NULL, svm_train_inds = NULL, svm_max)
N |
number of cells in the input dataset |
svm_num_cells |
number of random cells to be used for training |
svm_train_inds |
indeces of cells to be used for training |
svm_max |
define the maximum number of cells below which SVM is not run |
A list of indeces of the train and the study cells
Given an hclust
object and the number of clusters k
this function reindex the clusters inferred by cutree(hc, k)[hc$order]
, so that
they appear in ascending order. This is particularly useful when plotting
heatmaps in which the clusters should be numbered from left to right.
reindex_clusters(hc, k)
reindex_clusters(hc, k)
hc |
an object of class hclust |
k |
number of cluster to be inferred from hc |
hc <- hclust(dist(USArrests), 'ave') cutree(hc, 10)[hc$order] reindex_clusters(hc, 10)[hc$order]
hc <- hclust(dist(USArrests), 'ave') cutree(hc, 10)[hc$order] reindex_clusters(hc, 10)[hc$order]
SC3
in one goThis function is a wrapper that executes all steps of SC3
analysis in one go.
sc3.SingleCellExperiment(object, ks, gene_filter, pct_dropout_min, pct_dropout_max, d_region_min, d_region_max, svm_num_cells, svm_train_inds, svm_max, n_cores, kmeans_nstart, kmeans_iter_max, k_estimator, biology, rand_seed) ## S4 method for signature 'SingleCellExperiment' sc3(object, ks = NULL, gene_filter = TRUE, pct_dropout_min = 10, pct_dropout_max = 90, d_region_min = 0.04, d_region_max = 0.07, svm_num_cells = NULL, svm_train_inds = NULL, svm_max = 5000, n_cores = NULL, kmeans_nstart = NULL, kmeans_iter_max = 1e+09, k_estimator = FALSE, biology = FALSE, rand_seed = 1)
sc3.SingleCellExperiment(object, ks, gene_filter, pct_dropout_min, pct_dropout_max, d_region_min, d_region_max, svm_num_cells, svm_train_inds, svm_max, n_cores, kmeans_nstart, kmeans_iter_max, k_estimator, biology, rand_seed) ## S4 method for signature 'SingleCellExperiment' sc3(object, ks = NULL, gene_filter = TRUE, pct_dropout_min = 10, pct_dropout_max = 90, d_region_min = 0.04, d_region_max = 0.07, svm_num_cells = NULL, svm_train_inds = NULL, svm_max = 5000, n_cores = NULL, kmeans_nstart = NULL, kmeans_iter_max = 1e+09, k_estimator = FALSE, biology = FALSE, rand_seed = 1)
object |
an object of |
ks |
a range of the number of clusters |
gene_filter |
a boolen variable which defines whether to perform gene filtering before SC3 clustering. |
pct_dropout_min |
if |
pct_dropout_max |
if |
d_region_min |
defines the minimum number of eigenvectors used for
kmeans clustering as a fraction of the total number of cells. Default is |
d_region_max |
defines the maximum number of eigenvectors used for
kmeans clustering as a fraction of the total number of cells. Default is |
svm_num_cells |
number of randomly selected training cells to be used
for SVM prediction. The default is |
svm_train_inds |
a numeric vector defining indeces of training cells
that should be used for SVM training. The default is |
svm_max |
define the maximum number of cells below which SVM is not run. |
n_cores |
defines the number of cores to be used on the user's machine. If not set, 'SC3' will use all but one cores of your machine. |
kmeans_nstart |
nstart parameter passed to |
kmeans_iter_max |
iter.max parameter passed to |
k_estimator |
boolean parameter, defines whether to estimate an optimal number of clusters |
biology |
boolean parameter, defines whether to compute differentially expressed genes, marker genes and cell outliers. |
rand_seed |
sets the seed of the random number generator. |
an object of SingleCellExperiment
class
This function calculates differentially expressed (DE) genes, marker genes
and cell outliers based on the consensus SC3
clusterings.
sc3_calc_biology.SingleCellExperiment(object, ks, regime) ## S4 method for signature 'SingleCellExperiment' sc3_calc_biology(object, ks = NULL, regime = NULL)
sc3_calc_biology.SingleCellExperiment(object, ks, regime) ## S4 method for signature 'SingleCellExperiment' sc3_calc_biology(object, ks = NULL, regime = NULL)
object |
an object of |
ks |
a continuous range of integers - the number of clusters |
regime |
defines what biological analysis to perform. "marker" for marker genes, "de" for differentiall expressed genes and "outl" for outlier cells |
DE genes are calculated using get_de_genes
. Results of the DE
analysis are saved as new columns in the
featureData
slot of the input object
. The column names correspond
to the adjusted p-value
s of the genes and have the following format:
sc3_k_de_padj
, where k
is the number of clusters.
Marker genes are calculated using get_marker_genes
.
Results of the marker gene analysis are saved as three new
columns (for each k
) to the
featureData
slot of the input object
. The column names correspond
to the SC3
cluster labels, to the adjusted p-value
s of the genes
and to the area under the ROC curve
and have the following format: sc3_k_markers_clusts
,
sc3_k_markers_padj
and sc3_k_markers_auroc
, where k
is
the number of clusters.
Outlier cells are calculated using get_outl_cells
. Results of the
cell outlier analysis are saved as new columns in the
phenoData
slot of the input object
. The column names correspond
to the log2(outlier_score)
and have the following format:
sc3_k_log2_outlier_score
, where k
is the number of clusters.
Additionally, biology
item is added to the sc3
slot and is set to
TRUE
indicating that the biological analysis of the dataset has been
performed.
an object of SingleCellExperiment
class
This function calculates consensus matrices based on the clustering solutions
contained in the kmeans
item of the sc3
slot of the metadata(object)
. It then
creates and populates the consensus
item of the sc3
slot with
consensus matrices, their hierarchical clusterings in hclust
objects,
and Silhouette indeces of the clusters. It also removes the previously
calculated kmeans
clusterings from
the sc3
slot, as they are not needed for further analysis.
sc3_calc_consens.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_calc_consens(object)
sc3_calc_consens.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_calc_consens(object)
object |
an object of |
Additionally, it also adds new columns to the colData
slot of the
input object
. The column names correspond to the consensus cell labels
and have the following format: sc3_k_clusters
, where k
is the
number of clusters.
an object of SingleCellExperiment
class
This function calculates distances between the cells. It
creates and populates the following items of the sc3
slot of the metadata(object)
:
distances
- contains a list of distance matrices corresponding to
Euclidean, Pearson and Spearman distances.
sc3_calc_dists.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_calc_dists(object)
sc3_calc_dists.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_calc_dists(object)
object |
an object of |
an object of SingleCellExperiment
class
This function transforms all distances
items of the sc3
slot of
the metadata(object)
using either principal component analysis (PCA)
or by calculating the eigenvectors of the associated graph Laplacian.
The columns of the resulting matrices are then sorted in descending order
by their corresponding eigenvalues. The first d
columns
(where d = max(metadata(object)$sc3$n_dim)
) of each transformation are then
written to the transformations
item of the sc3
slot.
Additionally, this function also removes the previously calculated distances
from
the sc3
slot, as they are not needed for further analysis.
sc3_calc_transfs.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_calc_transfs(object)
sc3_calc_transfs.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_calc_transfs(object)
object |
an object of |
an object of SingleCellExperiment
class
k
for a scRNA-Seq expression matrixUses Tracy-Widom theory on random matrices to estimate the optimal number of
clusters k
. It creates and populates the k_estimation
item of the
sc3
slot of the metadata(object)
.
sc3_estimate_k.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_estimate_k(object)
sc3_estimate_k.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_estimate_k(object)
object |
an object of |
an estimated value of k
SC3
results to Excel fileThis function writes all SC3
results to an excel file.
sc3_export_results_xls.SingleCellExperiment(object, filename) ## S4 method for signature 'SingleCellExperiment' sc3_export_results_xls(object, filename = "sc3_results.xls")
sc3_export_results_xls.SingleCellExperiment(object, filename) ## S4 method for signature 'SingleCellExperiment' sc3_export_results_xls(object, filename = "sc3_results.xls")
object |
an object of |
filename |
name of the excel file, to which the results will be written |
SC3
results in an interactive session in a web browser.Runs interactive shiny
session of SC3
based on precomputed clusterings.
sc3_interactive.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_interactive(object)
sc3_interactive.SingleCellExperiment(object) ## S4 method for signature 'SingleCellExperiment' sc3_interactive(object)
object |
an object of |
Opens a browser window with an interactive shiny
app and visualize
all precomputed clusterings.
kmeans
clustering of cells.This function performs kmeans
clustering of the matrices
contained in the transformations
item of the sc3
slot of the metadata(object)
. It then
creates and populates the following items of the sc3
slot:
kmeans
- contains a list of kmeans clusterings.
sc3_kmeans.SingleCellExperiment(object, ks) ## S4 method for signature 'SingleCellExperiment' sc3_kmeans(object, ks = NULL)
sc3_kmeans.SingleCellExperiment(object, ks) ## S4 method for signature 'SingleCellExperiment' sc3_kmeans(object, ks = NULL)
object |
an object of |
ks |
a continuous range of integers - the number of clusters |
an object of SingleCellExperiment
class
Stability index shows how stable each cluster is accross the selected range of ks. The stability index varies between 0 and 1, where 1 means that the same cluster appears in every solution for different k.
sc3_plot_cluster_stability.SingleCellExperiment(object, k) ## S4 method for signature 'SingleCellExperiment' sc3_plot_cluster_stability(object, k)
sc3_plot_cluster_stability.SingleCellExperiment(object, k) ## S4 method for signature 'SingleCellExperiment' sc3_plot_cluster_stability(object, k)
object |
an object of 'SingleCellExperiment' class |
k |
number of clusters |
The consensus matrix is a NxN matrix, where N is the number of cells. It represents similarity between the cells based on the averaging of clustering results from all combinations of clustering parameters. Similarity 0 (blue) means that the two cells are always assigned to different clusters. In contrast, similarity 1 (red) means that the two cells are always assigned to the same cluster. The consensus matrix is clustered by hierarchical clustering and has a diagonal-block structure. Intuitively, the perfect clustering is achieved when all diagonal blocks are completely red and all off-diagonal elements are completely blue.
sc3_plot_consensus.SingleCellExperiment(object, k, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_consensus(object, k, show_pdata = NULL)
sc3_plot_consensus.SingleCellExperiment(object, k, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_consensus(object, k, show_pdata = NULL)
object |
an object of 'SingleCellExperiment' class |
k |
number of clusters |
show_pdata |
a vector of colnames of the pData(object) table. Default is NULL. If not NULL will add pData annotations to the columns of the output matrix |
SC3
as a heatmapSC3
plots gene expression profiles of the 50 genes with the lowest p-values.
sc3_plot_de_genes.SingleCellExperiment(object, k, p.val, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_de_genes(object, k, p.val = 0.01, show_pdata = NULL)
sc3_plot_de_genes.SingleCellExperiment(object, k, p.val, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_de_genes(object, k, p.val = 0.01, show_pdata = NULL)
object |
an object of 'SingleCellExperiment' class |
k |
number of clusters |
p.val |
significance threshold used for the DE genes |
show_pdata |
a vector of colnames of the pData(object) table. Default is NULL. If not NULL will add pData annotations to the columns of the output matrix |
The expression panel represents the original input expression matrix (cells in columns and genes in rows) after the gene filter. Genes are clustered by kmeans with k = 100 (dendrogram on the left) and the heatmap represents the expression levels of the gene cluster centers after log2-scaling.
sc3_plot_expression.SingleCellExperiment(object, k, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_expression(object, k, show_pdata = NULL)
sc3_plot_expression.SingleCellExperiment(object, k, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_expression(object, k, show_pdata = NULL)
object |
an object of 'SingleCellExperiment' class |
k |
number of clusters |
show_pdata |
a vector of colnames of the pData(object) table. Default is NULL. If not NULL will add pData annotations to the columns of the output matrix |
SC3
as a heatmap.By default the genes with the area under the ROC curve (AUROC) > 0.85 and with the p-value < 0.01 are selected and the top 10 marker genes of each cluster are visualized in this heatmap.
sc3_plot_markers.SingleCellExperiment(object, k, auroc, p.val, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_markers(object, k, auroc = 0.85, p.val = 0.01, show_pdata = NULL)
sc3_plot_markers.SingleCellExperiment(object, k, auroc, p.val, show_pdata) ## S4 method for signature 'SingleCellExperiment' sc3_plot_markers(object, k, auroc = 0.85, p.val = 0.01, show_pdata = NULL)
object |
an object of 'SingleCellExperiment' class |
k |
number of clusters |
auroc |
area under the ROC curve |
p.val |
significance threshold used for the DE genes |
show_pdata |
a vector of colnames of the pData(object) table. Default is NULL. If not NULL will add pData annotations to the columns of the output matrix |
A silhouette is a quantitative measure of the diagonality of the consensus matrix. An average silhouette width (shown at the bottom left of the silhouette plot) varies from 0 to 1, where 1 represents a perfectly block-diagonal consensus matrix and 0 represents a situation where there is no block-diagonal structure. The best clustering is achieved when the average silhouette width is close to 1.
sc3_plot_silhouette.SingleCellExperiment(object, k) ## S4 method for signature 'SingleCellExperiment' sc3_plot_silhouette(object, k)
sc3_plot_silhouette.SingleCellExperiment(object, k) ## S4 method for signature 'SingleCellExperiment' sc3_plot_silhouette(object, k)
object |
an object of 'SingleCellExperiment' class |
k |
number of clusters |
SingleCellExperiment
object for SC3
clustering.This function prepares an object of SingleCellExperiment
class for SC3
clustering. It
creates and populates the following items of the sc3
slot of the metadata(object)
:
kmeans_iter_max
- the same as the kmeans_iter_max
argument.
kmeans_nstart
- the same as the kmeans_nstart
argument.
n_dim
- contains numbers of the number of eigenvectors to be used
in kmeans
clustering.
rand_seed
- the same as the rand_seed
argument.
svm_train_inds
- if SVM is used this item contains indexes of the
training cells to be used for SC3 clustering and further SVM prediction.
svm_study_inds
- if SVM is used this item contains indexes of the
cells to be predicted by SVM.
n_cores
- the same as the n_cores
argument.
sc3_prepare.SingleCellExperiment(object, gene_filter, pct_dropout_min, pct_dropout_max, d_region_min, d_region_max, svm_num_cells, svm_train_inds, svm_max, n_cores, kmeans_nstart, kmeans_iter_max, rand_seed) ## S4 method for signature 'SingleCellExperiment' sc3_prepare(object, gene_filter = TRUE, pct_dropout_min = 10, pct_dropout_max = 90, d_region_min = 0.04, d_region_max = 0.07, svm_num_cells = NULL, svm_train_inds = NULL, svm_max = 5000, n_cores = NULL, kmeans_nstart = NULL, kmeans_iter_max = 1e+09, rand_seed = 1)
sc3_prepare.SingleCellExperiment(object, gene_filter, pct_dropout_min, pct_dropout_max, d_region_min, d_region_max, svm_num_cells, svm_train_inds, svm_max, n_cores, kmeans_nstart, kmeans_iter_max, rand_seed) ## S4 method for signature 'SingleCellExperiment' sc3_prepare(object, gene_filter = TRUE, pct_dropout_min = 10, pct_dropout_max = 90, d_region_min = 0.04, d_region_max = 0.07, svm_num_cells = NULL, svm_train_inds = NULL, svm_max = 5000, n_cores = NULL, kmeans_nstart = NULL, kmeans_iter_max = 1e+09, rand_seed = 1)
object |
an object of |
gene_filter |
a boolen variable which defines whether to perform gene filtering before SC3 clustering. |
pct_dropout_min |
if |
pct_dropout_max |
if |
d_region_min |
defines the minimum number of eigenvectors used for
kmeans clustering as a fraction of the total number of cells. Default is |
d_region_max |
defines the maximum number of eigenvectors used for
kmeans clustering as a fraction of the total number of cells. Default is |
svm_num_cells |
number of randomly selected training cells to be used
for SVM prediction. The default is |
svm_train_inds |
a numeric vector defining indeces of training cells
that should be used for SVM training. The default is |
svm_max |
define the maximum number of cells below which SVM is not run. |
n_cores |
defines the number of cores to be used on the user's machine. If not set, 'SC3' will use all but one cores of your machine. |
kmeans_nstart |
nstart parameter passed to |
kmeans_iter_max |
iter.max parameter passed to |
rand_seed |
sets the seed of the random number generator. |
an object of SingleCellExperiment
class
SVM
approach.This method parallelize SVM
prediction for each k
(the number
of clusters). Namely, for each k
, support_vector_machines
function is utilized to predict the labels of study cells. Training cells are
selected using svm_train_inds
item of the sc3
slot of the
metadata(object)
.
sc3_run_svm.SingleCellExperiment(object, ks) ## S4 method for signature 'SingleCellExperiment' sc3_run_svm(object, ks = NULL)
sc3_run_svm.SingleCellExperiment(object, ks) ## S4 method for signature 'SingleCellExperiment' sc3_run_svm(object, ks = NULL)
object |
an object of |
ks |
a continuous range of integers - the number of clusters |
Results are written to the sc3_k_clusters
columns to the
colData
slot of the input object
, where k
is the
number of clusters.
an object of SingleCellExperiment
class
SVM
) predictionTrain an SVM
classifier on a training dataset (train
) and then
classify a study dataset (study
) using the classifier.
support_vector_machines(train, study, kern)
support_vector_machines(train, study, kern)
train |
training dataset with colnames, corresponding to training labels |
study |
study dataset |
kern |
kernel to be used with SVM |
classification of the study dataset
Given matrix A, the procedure returns A'A.
tmult(x)
tmult(x)
x |
Numeric matrix. |
All distance matrices are transformed using either principal component analysis (PCA) or by calculating the eigenvectors of the graph Laplacian (Spectral). The columns of the resulting matrices are then sorted in descending order by their corresponding eigenvalues.
transformation(dists, method)
transformation(dists, method)
dists |
distance matrix |
method |
transformation method: either 'pca' or 'laplacian' |
transformed distance matrix
Single cell RNA-Seq data extracted from a publication by Yan et al.
yan
yan
An object of class data.frame
with 20214 rows and 90 columns.
http://dx.doi.org/10.1038/nsmb.2660
Columns represent cells, rows represent genes expression values.