Title: | scClassify: single-cell Hierarchical Classification |
---|---|
Description: | scClassify is a multiscale classification framework for single-cell RNA-seq data based on ensemble learning and cell type hierarchies, enabling sample size estimation required for accurate cell type classification and joint classification of cells using multiple references. |
Authors: | Yingxin Lin |
Maintainer: | Yingxin Lin <[email protected]> |
License: | GPL-3 |
Version: | 1.19.0 |
Built: | 2024-12-18 04:28:04 UTC |
Source: | https://github.com/bioc/scClassify |
The scClassifyTrainModel class is designed to stored training model for scClassify
.scClassifyTrainModel( name, cellTypeTree, cellTypeTrain, features, model, modelweights, metaData )
.scClassifyTrainModel( name, cellTypeTree, cellTypeTrain, features, model, modelweights, metaData )
name |
Name of the training dataset |
cellTypeTree |
A list indicate a cell type tree |
cellTypeTrain |
A vector of cell type in training dataset |
features |
A vector of character indicates the features that are trained for this data |
model |
A list stored the training model, including the features that are selected and the cell expression matrix that are used for training |
modelweights |
A vector of numeric indicates the weights of each model |
metaData |
A DataFrame stored meta data of training model |
A scClassifyTrainModel object
Yingxin Lin
Methods to access various components of the 'scClassifyTrainModel' object.
cellTypeTrain(x)
cellTypeTrain(x)
x |
A 'scClassifyTrainModel' object. |
cellTypeTrain of the scClassifyTrainModel slot
data(trainClassExample_xin) cellTypeTrain(trainClassExample_xin)
data(trainClassExample_xin) cellTypeTrain(trainClassExample_xin)
Methods to access various components of the 'scClassifyTrainModel' object.
cellTypeTree(x)
cellTypeTree(x)
x |
A 'scClassifyTrainModel' object. |
cellTypeTree of the scClassifyTrainModel slot
data(trainClassExample_xin) cellTypeTree(trainClassExample_xin)
data(trainClassExample_xin) cellTypeTree(trainClassExample_xin)
Methods to access various components of the 'scClassifyTrainModel' object.
features(x)
features(x)
x |
A 'scClassifyTrainModel' object. |
features of the scClassifyTrainModel slot
data(trainClassExample_xin) features(trainClassExample_xin)
data(trainClassExample_xin) features(trainClassExample_xin)
Function to get the required N given by the accuracy and the learning curve model
getN(res, acc = 0.9)
getN(res, acc = 0.9)
res |
model results returned by |
acc |
accuracy that are quired |
sample size that are required
set.seed(2019) n <- seq(20, 10000, 100) accMat <- do.call(cbind, lapply(1:length(n), function(i){ tmp_n <- rep(n[i], 50) y <- -2/(tmp_n^0.8) + 0.95 + rnorm(length(tmp_n), 0, 0.02) })) res <- learningCurve(accMat = accMat, n) N <- getN(res, acc = 0.9)
set.seed(2019) n <- seq(20, 10000, 100) accMat <- do.call(cbind, lapply(1:length(n), function(i){ tmp_n <- rep(n[i], 50) y <- -2/(tmp_n^0.8) + 0.95 + rnorm(length(tmp_n), 0, 0.02) })) res <- learningCurve(accMat = accMat, n) N <- getN(res, acc = 0.9)
Fit learning curve for accuracy matrix
learningCurve( accMat, n, auto_initial = TRUE, a = NULL, b = NULL, c = NULL, d_list = NULL, fitmodel = c("nls", "nls_mix", "gam"), plot = TRUE, verbose = TRUE )
learningCurve( accMat, n, auto_initial = TRUE, a = NULL, b = NULL, c = NULL, d_list = NULL, fitmodel = c("nls", "nls_mix", "gam"), plot = TRUE, verbose = TRUE )
accMat |
Matrix of accuracy rate where column indicate different sample size |
n |
Vector indicates the sample size |
auto_initial |
whether automatical intialise |
a |
input the parameter a starting point |
b |
input the parameter a starting point |
c |
input the parameter a starting point |
d_list |
range of d |
fitmodel |
"nls", "nls_mix", "gam" |
plot |
indicates whether plot or not |
verbose |
indicates whether verbose or not |
list of results
Yingxin Lin
set.seed(2019) n <- seq(20, 10000, 100) accMat <- do.call(cbind, lapply(1:length(n), function(i){ tmp_n <- rep(n[i], 50) y <- -2/(tmp_n^0.8) + 0.95 + rnorm(length(tmp_n), 0, 0.02) })) res <- learningCurve(accMat = accMat, n) N <- getN(res, acc = 0.9)
set.seed(2019) n <- seq(20, 10000, 100) accMat <- do.call(cbind, lapply(1:length(n), function(i){ tmp_n <- rep(n[i], 50) y <- -2/(tmp_n^0.8) + 0.95 + rnorm(length(tmp_n), 0, 0.02) })) res <- learningCurve(accMat = accMat, n) N <- getN(res, acc = 0.9)
Methods to access various components of the 'scClassifyTrainModel' object.
model(x)
model(x)
x |
A 'scClassifyTrainModel' object. |
model of the scClassifyTrainModel slot
data(trainClassExample_xin) model(trainClassExample_xin)
data(trainClassExample_xin) model(trainClassExample_xin)
Methods to access various components of the 'scClassifyTrainModel' object.
modelweights(x)
modelweights(x)
x |
A 'scClassifyTrainModel' object. |
modelweights of the scClassifyTrainModel slot
data(trainClassExample_xin) modelweights(trainClassExample_xin)
data(trainClassExample_xin) modelweights(trainClassExample_xin)
Methods to access various components of the 'scClassifyTrainModel' object.
name(x)
name(x)
x |
A 'scClassifyTrainModel' object. |
name of the scClassifyTrainModel slot
data(trainClassExample_xin) name(trainClassExample_xin)
data(trainClassExample_xin) name(trainClassExample_xin)
To plot cell type tree
plotCellTypeTree(cutree_list, group_level = NULL)
plotCellTypeTree(cutree_list, group_level = NULL)
cutree_list |
A list indicates the hierarchical cell type tree |
group_level |
Indicate whether plot or not |
A ggplot object visualising the HOPACH tree
data("trainClassExample_xin") plotCellTypeTree(cellTypeTree(trainClassExample_xin))
data("trainClassExample_xin") plotCellTypeTree(cellTypeTree(trainClassExample_xin))
Testing scClassify model
predict_scClassify( exprsMat_test, trainRes, cellTypes_test = NULL, k = 10, prob_threshold = 0.7, cor_threshold_static = 0.5, cor_threshold_high = 0.7, features = "limma", algorithm = "WKNN", similarity = "pearson", cutoff_method = c("dynamic", "static"), weighted_ensemble = FALSE, weights = NULL, parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = FALSE )
predict_scClassify( exprsMat_test, trainRes, cellTypes_test = NULL, k = 10, prob_threshold = 0.7, cor_threshold_static = 0.5, cor_threshold_high = 0.7, features = "limma", algorithm = "WKNN", similarity = "pearson", cutoff_method = c("dynamic", "static"), weighted_ensemble = FALSE, weights = NULL, parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = FALSE )
exprsMat_test |
A list or a matrix indicates the log-transformed expression matrices of the query datasets |
trainRes |
A 'scClassifyTrainModel' or a 'list' indicates scClassify trained model |
cellTypes_test |
A list or a vector indicates cell types of the qurey datasets (Optional). |
k |
An integer indicates the number of neighbour |
prob_threshold |
A numeric indicates the probability threshold for KNN/WKNN/DWKNN. |
cor_threshold_static |
A numeric indicates the static correlation threshold. |
cor_threshold_high |
A numeric indicates the highest correlation threshold |
features |
A vector indicates the gene selection method, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI". |
algorithm |
A vector indicates the KNN method that are used, set as "WKNN" by default. This should be one or more of "WKNN", "KNN", "DWKNN". |
similarity |
A vector indicates the similarity measure that are used, set as "pearson" by default. This should be one or more of "pearson", "spearman", "cosine", "jaccard", "kendall", "binomial", "weighted_rank","manhattan" |
cutoff_method |
A vector indicates the method to cutoff the correlation distribution. Set as "dynamic" by default. |
weighted_ensemble |
A logical input indicates in ensemble learning, whether the results is combined by a weighted score for each base classifier. |
weights |
A vector indicates the weights for ensemble |
parallel |
A logical input indicates whether running in paralllel or not |
BPPARAM |
A |
verbose |
A logical input indicates whether the intermediate steps will be printed |
list of results
Yingxin Lin
data("scClassify_example") wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset data("trainClassExample_xin") pred_res <- predict_scClassify(exprsMat_test = exprsMat_wang_subset, trainRes = trainClassExample_xin, cellTypes_test = wang_cellTypes, algorithm = "WKNN", features = c("limma"), similarity = c("pearson"), prob_threshold = 0.7, verbose = TRUE)
data("scClassify_example") wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset data("trainClassExample_xin") pred_res <- predict_scClassify(exprsMat_test = exprsMat_wang_subset, trainRes = trainClassExample_xin, cellTypes_test = wang_cellTypes, algorithm = "WKNN", features = c("limma"), similarity = c("pearson"), prob_threshold = 0.7, verbose = TRUE)
Testing scClassify model (joint training)
predict_scClassifyJoint( exprsMat_test, trainRes, cellTypes_test = NULL, k = 10, prob_threshold = 0.7, cor_threshold_static = 0.5, cor_threshold_high = 0.7, features = "limma", algorithm = "WKNN", similarity = "pearson", cutoff_method = c("dynamic", "static"), parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = FALSE )
predict_scClassifyJoint( exprsMat_test, trainRes, cellTypes_test = NULL, k = 10, prob_threshold = 0.7, cor_threshold_static = 0.5, cor_threshold_high = 0.7, features = "limma", algorithm = "WKNN", similarity = "pearson", cutoff_method = c("dynamic", "static"), parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = FALSE )
exprsMat_test |
A list or a matrix indicates the expression matrices of the testing datasets |
trainRes |
A 'scClassifyTrainModel' or a 'list' indicates scClassify training model |
cellTypes_test |
A list or a vector indicates cell types of the testing datasets (Optional). |
k |
An integer indicates the number of neighbour |
prob_threshold |
A numeric indicates the probability threshold for KNN/WKNN/DWKNN. |
cor_threshold_static |
A numeric indicates the static correlation threshold. |
cor_threshold_high |
A numeric indicates the highest correlation threshold |
features |
A vector indicates the method to select features, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI". |
algorithm |
A vector indicates the KNN method that are used, set as "WKNN" by default. This should be one or more of "WKNN", "KNN", "DWKNN". |
similarity |
A vector indicates the similarity measure that are used, set as "pearson" by default. This should be one or more of "pearson", "spearman", "cosine", "jaccard", "kendall", "binomial", "weighted_rank","manhattan" |
cutoff_method |
A vector indicates the method to cutoff the correlation distribution. Set as "dynamic" by default. |
parallel |
A logical input indicates whether running in paralllel or not |
BPPARAM |
A |
verbose |
A logical input indicates whether the intermediate steps will be printed |
list of results
Yingxin Lin
data("scClassify_example") wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset data("trainClassExample_xin") data("trainClassExample_wang") trainClassExampleJoint <- scClassifyTrainModelList(trainClassExample_wang, trainClassExample_xin) pred_res_joint <- predict_scClassifyJoint(exprsMat_test = exprsMat_wang_subset, trainRes = trainClassExampleJoint, cellTypes_test = wang_cellTypes, algorithm = "WKNN", features = c("limma"), similarity = c("pearson"), prob_threshold = 0.7, verbose = FALSE) table(pred_res_joint$jointRes$cellTypes, wang_cellTypes)
data("scClassify_example") wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset data("trainClassExample_xin") data("trainClassExample_wang") trainClassExampleJoint <- scClassifyTrainModelList(trainClassExample_wang, trainClassExample_xin) pred_res_joint <- predict_scClassifyJoint(exprsMat_test = exprsMat_wang_subset, trainRes = trainClassExampleJoint, cellTypes_test = wang_cellTypes, algorithm = "WKNN", features = c("limma"), similarity = c("pearson"), prob_threshold = 0.7, verbose = FALSE) table(pred_res_joint$jointRes$cellTypes, wang_cellTypes)
A function generating HOPACH tree using the average expression matrix for each cell type.
runHOPACH(data, plot = TRUE, kmax = 5)
runHOPACH(data, plot = TRUE, kmax = 5)
data |
A matrix of average expression matrix (each row indicates the gene, each column indicates the cell type) |
plot |
Indicate whether plot or not |
kmax |
Integer between 1 and 9 specifying the maximum number of children at each node in the tree. |
Return a list
where
cutree_list: A list indicates the hierarchical cell type tree
plot: A ggplot
visualise the cell type tree
Yingxin Lin
van der Laan, M. J. and Pollard, K. S. (2003) ‘A new algorithm for hybrid hierarchical clustering with visualization and the bootstrap’, Journal of Statistical Planning and Inference. doi: 10.1016/S0378-3758(02)00388-9.
data("scClassify_example") wang_cellTypes <- factor(scClassify_example$wang_cellTypes) exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset avgMat_wang <- apply(exprsMat_wang_subset, 1, function(x) aggregate(x, list(wang_cellTypes), mean)$x) rownames(avgMat_wang) <- levels(wang_cellTypes) res_hopach <- runHOPACH(avgMat_wang) res_hopach$plot
data("scClassify_example") wang_cellTypes <- factor(scClassify_example$wang_cellTypes) exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset avgMat_wang <- apply(exprsMat_wang_subset, 1, function(x) aggregate(x, list(wang_cellTypes), mean)$x) rownames(avgMat_wang) <- levels(wang_cellTypes) res_hopach <- runHOPACH(avgMat_wang) res_hopach$plot
Run sample size calculation for pilot data for reference dataset
runSampleCal( exprsMat, cellTypes, n_list = c(20, 40, 60, 80, 100, seq(200, 500, 100)), num_repeat = 20, level = NULL, cellType_tree = NULL, BPPARAM = BiocParallel::SerialParam(), subset_test = FALSE, num_test = NULL, ... )
runSampleCal( exprsMat, cellTypes, n_list = c(20, 40, 60, 80, 100, seq(200, 500, 100)), num_repeat = 20, level = NULL, cellType_tree = NULL, BPPARAM = BiocParallel::SerialParam(), subset_test = FALSE, num_test = NULL, ... )
exprsMat |
A matrix of expression matrix of pilot dataset (log-transformed, or normalised) |
cellTypes |
A vector of cell types of pilot dataset |
n_list |
A vector of integer indicates the sample size to run. |
num_repeat |
An integer indicates the number of run for each sample size will be repeated. |
level |
An integer indicates the accuracy rate is calculate based on the n-th level from top of cell type tree. If it is NULL (by default), it will be the bottom of the cell type tree. It can not be larger than the total number of levels of the tree. |
cellType_tree |
A list indicates the cell type tree (optional), if it is NULL, the accuracy rate is calculate based on the provided cellTypes. |
BPPARAM |
A |
subset_test |
A ogical input indicates whether we used a subset of data (fixed number for each sample size) to test instead of all remaining data. By default, it is FALSE. |
num_test |
An integer indicates the size of the test data. |
... |
other parameter from scClassify |
A matrix of accuracy matrix, where columns corresponding to different sample sizes, rows corresponding to the number of repetation.
data("scClassify_example") xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset exprsMat_xin_subset <- as(exprsMat_xin_subset, "dgCMatrix") set.seed(2019) accMat <- runSampleCal(exprsMat_xin_subset, xin_cellTypes, n_list = seq(20, 100, 20), num_repeat = 5, BPPARAM = BiocParallel::SerialParam())
data("scClassify_example") xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset exprsMat_xin_subset <- as(exprsMat_xin_subset, "dgCMatrix") set.seed(2019) accMat <- runSampleCal(exprsMat_xin_subset, xin_cellTypes, n_list = seq(20, 100, 20), num_repeat = 5, BPPARAM = BiocParallel::SerialParam())
Train and test scClassify model
scClassify( exprsMat_train = NULL, cellTypes_train = NULL, exprsMat_test = NULL, cellTypes_test = NULL, tree = "HOPACH", algorithm = "WKNN", selectFeatures = "limma", similarity = "pearson", cutoff_method = c("dynamic", "static"), weighted_ensemble = FALSE, weights = NULL, weighted_jointClassification = TRUE, cellType_tree = NULL, k = 10, topN = 50, hopach_kmax = 5, pSig = 0.01, prob_threshold = 0.7, cor_threshold_static = 0.5, cor_threshold_high = 0.7, returnList = TRUE, parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = FALSE )
scClassify( exprsMat_train = NULL, cellTypes_train = NULL, exprsMat_test = NULL, cellTypes_test = NULL, tree = "HOPACH", algorithm = "WKNN", selectFeatures = "limma", similarity = "pearson", cutoff_method = c("dynamic", "static"), weighted_ensemble = FALSE, weights = NULL, weighted_jointClassification = TRUE, cellType_tree = NULL, k = 10, topN = 50, hopach_kmax = 5, pSig = 0.01, prob_threshold = 0.7, cor_threshold_static = 0.5, cor_threshold_high = 0.7, returnList = TRUE, parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = FALSE )
exprsMat_train |
A matrix of log-transformed expression matrix of reference dataset |
cellTypes_train |
A vector of cell types of reference dataset |
exprsMat_test |
A list or a matrix indicates the expression matrices of the query datasets |
cellTypes_test |
A list or a vector indicates cell types of the query datasets (Optional). |
tree |
A vector indicates the method to build hierarchical tree, set as "HOPACH" by default. This should be one of "HOPACH" and "HC" (using hclust). |
algorithm |
A vector indicates the KNN method that are used, set as "WKNN" by default. Thisshould be one or more of "WKNN", "KNN", "DWKNN". |
selectFeatures |
A vector indicates the gene selection method, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI" and "Cepo". |
similarity |
A vector indicates the similarity measure that are used, set as "pearson" by default. This should be one or more of "pearson", "spearman", "cosine", "jaccard", kendall", "binomial", "weighted_rank","manhattan" |
cutoff_method |
A vector indicates the method to cutoff the correlation distribution. Set as "dynamic" by default. |
weighted_ensemble |
A logical input indicates in ensemble learning, whether the results is combined by a weighted score for each base classifier. |
weights |
A vector indicates the weights for ensemble |
weighted_jointClassification |
A logical input indicates in joint classification using multiple training datasets, whether the results is combined by a weighted score for each training model. |
cellType_tree |
A list indicates the cell type tree provided by user. (By default, it is NULL) (Only for one training data input) |
k |
An integer indicates the number of neighbour |
topN |
An integer indicates the top number of features that are selected |
hopach_kmax |
An integer between 1 and 9 specifying the maximum number of children at each node in the HOPACH tree. |
pSig |
A numeric indicates the cutoff of pvalue for features |
prob_threshold |
A numeric indicates the probability threshold for KNN/WKNN/DWKNN. |
cor_threshold_static |
A numeric indicates the static correlation threshold. |
cor_threshold_high |
A numeric indicates the highest correlation threshold |
returnList |
A logical input indicates whether the output will be class of list |
parallel |
A logical input indicates whether running in paralllel or not |
BPPARAM |
A |
verbose |
A logical input indicates whether the intermediate steps will be printed |
A list of the results, including testRes storing the results of the testing information, and trainRes storing the training model inforamtion.
Yingxin Lin
data("scClassify_example") xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset scClassify_res <- scClassify(exprsMat_train = exprsMat_xin_subset, cellTypes_train = xin_cellTypes, exprsMat_test = list(wang = exprsMat_wang_subset), cellTypes_test = list(wang = wang_cellTypes), tree = "HOPACH", algorithm = "WKNN", selectFeatures = c("limma"), similarity = c("pearson"), returnList = FALSE, verbose = FALSE)
data("scClassify_example") xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset wang_cellTypes <- scClassify_example$wang_cellTypes exprsMat_wang_subset <- scClassify_example$exprsMat_wang_subset scClassify_res <- scClassify(exprsMat_train = exprsMat_xin_subset, cellTypes_train = xin_cellTypes, exprsMat_test = list(wang = exprsMat_wang_subset), cellTypes_test = list(wang = wang_cellTypes), tree = "HOPACH", algorithm = "WKNN", selectFeatures = c("limma"), similarity = c("pearson"), returnList = FALSE, verbose = FALSE)
A list includes expression matrix and cell type of subsets of wang et al., xin et al.
data(scClassify_example, package = 'scClassify')
data(scClassify_example, package = 'scClassify')
An object of class list
of length 4.
Wang YJ, Schug J, Won K-J, Liu C, Naji A, Avrahami D, Golson ML & Kaestner KH (2016) Single cell transcriptomics of the human endocrine pancreas. Diabetes: db160405
Xin Y, Kim J, Okamoto H, Ni M, Wei Y, Adler C, Murphy AJ, Yancopoulos GD, Lin C & Gromada J (2016) RNA sequencing of single human islet cells reveals type 2 diabetes genes. Cell Metab. 24: 608–615
An S4 class to stored training model for scClassify
name
Name of the training dataset
cellTypeTrain
A vector of cell type in training dataset
cellTypeTree
A list indicate a cell type tree
features
A vector of character indicates the features that are trained for this data
model
A list stored the training model, including the features that are selected and the cell expression matrix that are used for training
modelweights
A vector of numeric indicates the weights of each model
metaData
A DataFrame stored meta data of training model
The scClassifyTrainModelList class
scClassifyTrainModelList(...)
scClassifyTrainModelList(...)
... |
scClassifyTrainModel objects |
A scClassifyTrainModelList object
data("trainClassExample_xin") data("trainClassExample_wang") trainClassExampleList <- scClassifyTrainModelList(trainClassExample_xin, trainClassExample_wang )
data("trainClassExample_xin") data("trainClassExample_wang") trainClassExampleList <- scClassifyTrainModelList(trainClassExample_xin, trainClassExample_wang )
An S4 class to stored a list of training models from scClassify
Training scClassify model
train_scClassify( exprsMat_train, cellTypes_train, tree = "HOPACH", selectFeatures = "limma", topN = 50, hopach_kmax = 5, pSig = 0.05, cellType_tree = NULL, weightsCal = FALSE, parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = TRUE, returnList = TRUE, ... )
train_scClassify( exprsMat_train, cellTypes_train, tree = "HOPACH", selectFeatures = "limma", topN = 50, hopach_kmax = 5, pSig = 0.05, cellType_tree = NULL, weightsCal = FALSE, parallel = FALSE, BPPARAM = BiocParallel::SerialParam(), verbose = TRUE, returnList = TRUE, ... )
exprsMat_train |
A matrix of log-transformed expression matrix of reference dataset |
cellTypes_train |
A vector of cell types of reference dataset |
tree |
A vector indicates the method to build hierarchical tree, set as "HOPACH" by default. This should be one of "HOPACH" and "HC" (using stats::hclust). |
selectFeatures |
A vector indicates the gene selection method, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI", "Cepo". |
topN |
An integer indicates the top number of features that are selected |
hopach_kmax |
An integer between 1 and 9 specifying the maximum number of children at each node in the HOPACH tree. |
pSig |
A numeric indicates the cutoff of pvalue for features |
cellType_tree |
A list indicates the cell type tree provided by user. (By default, it is NULL) |
weightsCal |
A logical input indicates whether we need to calculate the weights for the model. |
parallel |
A logical input indicates whether the algorihms will run in parallel |
BPPARAM |
A |
verbose |
A logical input indicates whether the intermediate steps will be printed |
returnList |
A logical input indicates whether the output will be class of list |
... |
Other input for predict_scClassify for the case when weights calculation of the pretrained model is performed |
list of results or an object of scClassifyTrainModel
Yingxin Lin
data("scClassify_example") xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset trainClass <- train_scClassify(exprsMat_train = exprsMat_xin_subset, cellTypes_train = xin_cellTypes, selectFeatures = c("limma", "BI"), returnList = FALSE )
data("scClassify_example") xin_cellTypes <- scClassify_example$xin_cellTypes exprsMat_xin_subset <- scClassify_example$exprsMat_xin_subset trainClass <- train_scClassify(exprsMat_train = exprsMat_xin_subset, cellTypes_train = xin_cellTypes, selectFeatures = c("limma", "BI"), returnList = FALSE )
An obejct of scClassifyTrainModel for Wang et al.
data(trainClassExample_wang, package = 'scClassify')
data(trainClassExample_wang, package = 'scClassify')
An object of class scClassifyTrainModel
of length 1.
Wang YJ, Schug J, Won K-J, Liu C, Naji A, Avrahami D, Golson ML & Kaestner KH (2016) Single cell transcriptomics of the human endocrine pancreas. Diabetes: db160405
An obejct of scClassifyTrainModel for Xin et al.
data(trainClassExample_xin, package = 'scClassify')
data(trainClassExample_xin, package = 'scClassify')
An object of class scClassifyTrainModel
of length 1.
Xin Y, Kim J, Okamoto H, Ni M, Wei Y, Adler C, Murphy AJ, Yancopoulos GD, Lin C & Gromada J (2016) RNA sequencing of single human islet cells reveals type 2 diabetes genes. Cell Metab. 24: 608–615