Title: | Markers Automated Screening Tool in R |
---|---|
Description: | mastR is an R package designed for automated screening of signatures of interest for specific research questions. The package is developed for generating refined lists of signature genes from multiple group comparisons based on the results from edgeR and limma differential expression (DE) analysis workflow. It also takes into account the background noise of tissue-specificity, which is often ignored by other marker generation tools. This package is particularly useful for the identification of group markers in various biological and medical applications, including cancer research and developmental biology. |
Authors: | Jinjin Chen [aut, cre] , Ahmed Mohamed [aut, ctb] , Chin Wee Tan [ctb] |
Maintainer: | Jinjin Chen <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.7.0 |
Built: | 2024-10-30 08:46:06 UTC |
Source: | https://github.com/bioc/mastR |
Convert CCLE data downloaded by depmap::depmap_TPM()
from long
data into wide matrix, with row names are gene names and column names are
depmap IDs.
ccle_2_wide(ccle)
ccle_2_wide(ccle)
ccle |
CCLE data downloaded by |
a matrix
data("ccle_crc_5") ccle <- data.frame( gene_name = rownames(ccle_crc_5), ccle_crc_5$counts ) |> tidyr::pivot_longer( -gene_name, names_to = "depmap_id", values_to = "rna_expression" ) ccle_wide <- ccle_2_wide(ccle)
data("ccle_crc_5") ccle <- data.frame( gene_name = rownames(ccle_crc_5), ccle_crc_5$counts ) |> tidyr::pivot_longer( -gene_name, names_to = "depmap_id", values_to = "rna_expression" ) ccle_wide <- ccle_2_wide(ccle)
A test DGEList object with RNA-seq RSEM quantified TPM data of 5 CRC cell
line samples from CCLE depmap::depmap_TPM()
.
data(ccle_crc_5)
data(ccle_crc_5)
A DGEList of 19177 genes * 5 samples.
DGEList
Standard DE analysis by using edgeR and limma::voom pipeline
de_analysis( dge, group_col, target_group, normalize = TRUE, group = FALSE, filter = c(10, 10), plot = FALSE, lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", batch = NULL, summary = TRUE, ... )
de_analysis( dge, group_col, target_group, normalize = TRUE, group = FALSE, filter = c(10, 10), plot = FALSE, lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", batch = NULL, summary = TRUE, ... )
dge |
DGEList object for DE analysis, including expr and samples info |
group_col |
character, column name of coldata to specify the DE comparisons |
target_group |
pattern, specify the group of interest, e.g. NK |
normalize |
logical, if the expr in data is raw counts needs to be normalized |
group |
logical, TRUE to separate samples into only 2 groups: ‘target_group“ and ’Others'; FALSE to set each level as a group |
filter |
a vector of 2 numbers, filter condition to remove low expression genes, the 1st for min.counts (if normalize = TRUE) or CPM/TPM (if normalize = FALSE), the 2nd for samples size 'large.n' |
plot |
logical, if to make plots to show QC before and after filtration |
lfc |
num, cutoff of logFC for DE analysis |
p |
num, cutoff of p value for DE analysis and permutation test if feature_selection = "rankproduct" |
markers |
vector, a vector of gene names, listed the gene symbols to be kept anyway after filtration. Default 'NULL' means no special genes need to be kept. |
gene_id |
character, specify the gene ID target_group of rownames of expression data when markers is not NULL, could be one of 'ENSEMBL', 'SYMBOL', 'ENTREZ'..., default 'SYMBOL' |
slot |
character, specify which slot to use for DGEList, default 'counts' |
batch |
vector of character, column name(s) of coldata to be treated as batch effect factor, default NULL |
summary |
logical, if to show the summary of DE analysis |
... |
omitted |
MArrayLM object generated by limma::treat()
data("im_data_6") dge <- edgeR::DGEList( counts = Biobase::exprs(im_data_6), samples = Biobase::pData(im_data_6) ) de_analysis(dge, group_col = "celltype.ch1", target_group = "NK")
data("im_data_6") dge <- edgeR::DGEList( counts = Biobase::exprs(im_data_6), samples = Biobase::pData(im_data_6) ) de_analysis(dge, group_col = "celltype.ch1", target_group = "NK")
return DEGs UP and DOWN list based on intersection or union of comparisons
DEGs_Group( tfit, lfc = NULL, p = 0.05, assemble = "intersect", Rank = "adj.P.Val", keep.top = NULL, keep.group = NULL, ... )
DEGs_Group( tfit, lfc = NULL, p = 0.05, assemble = "intersect", Rank = "adj.P.Val", keep.top = NULL, keep.group = NULL, ... )
tfit |
MArrayLM object generated by |
lfc |
num, cutoff of logFC for DE analysis |
p |
num, cutoff of p value for DE analysis |
assemble |
'intersect' or 'union', whether to select intersected or union genes of different comparisons, default 'intersect' |
Rank |
character, the variable for ranking DEGs, can be 'logFC', 'adj.P.Val'..., default 'adj.P.Val' |
keep.top |
NULL or num, whether to keep top n DEGs of specific comparison |
keep.group |
NULL or pattern, specify the top DEGs of which comparison or group to be kept |
... |
omitted |
A list of "UP" and "DOWN" genes
return DEGs UP and DOWN list based on Rank Product
DEGs_RP( tfit, lfc = NULL, p = 0.05, assemble = "intersect", Rank = "adj.P.Val", nperm = 1e+05, thres = 0.05, keep.top = NULL, keep.group = NULL, ... )
DEGs_RP( tfit, lfc = NULL, p = 0.05, assemble = "intersect", Rank = "adj.P.Val", nperm = 1e+05, thres = 0.05, keep.top = NULL, keep.group = NULL, ... )
tfit |
MArrayLM object generated by |
lfc |
num, cutoff of logFC for DE analysis |
p |
num, cutoff of p value for DE analysis |
assemble |
'intersect' or 'union', whether to select intersected or union genes of different comparisons, default 'intersect' |
Rank |
character, the variable for ranking DEGs, can be 'logFC', 'adj.P.Val'..., default 'adj.P.Val' |
nperm |
num, permutation runs of simulating the distribution |
thres |
num, cutoff for rank product permutation test if feature_selection = "rankproduct", default 0.05 |
keep.top |
NULL or num, whether to keep top n DEGs of specific comparison |
keep.group |
NULL or pattern, specify the top DEGs of which comparison or group to be kept |
... |
omitted |
A list of "UP" and "DOWN" genes
Specify the signature of the subset matched 'target_group' against other subsets, either "union", "intersect" or "RRA" can be specified when input is a list of datasets to integrate the signatures into one.
filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, ... ) ## S4 method for signature 'list' filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, slot = "counts", batch = NULL, ... ) ## S4 method for signature 'DGEList' filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, ... ) ## S4 method for signature 'ANY' filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, ... )
filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, ... ) ## S4 method for signature 'list' filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, slot = "counts", batch = NULL, ... ) ## S4 method for signature 'DGEList' filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, ... ) ## S4 method for signature 'ANY' filter_subset_sig( data, group_col, target_group, markers = NULL, normalize = TRUE, dir = "UP", gene_id = "SYMBOL", feature_selection = c("auto", "rankproduct", "none"), comb = union, filter = c(10, 10), s_thres = 0.05, ... )
data |
An expression data or a list of expression data objects |
group_col |
vector or character, specify the group factor or column name of coldata for DE comparisons |
target_group |
pattern, specify the group of interest, e.g. NK |
markers |
vector, a vector of gene names, listed the gene symbols to be kept anyway after filtration. Default 'NULL' means no special genes need to be kept. |
normalize |
logical, if the expr in data is raw counts needs to be normalized |
dir |
character, could be 'UP' or 'DOWN' to use only up- or down-expressed genes |
gene_id |
character, specify the gene ID target_group of rownames of expression data when markers is not NULL, could be one of 'ENSEMBL', 'SYMBOL', 'ENTREZ'..., default 'SYMBOL' |
feature_selection |
one of "auto" (default), "rankproduct" or "none", choose if to use rank product or not to select DEGs from multiple comparisons of DE analysis, 'auto' uses 'rankproduct' but change to 'none' if final genes < 5 for both UP and DOWN |
comb |
'RRA' or Fun for combining sigs from multiple datasets, keep all
passing genes or only intersected genes, could be |
filter |
(list of) vector of 2 numbers, filter condition to remove low expression genes, the 1st for min.counts (if normalize = TRUE) or CPM/TPM (if normalize = FALSE), the 2nd for samples size 'large.n' |
s_thres |
num, threshold of score if comb = 'RRA' |
... |
other params for |
slot |
character, specify which slot to use only for DGEList, sce or seurat object, optional, default 'counts' |
batch |
vector of character, column name(s) of coldata to be treated as batch effect factor, default NULL |
a vector of gene symbols
data("im_data_6", "nk_markers") sigs <- filter_subset_sig(im_data_6, "celltype:ch1", "NK", markers = nk_markers$HGNC_Symbol, gene_id = "ENSEMBL" )
data("im_data_6", "nk_markers") sigs <- filter_subset_sig(im_data_6, "celltype:ch1", "NK", markers = nk_markers$HGNC_Symbol, gene_id = "ENSEMBL" )
This function uses edgeR and limma to get DE analysis results
lists for multiple comparisons. Filter out low expressed genes and obtain
DE statistics by using limma::voom and limma::treat, and also create an
object proc_data
to store processed data.
get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'DGEList,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'matrix,vector,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'Matrix,vector,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'ExpressionSet,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'SummarizedExperiment,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'Seurat,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...)
get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'DGEList,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'matrix,vector,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'Matrix,vector,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'ExpressionSet,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'SummarizedExperiment,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...) ## S4 method for signature 'Seurat,character,character' get_de_table(data, group_col, target_group, slot = "counts", ...)
data |
expression object |
group_col |
vector or character, specify the group factor or column name of coldata for DE comparisons |
target_group |
pattern, specify the group of interest, e.g. NK |
slot |
character, specify which slot to use only for DGEList, sce or seurat object, optional, default 'counts' |
... |
params for function |
A list of DE result table of all comparisons.
data("im_data_6") DE_tables <- get_de_table(im_data_6, group_col = "celltype:ch1", target_group = "NK")
data("im_data_6") DE_tables <- get_de_table(im_data_6, group_col = "celltype:ch1", target_group = "NK")
This function uses edgeR and limma to get 'UP' and 'DOWN' DEG
lists, for multiple comparisons, DEGs can be obtained from intersection of
all DEGs or by using product of p value ranks for multiple
comparisons. Filter out low expressed genes and extract DE genes by using
limma::voom and limma::treat, and also create an object proc_data
to
store processed data.
get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'DGEList,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'matrix,vector,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'Matrix,vector,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'ExpressionSet,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'SummarizedExperiment,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'Seurat,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... )
get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'DGEList,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'matrix,vector,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'Matrix,vector,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'ExpressionSet,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'SummarizedExperiment,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... ) ## S4 method for signature 'Seurat,character,character' get_degs( data, group_col, target_group, normalize = TRUE, feature_selection = c("auto", "rankproduct", "none"), slot = "counts", batch = NULL, ... )
data |
expression object |
group_col |
vector or character, specify the group factor or column name of coldata for DE comparisons |
target_group |
pattern, specify the group of interest, e.g. NK |
normalize |
logical, if the expr in data is raw counts needs to be normalized |
feature_selection |
one of "auto" (default), "rankproduct" or "none", choose if to use rank product or not to select DEGs from multiple comparisons of DE analysis, 'auto' uses 'rankproduct' but change to 'none' if final genes < 5 for both UP and DOWN |
slot |
character, specify which slot to use only for DGEList, sce or seurat object, optional, default 'counts' |
batch |
vector of column name(s) or dataframe, specify the batch effect factor(s), default NULL |
... |
params for |
A list of 'UP', 'DOWN' gene set of all differentially expressed genes, and a DGEList 'proc_data' containing data after process (filtration, normalization and voom fit). Both 'UP' and 'DOWN' are ordered by rank product or 'Rank' variable if keep.top is NULL
data("im_data_6") DEGs <- get_degs(im_data_6, group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
data("im_data_6") DEGs <- get_degs(im_data_6, group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
Collect gene sets from MSigDB or given GeneSetCollection, of which the gene-set
names are matched to the given regex pattern by using grep()
function.
By setting cat and subcat, matching can be constrained in the union of given
categories and subcategories if gsc = 'msigdb'.
get_gsc_sig( gsc = "msigdb", pattern, cat = NULL, subcat = NULL, species = c("hs", "mm"), id = c("SYM", "EZID"), version = msigdb::getMsigdbVersions(), ... ) ## S4 method for signature 'GeneSetCollection,character' get_gsc_sig( gsc = "msigdb", pattern, cat = NULL, subcat = NULL, species = c("hs", "mm"), id = c("SYM", "EZID"), version = msigdb::getMsigdbVersions(), ... ) ## S4 method for signature 'character,character' get_gsc_sig( gsc = "msigdb", pattern, cat = NULL, subcat = NULL, species = c("hs", "mm"), id = c("SYM", "EZID"), version = msigdb::getMsigdbVersions(), ... )
get_gsc_sig( gsc = "msigdb", pattern, cat = NULL, subcat = NULL, species = c("hs", "mm"), id = c("SYM", "EZID"), version = msigdb::getMsigdbVersions(), ... ) ## S4 method for signature 'GeneSetCollection,character' get_gsc_sig( gsc = "msigdb", pattern, cat = NULL, subcat = NULL, species = c("hs", "mm"), id = c("SYM", "EZID"), version = msigdb::getMsigdbVersions(), ... ) ## S4 method for signature 'character,character' get_gsc_sig( gsc = "msigdb", pattern, cat = NULL, subcat = NULL, species = c("hs", "mm"), id = c("SYM", "EZID"), version = msigdb::getMsigdbVersions(), ... )
gsc |
'msigdb' or GeneSetCollection to be searched |
pattern |
pattern pass to |
cat |
character, stating the category(s) to be retrieved.
The category(s) must be one from |
subcat |
character, stating the sub-category(s) to be retrieved.
The sub-category(s) must be one from
|
species |
character, species of interest, can be 'hs' or 'mm' |
id |
a character, representing the ID type to use ("SYM" for gene SYMBOLs and "EZID" for ENTREZ IDs) |
version |
a character, stating the version of MSigDB to be retrieved
(should be >= 7.2). See |
... |
params for |
A GeneSet object containing all matched gene-sets in MSigDB
data("msigdb_gobp_nk") get_gsc_sig( gsc = msigdb_gobp_nk, pattern = "natural_killer_cell_mediated", subcat = "GO:BP", ignore.case = TRUE )
data("msigdb_gobp_nk") get_gsc_sig( gsc = msigdb_gobp_nk, pattern = "natural_killer_cell_mediated", subcat = "GO:BP", ignore.case = TRUE )
Extract markers for subsets matched to the given pattern from LM7/LM22, and save the matched genes in 'GeneSet' class object, if both pattern are provided, the output would be a 'GeneSetCollection' class object with setName: LM7, LM22.
get_lm_sig(lm7.pattern, lm22.pattern, ...)
get_lm_sig(lm7.pattern, lm22.pattern, ...)
lm7.pattern |
character string containing a regular expression, to be matched in the given subsets in LM7 |
lm22.pattern |
character string containing a regular expression, to be matched in the given subsets in LM22 |
... |
params for function |
A GeneSet or GeneSetCollection for matched subsets in LM7 and/or LM22
data("lm7", "lm22") get_lm_sig(lm7.pattern = "NK", lm22.pattern = "NK cells")
data("lm7", "lm22") get_lm_sig(lm7.pattern = "NK", lm22.pattern = "NK cells")
Extract specific immune subset markers for 'Hs' or 'Mm', the markers are retrieved from up-to-date PanglaoDB website.
get_panglao_sig(type, species = c("Hs", "Mm", "Mm Hs"))
get_panglao_sig(type, species = c("Hs", "Mm", "Mm Hs"))
type |
character vector, cell type name(s) of interest,
available subsets could be listed by |
species |
character, default 'Hs', could be 'Hs', 'Mm' or 'Mm Hs', specify the species of interest |
a 'GeneSet' class object containing genes of given type(s)
get_panglao_sig(type = "NK cells") get_panglao_sig(type = c("NK cells", "T cells"))
get_panglao_sig(type = "NK cells") get_panglao_sig(type = c("NK cells", "T cells"))
Convert gene-set list into GeneSetCollection
gls2gsc(...) ## S4 method for signature 'list' gls2gsc(...) ## S4 method for signature 'vector' gls2gsc(...)
gls2gsc(...) ## S4 method for signature 'list' gls2gsc(...) ## S4 method for signature 'vector' gls2gsc(...)
... |
vector of genes or list of genes |
GeneSetCollection
data("msigdb_gobp_nk") gls2gsc(GSEABase::geneIds(msigdb_gobp_nk[1:3]))
data("msigdb_gobp_nk") gls2gsc(GSEABase::geneIds(msigdb_gobp_nk[1:3]))
Plot upset diagram for overlapping genes among given gene-sets.
gsc_plot(...)
gsc_plot(...)
... |
GeneSet or GeneSetCollection |
upset plot object
data("msigdb_gobp_nk") gsc_plot(msigdb_gobp_nk[1:3])
data("msigdb_gobp_nk") gsc_plot(msigdb_gobp_nk[1:3])
An ExpressionSet objects containing 6 immune subsets (B-cells, CD4, CD8, Monocytes, Neutrophils, NK) from healthy individuals.
data(im_data_6)
data(im_data_6)
An ExpressionSet objects of 6*4 samples.
ExpressionSet
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE60424
Show the name of organs available in PanglaoDB. Help users know which organs could be retrieved by PanglaoDB.
list_panglao_organs()
list_panglao_organs()
a vector of available organ types or cell types in PanglaoDB
list_panglao_organs()
list_panglao_organs()
Show the name and number of each cell type in PanglaoDB. Help users know which subset(s) marker list(s) could be retrieved by PanglaoDB.
list_panglao_types(organ)
list_panglao_types(organ)
organ |
character, specify the tissue or organ label to list cell types |
a vector of available cell types of the organ in PanglaoDB
list_panglao_types(organ = "Immune system")
list_panglao_types(organ = "Immune system")
A dataset containing 547 marker genes expression of 22 immune subsets which is generated for CIBERSORT.
data(lm22)
data(lm22)
A data frame with 547 rows 23 variables:
gene symbols
0 or 1, represents if the gene is significantly up-regulated in the subset
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
0 or 1
data frame
https://cibersort.stanford.edu/
A dataset containing 375 marker genes expression of 7 immune subsets which is generated for CIBERSORT.
data(lm7)
data(lm7)
A data frame with 375 rows 9 variables:
gene symbols
immune subset of the marker gene
gene median expression in B cells
gene median expression in T CD4 cells
gene median expression in T CD8 cells
gene median expression in T gamma delta cells
gene median expression in NK cells
gene median expression in MoMaDC cells
gene median expression in granulocytes
data frame
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5384348/
Merge markers collected from different DB into one 'GeneSet' object, saved a
data.frame in json format under longDescription
with 'TRUE' and '-' to
indicate which DB each gene is from, this can be shown via
jsonlite::fromJSON()
.
merge_markers(...)
merge_markers(...)
... |
GeneSet or GeneSetCollection object to be merged |
A GeneSet class of union genes in the given list
data("msigdb_gobp_nk") Markers <- merge_markers(msigdb_gobp_nk[1:3]) jsonlite::fromJSON(GSEABase::longDescription(Markers))
data("msigdb_gobp_nk") Markers <- merge_markers(msigdb_gobp_nk[1:3]) jsonlite::fromJSON(GSEABase::longDescription(Markers))
A small GeneSetCollection object, contains gene sets with gene set name matched to 'NATURAL_KILLER' from GO:BP MSigDB v7.4 database.
data(msigdb_gobp_nk)
data(msigdb_gobp_nk)
A GeneSetCollection of 55 gene sets.
GeneSetCollection
A dataset containing 114 NK cell markers from LM22, LM7 and human orthologs in mice.
data(nk_markers)
data(nk_markers)
A data frame with 114 rows and at least 4 variables:
gene symbols
if included in LM22
if included in LM7
if included in orthologs
data frame
https://cancerimmunolres.aacrjournals.org/content/7/7/1162.long
Make a matrix plot of PCA with top PCs
pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'matrix' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'data.frame' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList' pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment' pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat' pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" )
pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'matrix' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'data.frame' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet' pca_matrix_plot( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList' pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment' pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat' pca_matrix_plot( data, features = "all", slot = "counts", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" )
data |
expression data, can be matrix, eSet, seurat... |
features |
vector of gene symbols or 'all', specify the genes used for PCA, default 'all' |
slot |
character, specify the slot name of expression to be used, optional |
group_by |
character, specify the column to be grouped and colored, default NULL |
scale |
logical, if to scale data for PCA, default TRUE |
n |
num, specify top n PCs to plot |
loading |
logical, if to plot and label loadings of PCA, default 'FALSE' |
n_loadings |
num, top n loadings to plot; or a vector of gene IDs;
only work when |
gene_id |
character, specify which column of IDs used to calculate TPM, also indicate the ID type of expression data's rowname, could be one of 'ENSEMBL', 'SYMBOL', 'ENTREZ'..., default 'SYMBOL' |
matrix plot of PCA
data("im_data_6") pca_matrix_plot(data = im_data_6, scale = FALSE)
data("im_data_6") pca_matrix_plot(data = im_data_6, scale = FALSE)
Make a matrix plot of PCA with top PCs
pca_matrix_plot_init( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" )
pca_matrix_plot_init( data, features = "all", group_by = NULL, scale = TRUE, n = 4, loading = FALSE, n_loadings = 10, gene_id = "SYMBOL" )
data |
expression matrix |
features |
vector of gene symbols or 'all', specify the genes used for PCA, default 'all' |
group_by |
character, specify the column to be grouped and colored, default NULL |
scale |
logical, if to scale data for PCA, default TRUE |
n |
num, specify top n PCs to plot |
loading |
logical, if to plot and label loadings of PCA, default 'FALSE' |
n_loadings |
num, top n loadings to plot; or a vector of gene IDs;
only work when |
gene_id |
character, specify which column of IDs used to calculate TPM, also indicate the ID type of expression data's rowname, could be one of 'ENSEMBL', 'SYMBOL', 'ENTREZ'..., default 'SYMBOL' |
matrix plot of PCA
process_data()
plot diagnostics before and after process_data()
plot_diagnostics(expr1, expr2, group_col, abl = 2)
plot_diagnostics(expr1, expr2, group_col, abl = 2)
expr1 |
expression matrix 1 for original data |
expr2 |
expression matrix 2 for processed data |
group_col |
vector of group of samples |
abl |
num, cutoff line |
multiple plots
data("im_data_6") dge <- edgeR::DGEList( counts = Biobase::exprs(im_data_6), samples = Biobase::pData(im_data_6) ) dge$logCPM <- edgeR::cpm(dge, log = TRUE) proc_data <- process_data(dge, group_col = "celltype.ch1", target_group = "NK" ) plot_diagnostics(proc_data$logCPM, proc_data$vfit$E, group_col = proc_data$samples$group )
data("im_data_6") dge <- edgeR::DGEList( counts = Biobase::exprs(im_data_6), samples = Biobase::pData(im_data_6) ) dge$logCPM <- edgeR::cpm(dge, log = TRUE) proc_data <- process_data(dge, group_col = "celltype.ch1", target_group = "NK" ) plot_diagnostics(proc_data$logCPM, proc_data$vfit$E, group_col = proc_data$samples$group )
plot Mean-variance trend after voom and after final linear fit
plot_mean_var(proc_data, span = 0.5)
plot_mean_var(proc_data, span = 0.5)
proc_data |
processed data returned by |
span |
num, span for |
comparison plot of mean-variance of voom and final model
data("im_data_6") proc_data <- process_data( im_data_6, group_col = "celltype:ch1", target_group = "NK" ) plot_mean_var(proc_data)
data("im_data_6") proc_data <- process_data( im_data_6, group_col = "celltype:ch1", target_group = "NK" ) plot_mean_var(proc_data)
Single PCA plot function
plotPCAbiplot( prcomp, loading = FALSE, n_loadings = 10, dims = c(1, 2), group_by = NULL )
plotPCAbiplot( prcomp, loading = FALSE, n_loadings = 10, dims = c(1, 2), group_by = NULL )
prcomp |
prcomp object generated by |
loading |
logical, if to plot and label loadings of PCA, default 'FALSE' |
n_loadings |
num, top n loadings to plot; or a vector of gene IDs;
only work when |
dims |
a vector of 2 elements, specifying PCs to plot |
group_by |
character, specify the column to be grouped and colored, default NULL |
ggplot of PCA
filter low expression genes, normalize data by 'TMM' and apply
limma::voom()
, limma::lmFit()
and limma::treat()
on normalized data
process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", ... ) ## S4 method for signature 'DGEList,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", ... ) ## S4 method for signature 'matrix,vector,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", batch = NULL, ... ) ## S4 method for signature 'Matrix,vector,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", batch = NULL, ... ) ## S4 method for signature 'ExpressionSet,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", batch = NULL, ... ) ## S4 method for signature 'SummarizedExperiment,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", batch = NULL, ... ) ## S4 method for signature 'Seurat,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", batch = NULL, ... )
process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", ... ) ## S4 method for signature 'DGEList,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", ... ) ## S4 method for signature 'matrix,vector,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", batch = NULL, ... ) ## S4 method for signature 'Matrix,vector,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", batch = NULL, ... ) ## S4 method for signature 'ExpressionSet,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", batch = NULL, ... ) ## S4 method for signature 'SummarizedExperiment,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", batch = NULL, ... ) ## S4 method for signature 'Seurat,character,character' process_data( data, group_col, target_group, normalize = TRUE, filter = c(10, 10), lfc = 0, p = 0.05, markers = NULL, gene_id = "SYMBOL", slot = "counts", batch = NULL, ... )
data |
expression object |
group_col |
character, column name of coldata to specify the DE comparisons |
target_group |
pattern, specify the group of interest, e.g. NK |
normalize |
logical, if the expr in data is raw counts needs to be normalized |
filter |
a vector of 2 numbers, filter condition to remove low expression genes, the 1st for min.counts (if normalize = TRUE) or CPM/TPM (if normalize = FALSE), the 2nd for samples size 'large.n' |
lfc |
num, cutoff of logFC for DE analysis |
p |
num, cutoff of p value for DE analysis and permutation test if feature_selection = "rankproduct" |
markers |
vector, a vector of gene names, listed the gene symbols to be kept anyway after filtration. Default 'NULL' means no special genes need to be kept. |
gene_id |
character, specify the gene ID target_group of rownames of expression data when markers is not NULL, could be one of 'ENSEMBL', 'SYMBOL', 'ENTREZ'..., default 'SYMBOL' |
slot |
character, specify which slot to use only for DGEList, sce or seurat object, optional, default 'counts' |
... |
params for |
batch |
vector of character, column name(s) of coldata to be treated as batch effect factor, default NULL |
A DGEList containing vfit by limma::voom()
(if normalize = TRUE)
and tfit by limma::treat()
data("im_data_6") proc_data <- process_data( im_data_6, group_col = "celltype:ch1", target_group = "NK" )
data("im_data_6") proc_data <- process_data( im_data_6, group_col = "celltype:ch1", target_group = "NK" )
Gathering cells to make the pool according to specific factors, and randomly assign the cells from the pool to pseudo-sample with the randomized cell size. (min.cells <= size <= max.cells)
pseudo_sample_list(data, by, min.cells = 0, max.cells = Inf)
pseudo_sample_list(data, by, min.cells = 0, max.cells = Inf)
data |
matrix or data.frame or other single cell expression object |
by |
a vector or data.frame contains factor(s) for aggregation |
min.cells |
num, default 0, the minimum size of cells aggregating to each pseudo-sample |
max.cells |
num, default Inf, the maximum size of cells aggregating to each pseudo-sample |
A list of cell names for each pseudo-sample
counts <- matrix(abs(rnorm(10000, 10, 10)), 100) rownames(counts) <- 1:100 colnames(counts) <- 1:100 meta <- data.frame( subset = rep(c("A", "B"), 50), level = rep(1:4, each = 25) ) rownames(meta) <- 1:100 scRNA <- SeuratObject::CreateSeuratObject(counts = counts, meta.data = meta) pseudo_sample_list(scRNA, by = c("subset", "level"), min.cells = 10, max.cells = 20 )
counts <- matrix(abs(rnorm(10000, 10, 10)), 100) rownames(counts) <- 1:100 colnames(counts) <- 1:100 meta <- data.frame( subset = rep(c("A", "B"), 50), level = rep(1:4, each = 25) ) rownames(meta) <- 1:100 scRNA <- SeuratObject::CreateSeuratObject(counts = counts, meta.data = meta) pseudo_sample_list(scRNA, by = c("subset", "level"), min.cells = 10, max.cells = 20 )
Gather cells for each group according to specified factors, then randomly assign and aggregate cells to each pseudo-samples with randomized cell size. (min.cells <= size <= max.cells)
pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'matrix,data.frame' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'matrix,vector' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'Seurat,character' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'SummarizedExperiment,character' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" )
pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'matrix,data.frame' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'matrix,vector' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'Seurat,character' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" ) ## S4 method for signature 'SummarizedExperiment,character' pseudo_samples( data, by, fun = c("sum", "mean"), scale = NULL, min.cells = 0, max.cells = Inf, slot = "counts" )
data |
a matrix or Seurat/SCE object containing expression and metadata |
by |
a vector of group names or dataframe for aggregation |
fun |
chr, methods used to aggregate cells, could be 'sum' or 'mean', default 'sum' |
scale |
a num or NULL, if to multiply a scale to the average expression |
min.cells |
num, default 300, the minimum size of cells aggregating to each pseudo-sample |
max.cells |
num, default 600, the maximum size of cells aggregating to each pseudo-sample |
slot |
chr, specify which slot of seurat object to aggregate, can be 'counts', 'data', 'scale.data'..., default is 'counts' |
An expression matrix after aggregating cells on specified factors
counts <- matrix(abs(rnorm(10000, 10, 10)), 100) rownames(counts) <- 1:100 colnames(counts) <- 1:100 meta <- data.frame( subset = rep(c("A", "B"), 50), level = rep(1:4, each = 25) ) rownames(meta) <- 1:100 scRNA <- SeuratObject::CreateSeuratObject(counts = counts, meta.data = meta) pseudo_samples(scRNA, by = c("subset", "level"), min.cells = 10, max.cells = 20 )
counts <- matrix(abs(rnorm(10000, 10, 10)), 100) rownames(counts) <- 1:100 colnames(counts) <- 1:100 meta <- data.frame( subset = rep(c("A", "B"), 50), level = rep(1:4, each = 25) ) rownames(meta) <- 1:100 scRNA <- SeuratObject::CreateSeuratObject(counts = counts, meta.data = meta) pseudo_samples(scRNA, by = c("subset", "level"), min.cells = 10, max.cells = 20 )
Specify signatures against specific tissues or cell lines by removing genes with high expression in the background.
remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'matrix,matrix,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'DGEList,matrix,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,DGEList,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,ExpressionSet,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,SummarizedExperiment,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,Seurat,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,character,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,missing,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,ANY,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL )
remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'matrix,matrix,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'DGEList,matrix,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,DGEList,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,ExpressionSet,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,SummarizedExperiment,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,Seurat,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,character,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,missing,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL ) ## S4 method for signature 'ANY,ANY,vector' remove_bg_exp( sig_data, bg_data = "CCLE", markers, s_group_col = NULL, s_target_group = NULL, b_group_col = NULL, b_target_group = NULL, snr = 1, ..., filter = NULL, gene_id = "SYMBOL", s_slot = "counts", b_slot = "counts", ccle_tpm = NULL, ccle_meta = NULL )
sig_data |
log-transformed expression object, can be matrix or DGEList, as signal data |
bg_data |
'CCLE' or log-transformed expression object as background data |
markers |
vector, a vector of gene names, listed the gene symbols to be filtered. Must be gene SYMBOLs |
s_group_col |
vector or character, to specify the group of signal target_groups, or column name of group, default NULL |
s_target_group |
pattern, specify the target group of interest in sig_data, default NULL |
b_group_col |
vector or character, to specify the group of background
target_groups, or column name of |
b_target_group |
pattern, specify the target_group of interest in bg_data, e.g. 'colorectal', default NULL |
snr |
num, the cutoff of SNR to screen markers which are not or lowly expressed in bg_data |
... |
params for |
filter |
NULL or a vector of 2 num, filter condition to remove low expression genes in bg_data, the 1st for logcounts, the 2nd for samples size |
gene_id |
character, specify the gene ID type of rownames of expression data, could be one of 'ENSEMBL', 'SYMBOL', 'ENTREZ'..., default 'SYMBOL' |
s_slot |
character, specify which slot to use of DGEList, sce or seurat object for sig_data, optional, default 'counts' |
b_slot |
character, specify which slot to use of DGEList, sce or seurat object for bg_data, optional, default 'counts' |
ccle_tpm |
ccle_tpm data from |
ccle_meta |
ccle_meta data from |
a vector of genes after filtration
data("im_data_6", "nk_markers", "ccle_crc_5") remove_bg_exp( sig_data = Biobase::exprs(im_data_6), bg_data = ccle_crc_5, im_data_6$`celltype:ch1`, "NK", ## for sig_data "cancer", "CRC", ## for bg_data markers = nk_markers$HGNC_Symbol[40:50], filter = c(1, 2), gene_id = c("ENSEMBL", "SYMBOL") )
data("im_data_6", "nk_markers", "ccle_crc_5") remove_bg_exp( sig_data = Biobase::exprs(im_data_6), bg_data = ccle_crc_5, im_data_6$`celltype:ch1`, "NK", ## for sig_data "cancer", "CRC", ## for bg_data markers = nk_markers$HGNC_Symbol[40:50], filter = c(1, 2), gene_id = c("ENSEMBL", "SYMBOL") )
Remove genes show high signal in the background expression data from markers.
remove_bg_exp_mat(sig_mat, bg_mat, markers, snr = 1, gene_id = "SYMBOL")
remove_bg_exp_mat(sig_mat, bg_mat, markers, snr = 1, gene_id = "SYMBOL")
sig_mat |
log-transformed expression matrix of interested signal data |
bg_mat |
log-transformed expression matrix of interested background data |
markers |
vector, a vector of gene names, listed the gene symbols to be filtered. Must be gene SYMBOLs. |
snr |
num, the cutoff of SNR to screen markers which are not or lowly expressed in bg_data |
gene_id |
character, specify the gene ID types of row names of sig_mat and bg_mat data, could be one of 'ENSEMBL', 'SYMBOL', 'ENTREZ'..., default 'SYMBOL' |
a vector of genes after filtration
data("im_data_6", "nk_markers", "ccle_crc_5") remove_bg_exp_mat( sig_mat = Biobase::exprs(im_data_6), bg_mat = ccle_crc_5$counts, markers = nk_markers$HGNC_Symbol[30:40], gene_id = c("ENSEMBL", "SYMBOL") )
data("im_data_6", "nk_markers", "ccle_crc_5") remove_bg_exp_mat( sig_mat = Biobase::exprs(im_data_6), bg_mat = ccle_crc_5$counts, markers = nk_markers$HGNC_Symbol[30:40], gene_id = c("ENSEMBL", "SYMBOL") )
select DEGs from multiple comparisons
select_sig(tfit, feature_selection = c("auto", "rankproduct", "none"), ...)
select_sig(tfit, feature_selection = c("auto", "rankproduct", "none"), ...)
tfit |
processed tfit by |
feature_selection |
one of "auto" (default), "rankproduct" or "none", choose if to use rank product or not to select DEGs from multiple comparisons of DE analysis, 'auto' uses 'rankproduct' but change to 'none' if final genes < 5 for both UP and DOWN |
... |
params for |
GeneSetCollection contains UP and DOWN gene sets
data("im_data_6") proc_data <- process_data( im_data_6, group_col = "celltype:ch1", target_group = "NK" ) select_sig(proc_data$tfit)
data("im_data_6") proc_data <- process_data( im_data_6, group_col = "celltype:ch1", target_group = "NK" ) select_sig(proc_data$tfit)
Make boxplot and show expression or score level of signature across subsets.
sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'matrix,vector,vector,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix,vector,vector,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'data.frame,vector,vector,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'list,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" )
sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'matrix,vector,vector,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix,vector,vector,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'data.frame,vector,vector,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'list,vector,character,character' sig_boxplot( data, sigs, group_col, target_group, type = c("score", "expression"), method = "t.test", slot = "counts", gene_id = "SYMBOL" )
data |
expression data, can be matrix, DGEList, eSet, seurat, sce... |
sigs |
a vector of signature (Symbols) |
group_col |
character or vector, specify the column name to compare in coldata |
target_group |
pattern, specify the group of interest as reference |
type |
one of "score" and "expression", to plot score or expression of the signature |
method |
a character string indicating which method to be used for
|
slot |
character, indicate which slot used as expression, optional |
gene_id |
character, indicate the ID type of rowname of expression data's , could be one of 'ENSEMBL', 'SYMBOL', ... default 'SYMBOL' |
patchwork or ggplot of boxplot
data("im_data_6", "nk_markers") p <- sig_boxplot( im_data_6, sigs = nk_markers$HGNC_Symbol[1:30], group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
data("im_data_6", "nk_markers") p <- sig_boxplot( im_data_6, sigs = nk_markers$HGNC_Symbol[1:30], group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
Visualize GSEA result with multiple lists of genes by using clusterProfiler
.
sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'MArrayLM,vector' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'MArrayLM,list' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'DGEList,ANY' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'ANY,ANY' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'list,ANY' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... )
sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'MArrayLM,vector' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'MArrayLM,list' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'DGEList,ANY' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'ANY,ANY' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... ) ## S4 method for signature 'list,ANY' sig_gseaplot( data, sigs, group_col, target_group, gene_id = "SYMBOL", slot = "counts", method = c("dotplot", "gseaplot"), col = "-log10(p.adjust)", size = "enrichmentScore", pvalue_table = FALSE, digits = 2, rank_stat = "logFC", ... )
data |
expression data, can be matrix, DGEList, eSet, seurat, sce... |
sigs |
a vector of signature (Symbols) or a list of signatures |
group_col |
character or vector, specify the column name to compare in coldata |
target_group |
pattern, specify the group of interest as reference |
gene_id |
character, indicate the ID type of rowname of expression data's , could be one of 'ENSEMBL', 'SYMBOL', ... default 'SYMBOL' |
slot |
character, indicate which slot used as expression, optional |
method |
one of "gseaplot" and "dotplot", how to plot GSEA result |
col |
column name of |
size |
column name of |
pvalue_table |
logical, if to add p value table if method = "gseaplot" |
digits |
num, specify the number of significant digits of pvalue table |
rank_stat |
character, specify which metric used to rank for GSEA, default "logFC" |
... |
params for function |
patchwork object for all comparisons
data("im_data_6", "nk_markers") sig_gseaplot( sigs = list( A = nk_markers$HGNC_Symbol[1:15], B = nk_markers$HGNC_Symbol[20:40], C = nk_markers$HGNC_Symbol[60:75] ), data = im_data_6, group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
data("im_data_6", "nk_markers") sig_gseaplot( sigs = list( A = nk_markers$HGNC_Symbol[1:15], B = nk_markers$HGNC_Symbol[20:40], C = nk_markers$HGNC_Symbol[60:75] ), data = im_data_6, group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
Compare the heatmap before and after screening.
sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'matrix,character,vector,missing' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'matrix,character,vector,vector' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'matrix,list,vector,missing' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'Matrix,ANY,vector,ANY' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'data.frame,ANY,vector,ANY' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'DGEList,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'ExpressionSet,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'Seurat,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'SummarizedExperiment,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'list,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... )
sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'matrix,character,vector,missing' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'matrix,character,vector,vector' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'matrix,list,vector,missing' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'Matrix,ANY,vector,ANY' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'data.frame,ANY,vector,ANY' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'DGEList,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'ExpressionSet,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = c("none", "row", "column"), gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'Seurat,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'SummarizedExperiment,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... ) ## S4 method for signature 'list,ANY,character,ANY' sig_heatmap( data, sigs, group_col, markers, scale = "none", gene_id = "SYMBOL", ranks_plot = FALSE, slot = "counts", ... )
data |
expression data, can be matrix, DGEList, eSet, seurat, sce... |
sigs |
a vector of signature (Symbols) or a list of signatures |
group_col |
character or vector, specify the column name to compare in coldata |
markers |
a vector of gene names, listed the gene symbols of original markers pool |
scale |
could be one of 'none' (default), 'row' or 'column' |
gene_id |
character, indicate the ID type of rowname of expression data's , could be one of 'ENSEMBL', 'SYMBOL', ... default 'SYMBOL' |
ranks_plot |
logical, if to use ranks instead of expression of genes to draw heatmap |
slot |
character, indicate which slot used as expression, optional |
... |
params for |
patchwork object of heatmap
data("im_data_6", "nk_markers") sig_heatmap( data = im_data_6, sigs = nk_markers$HGNC_Symbol[1:10], group_col = "celltype:ch1", gene_id = "ENSEMBL" )
data("im_data_6", "nk_markers") sig_heatmap( data = im_data_6, sigs = nk_markers$HGNC_Symbol[1:10], group_col = "celltype:ch1", gene_id = "ENSEMBL" )
Show the rank density of given signature in the given comparison.
sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'matrix,vector,vector' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix,vector,vector' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'data.frame,vector,vector' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'list,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" )
sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'matrix,vector,vector' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix,vector,vector' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'data.frame,vector,vector' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" ) ## S4 method for signature 'list,vector,character' sig_rankdensity_plot( data, sigs, group_col, aggregate = FALSE, slot = "counts", gene_id = "SYMBOL" )
data |
expression data, can be matrix, DGEList, eSet, seurat, sce... |
sigs |
a vector of signature (Symbols) |
group_col |
character or vector, specify the column name to compare in coldata |
aggregate |
logical, if to aggregate expression according to |
slot |
character, indicate which slot used as expression, optional |
gene_id |
character, indicate the ID type of rowname of expression data's , could be one of 'ENSEMBL', 'SYMBOL', ... default 'SYMBOL' |
ggplot or patchwork
data("im_data_6", "nk_markers") sig_rankdensity_plot( data = im_data_6, sigs = nk_markers$HGNC_Symbol[1:10], group_col = "celltype:ch1", gene_id = "ENSEMBL" )
data("im_data_6", "nk_markers") sig_rankdensity_plot( data = im_data_6, sigs = nk_markers$HGNC_Symbol[1:10], group_col = "celltype:ch1", gene_id = "ENSEMBL" )
Scatter plot depicts mean expression for each signature gene in the specific subset against other cell types.
sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'matrix,vector,vector,character' sig_scatter_plot( data, sigs, group_col, target_group, xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix,vector,vector,character' sig_scatter_plot( data, sigs, group_col, target_group, xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'list,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" )
sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'matrix,vector,vector,character' sig_scatter_plot( data, sigs, group_col, target_group, xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'Matrix,vector,vector,character' sig_scatter_plot( data, sigs, group_col, target_group, xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'DGEList,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'ExpressionSet,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'Seurat,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'SummarizedExperiment,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" ) ## S4 method for signature 'list,vector,character,character' sig_scatter_plot( data, sigs, group_col, target_group, slot = "counts", xint = 1, yint = 1, gene_id = "SYMBOL" )
data |
expression data, can be matrix, DGEList, eSet, seurat, sce... |
sigs |
a vector of signature (Symbols) |
group_col |
character or vector, specify the column name to compare in coldata |
target_group |
pattern, specify the group of interest as reference |
slot |
character, indicate which slot used as expression, optional |
xint |
intercept of vertical dashed line, default 1 |
yint |
intercept of horizontal dashed line, default 1 |
gene_id |
character, indicate the ID type of rowname of expression data's , could be one of 'ENSEMBL', 'SYMBOL', ... default 'SYMBOL' |
patchwork or ggplot of scatter plot of median expression
data("im_data_6", "nk_markers") sig_scatter_plot( sigs = nk_markers$HGNC_Symbol, data = im_data_6, group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
data("im_data_6", "nk_markers") sig_scatter_plot( sigs = nk_markers$HGNC_Symbol, data = im_data_6, group_col = "celltype:ch1", target_group = "NK", gene_id = "ENSEMBL" )
return DGEList containing vfit by limma::voom (if normalize = TRUE) and tfit by limma::treat
voom_fit_treat( dge, group_col, target_group, normalize = TRUE, group = FALSE, lfc = 0, p = 0.05, batch = NULL, summary = TRUE, ... )
voom_fit_treat( dge, group_col, target_group, normalize = TRUE, group = FALSE, lfc = 0, p = 0.05, batch = NULL, summary = TRUE, ... )
dge |
DGEList object for DE analysis, including expr and samples info |
group_col |
character, column name of coldata to specify the DE comparisons |
target_group |
pattern, specify the group of interest, e.g. NK |
normalize |
logical, if the expr in data is raw counts needs to be normalized |
group |
logical, TRUE to separate samples into only 2 groups: ‘target_group“ and ’Others'; FALSE to set each level as a group |
lfc |
num, cutoff of logFC for DE analysis |
p |
num, cutoff of p value for DE analysis and permutation test if feature_selection = "rankproduct" |
batch |
vector of character, column name(s) of coldata to be treated as batch effect factor, default NULL |
summary |
logical, if to show the summary of DE analysis |
... |
omitted |
A DGEList containing vfit and tfit