Package 'smartid'

Title: Scoring and Marker Selection Method Based on Modified TF-IDF
Description: This package enables automated selection of group specific signature, especially for rare population. The package is developed for generating specifc lists of signature genes based on Term Frequency-Inverse Document Frequency (TF-IDF) modified methods. It can also be used as a new gene-set scoring method or data transformation method. Multiple visualization functions are implemented in this package.
Authors: Jinjin Chen [aut, cre]
Maintainer: Jinjin Chen <[email protected]>
License: MIT + file LICENSE
Version: 1.1.2
Built: 2024-07-26 03:14:36 UTC
Source: https://github.com/bioc/smartid

Help Index


calculate combined score

Description

compute TF (term/feature frequency), IDF (inverse document/cell frequency), IAE (inverse average expression of features) and combine the the final score

Usage

cal_score(
  data,
  tf = c("logtf", "tf"),
  idf = "prob",
  iae = "prob",
  slot = "counts",
  new.slot = "score",
  par.idf = NULL,
  par.iae = NULL
)

## S4 method for signature 'AnyMatrix'
cal_score(
  data,
  tf = c("logtf", "tf"),
  idf = "prob",
  iae = "prob",
  par.idf = NULL,
  par.iae = NULL
)

## S4 method for signature 'SummarizedExperiment'
cal_score(
  data,
  tf = c("logtf", "tf"),
  idf = "prob",
  iae = "prob",
  slot = "counts",
  new.slot = "score",
  par.idf = NULL,
  par.iae = NULL
)

Arguments

data

an expression object, can be matrix or SummarizedExperiment

tf

a character, specify the TF method to use, can be "tf" or "logtf"

idf

a character, specify the IDF method to use. Available methods can be accessed using idf_iae_methods()

iae

a character, specify the IAE method to use. Available methods can be accessed using idf_iae_methods()

slot

a character, specify which slot to use when data is se object, optional, default 'counts'

new.slot

a character, specify the name of slot to save score in se object, optional, default 'score'

par.idf

other parameters for specified IDF methods

par.iae

other parameters for specified IAE methods

Value

A list of matrices or se object containing combined score

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
cal_score(
  data,
  par.idf = list(label = sample(c("A", "B"), 10, replace = TRUE)),
  par.iae = list(label = sample(c("A", "B"), 10, replace = TRUE))
)

Calculate score for each feature in each cell

Description

Calculate score for each feature in each cell

Usage

cal_score_init(
  expr,
  tf = c("logtf", "tf"),
  idf = "prob",
  iae = "prob",
  par.idf = NULL,
  par.iae = NULL
)

Arguments

expr

a count matrix, features in row and cells in column

tf

a character, specify the TF method to use, can be "tf" or "logtf"

idf

a character, specify the IDF method to use. Available methods can be accessed using idf_iae_methods()

iae

a character, specify the IAE method to use. Available methods can be accessed using idf_iae_methods()

par.idf

other parameters for specified IDF methods

par.iae

other parameters for specified IAE methods

Value

a list of combined score, tf, idf and iae

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
label <- sample(c("A", "B"), 10, replace = TRUE)
smartid:::cal_score_init(data,
  par.idf = list(label = label),
  par.iae = list(label = label)
)

compute overall score based on the given marker list

Description

compute overall score based on the given marker list

Usage

gs_score(data, features = NULL, slot = "score", suffix = "score")

## S4 method for signature 'AnyMatrix,ANY'
gs_score(data, features = NULL)

## S4 method for signature 'AnyMatrix,list'
gs_score(data, features = NULL, suffix = "score")

## S4 method for signature 'SummarizedExperiment,ANY'
gs_score(data, features = NULL, slot = "score", suffix = "score")

Arguments

data

an expression object, can be matrix or SummarizedExperiment

features

vector or named list, feature names to compute score

slot

a character, specify which slot to use when data is se object, optional, default 'score'

suffix

a character, specify the name suffix to save score when features is a named list

Value

A vector of overall score for each sample

Examples

data <- matrix(rnorm(100), 10, dimnames = list(seq_len(10)))
gs_score(data, features = seq_len(3))

Calculate scores of each cell on given features

Description

Calculate scores of each cell on given features

Usage

gs_score_init(score, features = NULL)

Arguments

score

matrix, features in row and samples in column

features

vector, feature names to compute score

Value

a vector of score

Examples

data <- matrix(rnorm(100), 10, dimnames = list(1:10))
gs_score_init(data, 1:5)

standard inverse average expression

Description

standard inverse average expression

Usage

iae(expr, features = NULL, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

thres

numeric, cell only counts when expr > threshold, default 0

Details

IAEi=log(1+nN^i,j+1)\mathbf{IAE_i} = log(1+\frac{n}{\hat N_{i,j}+1})

where nn is the total number of cells, Ni,jN_{i,j} is the counts of feature ii in cell jj.

Value

a vector of inverse average expression score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::iae(data)

inverse average expression using hdbscan cluster as label

Description

inverse average expression using hdbscan cluster as label

Usage

iae_hdb(expr, features = NULL, multi = TRUE, thres = 0, minPts = 2, ...)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

multi

logical, if to compute based on binary (FALSE) or multi-class (TRUE)

thres

numeric, cell only counts when expr > threshold, default 0

minPts

integer, minimum size of clusters, default 2. Details in dbscan::hdbscan().

...

parameters for dbscan::hdbscan()

Details

Details as iae_prob().

Value

a matrix of inverse average expression score

Examples

set.seed(123)
data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::iae_hdb(data)

labeled inverse average expression: IGM

Description

labeled inverse average expression: IGM

Usage

iae_igm(expr, features = NULL, label, lambda = 7, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

label

vector, group label of each cell

lambda

numeric, hyperparameter for IGM

thres

numeric, cell only counts when expr > threshold, default 0

Details

IGMi=log(1+λmax(mean(Ni,jD)k)kK(mean(Ni,jD)krk)+e8)\mathbf{IGM_i} = log(1+\lambda\frac{max(mean(N_{i,j\in D})_{k})}{\sum_{k}^{K}(mean(N_{i,j\in D})_{k}*r_{k})+e^{-8}})

where λ\lambda is the hyper parameter, Ni,jDN_{i,j\in D} is the counts of feature ii in cell jj within class DD, and rkr_k is the rank of mean(Ni,jD)mean(N_{i,j\in D}).

Value

a vector of inverse gravity moment score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::iae_igm(data, label = sample(c("A", "B"), 10, replace = TRUE))

inverse average expression: max

Description

inverse average expression: max

Usage

iae_m(expr, features = NULL, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

thres

numeric, cell only counts when expr > threshold, default 0

Details

IAEi,j=log(1+max{ij}(ni)j=1nmax(0,Ni,jthreshold)+1)\mathbf{IAE_{i,j}} = log(1+\frac{max_{\{i^{'}\in j\}}(n_{i^{'}})}{\sum_{j = 1}^{n} max(0, N_{i,j} - threshold)+1})

where ii is the feature ii and ii^{'} is the feature except ii, Ni,jN_{i,j} is the counts of feature ii in cell jj, and nin_{i^{'}} is j=1nsign(Ni,j>threshold)\sum_{j = 1}^{n} sign(N_{i,j} > threshold).

Value

a matrix of inverse average expression score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::iae_m(data)

labeled inverse average expression: probability based

Description

labeled inverse average expression: probability based

Usage

iae_prob(expr, features = NULL, label, multi = TRUE, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

label

vector, group label of each cell

multi

logical, if to compute based on binary (FALSE) or multi-class (TRUE)

thres

numeric, cell only counts when expr > threshold, default 0

Details

IAEi,j=log(1+mean(Ni,jD)max(mean(Ni,jD^))+e8mean(Ni,jD))\mathbf{IAE_{i,j}} = log(1+\frac{mean(N_{i,j\in D})}{max(mean(N_{i,j\in \hat D}))+ e^{-8}}*mean(N_{i,j\in D}))

where Ni,jDN_{i,j\in D} is the counts of feature ii in cell jj within class DD, and D^\hat D is the class except DD.

Value

a matrix of inverse average expression score

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::iae_prob(data, label = sample(c("A", "B"), 10, replace = TRUE))

labeled inverse average expression: relative frequency

Description

labeled inverse average expression: relative frequency

Usage

iae_rf(expr, features = NULL, label, multi = TRUE, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

label

vector, group label of each cell

multi

logical, if to compute based on binary (FALSE) or multi-class (TRUE)

thres

numeric, cell only counts when expr > threshold, default 0

Details

IAE=log(1+mean(Ni,jD)max(mean(Ni,jD^))+e8)\mathbf{IAE} = log(1+\frac{mean(N_{i,j\in D})}{max(mean(N_{i,j\in \hat D}))+ e^{-8}})

where Ni,jDN_{i,j\in D} is the counts of feature ii in cell jj within class DD, and D^\hat D is the class except DD.

Value

a matrix of inverse average expression score

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::iae_rf(data, label = sample(c("A", "B"), 10, replace = TRUE))

inverse average expression using standard deviation (SD)

Description

inverse average expression using standard deviation (SD)

Usage

iae_sd(expr, features = NULL, log = FALSE, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

log

logical, if to do log-transformation

thres

numeric, cell only counts when expr > threshold, default 0

Details

IAE=log(1+sd(tfi)nj=1nmax(0,Ni,j)+1)\mathbf{IAE} = log(1+sd(tf_{i})*\frac{n}{\sum_{j=1}^{n}max(0,N_{i,j})+1})

where tfitf_i is the term frequency of feature ii, see details in tf(), nn is the total number of cells and Ni,jN_{i,j} is the counts of feature ii in cell jj.

Value

a vector of inverse average expression score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::iae_sd(data)

standard inverse cell frequency

Description

standard inverse cell frequency

Usage

idf(expr, features = NULL, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

thres

numeric, cell only counts when expr > threshold, default 0

Details

IDFi=log(1+nni+1)\mathbf{IDF_i} = log(1+\frac{n}{n_i+1})

where nn is the total number of cells, nin_i is the number of cells containing feature i.

Value

a vector of inverse cell frequency score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf(data)

inverse document frequency using hdbscan cluster as label

Description

inverse document frequency using hdbscan cluster as label

Usage

idf_hdb(expr, features = NULL, multi = TRUE, thres = 0, minPts = 2, ...)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

multi

logical, if to compute based on binary (FALSE) or multi-class (TRUE)

thres

numeric, cell only counts when expr > threshold, default 0

minPts

integer, minimum size of clusters, default 2. Details in dbscan::hdbscan().

...

parameters for dbscan::hdbscan()

Details

Details as idf_prob().

Value

a matrix of inverse cell frequency score

Examples

set.seed(123)
data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_hdb(data)

Get names of available IDF and IAE methods

Description

Returns a named vector of IDF/IAE methods

Usage

idf_iae_methods()

Value

names of methods implemented

Examples

idf_iae_methods()

labeled inverse cell frequency: IGM

Description

labeled inverse cell frequency: IGM

Usage

idf_igm(expr, features = NULL, label, lambda = 7, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

label

vector, group label of each cell

lambda

numeric, hyperparameter for IGM

thres

numeric, cell only counts when expr > threshold, default 0

Details

IGMi=log(1+λmax(ni,jD)kkK((ni,jD)krk)+e8)\mathbf{IGM_i} = log(1+\lambda\frac{max(n_{i,j\in D})_{k}}{\sum_{k}^{K}((n_{i,j\in D})_{k}*r_{k})+e^{-8}})

where λ\lambda is the hyper parameter, ni,jDn_{i,j\in D} is the number of cells containing feature ii in class DD, rkr_k is the rank of ni,jDn_{i,j\in D}.

Value

a vector of inverse gravity moment score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_igm(data, label = sample(c("A", "B"), 10, replace = TRUE))

inverse cell frequency: max

Description

inverse cell frequency: max

Usage

idf_m(expr, features = NULL, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

thres

numeric, cell only counts when expr > threshold, default 0

Details

IDFi,j=log(max{ij}(ni)ni+1)\mathbf{IDF_{i,j}} = log(\frac{max_{\{i^{'}\in j\}}(n_{i^{'}})}{n_i+1})

where ii is the feature ii and ii^{'} is the feature except ii, nin_i is the number of cells containing feature i, and nin_{i^{'}} is the number of cells containing feature ii^{'}.

Value

a matrix of inverse cell frequency score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_m(data)

labeled inverse cell frequency: probability based

Description

labeled inverse cell frequency: probability based

Usage

idf_prob(expr, features = NULL, label, multi = TRUE, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

label

vector, group label of each cell

multi

logical, if to compute based on binary (FALSE) or multi-class (TRUE)

thres

numeric, cell only counts when expr > threshold, default 0

Details

IDFi,j=log(1+ni,jDnjDmax(ni,jD^njD^)+e8ni,jDnjD)\mathbf{IDF_{i,j}} = log(1+\frac{\frac{n_{i,j\in D}}{n_{j\in D}}}{max(\frac{n_{i,j\in \hat D}}{n_{j\in \hat D}})+ e^{-8}}\frac{n_{i,j\in D}}{n_{j\in D}})

where ni,jDn_{i,j\in D} is the number of cells containing feature ii in class DD, njDn_{j\in D} is the total number of cells in class DD, D^\hat D is the class except DD.

Value

a matrix of inverse cell frequency score

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_prob(data, label = sample(c("A", "B"), 10, replace = TRUE))

labeled inverse cell frequency: relative frequency

Description

labeled inverse cell frequency: relative frequency

Usage

idf_rf(expr, features = NULL, label, multi = TRUE, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

label

vector, group label of each cell

multi

logical, if to compute based on binary (FALSE) or multi-class (TRUE)

thres

numeric, cell only counts when expr > threshold, default 0

Details

IDFi,j=log(1+ni,jDnjDmax(ni,jD^njD^)+e8)\mathbf{IDF_{i,j}} = log(1+\frac{\frac{n_{i,j\in D}}{n_{j\in D}}}{max(\frac{n_{i,j\in \hat D}}{n_{j\in \hat D}})+ e^{-8}})

where ni,jDn_{i,j\in D} is the number of cells containing feature ii in class DD, njDn_{j\in D} is the total number of cells in class DD, D^\hat D is the class except DD.

Value

a matrix of inverse cell frequency score

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_rf(data, label = sample(c("A", "B"), 10, replace = TRUE))

inverse cell frequency using standard deviation (SD)

Description

inverse cell frequency using standard deviation (SD)

Usage

idf_sd(expr, features = NULL, log = FALSE, thres = 0)

Arguments

expr

a matrix, features in row and cells in column

features

vector, feature names or indexes to compute

log

logical, if to do log-transformation

thres

numeric, cell only counts when expr > threshold, default 0

Details

IDFi=log(1+sd(tfi)nni+1)\mathbf{IDF_i} = log(1+sd(tf_{i})*\frac{n}{n_i+1})

where tfitf_i is the term frequency of feature ii, see details in tf(), nn is the total number of cells and nin_i is the number of cells containing feature ii.

Value

a vector of inverse cell frequency score for each feature

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::idf_sd(data)

select markers using HDBSCAN method

Description

select markers using HDBSCAN method

Usage

markers_hdbscan(
  top_markers,
  column = ".dot",
  s_thres = NULL,
  method = c("max.one", "remove.min"),
  minPts = 5,
  plot = FALSE,
  ...
)

Arguments

top_markers

output of top_markers()

column

character, specify which column used as group label

s_thres

NULL or numeric, only features with score > threshold will be returned, if NULL will use 2 * average probability as threshold

method

can be "max.one" or "remove.min", if to only keep features in 1st component or return features not in the last component

minPts

integer, minimum size of clusters for dbscan::hdbscan()

plot

logical, if to plot mixture density and hist

...

other params for dbscan::hdbscan()

Value

a list of markers for each group

Examples

data <- matrix(rnorm(100), 10, dimnames = list(1:10))
top_n <- top_markers(data, label = rep(c("A", "B"), 5))
markers_hdbscan(top_n, minPts = 2)

select markers using mclust EM method

Description

select markers using mclust EM method

Usage

markers_mclust(
  top_markers,
  column = ".dot",
  prob = 0.99,
  s_thres = NULL,
  method = c("max.one", "remove.min"),
  plot = FALSE,
  ...
)

Arguments

top_markers

output of top_markers()

column

character, specify which column used as group label

prob

numeric, probability cutoff for 1st component classification

s_thres

NULL or numeric, only features with score > threshold will be returned, if NULL will use 2 * average probability as threshold

method

can be "max.one" or "remove.min", if to only keep features in 1st component or return features not in the last component

plot

logical, if to plot mixture density and hist

...

other params for mclust::densityMclust()

Value

a list of markers for each group

Examples

data <- matrix(rnorm(100), 10, dimnames = list(1:10))
top_n <- top_markers(data, label = rep(c("A", "B"), 5))
markers_mclust(top_n)

select markers using mixtools EM method

Description

select markers using mixtools EM method

Usage

markers_mixmdl(
  top_markers,
  column = ".dot",
  prob = 0.99,
  k = 3,
  ratio = 2,
  dist = c("norm", "gamma"),
  maxit = 1e+05,
  plot = FALSE,
  ...
)

Arguments

top_markers

output of top_markers()

column

character, specify which column used as group label

prob

numeric, probability cutoff for 1st component classification

k

integer, number of components of mixtures

ratio

numeric, ratio cutoff of 1st component mu to 2nd component mu, only when ratio > cutoff will return markers for the group

dist

can be one of "norm" and "gamma", specify if to use mixtools::normalmixEM() or mixtools::gammamixEM()

maxit

integer, maximum number of iterations for EM

plot

logical, if to plot mixture density and hist

...

other params for mixtools::normalmixEM() or mixtools::gammamixEM()

Value

a list of markers for each group

Examples

set.seed(1000)
data <- matrix(rnorm(100), 10, dimnames = list(1:10))
top_n <- top_markers(data, label = rep(c("A", "B"), 5))
markers_mixmdl(top_n, k = 3)

boxplot of features overall score

Description

boxplot of features overall score

Usage

ova_score_boxplot(data, features, ref.group, label, method = "t.test")

Arguments

data

matrix, features in row and samples in column

features

vector, feature names to plot

ref.group

character, reference group name

label

vector, group labels

method

character, statistical test to use, details in ggpubr::stat_compare_means()

Value

ggplot object

Examples

data <- matrix(rnorm(100), 10, dimnames = list(1:10))
ova_score_boxplot(data, 1:5, ref.group = "A", label = rep(c("A", "B"), 5))

scale by mean of group mean for imbalanced data

Description

scale by mean of group mean for imbalanced data

Usage

scale_mgm(expr, label, pooled.sd = FALSE)

Arguments

expr

matrix

label

a vector of group label

pooled.sd

logical, if to use pooled SD for scaling

Details

z=xknD(μk)nDsz=\frac{x-\frac{\sum_k^{n_D}(\mu_k)}{n_D}}{s}

where μk\mu_k is the mean of x in kthk^{th} class, and nDn_D is the number of classes, ss is the standard deviation of x, when pooled.sd is set to be TRUE, ss will be replaced with spooleds_{pooled}, spooled=knD(nk1)sk2knDnkks_{pooled}=\sqrt{\frac{\sum_k^{n_D}{(n_k-1){s_k}^2}}{\sum_k^{n_D}{n_k}-k}}

Value

scaled matrix

Examples

scale_mgm(matrix(rnorm(100), 10), label = rep(letters[1:2], 5))

barplot of processed score

Description

barplot of processed score

Usage

score_barplot(top_markers, column = ".dot", f_list, n = 30)

Arguments

top_markers

output of top_markers()

column

character, specify which column used as group label

f_list

a named list of markers

n

numeric, number of returned top genes for each group

Value

ggplot object

Examples

data <- matrix(rnorm(100), 10, dimnames = list(1:10))
top_n <- top_markers(data, label = rep(c("A", "B"), 5))
score_barplot(top_n)

scRNA-seq test data of 4 groups simulated by splatter.

Description

A SingleCellExperiment object containing 4 groups with each group up-regulated DEGs saved in metadata.

Usage

data(sim_sce_test)

Format

A SingleCellExperiment object of 100genes * 400 cells.

Value

SingleCellExperiment

Source

splatter::splatSimulate()


boxplot of split single feature score

Description

boxplot of split single feature score

Usage

sin_score_boxplot(data, features = NULL, ref.group, label, method = "t.test")

Arguments

data

matrix, features in row and samples in column

features

vector, feature names to plot

ref.group

character, reference group name

label

vector, group labels

method

character, statistical test to use, details in ggpubr::stat_compare_means()

Value

faceted ggplot object

Examples

data <- matrix(rnorm(100), 10, dimnames = list(1:10))
sin_score_boxplot(data, 1:2, ref.group = "A", label = rep(c("A", "B"), 5))

compute term/feature frequency within each cell

Description

compute term/feature frequency within each cell

Usage

tf(expr, log = FALSE)

Arguments

expr

a count matrix, features in row and cells in column

log

logical, if to do log-transformation

Details

TFi,j=Ni,jjNi,j\mathbf{TF_{i,j}}=\frac{N_{i,j}}{\sum_j{N_{i,j}}}

where Ni,jN_{i,j} is the counts of feature i in cell j.

Value

a matrix of term/gene frequency

Examples

data <- matrix(rpois(100, 2), 10, dimnames = list(1:10))
smartid:::tf(data)

scale score and return top markers

Description

scale and transform score and output top markers for groups

Usage

top_markers(
  data,
  label,
  n = 10,
  use.glm = TRUE,
  scale = TRUE,
  use.mgm = TRUE,
  softmax = TRUE,
  slot = "score",
  ...
)

## S4 method for signature 'AnyMatrix'
top_markers(
  data,
  label,
  n = 10,
  use.glm = TRUE,
  scale = TRUE,
  use.mgm = TRUE,
  softmax = TRUE,
  slot = "score",
  ...
)

## S4 method for signature 'SummarizedExperiment'
top_markers(
  data,
  label,
  n = 10,
  use.glm = TRUE,
  scale = TRUE,
  use.mgm = TRUE,
  softmax = TRUE,
  slot = "score",
  ...
)

Arguments

data

an expression object, can be matrix or SummarizedExperiment

label

a vector of group label

n

integer, number of returned top genes for each group

use.glm

logical, if to use stats::glm() to compute group mean score, if TRUE, also compute mean score difference as output

scale

logical, if to scale data by row

use.mgm

logical, if to scale data using scale_mgm()

softmax

logical, if to apply softmax transformation on output

slot

a character, specify which slot to use when data is se object, optional, default 'score'

...

params for top_markers_abs() or top_markers_glm()

Value

A tibble with top n feature names, group labels and ordered scores

Examples

data <- matrix(rgamma(100, 2), 10, dimnames = list(1:10))
top_markers(data, label = rep(c("A", "B"), 5))

calculate group median, MAD or mean score and order genes based on scores

Description

calculate group median, MAD or mean score and order genes based on scores

Usage

top_markers_abs(
  data,
  label,
  n = 10,
  pooled.sd = FALSE,
  method = c("median", "mad", "mean"),
  scale = TRUE,
  use.mgm = TRUE,
  softmax = TRUE,
  tau = 1
)

Arguments

data

matrix, features in row and samples in column

label

a vector of group label

n

integer, number of returned top genes for each group

pooled.sd

logical, if to use pooled SD for scaling

method

character, specify metric to compute, can be one of "median", "mad", "mean"

scale

logical, if to scale data by row

use.mgm

logical, if to scale data using scale_mgm()

softmax

logical, if to apply softmax transformation on output

tau

numeric, hyper parameter for softmax

Value

a tibble with feature names, group labels and ordered processed scores

Examples

data <- matrix(rgamma(100, 2), 10, dimnames = list(1:10))
top_markers_abs(data, label = rep(c("A", "B"), 5))

calculate group mean score using glm and order genes based on scores difference

Description

calculate group mean score using glm and order genes based on scores difference

Usage

top_markers_glm(
  data,
  label,
  n = 10,
  family = gaussian(),
  scale = TRUE,
  use.mgm = TRUE,
  pooled.sd = FALSE,
  softmax = TRUE,
  tau = 1
)

Arguments

data

matrix, features in row and samples in column

label

a vector of group label

n

integer, number of returned top genes for each group

family

family for glm, details in stats::glm()

scale

logical, if to scale data by row

use.mgm

logical, if to scale data using scale_mgm()

pooled.sd

logical, if to use pooled SD for scaling

softmax

logical, if to apply softmax transformation on output

tau

numeric, hyper parameter for softmax

Value

a tibble with feature names, group labels and ordered processed scores

Examples

data <- matrix(rgamma(100, 2), 10, dimnames = list(1:10))
top_markers_glm(data, label = rep(c("A", "B"), 5))

compute group summarized score and order genes based on processed scores

Description

compute group summarized score and order genes based on processed scores

Usage

top_markers_init(
  data,
  label,
  n = 10,
  use.glm = TRUE,
  scale = TRUE,
  use.mgm = TRUE,
  softmax = TRUE,
  ...
)

Arguments

data

matrix, features in row and samples in column

label

a vector of group label

n

integer, number of returned top genes for each group

use.glm

logical, if to use stats::glm() to compute group mean score, if TRUE, also compute mean score difference as output

scale

logical, if to scale data by row

use.mgm

logical, if to scale data using scale_mgm()

softmax

logical, if to apply softmax transformation on output

...

params for top_markers_abs() or top_markers_glm()

Value

a tibble with feature names, group labels and ordered processed scores

Examples

data <- matrix(rgamma(100, 2), 10, dimnames = list(1:10))
top_markers_init(data, label = rep(c("A", "B"), 5))