Title: | `orthos` is an R package for variance decomposition using conditional variational auto-encoders |
---|---|
Description: | `orthos` decomposes RNA-seq contrasts, for example obtained from a gene knock-out or compound treatment experiment, into unspecific and experiment-specific components. Original and decomposed contrasts can be efficiently queried against a large database of contrasts (derived from ARCHS4, https://maayanlab.cloud/archs4/) to identify similar experiments. `orthos` furthermore provides plotting functions to visualize the results of such a search for similar contrasts. |
Authors: | Panagiotis Papasaikas [aut, cre] , Charlotte Soneson [aut] , Michael Stadler [aut] , Friedrich Miescher Institute for Biomedical Research [cph] |
Maintainer: | Panagiotis Papasaikas <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.5.0 |
Built: | 2024-11-29 08:02:08 UTC |
Source: | https://github.com/bioc/orthos |
orthos decomposes RNA-seq contrasts, for example obtained from a genetic or compound treatment experiments, into non-specific and experiment-specific components. Original and decomposed contrasts can be efficiently queried against a large database of public contrasts (derived from ARCHS4, https://maayanlab.cloud/archs4/) to identify similar experiments. orthos furthermore provides plotting functions to visualize the results of such queries for similar contrasts.
Maintainer: Panagiotis Papasaikas [email protected] (ORCID)
Authors:
Charlotte Soneson [email protected] (ORCID)
Michael Stadler [email protected] (ORCID)
Other contributors:
Friedrich Miescher Institute for Biomedical Research [copyright holder]
Decompose input contrasts (gene expression deltas) to decoded (generic) and residual (unique) components according to a contrast encoder-decoder pre-trained on a large corpus of public RNAseq experiments.
decomposeVar( M, MD = NULL, treatm = NULL, cntr = NULL, processInput = TRUE, organism = c("Human", "Mouse"), featureType = c("AUTO", "ENSEMBL_GENE_ID", "GENE_SYMBOL", "ENTREZ_GENE_ID", "ARCHS4_ID"), pseudocount = 4, verbose = TRUE )
decomposeVar( M, MD = NULL, treatm = NULL, cntr = NULL, processInput = TRUE, organism = c("Human", "Mouse"), featureType = c("AUTO", "ENSEMBL_GENE_ID", "GENE_SYMBOL", "ENTREZ_GENE_ID", "ARCHS4_ID"), pseudocount = 4, verbose = TRUE )
M |
Matrix of raw gene counts. |
MD |
Matrix of gene deltas (optional). If |
treatm , cntr
|
Vectors indicating column indices in |
processInput |
If set to |
organism |
Selects the autoencoder model trained on data from this
species. One of |
featureType |
Set to |
pseudocount |
Numerical scalar, added to raw counts in |
verbose |
Logical scalar indicating whether to print messages along the way. |
When calling decomposeVar()
, you may see an ImportError
on the console. This most likely does not have any negative consequences,
rather it means that R and python may not be library compatible and that
an automated fallback approach is being used (for more details see
testload
argument of basiliskStart
).
A SummarizedExperiment
object
with the decomposed contrasts in the assays and the decomposed variance
as the colData
.
Panagiotis Papasaikas
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Specifying M, treatm and cntr: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Alternatively by specifying M and MD: pseudocount <- 4 M <- sweep(MKL1_human, 2, colSums(MKL1_human), FUN = "/") * 1e+06 M <- log2(M + pseudocount) DeltaM <- M[,c("MKL1","caMKL1")]-M[,"Ctrl"] # Matrix of contrasts ContextM <- M[,c("Ctrl","Ctrl")] # Matrix with context for the specified contrasts colnames(ContextM) <- colnames(DeltaM) # M and MD need identical dimnames RES <- decomposeVar(M = ContextM, MD = DeltaM, processInput = FALSE)
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Specifying M, treatm and cntr: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Alternatively by specifying M and MD: pseudocount <- 4 M <- sweep(MKL1_human, 2, colSums(MKL1_human), FUN = "/") * 1e+06 M <- log2(M + pseudocount) DeltaM <- M[,c("MKL1","caMKL1")]-M[,"Ctrl"] # Matrix of contrasts ContextM <- M[,c("Ctrl","Ctrl")] # Matrix with context for the specified contrasts colnames(ContextM) <- colnames(DeltaM) # M and MD need identical dimnames RES <- decomposeVar(M = ContextM, MD = DeltaM, processInput = FALSE)
Load a pre-calculated, organism-specific contrast database and return it
as a SummarizedExperiment
.
loadContrastDatabase( organism = c("Human", "Mouse"), mode = c("ANALYSIS", "DEMO"), mustWork = TRUE )
loadContrastDatabase( organism = c("Human", "Mouse"), mode = c("ANALYSIS", "DEMO"), mustWork = TRUE )
organism |
Character scalar selecting the organism for which to load the
contrast database. One of |
mode |
When in "ANALYSIS" mode (default) the complete contrast DB is queried. "DEMO" mode employs a small "toy" database for the queries. "DEMO" should only be used for testing/demonstration purposes and never for actual analysis purposes. |
mustWork |
Logical scalar. If |
Organism-specific databases are compiled in HDF5SummarizedExperiment objects. The first time 'loadContrastDatabase()' is called for a database either directly or via 'queryWithContrasts()' the required objects will be automatically downloaded from 'ExperimentHub' and cached in the user ExperimentHub directory (see 'ExperimentHub::getExperimentHubOption("CACHE")') using the 'orthosData' companion data-package.
A SummarizedExperiment
with pre-calculated contrasts as
assays.
Panagiotis Papasaikas, Michael Stadler
# !!!Note!!! mode="DEMO" for demonstration purposes only. Default is mode="ANALYSIS" SE_mouse_demoDB <- loadContrastDatabase (organism="Mouse", mode="DEMO") SE_mouse_demoDB SE_human_demoDB <- loadContrastDatabase (organism="Human", mode="DEMO") SE_human_demoDB
# !!!Note!!! mode="DEMO" for demonstration purposes only. Default is mode="ANALYSIS" SE_mouse_demoDB <- loadContrastDatabase (organism="Mouse", mode="DEMO") SE_mouse_demoDB SE_human_demoDB <- loadContrastDatabase (organism="Human", mode="DEMO") SE_human_demoDB
Visualize query results as a composite manhattan/density plot.
plotQueryResultsManh(queryResults, doPlot = TRUE)
plotQueryResultsManh(queryResults, doPlot = TRUE)
queryResults |
A list containing the results of a query performed with
|
doPlot |
Logical scalar specifying if a plot should be generated. |
A composite manhattan/density plot for the scores of queries using different contrast components against the respective contrast DBs.
Panagiotis Papasaikas, Michael Stadler
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Decompose contrasts: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Perform query against contrast DB with the decomposed fractions. # !!!Note!!! mode="DEMO" for demonstration purposes only. params <- BiocParallel::MulticoreParam(workers = 2) query.res.human <- queryWithContrasts(dec_MKL1_human, organism = "Human", BPPARAM = params, verbose = FALSE, mode = "DEMO") # plot results for individual contrasts using composite Manhattan/Density plots: ManhDensPlots <- plotQueryResultsManh(query.res.human, doPlot = FALSE) ManhDensPlots[["caMKL1"]]
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Decompose contrasts: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Perform query against contrast DB with the decomposed fractions. # !!!Note!!! mode="DEMO" for demonstration purposes only. params <- BiocParallel::MulticoreParam(workers = 2) query.res.human <- queryWithContrasts(dec_MKL1_human, organism = "Human", BPPARAM = params, verbose = FALSE, mode = "DEMO") # plot results for individual contrasts using composite Manhattan/Density plots: ManhDensPlots <- plotQueryResultsManh(query.res.human, doPlot = FALSE) ManhDensPlots[["caMKL1"]]
Visualize query results as violin plots
plotQueryResultsViolin(queryResults, doPlot = TRUE)
plotQueryResultsViolin(queryResults, doPlot = TRUE)
queryResults |
A list containing the results of a query performed with
|
doPlot |
Logical scalar specifying if a plot should be generated. |
A list of ggplot violin plots (one for each dataset) for the scores of queries using different contrast components against the respective contrast DBs.
Panagiotis Papasaikas, Michael Stadler
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Decompose contrasts: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Perform query against contrast DB with the decomposed fractions. # !!!Note!!! mode="DEMO" for demonstration purposes only. params <- BiocParallel::MulticoreParam(workers = 2) query.res.human <- queryWithContrasts(dec_MKL1_human, organism = "Human", BPPARAM = params, verbose = FALSE, mode = "DEMO") # plot results for individual contrasts using violin plots:: ViolinPlots <- plotQueryResultsViolin(query.res.human, doPlot = FALSE) ViolinPlots[["caMKL1"]]
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Decompose contrasts: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Perform query against contrast DB with the decomposed fractions. # !!!Note!!! mode="DEMO" for demonstration purposes only. params <- BiocParallel::MulticoreParam(workers = 2) query.res.human <- queryWithContrasts(dec_MKL1_human, organism = "Human", BPPARAM = params, verbose = FALSE, mode = "DEMO") # plot results for individual contrasts using violin plots:: ViolinPlots <- plotQueryResultsViolin(query.res.human, doPlot = FALSE) ViolinPlots[["caMKL1"]]
Query the contrast database with a set of contrasts
queryWithContrasts( contrasts, use = c("expressed.in.both", "all.genes"), exprThr = 0.25, organism = c("Human", "Mouse"), plotType = c("violin", "manh", "none"), detailTopn = 10, verbose = TRUE, BPPARAM = BiocParallel::bpparam(), chunk_size = 500, mode = c("ANALYSIS", "DEMO") )
queryWithContrasts( contrasts, use = c("expressed.in.both", "all.genes"), exprThr = 0.25, organism = c("Human", "Mouse"), plotType = c("violin", "manh", "none"), detailTopn = 10, verbose = TRUE, BPPARAM = BiocParallel::bpparam(), chunk_size = 500, mode = c("ANALYSIS", "DEMO") )
contrasts |
A |
use |
Determines if all.genes or genes expressed in both query and target context will be used. Note that "expressed.in.both", though more accurate, is slower. |
exprThr |
is the quantile in the provided context that determines the expression value above which a gene is considered to be expressed. This same value is then used for thresholding the contrast database. Only applies when use="expressed.in.both". |
organism |
Uses the 'orthosData' contrast database from this
species. One of |
plotType |
Select the type of visualization for the query results
|
detailTopn |
specifies the number of top hits for which metadata will be returned in the TopHits slot of the results. |
verbose |
Logical scalar indicating whether to print messages along the way. |
BPPARAM |
BiocParallelParam object specifying how parallelization is to
be performed using e.g. |
chunk_size |
Column dimension for the grid used to read blocks from the HDF5 Matrix. Sizes between 250 and 1000 are recommended. Smaller sizes reduce memory usage. |
mode |
When in "ANALYSIS" mode (default) the complete contrast DB is queried. "DEMO" mode employs a small "toy" database for the queries. "DEMO" should only be used for testing/demonstration purposes and never for actual analysis purposes. |
A list with three elements called "pearson.rhos", "zscores" and
"TopHits", containing raw and z-scored Pearson's rho correlation
coefficients between the query contrast(s) and the contrasts in the
database, as well as detailed metadata for the detailTopn
best
hits.
Panagiotis Papasaikas
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Decompose contrasts: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Perform query against contrast DB with the decomposed fractions. # !!!Note!!! mode="DEMO" for demonstration purposes only. params <- BiocParallel::MulticoreParam(workers = 2) query.res.human <- queryWithContrasts(dec_MKL1_human, organism = "Human", BPPARAM = params, verbose = FALSE, mode = "DEMO")
MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds", package = "orthos")) # Decompose contrasts: dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), organism = "Human", verbose = FALSE) # Perform query against contrast DB with the decomposed fractions. # !!!Note!!! mode="DEMO" for demonstration purposes only. params <- BiocParallel::MulticoreParam(workers = 2) query.res.human <- queryWithContrasts(dec_MKL1_human, organism = "Human", BPPARAM = params, verbose = FALSE, mode = "DEMO")
Test conda environment
testOrthosEnv()
testOrthosEnv()
A list indicating whether keras is available, and the version of TensorFlow.
Charlotte Soneson
testOrthosEnv()
testOrthosEnv()