Package 'MOSim' reference manual

Title:	Multi-Omics Simulation (MOSim)
Description:	MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.
Authors:	Carolina Monzó [aut], Carlos Martínez [aut], Sonia Tarazona [cre, aut]
Maintainer:	Sonia Tarazona <sotacam@gmail.com>
License:	GPL-3
Version:	2.3.0
Built:	2025-04-01 08:31:48 UTC
Source:	https://github.com/bioc/MOSim

Data to showcase scRNA and scATAC-seq association

Description

Data to showcase scRNA and scATAC-seq association

Usage

data("associationList")
data("associationList")

Format

A dataframe with two columns and rows according to gene/feature relationships

Peak_ID: ATAC chromosomic positions associated to genes
Gene_ID: RNA genes associated to peaks

@source Created in-house to serve as an example

calculate_mean_per_list_df

Description

Helper function to calculate mean expression per celltype

Usage

calculate_mean_per_list_df(df, named_lists)
calculate_mean_per_list_df(df, named_lists)

Arguments

`df`	dataframe of expression where columns are cells
`named_lists`	list of which cells belong to each celltype

Examples

rna <- data.frame(c1 = c(1.5, 15.5, 3.5, 20.5), c2 = c(2, 15, 4, 20), 
          c3 = c(10, 1, 12, 13), c4 = c(11, 1, 13, 14))
cell_types <- list("ct1" = c(1,2), "ct2" = c(3, 4))
calculate_mean_per_list_df(rna, cell_types)
rna <- data.frame(c1 = c(1.5, 15.5, 3.5, 20.5), c2 = c(2, 15, 4, 20), 
          c3 = c(10, 1, 12, 13), c4 = c(11, 1, 13, 14))
cell_types <- list("ct1" = c(1,2), "ct2" = c(3, 4))
calculate_mean_per_list_df(rna, cell_types)

check_patterns

Description

Function to check if the TRUE FALSE patterns have at least two rows that are opposite, we need this to be able to generate repressor regulators

Usage

check_patterns(patterns)
check_patterns(patterns)

Arguments

patterns

tibble of TRUE FALSE values

Value

list of indices where the rows are opposite

Examples

patterns <- tibble::tibble(one = c(TRUE, FALSE, TRUE, FALSE), 
              two = c(TRUE, TRUE, TRUE, TRUE), 
              three = c(FALSE, TRUE, FALSE, TRUE), 
              four = c(FALSE, TRUE, TRUE, TRUE))
opposite_indices <- check_patterns(patterns)

patterns <- tibble::tibble(one = c(TRUE, FALSE, TRUE, FALSE), 
              two = c(TRUE, TRUE, TRUE, TRUE), 
              three = c(FALSE, TRUE, FALSE, TRUE), 
              four = c(FALSE, TRUE, TRUE, TRUE))
opposite_indices <- check_patterns(patterns)

Discretize ChIP-Seq counts to simulate a binary dataset

Description

Discretize ChIP-Seq counts to simulate a binary dataset

Usage

discretize(df, omic)
discretize(df, omic)

Arguments

`df`	A MOSimulated object
`omic`	Character string of the omic to transform into binary data

Value

A regulator dataframe of 0 and 1

Examples

omic_list <- c("RNA-seq", "ChIP-seq")
rnaseq_simulation <- mosim(omics = omic_list, 
    omicsOptions = c(omicSim("ChIP-seq", totalFeatures = 2500)))
rnaseq_simulated <- omicResults(rnaseq_simulation, omic_list)
discrete_ChIP <- discretize(rnaseq_simulated, "ChIP-seq")

omic_list <- c("RNA-seq", "ChIP-seq")
rnaseq_simulation <- mosim(omics = omic_list, 
    omicsOptions = c(omicSim("ChIP-seq", totalFeatures = 2500)))
rnaseq_simulated <- omicResults(rnaseq_simulation, omic_list)
discrete_ChIP <- discretize(rnaseq_simulated, "ChIP-seq")

Retrieves the experimental design

Description

Retrieves the experimental design

Usage

experimentalDesign(simulation)
experimentalDesign(simulation)

Arguments

simulation

A MOSimulation object

Value

A data frame containing the experimental design used to simulate the data.

Examples


omic_list <- c("RNA-seq")
rnaseq_simulation <- mosim(omics = omic_list)
# This will be a data frame with RNA-seq counts

design_matrix <- experimentalDesign(rnaseq_simulation)

omic_list <- c("RNA-seq")
rnaseq_simulation <- mosim(omics = omic_list)
# This will be a data frame with RNA-seq counts

design_matrix <- experimentalDesign(rnaseq_simulation)

Check if a variable is declared.

Description

Check if a variable is declared.

Usage

is.declared(object, key = NULL)
is.declared(object, key = NULL)

Arguments

`object`	Variable name to check
`key`	Optional key to check inside object.

Value

TRUE or FALSE indicating if the variable is initialized & non-empty.

This function generates a dataframe containing the information of the relationship between ATAC and RNA, based on the cluster groups, and then tells the order the genes and peaks should be in the simulated dataframe of the group

Usage

make_association_dataframe(
  group,
  genereggroup,
  numtotalgenes,
  numtotalpeaks,
  minFC,
  maxFC
)
make_association_dataframe(
  group,
  genereggroup,
  numtotalgenes,
  numtotalpeaks,
  minFC,
  maxFC
)

Arguments

`group`	Group from which we are generating the association dataframe
`genereggroup`	list of elements to generate the association dataframe such as clusters of each omic, indices of opposite clusters, which genes are activated, repressed, behavior of the features etc.
`numtotalgenes`	total number of genes
`numtotalpeaks`	total number of peaks
`minFC`	FC below which is downregulated
`maxFC`	FC above which is upregulated

Value

a dataframe with all the information the user needs about each gene and the order of gene and peak names to rename them in the simulated datasets of the group

make_cluster_patterns

Description

Function to make the tibble with cluster combinations for the gene expression patterns along the cells This function is a slightly modified copy of the 'make_cluster_patterns' function from the 'Acorde' package (v1.0.0), originally developed by Arzalluz-Luque A, Salguero P, Tarazona S, Conesa A. (2022). acorde unravels functionally interpretable networks of isoform co-usage from single cell data. Nature communications 1828. DOI: 10.1038/s41467-022-29497-w. The original package is licensed under the GPL-3 license.

Usage

make_cluster_patterns(numcells = 4, clusters = 8)
make_cluster_patterns(numcells = 4, clusters = 8)

Arguments

`numcells`	Number of different celltypes we are simulating
`clusters`	OPTIONAL. Number of co-expression patterns the user wants to simulate

Value

A tibble with number of columns equal to number of celltypes, rows according to the number of TRUE/FALSE combinations corresponding to the gene expression patterns along the cells

Examples

patterns <- make_cluster_patterns(numcells = 4, clusters = 8)
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
patterns <- make_cluster_patterns(numcells = length(cell_types), 
    clusters = 8)

patterns <- make_cluster_patterns(numcells = 4, clusters = 8)
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
patterns <- make_cluster_patterns(numcells = length(cell_types), 
    clusters = 8)

match_gene_regulator

Description

Helper function to make the most similar profiles possible between gene and regulator

Usage

match_gene_regulator(rna, atac, cell_types, associationList)
match_gene_regulator(rna, atac, cell_types, associationList)

Arguments

`rna`	dataframe of RNA expression
`atac`	dataframe of ATAC expression
`cell_types`	list of which cells belong to each celltype
`associationList`	dataframe of two columns, Gene_ID and Peak_ID

Examples

rna <- data.frame(c1 = c(1.5, 15.5, 3.5, 20.5), c2 = c(2, 15, 4, 20), 
        c3 = c(10, 1, 12, 13), c4 = c(11, 1, 13, 14), c5 = c(7, 0, 0, 0), 
        c6 = c(8, 1, 1, 1), c7 = c(8, 1, 1, 1))
rownames(rna) <- c('GenB', 'GenA', 'GenC', 'GenD')
associationList <- data.frame(Gene_ID = c('GenA', 'GenB', 'GenC', 'GenA'),
        Peak_ID = c('PeakA', 'PeakB', 'PeakC', 'PeakD'))
cell_types <- list("ct1" = c(1,2), "ct2" = c(3, 4), "ct3" = c(5, 6), "ct4" = c(7))
atac <- data.frame(c1 = c(3,20, 1,15, 1, 7, 1), c2 = c(4,20, 2,15, 0, 5, 1.5), 
        c3 = c(10, 13, 1, 12, 1, 14, 9), c4 = c(11, 14, 1, 13, 1, 4, 12), 
        c5 = c(0, 0, 0, 7, 1, 6, 6), c6 = c(1, 1, 1, 8, 0, 5, 8), 
        c7 = c(1, 1, 1, 8, 1, 5, 7))
rownames(atac) <- c('PeakB', "PeakC", "PeakF", "PeakD", "PeakE", "PeakA", "PeakG")
match_gene_regulator(rna, atac, cell_types, associationList)
rna <- data.frame(c1 = c(1.5, 15.5, 3.5, 20.5), c2 = c(2, 15, 4, 20), 
        c3 = c(10, 1, 12, 13), c4 = c(11, 1, 13, 14), c5 = c(7, 0, 0, 0), 
        c6 = c(8, 1, 1, 1), c7 = c(8, 1, 1, 1))
rownames(rna) <- c('GenB', 'GenA', 'GenC', 'GenD')
associationList <- data.frame(Gene_ID = c('GenA', 'GenB', 'GenC', 'GenA'),
        Peak_ID = c('PeakA', 'PeakB', 'PeakC', 'PeakD'))
cell_types <- list("ct1" = c(1,2), "ct2" = c(3, 4), "ct3" = c(5, 6), "ct4" = c(7))
atac <- data.frame(c1 = c(3,20, 1,15, 1, 7, 1), c2 = c(4,20, 2,15, 0, 5, 1.5), 
        c3 = c(10, 13, 1, 12, 1, 14, 9), c4 = c(11, 14, 1, 13, 1, 4, 12), 
        c5 = c(0, 0, 0, 7, 1, 6, 6), c6 = c(1, 1, 1, 8, 0, 5, 8), 
        c7 = c(1, 1, 1, 8, 1, 5, 7))
rownames(atac) <- c('PeakB', "PeakC", "PeakF", "PeakD", "PeakE", "PeakA", "PeakG")
match_gene_regulator(rna, atac, cell_types, associationList)

match_gene_regulator_cluster

Description

match_gene_regulator_cluster

Usage

match_gene_regulator_cluster(rna, atac, cell_types, associationMatrix)
match_gene_regulator_cluster(rna, atac, cell_types, associationMatrix)

Arguments

`rna`	rna expression dataframe
`atac`	atac expression dataframe
`cell_types`	list of which cells belong to each celltype
`associationMatrix`	matrix of related genes and peaks

Examples

rna <- data.frame(c1 = c(1.5, 15.5, 3.5, 20.5), c2 = c(2, 15, 4, 20), 
        c3 = c(10, 1, 12, 13), c4 = c(11, 1, 13, 14), c5 = c(7, 0, 0, 0), 
        c6 = c(8, 1, 1, 1), c7 = c(8, 1, 1, 1))
rownames(rna) <- c('GenB', 'GenA', 'GenC', 'GenD')
associationList <- data.frame(Gene_ID = c('GenA', 'GenB', 'GenC', 'GenA'),
        Peak_ID = c('PeakA', 'PeakB', 'PeakC', 'PeakD'),
        Gene_cluster = c(1, 2, 1, 2), Peak_cluster = c(1, 2, 1, 2))
cell_types <- list("ct1" = c(1,2), "ct2" = c(3, 4), "ct3" = c(5, 6), "ct4" = c(7))
atac <- data.frame(c1 = c(3,20, 1,15, 1, 7, 1), c2 = c(4,20, 2,15, 0, 5, 1.5), 
        c3 = c(10, 13, 1, 12, 1, 14, 9), c4 = c(11, 14, 1, 13, 1, 4, 12), 
        c5 = c(0, 0, 0, 7, 1, 6, 6), c6 = c(1, 1, 1, 8, 0, 5, 8), 
        c7 = c(1, 1, 1, 8, 1, 5, 7))
rownames(atac) <- c('PeakB', "PeakC", "PeakF", "PeakD", "PeakE", "PeakA", "PeakG")
match_gene_regulator_cluster(rna, atac, cell_types, associationList)
rna <- data.frame(c1 = c(1.5, 15.5, 3.5, 20.5), c2 = c(2, 15, 4, 20), 
        c3 = c(10, 1, 12, 13), c4 = c(11, 1, 13, 14), c5 = c(7, 0, 0, 0), 
        c6 = c(8, 1, 1, 1), c7 = c(8, 1, 1, 1))
rownames(rna) <- c('GenB', 'GenA', 'GenC', 'GenD')
associationList <- data.frame(Gene_ID = c('GenA', 'GenB', 'GenC', 'GenA'),
        Peak_ID = c('PeakA', 'PeakB', 'PeakC', 'PeakD'),
        Gene_cluster = c(1, 2, 1, 2), Peak_cluster = c(1, 2, 1, 2))
cell_types <- list("ct1" = c(1,2), "ct2" = c(3, 4), "ct3" = c(5, 6), "ct4" = c(7))
atac <- data.frame(c1 = c(3,20, 1,15, 1, 7, 1), c2 = c(4,20, 2,15, 0, 5, 1.5), 
        c3 = c(10, 13, 1, 12, 1, 14, 9), c4 = c(11, 14, 1, 13, 1, 4, 12), 
        c5 = c(0, 0, 0, 7, 1, 6, 6), c6 = c(1, 1, 1, 8, 0, 5, 8), 
        c7 = c(1, 1, 1, 8, 1, 5, 7))
rownames(atac) <- c('PeakB', "PeakC", "PeakF", "PeakD", "PeakE", "PeakA", "PeakG")
match_gene_regulator_cluster(rna, atac, cell_types, associationList)

mosim

Description

Performs a multiomic simulation by chaining two actions: 1) Creating the "MOSimulation" class with the provided params. 2) Calling "simulate" method on the initialized object.

Usage

mosim(
  omics,
  omicsOptions,
  diffGenes,
  numberReps,
  numberGroups,
  times,
  depth,
  profileProbs,
  minMaxFC,
  TFtoGene
)
mosim(
  omics,
  omicsOptions,
  diffGenes,
  numberReps,
  numberGroups,
  times,
  depth,
  profileProbs,
  minMaxFC,
  TFtoGene
)

Arguments

`omics`	Character vector containing the names of the omics to simulate, which can be "RNA-seq", "miRNA-seq", "DNase-seq", "ChIP-seq" or "Methyl-seq" (e.g. c("RNA-seq", "miRNA-seq")). It can also be a list with the omic names as names and their options as values, but we recommend to use the argument omicSim to provide the options to simulated each omic.
`omicsOptions`	List containing the options to simulate each omic. We recommend to apply the helper method omicSim to create this list in a friendly way, and the function omicData to provide custom data (see the related sections for more information). Each omic may have different configuration parameters, but the common ones are: simuData/idToGene Seed sample and association tables for regulatory omics. The helper function omicData should be used to provide this information (see the following section). regulatorEffect For regulatory omics. List containing the percentage of effect types (repressor, activator or no effect) over the total number of regulators. See vignette for more information. totalFeatures Number of features to simulate. By default, the total number of features in the seed dataset. depth Sequencing depth in millions of reads. If not provided, it takes the global parameter passed to mosim function. replicateParams List with parameters a and b for adjusting the variability in the generation of replicates using the negative binomial. See vignette for more information.
`diffGenes`	Number of differentially expressed genes to simulate, given in percentage (0 - 1) or in absolute number (> 1). By default 0.15
`numberReps`	Number of replicates per experimetal condition (and time point, if time series are to be generated). By default 3.
`numberGroups`	Number of experimental groups or conditions to simulate.
`times`	Vector of time points to consider in the experimental design.
`depth`	Sequencing depth in millions of reads.
`profileProbs`	Numeric vector with the probabilities to assign each of the patterns. Defaults to 0.2 for each.
`minMaxFC`	Numeric vector of length 2 with minimum and maximum fold-change for differentially expressed features, respectively.
`TFtoGene`	A logical value indicating if default transcription factors data should be used (TRUE) or not (FALSE), or a 3 column data frame containing custom associations. By default FALSE.

Value

Instance of class "MOSimulation" containing the multiomic simulation data.

Examples


 moSimulation <- mosim(
     omics = c("RNA-seq"),
     numberReps = 3,
     times = c(0, 2, 6, 12, 24)
 )

 # Retrieve simulated count matrix for RNA-seq
 dataRNAseq <- omicResults(moSimulation, "RNA-seq")


moSimulation <- mosim(
     omics = c("RNA-seq"),
     numberReps = 3,
     times = c(0, 2, 6, 12, 24)
 )

 # Retrieve simulated count matrix for RNA-seq
 dataRNAseq <- omicResults(moSimulation, "RNA-seq")

Set customized data for an omic.

Description

Set customized data for an omic.

Usage

omicData(omic, data = NULL, associationList = NULL)
omicData(omic, data = NULL, associationList = NULL)

Arguments

`omic`	The name of the omic to provide data.
`data`	Data frame with the omic identifiers as row names and just one column named Counts containing numeric values used as initial sample for the simulation.
`associationList`	Only for regulatory omics, a data frame with 2 columns, the first called containing the regulator ID and the second called Gene with the gene identifier.

Value

Initialized simulation object with the given data.

Examples


# Take a subset of the included dataset for illustration
# purposes. We could also load it from a csv file or RData,
# as long as we transform it to have 1 column named "Counts"
# and the identifiers as row names.

data(sampleData)

custom_rnaseq <- head(sampleData$SimRNAseq$data, 100)

# In this case, 'custom_rnaseq' is a data frame with
# the structure:
head(custom_rnaseq)
##                    Counts
## ENSMUSG00000000001   6572
## ENSMUSG00000000003      0
## ENSMUSG00000000028   4644
## ENSMUSG00000000031      8
## ENSMUSG00000000037      0
## ENSMUSG00000000049      0


# The helper 'omicData' returns an object with our custom data.
rnaseq_customdata <- omicData("RNA-seq", data = custom_rnaseq)

# Take a subset of the included dataset for illustration
# purposes. We could also load it from a csv file or RData,
# as long as we transform it to have 1 column named "Counts"
# and the identifiers as row names.

data(sampleData)

custom_rnaseq <- head(sampleData$SimRNAseq$data, 100)

# In this case, 'custom_rnaseq' is a data frame with
# the structure:
head(custom_rnaseq)
##                    Counts
## ENSMUSG00000000001   6572
## ENSMUSG00000000003      0
## ENSMUSG00000000028   4644
## ENSMUSG00000000031      8
## ENSMUSG00000000037      0
## ENSMUSG00000000049      0


# The helper 'omicData' returns an object with our custom data.
rnaseq_customdata <- omicData("RNA-seq", data = custom_rnaseq)

Retrieves the simulated data.

Description

Retrieves the simulated data.

Usage

omicResults(simulation, omics = NULL, format = "data.frame")
omicResults(simulation, omics = NULL, format = "data.frame")

Arguments

`simulation`	A MOSimulation object.
`omics`	List of the omics to retrieve the simulated data.
`format`	Type of object to use for returning the results

Value

A list containing an element for every omic specifiec, with the simulation data in the format indicated, or a numeric matrix with simulated data if the omic name is directly provided.

Examples


omic_list <- c("RNA-seq")
rnaseq_simulation <- mosim(omics = omic_list)
#' # This will be a data frame with RNA-seq counts
rnaseq_simulated <- omicResults(rnaseq_simulation, "RNA-seq")

#                    Group1.Time0.Rep1 Group1.Time0.Rep2 Group1.Time0.Rep3 ...
# ENSMUSG00000073155              4539              5374              5808 ...
# ENSMUSG00000026251                 0                 0                 0 ...
# ENSMUSG00000040472              2742              2714              2912 ...
# ENSMUSG00000021598              5256              4640              5130 ...
# ENSMUSG00000032348               421               348               492 ...
# ENSMUSG00000097226                16                14                 9 ...
# ENSMUSG00000027857                 0                 0                 0 ...
# ENSMUSG00000032081                 1                 0                 0 ...
# ENSMUSG00000097164               794               822               965 ...
# ENSMUSG00000097871                 0                 0                 0 ...

omic_list <- c("RNA-seq")
rnaseq_simulation <- mosim(omics = omic_list)
#' # This will be a data frame with RNA-seq counts
rnaseq_simulated <- omicResults(rnaseq_simulation, "RNA-seq")

#                    Group1.Time0.Rep1 Group1.Time0.Rep2 Group1.Time0.Rep3 ...
# ENSMUSG00000073155              4539              5374              5808 ...
# ENSMUSG00000026251                 0                 0                 0 ...
# ENSMUSG00000040472              2742              2714              2912 ...
# ENSMUSG00000021598              5256              4640              5130 ...
# ENSMUSG00000032348               421               348               492 ...
# ENSMUSG00000097226                16                14                 9 ...
# ENSMUSG00000027857                 0                 0                 0 ...
# ENSMUSG00000032081                 1                 0                 0 ...
# ENSMUSG00000097164               794               822               965 ...
# ENSMUSG00000097871                 0                 0                 0 ...

Retrieves the settings used in a simulation

Description

Retrieves the settings used in a simulation

Usage

omicSettings(
  simulation,
  omics = NULL,
  association = FALSE,
  reverse = FALSE,
  only.linked = FALSE,
  prefix = FALSE,
  include.lagged = TRUE
)
omicSettings(
  simulation,
  omics = NULL,
  association = FALSE,
  reverse = FALSE,
  only.linked = FALSE,
  prefix = FALSE,
  include.lagged = TRUE
)

Arguments

`simulation`	A MOSimulation object.
`omics`	List of omics to retrieve the settings.
`association`	A boolean indicating if the association must also be returned for the regulators.
`reverse`	A boolean, swap the column order in the association list in case we want to use the output directly and the program requires a different ordering.
`only.linked`	Return only the interactions that have an effect.
`prefix`	Logical indicating if the name of the omic should prefix the name of the regulator.
`include.lagged`	Logical indicating if interactions with transitory profile and different minimum/maximum time point between gene and regulator should be included or not.

Value

A list containing a data frame with the settings used to simulate each of the indicated omics. If association is TRUE, it will be a list with 3 keys: 'associations', 'settings' and 'regulators', with the first two keys being a list containing the information for the selected omics and the last one a global data frame giving the merged information.

Examples


omic_list <- c("RNA-seq", "miRNA-seq")
multi_simulation <- mosim(omics = omic_list)

# This will be a data frame with RNA-seq settings (DE flag, profiles)
rnaseq_settings <- omicSettings(multi_simulation, "RNA-seq")

# This will be a list containing all the simulated omics (RNA-seq
# and DNase-seq in this case)
all_settings <- omicSettings(multi_simulation)

omic_list <- c("RNA-seq", "miRNA-seq")
multi_simulation <- mosim(omics = omic_list)

# This will be a data frame with RNA-seq settings (DE flag, profiles)
rnaseq_settings <- omicSettings(multi_simulation, "RNA-seq")

# This will be a list containing all the simulated omics (RNA-seq
# and DNase-seq in this case)
all_settings <- omicSettings(multi_simulation)

Set the simulation settings for an omic.

Description

Set the simulation settings for an omic.

Usage

omicSim(omic, depth = NULL, totalFeatures = NULL, regulatorEffect = NULL)
omicSim(omic, depth = NULL, totalFeatures = NULL, regulatorEffect = NULL)

Arguments

`omic`	Name of the omic to set the settings.
`depth`	Sequencing depth in millions of counts. If not provided will take the global parameter passed to mosim function.
`totalFeatures`	Limit the number of features to simulate. By default include all present in the dataset.
`regulatorEffect`	only for regulatory omics. Associative list containing the percentage of effects over the total number of regulator, including repressor, association and no effect (NE).

Value

A list with the appropiate structure to be given as options in mosim function.

Examples


omic_list <- c("RNA-seq", "miRNA-seq")

rnaseq_options <- c(omicSim("miRNA-seq", totalFeatures = 2500))

# The return value is an associative list compatible with
# 'omicsOptions'
rnaseq_simulation <- mosim(omics = omic_list,
                           omicsOptions = rnaseq_options)

omic_list <- c("RNA-seq", "miRNA-seq")

rnaseq_options <- c(omicSim("miRNA-seq", totalFeatures = 2500))

# The return value is an associative list compatible with
# 'omicsOptions'
rnaseq_simulation <- mosim(omics = omic_list,
                           omicsOptions = rnaseq_options)

Generate a plot of a feature's profile for one or two omics.

Description

Generate a plot of a feature's profile for one or two omics.

Usage

plotProfile(simulation, omics, featureIDS, drawReps = FALSE, groups = NULL)
plotProfile(simulation, omics, featureIDS, drawReps = FALSE, groups = NULL)

Arguments

`simulation`	A MOSimulation object
`omics`	Character vector of the omics to simulate.
`featureIDS`	List containing the feature to show per omic. Must have the omics as the list names and the features as values.
`drawReps`	Logical to enable/disable the representation of the replicates inside the plot.
`groups`	Character vector indicating the groups to plot in the form "GroupX" (i.e. Group1)

Value

A ggplot2 object.

Examples

omic_list <- c("RNA-seq", "miRNA-seq")

rnaseq_options <- c(omicSim("miRNA-seq", totalFeatures = 2500))
rnaseq_simulation <- mosim(omics = omic_list,
                           omicsOptions = rnaseq_options)

#plotProfile(rnaseq_simulation,
#     omics = c("RNA-seq", "miRNA-seq"),
#   featureIDS = list("RNA-seq"="ENSMUSG00000007682", "miRNA-seq"="mmu-miR-320-3p")
#)

omic_list <- c("RNA-seq", "miRNA-seq")

rnaseq_options <- c(omicSim("miRNA-seq", totalFeatures = 2500))
rnaseq_simulation <- mosim(omics = omic_list,
                           omicsOptions = rnaseq_options)

#plotProfile(rnaseq_simulation,
#     omics = c("RNA-seq", "miRNA-seq"),
#   featureIDS = list("RNA-seq"="ENSMUSG00000007682", "miRNA-seq"="mmu-miR-320-3p")
#)

random_unif_interval Function to call the C code This function is a copy of the 'random_unif_interval' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license.

Description

random_unif_interval Function to call the C code This function is a copy of the 'random_unif_interval' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license.

Usage

random_unif_interval(size, max_val)
random_unif_interval(size, max_val)

Arguments

`size`	from sparsim
`max_val`	from sparsim

Default data

Description

Dataset with base counts and id-gene tables.

Usage

data("sampleData")
data("sampleData")

Format

An object of class list of length 6.

Details

List with 6 elements:

SimRNAseq

data: Dataframe with base counts with gene id as rownames.
geneLength: Length of every gene.

SimChIPseq

data: Dataframe with base counts with regions as rownames.
idToGene: Dataframe with region as "ID" column and gene name on "Gene" column.

SimDNaseseq

data: Dataframe with base counts with regions as rownames.
idToGene: Dataframe with region as "ID" column and gene name on "Gene" column.

SimMiRNAseq

data: Dataframe with base counts with miRNA id as rownames.
idToGene: Dataframe with miRNA as "ID" column and gene name on "Gene" column.

SimMethylseq

idToGene: Dataframe with region as "ID" column and gene name on "Gene" column.

CpGisland

Dataframe of CpG to be used as initialization data, located on "Region" column

sc_mosim

Description

Performs multiomic simulation of single cell datasets

Usage

sc_mosim(
  omics,
  cellTypes,
  numberReps = 1,
  numberGroups = 1,
  diffGenes = NULL,
  minFC = 0.25,
  maxFC = 4,
  numberCells = NULL,
  mean = NULL,
  sd = NULL,
  noiseRep = 0.1,
  noiseGroup = 0.5,
  regulatorEffect = NULL,
  associationList = NULL,
  feature_no = 8000,
  clusters = 3,
  cluster_size = NULL,
  TF = FALSE,
  TFdf = NULL
)
sc_mosim(
  omics,
  cellTypes,
  numberReps = 1,
  numberGroups = 1,
  diffGenes = NULL,
  minFC = 0.25,
  maxFC = 4,
  numberCells = NULL,
  mean = NULL,
  sd = NULL,
  noiseRep = 0.1,
  noiseGroup = 0.5,
  regulatorEffect = NULL,
  associationList = NULL,
  feature_no = 8000,
  clusters = 3,
  cluster_size = NULL,
  TF = FALSE,
  TFdf = NULL
)

Arguments

`omics`	named list containing the omic to simulate as names, which can be "scRNA-seq" or "scATAC-seq".
`cellTypes`	list where the i-th element of the list contains the column indices for i-th experimental conditions. List must be a named list.
`numberReps`	OPTIONAL. Number of replicates per group
`numberGroups`	OPTIONAL. number of different groups
`diffGenes`	OPTIONAL. If number groups > 1, Percentage DE genes to simulate. List of vectors (one per group to compare to group 1) where the vector contains absolute number of genes for Up and Down ex: c(250, 500) or a percentage for up, down ex: c(0.2, 0.2). The rest will be NE
`minFC`	OPTIONAL. Threshold of FC below which are downregulated, by default 0.25
`maxFC`	OPTIONAL. Threshold of FC above which are upregulated, by default 4
`numberCells`	OPTIONAL. Vector of numbers. The numbers correspond to the number of cells the user wants to simulate per each cell type. The length of the vector must be the same as length of `cellTypes`.
`mean`	OPTIONAL. Vector of numbers of mean depth per each cell type. Must be specified just if `numberCells` is specified.The length of the vector must be the same as length of `cellTypes`.
`sd`	OPTIONAL. Vector of numbers of standard deviation per each cell type. Must be specified just if `numberCells` is specified.The length of the vector must be the same as length of `cellTypes`.
`noiseRep`	OPTIONAL. Number indicating the desired standard deviation between biological replicates.
`noiseGroup`	OPTIONAL. Number indicating the desired standard deviation between treatment groups
`regulatorEffect`	OPTIONAL. To simulate relationship scRNA-scATAC, list of vectors (one per group) where the vector contains absolute number of regulators for Activator and repressor ex: c(150, 200) or a percentage for Activator and repressor ex: c(0.2, 0.1). The rest will be NE. If not provided, no table of association between scRNA and scATAC is outputted.
`associationList`	REQUIRED A 2 columns dataframe reporting peak ids related to gene names. If user doesnt have one, load from package data("associationList")
`feature_no`	OPTIONAL. If only scRNA-seq to simulate or scRNA and scATAC but no regulatory constraints, total number of features to be distributed between the coexpression clusters.
`clusters`	OPTIONAL. Number of co-expression patterns the user wants to simulate
`cluster_size`	OPTIONAL. It may be inputted by the user. Recommended: by default, its the number of features divided by the number of patterns to generate.
`TF`	OPTIONAL default is FALSE, if true, extract TF dataframe
`TFdf`	OPTIONAL, default is NULL. If an association matrix of TF and Target_gene is given the TF expression values are extracted. If no data.frame is given, using the association of human TF from https://tflink.net/

Value

a list of Seurat object, one per each omic.

Examples

omic_list <- sc_omicData(list("scRNA-seq"))
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
sim <- sc_mosim(omic_list, cell_types)

omic_list <- sc_omicData(list("scRNA-seq"))
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
sim <- sc_mosim(omic_list, cell_types)

sc_omicData

Description

Checks if the user defined data is in the correct format, or loads the default multiomics pbmc dataset, a subset from SeuratData package

Usage

sc_omicData(omics_types, data = NULL)
sc_omicData(omics_types, data = NULL)

Arguments

`omics_types`	A list of strings which can be either "scRNA-seq" or "scATAC-seq"
`data`	A user input matrix with genes (peaks in case of scATAC-seq) as rows and cells as columns. By default, it loads the example data. If a user input matrix is included, cell columns must be sorted by cell t ype.

Value

a named list with omics type as name and the count matrix as value

Examples

# Simulate from PBMC
omicsList <- sc_omicData(list("scRNA-seq", "scATAC-seq"))

# Simulate from PBMC
omicsList <- sc_omicData(list("scRNA-seq", "scATAC-seq"))

sc_omicResults

Description

sc_omicResults

Usage

sc_omicResults(sim)
sc_omicResults(sim)

Arguments

sim

a simulated object from sc_mosim function

Value

list of seurat objects with simulated data

Examples


cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
omicsList <- sc_omicData(list("scRNA-seq"))
sim <- sc_mosim(omicsList, cell_types)
res <- sc_omicResults(sim)
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
omicsList <- sc_omicData(list("scRNA-seq"))
sim <- sc_mosim(omicsList, cell_types)
res <- sc_omicResults(sim)

sc_omicSettings

Description

sc_omicSettings

Usage

sc_omicSettings(sim, TF = FALSE)
sc_omicSettings(sim, TF = FALSE)

Arguments

`sim`	a simulated object from sc_mosim function
`TF`	OPTIONAL default is FALSE, if true, extract TF association matrix

Value

list of Association matrices explaining the effects of each regulator to each gene

Examples


cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
omicsList <- sc_omicData(list("scRNA-seq"))
sim <- sc_mosim(omicsList, cell_types)
res <- sc_omicSettings(sim)
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
omicsList <- sc_omicData(list("scRNA-seq"))
sim <- sc_mosim(omicsList, cell_types)
res <- sc_omicSettings(sim)

sc_param_estimation

Description

Evaluate the users parameters for single cell simulation and use SPARSim to simulate the main dataset. Internal function

Usage

sc_param_estimation(
  omics,
  cellTypes,
  diffGenes = list(c(0.2, 0.2)),
  minFC = 0.25,
  maxFC = 4,
  numberCells = NULL,
  mean = NULL,
  sd = NULL,
  noiseGroup = 0.5,
  group = 1,
  genereggroup
)
sc_param_estimation(
  omics,
  cellTypes,
  diffGenes = list(c(0.2, 0.2)),
  minFC = 0.25,
  maxFC = 4,
  numberCells = NULL,
  mean = NULL,
  sd = NULL,
  noiseGroup = 0.5,
  group = 1,
  genereggroup
)

Arguments

`omics`	named list containing the omics to simulate as names, which can be "scRNA-seq" or "scATAC-seq".
`cellTypes`	list where the i-th element of the list contains the column indices for i-th cell type. List must be a named list.
`diffGenes`	If number groups > 1, Percentage DE genes to simulate. List of vectors (one per group to compare to group 1) where the vector contains absolute number of genes for Up and Down ex: c(250, 500) or a percentage for up, down ex: c(0.2, 0.2). The rest will be NE
`minFC`	Threshold of FC below which are downregulated, by default 0.25
`maxFC`	Threshold of FC above which are upregulated, by default 4
`numberCells`	vector of numbers. The numbers correspond to the number of cells the user wants to simulate per each cell type. The length of the vector must be the same as length of `cellTypes`.
`mean`	vector of numbers of mean depth per each cell type. Must be specified just if `numberCells` is specified.
`sd`	vector of numbers of standard deviation per each cell type. Must be specified just if `numberCells` is specified.
`noiseGroup`	OPTIONAL. Number indicating the desired standard deviation between treatment groups
`group`	Group for which to estimate parameters
`genereggroup`	List with information of genes, clusters and regulators that must be related to each other

Value

a list of Seurat object, one per each omic.

a named list with simulation parameters for each omics as values.

Examples

omicsList <- sc_omicData(list("scRNA-seq"))
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
#estimated_params <- sc_param_estimation(omicsList, cell_types)

omicsList <- sc_omicData(list("scRNA-seq"))
cell_types <- list('Treg' = c(1:10),'cDC' = c(11:20),'CD4_TEM' = c(21:30),
'Memory_B' = c(31:40))
#estimated_params <- sc_param_estimation(omicsList, cell_types)

Data to test scMOSim

Description

Data to test scMOSim

Usage

data("scatac")
data("scatac")

Format

A seurat Object, subset from seuratData with ATAC

assays: ATAC expression values
meta.data: annotations of celltypes

@source https://github.com/satijalab/seurat-data, we took 11 cells from each of 4 celltypes

Data to test scMOSim

Description

Data to test scMOSim

Usage

data("scrna")
data("scrna")

Format

A seurat Object, subset from seuratData with RNA

assays: RNA expression values
meta.data: annotations of celltypes

@source https://github.com/satijalab/seurat-data, we took 11 cells from each of 4 celltypes This is how: dat <- pbmcMultiome.SeuratData::pbmc.rna dat <- subset(x = dat, subset = seurat_annotations "cDC", "Memory B", "Treg")) unique_cell_types <- unique(datATmeta.data$seurat_annotations) extracted_cells <- list() cellnames <- c() for (cell_type in unique_cell_types) type_cells <- subset(dat, subset = seurat_annotations counts <- as.matrix(type_cellsATassays[["RNA"]]ATcounts) extracted_cells[[cell_type]] <- counts[, 1:10] cellnames <- append(cellnames, replicate(11, cell_type)) scrna <- Reduce(cbind, extracted_cells)

shuffle_group_matrix, Reorder cell type-specific expression matrix during co-expression simulation. Copied from ACORDE (https://github.com/ConesaLab/acorde) to facilitate stability and running within our scripts This function is a slightly modified copy of the 'shuffle_group_matrix' function from the 'Acorde' package (v1.0.0), originally developed by Arzalluz-Luque A, Salguero P, Tarazona S, Conesa A. (2022). acorde unravels functionally interpretable networks of isoform co-usage from single cell data. Nature communications 1828. DOI: 10.1038/s41467-022-29497-w. The original package is licensed under the GPL-3 license.

Description

This function is used internally by acorde to perform the shuffling of simulated features for an individual cell type, as part of the co-expression simulation process. The function is called recursively by simulate_coexpression() to perform the simulation on a full scRNA-seq matrix.

Usage

shuffle_group_matrix(sim_data, feature_ids, group_pattern, ngroups)
shuffle_group_matrix(sim_data, feature_ids, group_pattern, ngroups)

Arguments

`sim_data`	A count matrix with features as rows and cells as columns. Feature IDs must be included in an additional column named `feature`.
`feature_ids`	A two-column `tibble` containing `top` and `bottom` columns, each including the feature IDs of features to be used as highly or lowly expressed when shuffling by the indicated expression pattern.
`group_pattern`	A logical vector, containing `TRUE` to indicate that high expression in that cell type is desired and `FALSE` if the opposite. The vector must be ordered as the cell types in `sim_data`.
`ngroups`	An integer indicating the number of groups that top and bottom features should be divided into. It is computed by dividing the number of features selected as highly/lowly expressed by the size of the clusters that are to be generated.

Value

An expression matrix, with the same characteristics as sim_data, and a number of features defined as the total amount of top/bottom features selected divided by the number of clusters for which co-expression patterns where supplied.

simulate coexpression

Description

Adapted from ACORDE (https://github.com/ConesaLab/acorde) to adapt to our data input type. Simulates coexpression of genes along celltypes

Usage

simulate_coexpression(
  sim_matrix,
  feature_no,
  cellTypes,
  patterns,
  cluster_size = NULL
)
simulate_coexpression(
  sim_matrix,
  feature_no,
  cellTypes,
  patterns,
  cluster_size = NULL
)

Arguments

`sim_matrix`	Matrix with rows as features and columns as cells
`feature_no`	Total number of features to be distributed between the coexpression clusters
`cellTypes`	list where the i-th element of the list contains the column indices for i-th experimental conditions. List must be a named list.
`patterns`	Tibble with TRUE FALSE depicting the cluster patterns to simulate. Generated by the user or by `make_cluster_patterns`.
`cluster_size`	OPTIONAL. It may be inputted by the user. By default, its the number of features divided by the number of patterns to generate.

Details

This function is a slightly modified copy of the 'simulate_coexpression' function from the 'Acorde' package (v1.0.0), originally developed by Arzalluz-Luque A, Salguero P, Tarazona S, Conesa A. (2022). acorde unravels functionally interpretable networks of isoform co-usage from single cell data. Nature communications 1828. DOI: 10.1038/s41467-022-29497-w. The original package is licensed under the GPL-3 license.

Value

the simulated coexpression

Simulate technical variability

Description

Function to simulate the technical variability (i.e. a multivariate hypergeometric on a gamma expression value array)

Usage

simulate_hyper(avgAbund, seqdepth = NULL, digits, max_val)
simulate_hyper(avgAbund, seqdepth = NULL, digits, max_val)

Arguments

`avgAbund`	array containing the intensity values for each feature. It describes the intensity of a single sample
`seqdepth`	sequencing depth (i.e. sample size of the MH)
`digits`	number of digits for random number generation
`max_val`	max value for random number generation

Details

This function is a copy of the 'simulate_hyper' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license.

Value

An array of length(avgAbund) elements representing the count values for the current sample

Create SPARSim simulation parameter

Description

Function to create a SPARSim simulation parameter. This function is a copy of the 'SPARSIM_create_simulation_parameter' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license. To simulate N feature (e.g. genes), user must specify N values of gene expression level and gene expression variability in the function input parameters intensity and variability, respectively. To simulate M samples (i.e. cells), user must specify M values of sample library size in the function input parameter library_size.

Usage

sparsim_create_simulation_parameter(
  intensity,
  variability,
  library_size,
  feature_names = NA,
  sample_names = NA,
  condition_name = NA,
  intensity_2 = NULL,
  variability_2 = NULL,
  p_bimod = NULL
)
sparsim_create_simulation_parameter(
  intensity,
  variability,
  library_size,
  feature_names = NA,
  sample_names = NA,
  condition_name = NA,
  intensity_2 = NULL,
  variability_2 = NULL,
  p_bimod = NULL
)

Arguments

`intensity`	Array of gene expression intensity values
`variability`	Array of gene expression variability values
`library_size`	Array of library size values
`feature_names`	Array of feature names. It must be of the same length of `intensity` array. If NA (default), feature will be automatically named "gene_1", "gene_2", ... "gene_<N>", where N = length(intensity)
`sample_names`	Array of sample names. It must be of the same length of `library_size` array. If NA (defatul), sample will be automatically named "<condition_name>_cell1", "<condition_name>_cell2", ..., "<condition_name>_cell<M>", where M = length(library_size)
`condition_name`	Name associated to the current experimental condition. If NA (default), it will be set to "cond<l1><l2>", where l1 and l2 are two random letters.
`intensity_2`	Array of gene expression intensity values for the second expression mode, if simulating genes with bimodal gene expression. Entries containing `NAs` will be ignored. If NULL (default), no bimodal gene expression is simulated.
`variability_2`	Array of gene expression variability values for the second expression mode, if simulating genes with bimodal gene expression. If NULL (default), no bimodal gene expression is simulated.
`p_bimod`	Array of bimodal gene expression probabilities; the i-th value indicates the probability `p` of the i-th gene to be expressed in the first mode (i.e. the one specified in the i-th entries of parameters `intensity` and `variability`); with probability `1-p` the i-th gene will be expressed in the second mode (i.e. the one specified in the i-th entries of parameters `intensity_2` and `variability_2`)

Details

User can optionally specify the names to assign at the single feature and sample to simulate (function input parameters feature_names and sample_names, respectively, as well as the name of the experimental condition (function input parameter condition_name). If the user does not specify such information, the function will set some default values.

To simulate T different experimental conditions in a single count table, then T different simulation parameters must be created.

Value

SPARSim simulation parameter describing one experimental condition

Estimate SPARSIm "intensity" parameter

Description

Function to estimate the intensity values from the genes in data. The intensity is computed as mean of normalized counts for each gene.

Usage

sparsim_estimate_intensity(data)
sparsim_estimate_intensity(data)

Arguments

data

normalized count data matrix (gene on rows, samples on columns). rownames(data) must contain gene names.

Details

This function is a copy of the 'SPARSIM_estimate_intensity' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license. This function is used in sparsim_estimate_parameter_from_data to compute SPARSim "intensity" parameter, given a real count table as input. If the count table contains more than one experimental condition, then the function is applied to each experimental conditions.

Value

An array of intensity values having N_genes elements (N_genes = nrow(data)). Array entries are named with gene names.

Estimate SPARSim "library size" parameter

Description

Function to estimate the library sizes from the samples in data.

Usage

sparsim_estimate_library_size(data)
sparsim_estimate_library_size(data)

Arguments

data

raw count data matrix (gene on rows, samples on columns)

Details

This function is a copy of the 'SPARSIM_estimate_library_size' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license. This function is used in sparsim_estimate_parameter_from_data to compute SPARSim "library size" parameter, given a real count table as input. If the count table contains more than one experimental condition, then the function is applied to each experimental conditions.

Value

An array of library size values having N_samples elements (N_samples = ncol(data))

Estimate SPARSim simulation parameter from a given count table

Description

Function to estimate SPARSim simulation parameters (intensity, variability and library sizes) from a real count table. If the real count table contains more than one experimental condition, it is possible to estimate the parameters for each experimental condition.

Usage

sparsim_estimate_parameter_from_data(raw_data, norm_data, conditions)
sparsim_estimate_parameter_from_data(raw_data, norm_data, conditions)

Arguments

`raw_data`	count matrix (gene on rows, samples on columns) containing raw count data
`norm_data`	count matrix (gene on rows, samples on columns) containing normalized count data
`conditions`	list where the i-th element of the list contains the column indices for i-th experimental conditions. List must be a named list.

Details

This function is a copy of the 'SPARSIM_estimate_parameter_from_data' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license.

Value

A SPARSim simulation parameters

Estimate SPARSim "variability" parameter

Description

Function to estimate the variability values from the genes in data.

Usage

sparsim_estimate_variability(data)
sparsim_estimate_variability(data)

Arguments

data

raw count data matrix (gene on rows, samples on columns)

Details

This function is a copy of the 'SPARSIM_estimate_variability' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license. This function is used in sparsim_estimate_parameter_from_data to compute SPARSim "variability" parameter, given a real count table as input. If the count table contains more than one experimental condition, then the function is applied to each experimental conditions.

Value

An array of variability values having N_genes elements (N_genes = nrow(data))

Function to simulate a raw count table

Description

This function is a copy of the 'SPARSIM_simulation' function from the 'SPARSim' package (v0.9.5), originally developed by Giacomo Baruzzo, Ilaria Patuzzi, Barbara Di Camillo (2020). The original package is licensed under the GPL-3 license.

Usage

sparsim_simulation(
  dataset_parameter,
  output_sim_param_matrices = FALSE,
  output_batch_matrix = FALSE,
  count_data_simulation_seed = NULL
)
sparsim_simulation(
  dataset_parameter,
  output_sim_param_matrices = FALSE,
  output_batch_matrix = FALSE,
  count_data_simulation_seed = NULL
)

Arguments

`dataset_parameter`	list containing, the intensity, variability and lib sizes of each experimental condition. It is the return value of "estimate_parameter_from_data" or could be created by the users
`output_sim_param_matrices`	boolean flag. If TRUE, the function will output two additional matrices, called abundance_matrix and variability_matrix, containing the gene intensities and gene variabilities used as simulation input. (Default: FALSE)
`output_batch_matrix`	boolean flag. If TRUE, the function will output an additional matrix, called batch_factors_matrix, containing the multiplicative factors used in batch effect simulation. (Default: FALSE)
`count_data_simulation_seed`	inherited from sparsim

Value

A list of 5 elements:

- count_matrix: the simulated count matrix (genes on rows, samples on columns)

- gene_matrix: the simulated gene expression levels (genes on rows, samples on columns)

- abundance_matrix: the input gene intensity values provided as input (genes on rows, samples on columns), if output_sim_param_matrices = TRUE. NULL otherwise.

- variability_matrix: the input gene variability values provided as input (genes on rows, samples on columns), if output_sim_param_matrices = TRUE. NULL otherwise.

- batch_factors_matrix: the multiplicative factor used in batch generation (genes on rows, samples on columns), if output_batch_matrix = TRUE. NULL otherwise.

Data to extract human TF

Description

Data to extract human TF

Usage

data("TF_human")
data("TF_human")

Format

vector of gene names

data.frame: gene names corresponding to TF and to Target genes

@source https://tflink.net/

Package 'MOSim'

Help Index

Data to showcase scRNA and scATAC-seq association

Description

Usage

Format

calculate_mean_per_list_df

Description

Usage

Arguments

Examples

check_patterns

Description

Usage

Arguments

Value

Examples

Discretize ChIP-Seq counts to simulate a binary dataset

Description

Usage

Arguments

Value

Examples

Retrieves the experimental design

Description

Usage

Arguments

Value

Examples

Check if a variable is declared.

Description

Usage

Arguments

Value

make_association_dataframe

Description

Usage

Arguments

Value

make_cluster_patterns

Description

Usage

Arguments

Value

Examples

match_gene_regulator

Description

Usage

Arguments

Examples

match_gene_regulator_cluster

Description

Usage

Arguments

Examples

mosim

Description

Usage

Arguments

Value

Examples

Set customized data for an omic.

Description

Usage

Arguments

Value

Examples

Retrieves the simulated data.

Description

Usage

Arguments

Value

Examples

Retrieves the settings used in a simulation

Description

Usage

Arguments

Value

Examples

Set the simulation settings for an omic.