Title: | Multiview Intercellular SpaTial modeling framework |
---|---|
Description: | mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution. |
Authors: | Jovan Tanevski [cre, aut] , Ricardo Omar Ramirez Flores [ctb] , Philipp Schäfer [ctb] |
Maintainer: | Jovan Tanevski <[email protected]> |
License: | GPL-3 |
Version: | 1.15.0 |
Built: | 2024-12-29 06:10:02 UTC |
Source: | https://github.com/bioc/mistyR |
The juxtaview captures the expression of all markers within the immediate neighborhood of a spatial unit.
add_juxtaview( current.views, positions, neighbor.thr = 15, prefix = "", cached = FALSE, verbose = TRUE )
add_juxtaview( current.views, positions, neighbor.thr = 15, prefix = "", cached = FALSE, verbose = TRUE )
current.views |
the current view composition. |
positions |
a |
neighbor.thr |
a threshold value used to indicate the largest distance between two spatial units that can be considered as neighboring. |
prefix |
a prefix to add to the column names. |
cached |
a |
verbose |
a |
The neighborhood of each spatial unit is estimated by constructing a graph by
2D Delaunay triangulation following by removal of edges with length larger than
neighbor.thr
. For each spatial unit the juxtaview contains the sum of
expressions across its estimated neighbors for each marker.
A mistyR view composition with added juxtaview.
create_initial_view()
for
starting a view composition with an intraview only.
Other view composition functions:
add_paraview()
,
add_views()
,
create_initial_view()
,
create_view()
,
remove_views()
# Create a view composition of an intraview and a juxtaview. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_juxtaview(pos, neighbor.thr = 1.5) # preview str(misty.views[["juxtaview.1.5"]])
# Create a view composition of an intraview and a juxtaview. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_juxtaview(pos, neighbor.thr = 1.5) # preview str(misty.views[["juxtaview.1.5"]])
The paraview captures the expression of all markers in the broader tissue structure.
add_paraview( current.views, positions, l, zoi = 0, family = c("gaussian", "exponential", "linear", "constant"), approx = 1, nn = NULL, prefix = "", cached = FALSE, verbose = TRUE )
add_paraview( current.views, positions, l, zoi = 0, family = c("gaussian", "exponential", "linear", "constant"), approx = 1, nn = NULL, prefix = "", cached = FALSE, verbose = TRUE )
current.views |
the current view composition. |
positions |
a |
l |
effective radius of influence of expression in the broader tissue structure. |
zoi |
spatial units with distance smaller than the zone of indifference will not be taken into account when generating the paraview. |
family |
the family f functions used to generate weights. (see Details) |
approx |
rank of the Nyström approximation matrix. (see Details) |
nn |
the number of spatial units to be used for approximating the paraview using a fast nearest neighbor search. (see Details) |
prefix |
a prefix to add to the column names. |
cached |
a |
verbose |
a |
The paraview is generated by weighted sum of the expression of all spatial
units for each marker. The weights for each spatial unit i are dependent
on the family
which can be one of "gaussian", "exponential", "linear"
or "constant".
If "gaussian" the weights are calculated based on the distance to the spatial
unit j and the parameter l
using the radial basis function
The parameter l
here denotes the "effective" radius of influence.
If "exponential" the weights are calculated based on the distance to the spatial
unit j and the parameter l
using the exponential function
The parameter l
here denotes signaling length. For more information
consult Oyler-Yaniv et. al. Immunity 46(4) 2017.
If "linear" the weights are calculated based on the distance to the spatial
unit j and the parameter l
using the linear function
The parameter l
here denotes the intersect of the linear function. For
distances larger than l
the weight is equal to 0.
If "constant" the weights are always 1. The parameter l
here denotes
the number of nearest neighbors to take into account if nn
is not
defined.
Since the generation of the paraview requires the calculation of pairwise distances
of all spatial units it can take a significant amount of computation time. The
parameters approx
and nn
can be set to speed up the calculation
by approximation. The approximation can be achieved by using the Nyström
low-rank approximation method or by limiting the calculation of the paraview
to a number of nearest neighbors around each spatial unit.
If the value of approx
is between 0 and 1 it will be interpreted as
fraction of the number of spatial units. Discrete values above 1 will be
interpreted as the size of the approximation block. The number of nearest
neighbors nn
around each spatial unit are determined using a fast
nearest neighbor search.
If both approx
and nn
have non-null values, nn
has priority and an approximation based on fast nearest neighbor
search will be used to generate the paraview.
A mistyR view composition with added paraview with parameter l
.
create_initial_view()
for
starting a view composition with an intraview only.
Other view composition functions:
add_juxtaview()
,
add_views()
,
create_initial_view()
,
create_view()
,
remove_views()
# Create a view composition of an intraview and a paraview with radius 10. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_paraview(pos, l = 10) # preview str(misty.views[["paraview.10"]])
# Create a view composition of an intraview and a paraview with radius 10. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_paraview(pos, l = 10) # preview str(misty.views[["paraview.10"]])
Add one or more custom views to the current view composition.
add_views(current.views, new.views)
add_views(current.views, new.views)
current.views |
the current view composition. |
new.views |
a view or a list of views created with
|
A mistyR view composition containing an union of views from
current.views
and new.views
.
create_initial_view()
for
starting a view composition, with an intraview,
create_view()
for creating a custom view.
Other view composition functions:
add_juxtaview()
,
add_paraview()
,
create_initial_view()
,
create_view()
,
remove_views()
# create random views view1 <- data.frame(marker1 = rnorm(100, 10, 2), marker2 = rnorm(100, 15, 3)) view2 <- data.frame(marker1 = rnorm(100, 10, 5), marker2 = rnorm(100, 15, 5)) misty.views <- create_initial_view(view1) new.view <- create_view("dummyname", view2, "dname") add_views(misty.views, new.view) misty.views %>% add_views(create_view("dummyname", view2, "dname"))
# create random views view1 <- data.frame(marker1 = rnorm(100, 10, 2), marker2 = rnorm(100, 15, 3)) view2 <- data.frame(marker1 = rnorm(100, 10, 5), marker2 = rnorm(100, 15, 5)) misty.views <- create_initial_view(view1) new.view <- create_view("dummyname", view2, "dname") add_views(misty.views, new.view) misty.views %>% add_views(create_view("dummyname", view2, "dname"))
Purge the cache or clear the cached objects for a single sample.
clear_cache(id = NULL)
clear_cache(id = NULL)
id |
the unique id of the sample. |
The cached objects are removed from disk and cannot be retrieved. Whenever
possible specifying an id
is reccomended. If id = NULL
all
contents of the folder ‘.misty.temp’ will be removed.
None (NULL
)
clear_cache("b98ad35f4e671871cba35f2155228612") clear_cache()
clear_cache("b98ad35f4e671871cba35f2155228612") clear_cache()
Collect and aggregate performance, contribution and importance estimations
of a set of raw results produced by run_misty()
.
collect_results(folders)
collect_results(folders)
folders |
Paths to folders containing the raw results from
|
List of collected performance, contributions and importances per sample, performance and contribution statistics and aggregated importances.
Long format tibble
with measurements
of performance for each target and each sample.
Available performance measures are RMSE and variance explained
(R2) for a model containing only an intrinsic view
(intra.RMSE, intra.R2), model with all views
(multi.RMSE, multi.R2), gain of RMSE and gain of
variance explained of multi-view model over the intrisic model
where gain.RMSE is the relative decrease of RMSE in percent,
while gain.R2 is the absolute increase of variance explained
in percent. Each value represents the mean performance across
folds (k-fold cross-validation). The p values of a one sided
t-test of improvement of performance (p.RMSE, p.R2)
are also available as a measure.
Long format tibble
with summary
statistics (mean, standard deviation and coefficient of variation)
for all performance measures for each target over all samples.
Long format tibble
with the values
of the coefficients for each view in the meta-model, for each
target and each sample. The p values for the coefficient
for each view, under the null hypothesis of zero contribution to the
meta model are also available.
Long format tibble
with summary
statistics for all views per target over all samples. Including
mean coffecient value, fraction of contribution, mean and standard
deviation of p values.
List of view-specific predictor-target importance tables per sample. The importances in each table are standardized per target and weighted by the quantile of the coefficient for the target in that view. Columns other than Predictor represent target markers.
A list of aggregated view-specific predictor-target importance tables . Aggregation is reducing by mean over all samples.
run_misty()
to train models and
generate results.
# Train and collect results for 3 samples in synthetic library(dplyr) library(purrr) data("synthetic") misty.results <- synthetic[seq_len(3)] %>% imap_chr(~ create_initial_view(.x %>% select(-c(row, col, type))) %>% add_paraview(.x %>% select(row, col), l = 10) %>% run_misty(paste0("results/", .y))) %>% collect_results() str(misty.results)
# Train and collect results for 3 samples in synthetic library(dplyr) library(purrr) data("synthetic") misty.results <- synthetic[seq_len(3)] %>% imap_chr(~ create_initial_view(.x %>% select(-c(row, col, type))) %>% add_paraview(.x %>% select(row, col), l = 10) %>% run_misty(paste0("results/", .y))) %>% collect_results() str(misty.results)
This function is the first one to be called when building a mistyR workflow, starting from view composition. The initial view describes the intraview of the sample.
create_initial_view(data, unique.id = NULL)
create_initial_view(data, unique.id = NULL)
data |
A |
unique.id |
A |
An initial mistyR view composition containing an intraview list
item named described with abbreviation "intra" and data as provided
in data
and a misty.uniqueid list item containing the
provided or automatically calculated unique.id
. A cache folder for
the sample will be automatically created in the working directory as a
subfolder of ‘.misty.temp/’ with the same name as unique.id
.
Other view composition functions:
add_juxtaview()
,
add_paraview()
,
add_views()
,
create_view()
,
remove_views()
# Create an intrinsic view from the first sample in the dataset synthetic. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) create_initial_view(expr)
# Create an intrinsic view from the first sample in the dataset synthetic. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) create_initial_view(expr)
Create a custom view from a data.frame
or a tibble
.
create_view(name, data, abbrev = name)
create_view(name, data, abbrev = name)
name |
Name of the view. A character vector. |
data |
A |
abbrev |
Abbreviated name. A character vector. |
Creating a custom view does not add it to the current view composition.
A new mistyR view. A list with a single name
d item described by
the provided abbrev
iation and data containing the provided
data
.
add_views()
for adding created views
to a view composition.
Other view composition functions:
add_juxtaview()
,
add_paraview()
,
add_views()
,
create_initial_view()
,
remove_views()
# Create a view from the mean expression of the 10 nearest neighbors of # each cell. library(dplyr) library(purrr) library(distances) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # find the 10 nearest neighbors neighbors <- nearest_neighbor_search(distances(as.matrix(pos)), k = 11)[-1, ] # calculate the mean expression of the nearest neighbors for all markers # for each cell in expr nnexpr <- seq_len(nrow(expr)) %>% map_dfr(~ expr %>% slice(neighbors[, .x]) %>% colMeans()) create_view("nearest", nnexpr, "nn")
# Create a view from the mean expression of the 10 nearest neighbors of # each cell. library(dplyr) library(purrr) library(distances) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # find the 10 nearest neighbors neighbors <- nearest_neighbor_search(distances(as.matrix(pos)), k = 11)[-1, ] # calculate the mean expression of the nearest neighbors for all markers # for each cell in expr nnexpr <- seq_len(nrow(expr)) %>% map_dfr(~ expr %>% slice(neighbors[, .x]) %>% colMeans()) create_view("nearest", nnexpr, "nn")
Signature is a representation of each sample in the space of mistyR results.
extract_signature( misty.results, type = c("performance", "contribution", "importance"), trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
extract_signature( misty.results, type = c("performance", "contribution", "importance"), trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
misty.results |
a results list generated by
|
type |
type of signature to extract from the results. |
trim |
display targets with performance value above (if R2 or gain) or below (otherwise) this value only. |
trim.measure |
the measure used for trimming. |
The performance signature of each sample is a concatenation of the estimated
values of variance explained using only the intraview, the variance explained
by the multiview model and the gain in variance explained for each marker.
The performance signature vector for each sample available in
misty.results
is of length .
The contribution signature of each sample is a concatenation of the estimated
fraction of contribution of each view for each marker.
The contribution signature vector for each sample available in
misty.results
is of length
.
The importance signature of each sample is a concatenation of the estimated
and weighted importances for each predictor-target marker pair from all views.
The importance signature vector for each sample available in
misty.results
is of length
.
A table with one row per sample from misty.results
representing
its signature.
collect_results()
to generate a
results list from raw results.
library(dplyr) misty.results <- list.files("results", full.names = TRUE) %>% collect_results() extract_signature(misty.results, "performance")
library(dplyr) misty.results <- list.files("results", full.names = TRUE) %>% collect_results() extract_signature(misty.results, "performance")
Select, remove (or duplicate) rows from all views in a composition by their row locations or according to conditions based on a specific view.
filter_views(current.views, rows, view = "intraview", ...)
filter_views(current.views, rows, view = "intraview", ...)
current.views |
the current view composition. |
rows |
row (integer) location; positive values to keep (duplicate) and/or negative to remove. |
view |
the name of the view to be used for filtering. |
... |
logical expressions defined in terms of the variables in
|
The values in rows
have priority over the other parameters. If
rows
doesn't contain integer values then filtering
is performed based on the view specified in view
and expressions
(...
) returning logical values
defined in terms of the variables in view
.
A mistyR view composition with filtered spatial units from all views.
<data-masking
>.
Other view manipulation functions:
rename_view()
,
select_markers()
# Create a view composition with an intraview and filter library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # compose misty.views <- create_initial_view(expr) # select only the first 10 spatial units and preview misty.views %>% filter_views(1:10) %>% str() # select only the units where the expression of ligA is larger than 0.5 # and preview misty.views %>% filter_views(NA, "intraview", ligA > 0.5) %>% str()
# Create a view composition with an intraview and filter library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # compose misty.views <- create_initial_view(expr) # select only the first 10 spatial units and preview misty.views %>% filter_views(1:10) %>% str() # select only the units where the expression of ligA is larger than 0.5 # and preview misty.views %>% filter_views(NA, "intraview", ligA > 0.5) %>% str()
The heatmap shows the interactions that are present and have importance above
a cutoff
value in the to.view
but but not in the from.view
.
plot_contrast_heatmap( misty.results, from.view, to.view, cutoff = 1, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
plot_contrast_heatmap( misty.results, from.view, to.view, cutoff = 1, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
misty.results |
a results list generated by
|
from.view , to.view
|
abbreviated name of the view. |
cutoff |
importance threshold. Importances below this value will be colored white in the heatmap and considered as not relevant. |
trim |
display targets with performance value above (if R2 or gain) or below (otherwise) this value only. |
trim.measure |
the measure used for trimming. |
The misty.results
list (invisibly).
collect_results()
to generate a
results list from raw results.
Other plotting functions:
plot_contrast_results()
,
plot_improvement_stats()
,
plot_interaction_communities()
,
plot_interaction_heatmap()
,
plot_view_contributions()
all.samples <- list.dirs("results", recursive = FALSE) misty.results <- collect_results(all.samples) misty.results %>% plot_contrast_heatmap("intra", "para.10") misty.results %>% plot_contrast_heatmap("intra", "para.10", cutoff = 0.5)
all.samples <- list.dirs("results", recursive = FALSE) misty.results <- collect_results(all.samples) misty.results %>% plot_contrast_heatmap("intra", "para.10") misty.results %>% plot_contrast_heatmap("intra", "para.10", cutoff = 0.5)
Plot interexperiment contrast of views.
plot_contrast_results( misty.results.from, misty.results.to, views = NULL, cutoff.from = 1, cutoff.to = 1, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
plot_contrast_results( misty.results.from, misty.results.to, views = NULL, cutoff.from = 1, cutoff.to = 1, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
misty.results.from , misty.results.to
|
a results list generated by
|
views |
one or more abbreviated names of views. |
cutoff.from , cutoff.to
|
importance thresholds respective to the result lists. |
trim |
display targets with performance value above (if R2 or gain) or below (otherwise) this value only. |
trim.measure |
the measure used for trimming. |
The heatmaps show the interactions that are present and have importance above
a cutoff.to
value in the views
of misty.results.to
but
not present or have importance below cutoff.from
in the views
of misty.results.from
.
The misty.results.from
list (invisibly).
collect_results()
to generate a
results list from raw results.
Other plotting functions:
plot_contrast_heatmap()
,
plot_improvement_stats()
,
plot_interaction_communities()
,
plot_interaction_heatmap()
,
plot_view_contributions()
# if for example the available samples come from different grades of tumors grade1.results <- collect_results(c("results/synthetic1", "results/synthetic2")) grade3.results <- collect_results("results/synthetic10") # highlight interactions present in grade 1 tumors but not in grade 3 tumors # in the paraview grade3.results %>% plot_contrast_results(grade1.results, views = "para.10") # see the loss of interactions in all views with lower sensitivity plot_contrast_results(grade3.results, grade1.results, cutoff.from = 0.75, cutoff.to = 0.5)
# if for example the available samples come from different grades of tumors grade1.results <- collect_results(c("results/synthetic1", "results/synthetic2")) grade3.results <- collect_results("results/synthetic10") # highlight interactions present in grade 1 tumors but not in grade 3 tumors # in the paraview grade3.results %>% plot_contrast_results(grade1.results, views = "para.10") # see the loss of interactions in all views with lower sensitivity plot_contrast_results(grade3.results, grade1.results, cutoff.from = 0.75, cutoff.to = 0.5)
Generates a plot of the mean (+- standard deviation) of the performance value per target across all samples from the results.
plot_improvement_stats( misty.results, measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE"), trim = -Inf )
plot_improvement_stats( misty.results, measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE"), trim = -Inf )
misty.results |
a results list generated by
|
measure |
performance measure to be plotted (See
|
trim |
display targets with performance value above (if R2 or gain) or below (otherwise) this value only. |
The misty.results
list (invisibly).
collect_results()
to generate a
results list from raw results.
Other plotting functions:
plot_contrast_heatmap()
,
plot_contrast_results()
,
plot_interaction_communities()
,
plot_interaction_heatmap()
,
plot_view_contributions()
all.samples <- list.dirs("results", recursive = FALSE) collect_results(all.samples) %>% plot_improvement_stats() misty.results <- collect_results(all.samples) misty.results %>% plot_improvement_stats(measure = "gain.RMSE") misty.results %>% plot_improvement_stats(measure = "intra.R2")
all.samples <- list.dirs("results", recursive = FALSE) collect_results(all.samples) %>% plot_improvement_stats() misty.results <- collect_results(all.samples) misty.results %>% plot_improvement_stats(measure = "gain.RMSE") misty.results %>% plot_improvement_stats(measure = "intra.R2")
Identify and plot a graph of marker interaction communities.
plot_interaction_communities(misty.results, view, cutoff = 1)
plot_interaction_communities(misty.results, view, cutoff = 1)
misty.results |
a results list generated by
|
view |
abbreviated name of the view. |
cutoff |
importance threshold. Importances below this value will be colored white in the heatmap and considered as not relevant. |
The communities are identified using the Louvain algorithm. Communities can be extracted only from views that have the same predictor and target markers.
The misty.results
list (invisibly).
collect_results()
to generate a
results list from raw results.
Other plotting functions:
plot_contrast_heatmap()
,
plot_contrast_results()
,
plot_improvement_stats()
,
plot_interaction_heatmap()
,
plot_view_contributions()
all.samples <- list.dirs("results", recursive = FALSE) misty.results <- collect_results(all.samples) misty.results %>% plot_interaction_communities("intra") %>% plot_interaction_communities("para.10") misty.results %>% plot_interaction_communities("para.10", cutoff = 0.5)
all.samples <- list.dirs("results", recursive = FALSE) misty.results <- collect_results(all.samples) misty.results %>% plot_interaction_communities("intra") %>% plot_interaction_communities("para.10") misty.results %>% plot_interaction_communities("para.10", cutoff = 0.5)
Generate a heatmap with importances of predictor-target interaction.
plot_interaction_heatmap( misty.results, view, cutoff = 1, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE"), clean = FALSE )
plot_interaction_heatmap( misty.results, view, cutoff = 1, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE"), clean = FALSE )
misty.results |
a results list generated by
|
view |
abbreviated name of the view. |
cutoff |
importance threshold. Importances below this value will be colored white in the heatmap and considered as not relevant. |
trim |
display targets with performance value above (if R2 or gain) or below (otherwise) this value only. |
trim.measure |
the measure used for trimming. |
clean |
a |
The misty.results
list (invisibly).
collect_results()
to generate
a results list from raw results.
Other plotting functions:
plot_contrast_heatmap()
,
plot_contrast_results()
,
plot_improvement_stats()
,
plot_interaction_communities()
,
plot_view_contributions()
all.samples <- list.dirs("results", recursive = FALSE) collect_results(all.samples) %>% plot_interaction_heatmap("intra") %>% plot_interaction_heatmap("para.10", cutoff = 0.5)
all.samples <- list.dirs("results", recursive = FALSE) collect_results(all.samples) %>% plot_interaction_heatmap("intra") %>% plot_interaction_heatmap("para.10", cutoff = 0.5)
Generate a stacked barplot of the average view contribution fraction per target across all samples from the results.
plot_view_contributions( misty.results, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
plot_view_contributions( misty.results, trim = -Inf, trim.measure = c("gain.R2", "multi.R2", "intra.R2", "gain.RMSE", "multi.RMSE", "intra.RMSE") )
misty.results |
a results list generated by
|
trim |
display targets with performance value above (if R2 or gain) or below (otherwise) this value only. |
trim.measure |
the measure used for trimming. |
The misty.results
list (invisibly).
collect_results()
to generate a
results list from raw results.
Other plotting functions:
plot_contrast_heatmap()
,
plot_contrast_results()
,
plot_improvement_stats()
,
plot_interaction_communities()
,
plot_interaction_heatmap()
all.samples <- list.dirs("results", recursive = FALSE) collect_results(all.samples) %>% plot_view_contributions()
all.samples <- list.dirs("results", recursive = FALSE) collect_results(all.samples) %>% plot_view_contributions()
Remove one or more views from the view composition.
remove_views(current.views, view.names)
remove_views(current.views, view.names)
current.views |
the current view composition. |
view.names |
the names of one or more views to be removed. |
The intraview and the unique id cannot be removed with this function.
A mistyR view composition with view.names
views removed.
Other view composition functions:
add_juxtaview()
,
add_paraview()
,
add_views()
,
create_initial_view()
,
create_view()
library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_juxtaview(pos, neighbor.thr = 1.5) %>% add_paraview(pos, l = 10) # preview str(misty.views) # remove juxtaview and preview misty.views %>% remove_views("juxtaview.1.5") %>% str() # remove juxtaview and paraview and preview misty.views %>% remove_views(c("juxtaview.1.5", "paraview.10")) %>% str()
library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_juxtaview(pos, neighbor.thr = 1.5) %>% add_paraview(pos, l = 10) # preview str(misty.views) # remove juxtaview and preview misty.views %>% remove_views("juxtaview.1.5") %>% str() # remove juxtaview and paraview and preview misty.views %>% remove_views(c("juxtaview.1.5", "paraview.10")) %>% str()
Rename view in a view composition
rename_view(current.views, old.name, new.name, new.abbrev = new.name)
rename_view(current.views, old.name, new.name, new.abbrev = new.name)
current.views |
the current view composition. |
old.name |
old name of the view. |
new.name |
new name of the view. |
new.abbrev |
new abbreviated name. |
A mistyR view composition with a renamed view.
Other view manipulation functions:
filter_views()
,
select_markers()
view1 <- data.frame(marker1 = rnorm(100, 10, 2), marker2 = rnorm(100, 15, 3)) view2 <- data.frame(marker1 = rnorm(100, 10, 5), marker2 = rnorm(100, 15, 5)) misty.views <- create_initial_view(view1) %>% add_views(create_view("originalname", view2, "on")) str(misty.views) # rename and preview misty.views %>% rename_view("originalname", "renamed", "rn") %>% str()
view1 <- data.frame(marker1 = rnorm(100, 10, 2), marker2 = rnorm(100, 15, 3)) view2 <- data.frame(marker1 = rnorm(100, 10, 5), marker2 = rnorm(100, 15, 5)) misty.views <- create_initial_view(view1) %>% add_views(create_view("originalname", view2, "on")) str(misty.views) # rename and preview misty.views %>% rename_view("originalname", "renamed", "rn") %>% str()
Trains multi-view models for all target markers, estimates the performance, the contributions of the view specific models and the importance of predictor markers for each target marker.
run_misty( views, results.folder = "results", seed = 42, target.subset = NULL, bypass.intra = FALSE, cv.folds = 10, cached = FALSE, append = FALSE, model.function = random_forest_model, ... )
run_misty( views, results.folder = "results", seed = 42, target.subset = NULL, bypass.intra = FALSE, cv.folds = 10, cached = FALSE, append = FALSE, model.function = random_forest_model, ... )
views |
view composition. |
results.folder |
path to the top level folder to store raw results. |
seed |
seed used for random sampling to ensure reproducibility. |
target.subset |
subset of targets to train models for. If |
bypass.intra |
a |
cv.folds |
number of cross-validation folds to consider for estimating the performance of the multi-view models |
cached |
a |
append |
a |
model.function |
a function which is used to model each view, default
model is |
... |
all additional parameters are passed to the chosen ML model for training the view-specific models |
If bypass.intra
is set to TRUE
all variable in the intraview
the intraview data will be treated as targets only. The baseline intraview
model in this case is a trivial model that predicts the average of each
target. If the intraview has only one variable this switch is automatically
set to TRUE
.
Default model to train the view-specific views is a Random Forest model
based on ranger()
–
run_misty(views, model.function = random_forest_model)
The following parameters are the default
configuration: num.trees = 100
, importance = "impurity"
,
num.threads = 1
, seed = seed
.
Gradient boosting is an alternative to model each view using gradient
boosting. The algorithm is based on xgb.train()
–
run_misty(views, model.function = gradient_boosting_model)
The following parameters are the default configuration: booster = "gbtree"
,
rounds = 10
, objective = "reg:squarederror"
. Set booster
to "gblinear"
for linear boosting.
Bagged MARS is an alternative to model each view using bagged MARS,
(multivariate adaptive spline regression models) trained with
bootstrap aggregation samples. The algorithm is based on
earth()
–
run_misty(views, model.function = bagged_mars_model)
The following parameters are the default configuration: degree = 2
.
Furthermore 50 base learners are used by default (pass n.bags
as
parameter via ...
to change this value).
MARS is an alternative to model each view using
multivariate adaptive spline regression model. The algorithm is based on
earth()
–
run_misty(views, model.function = mars_model)
The following parameters are the default configuration: degree = 2
.
Linear model is an alternative to model each view using a simple linear
model. The algorithm is based on lm()
–
run_misty(views, model.function = linear_model)
SVM is an alternative to model each view using a support vector
machines. The algorithm is based on ksvm()
–
run_misty(views, model.function = svm_model)
The following parameters are the default configuration: kernel = "vanilladot"
(linear kernel), C = 1
, type = "eps-svr"
.
MLP is an alternative to model each view using a multi-layer perceptron.
The alogorithm is based on mlp()
–
run_misty(views, model.function = mlp_model)
The following parameters are the default configuration: size = c(10)
(meaning we have 1 hidden layer with 10 units).
Path to the results folder that can be passed to
collect_results()
.
create_initial_view()
for
starting a view composition.
# Create a view composition of an intraview and a paraview with radius 10 then # run MISTy for a single sample. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_paraview(pos, l = 10) # run with default parameters run_misty(misty.views)
# Create a view composition of an intraview and a paraview with radius 10 then # run MISTy for a single sample. library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # get the coordinates for each cell pos <- synthetic[[1]] %>% select(row, col) # compose misty.views <- create_initial_view(expr) %>% add_paraview(pos, l = 10) # run with default parameters run_misty(misty.views)
Select a subset of markers in a view
select_markers(current.views, view = "intraview", ...)
select_markers(current.views, view = "intraview", ...)
current.views |
the current view composition. |
view |
the name of the view to select markers for. |
... |
one or more select expressions
|
A mistyR view composition with selected markers in view
.
<tidy-select
>.
Other view manipulation functions:
filter_views()
,
rename_view()
# Create a view composition with an intraview and select library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # compose misty.views <- create_initial_view(expr) # select markers from the intraview not starting with lig and preview misty.views %>% select_markers("intraview", !starts_with("lig")) %>% str()
# Create a view composition with an intraview and select library(dplyr) # get the expression data data("synthetic") expr <- synthetic[[1]] %>% select(-c(row, col, type)) # compose misty.views <- create_initial_view(expr) # select markers from the intraview not starting with lig and preview misty.views %>% select_markers("intraview", !starts_with("lig")) %>% str()
Data generated from 10 random layouts of four cell types and empty space on 100-by-100 grid by simulating a two-dimensional cellular automata model that focuses on signaling events. Cell growth, division, motility and death are neglected. The intracellular processes involve two layers, first the ligand activation of signaling hubs and ligand production/secretion regulated by proteins. The model simulates the production, diffusion, degradation and interactions of 11 molecular species. Ligands are produced in each cell-type based on the activity level of their production nodes and then freely diffuse, degrade or interact with other cells on the grid. Other molecular species involved in signaling are localised in the intracellular space and their activity depends on ligand binding and intracellular wiring.
data("synthetic")
data("synthetic")
A named list
of length 10. Each list item is a tibble
that
corresponds to a simulation of one random layout with information about each
cell in rows described by the following 14 variables:
location of the cell on the grid
expression of ligands
expression of intracellular proteins
expression of regulatory proteins
cell type id
https://github.com/saezlab/misty_pipelines/