| Title: | Phenotypes Identification Using Mapper from topological data Analysis |
|---|---|
| Description: | The PIUMA package offers a tidy pipeline of Topological Data Analysis frameworks to identify and characterize communities in high and heterogeneous dimensional data. |
| Authors: | Mattia Chiesa [aut, cre] (ORCID: <https://orcid.org/0000-0001-7427-9954>), Arianna Dagliati [aut] (ORCID: <https://orcid.org/0000-0002-5041-0409>), Alessia Gerbasi [aut] (ORCID: <https://orcid.org/0000-0003-4501-1777>), Giuseppe Albi [aut], Laura Ballarini [aut], Luca Piacentini [aut] (ORCID: <https://orcid.org/0000-0003-1022-4481>), Carlo Leonardi [aut] (ORCID: <https://orcid.org/0000-0001-5348-8300>) |
| Maintainer: | Mattia Chiesa <[email protected]> |
| License: | GPL-3 + file LICENSE |
| Version: | 1.9.0 |
| Built: | 2026-06-04 07:08:23 UTC |
| Source: | https://github.com/bioc/PIUMA |
Cluster Mapper nodes (x@graph$igraph) with a chosen community algorithm or an automatic selection based on predicted graph geometry (x@graph$predicted). Assign observations either by k-NN tie-breaking (default) or by pure topological label concatenation.
autoClusterMapper( x, method = c("automatic", "fast_greedy", "walktrap", "edge_betweenness", "optimal", "label_propagation"), k = 5L )autoClusterMapper( x, method = c("automatic", "fast_greedy", "walktrap", "edge_betweenness", "optimal", "label_propagation"), k = 5L )
x |
A |
method |
One of |
k |
Integer >=1 or |
In method = "automatic", the algorithm is chosen from the
predicted geometry:
SF / CM
Use fast greedy modularity optimization.
WSUse Walktrap (short random walks).
RGGUse edge betweenness (bridge detection).
SBMPrefer optimal (exact modularity) for small graphs; falls back for larger ones.
ERUse label propagation (fast, parameter-free).
Isolated nodes (degree = 0) become singletons with unique labels.
The input TDAobj invisibly, with x@clustering
updated:
nodes_clusterData frame with columns node,
obs, cluster.
obs_clusterData frame with columns obs,
cluster.
Carlo Leonardi, Mattia Chiesa
data(vascEC_norm) data(vascEC_meta) #df_TDA <- cbind(vascEC_meta, vascEC_norm) #df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) #df_TDA <- dfToDistance(df_TDA,'euclidean') #df_TDA <- dfToProjection(df_TDA, "UMAP", nComp = 2) #df_TDA <- mapperCore(df_TDA, # nBins = 20, overlap = 0.3, # mClustNode = 2, clustMeth = "kmeans") #df_TDA <- jaccardMatrix(df_TDA) #df_TDA <- setGraph(df_TDA) #df_TDA <- predict_mapper_class(df_TDA) #df_TDA <- autoClusterMapper(df_TDA,method = 'walktrap')data(vascEC_norm) data(vascEC_meta) #df_TDA <- cbind(vascEC_meta, vascEC_norm) #df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) #df_TDA <- dfToDistance(df_TDA,'euclidean') #df_TDA <- dfToProjection(df_TDA, "UMAP", nComp = 2) #df_TDA <- mapperCore(df_TDA, # nBins = 20, overlap = 0.3, # mClustNode = 2, clustMeth = "kmeans") #df_TDA <- jaccardMatrix(df_TDA) #df_TDA <- setGraph(df_TDA) #df_TDA <- predict_mapper_class(df_TDA) #df_TDA <- autoClusterMapper(df_TDA,method = 'walktrap')
This function computes the average of the entropies for each node of a network.
checkNetEntropy(outcome_vect)checkNetEntropy(outcome_vect)
outcome_vect |
A vector containing the average outcome values for each node of a network. |
The average of the entropies is related to the amount of information stored in the network.
The network entropy using each node of a network.
Mattia Chiesa, Laura Ballarini, Luca Piacentini
makeTDAobj,
dfToDistance,
dfToProjection,
mapperCore,
jaccardMatrix,
tdaDfEnrichment
# use example data: set.seed(1) entropy <- checkNetEntropy(round(runif(10), 0))# use example data: set.seed(1) entropy <- checkNetEntropy(round(runif(10), 0))
This function assesses the fitting to a scale-free net model.
checkScaleFreeModel(x, showPlot = FALSE)checkScaleFreeModel(x, showPlot = FALSE)
x |
A TDAobj object, processed by the |
showPlot |
Whether the plot has to be generated. Default: FALSE |
The scale-free networks show a high negative correlation beztween k and p(k).
A list containing:
connectivity of the resulting graph
the estimated gamma value
the correlation between degree \(k\) and its distribution \(p(k)\).
The p-value of the correlation between the k and the degree distribution p(k).
The correlation between the logarithm (base 10) of k and the logarithm (base 10) of the degree distribution p(k).
The p-value of the correlation between the logarithm (base 10) of k and the logarithm (base 10) of the degree distribution p(k).
A composite score reflecting how strongly power‐law behavior coexists with graph cohesion, computed as the absolute product between cor(P(k)*k) and connectivity
Mattia Chiesa, Laura Ballarini, Luca Piacentini, Carlo Leonardi
makeTDAobj,
dfToDistance,
dfToProjection,
mapperCore,
jaccardMatrix
## use example data: data(tda_test_data) netModel <- checkScaleFreeModel(tda_test_data) print(netModel)## use example data: data(tda_test_data) netModel <- checkScaleFreeModel(tda_test_data) print(netModel)
dfToProjection and
dfToDistance funtions of PIUMA package.A dataset to test the dfToProjection and
dfToDistance funtions of PIUMA package.
data(df_test_proj)data(df_test_proj)
A data frame with 15 rows (cells) and 15 columns (genes).
This function returns the distance matrix computed by using the Pearson's, Euclidean or Gower distance methods. The distances are computed between the rows of a data.frame in the classical form n x m, where n (rows) are observations and m (columns) are features.
dfToDistance(x, distMethod = c("euclidean", "gower", "pearson"))dfToDistance(x, distMethod = c("euclidean", "gower", "pearson"))
x |
A TDAobj object, generated by makeTDAobj Rows (n) and columns (m) should be, respectively, observations and features. |
distMethod |
The distance method to calculate the distance matix. "euclidean", "gower" and "pearson" values are allowed. Default: "euclidean". |
The starting TDAobj object, in which the computed distance matrix has been added (slot: 'dist_mat')
Mattia Chiesa, Laura Ballarini, Luca Piacentini
data(vascEC_norm) data(vascEC_meta) df_TDA <- cbind(vascEC_meta, vascEC_norm) df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) df_TDA <- dfToDistance(df_TDA,'euclidean')data(vascEC_norm) data(vascEC_meta) df_TDA <- cbind(vascEC_meta, vascEC_norm) df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) df_TDA <- dfToDistance(df_TDA,'euclidean')
This function performs the transformation of data from a high dimensional space into a low dimensional space, wrapping 6 well-knwon reduction methods; i.e., PCA, KPCA, t-SNE, UMAP, MDS, and Isomap. In the topological data analysis, the identified components are commonly used as lenses.
dfToProjection( x, method = c("PCA", "UMAP", "TSNE", "MDS", "KPCA", "ISOMAP"), nComp = 2, centerPCA = FALSE, scalePCA = FALSE, umapNNeigh = 15, umapMinDist = 0.1, tsnePerpl = 30, tsneMaxIter = 300, kpcaKernel = c("rbfdot", "laplacedot", "polydot", "tanhdot", "besseldot", "anovadot", "vanilladot", "splinedot"), kpcaSigma = 0.1, kpcaDegree = 1, isomNNeigh = 5, showPlot = FALSE, vectColor = NULL )dfToProjection( x, method = c("PCA", "UMAP", "TSNE", "MDS", "KPCA", "ISOMAP"), nComp = 2, centerPCA = FALSE, scalePCA = FALSE, umapNNeigh = 15, umapMinDist = 0.1, tsnePerpl = 30, tsneMaxIter = 300, kpcaKernel = c("rbfdot", "laplacedot", "polydot", "tanhdot", "besseldot", "anovadot", "vanilladot", "splinedot"), kpcaSigma = 0.1, kpcaDegree = 1, isomNNeigh = 5, showPlot = FALSE, vectColor = NULL )
x |
A TDAobj object, generated by makeTDAobj |
method |
Name of the dimensionality reduction method to use. "PCA", "UMAP", "TSNE", "MDS", "KPCA" and "isomap" values are allowed. Default is: "PCA". |
nComp |
The number of components to be computed. Default: 2 |
centerPCA |
Whether the data should be centered before PCA. Default:TRUE |
scalePCA |
Whether the data should be scaled before PCA. Default:TRUE |
umapNNeigh |
The number of neighbors for UMAP. Default: 15 |
umapMinDist |
The minimum distance between points for UMAP. Default: 0.1 |
tsnePerpl |
Perplexity argument of t-SNE. Default: 30 |
tsneMaxIter |
The maximum number of iterations for t-SNE. Default: 300 |
kpcaKernel |
The type of kernel for kPCA. "rbfdot", "laplacedot", "polydot", "tanhdot", "besseldot", "anovadot", "vanilladot" and "splinedot" are allowed. Default: "polydot". |
kpcaSigma |
The 'sigma' argument for kPCA. Default: 0.1. |
kpcaDegree |
The 'degree' argument for kPCA. Default: 1. |
isomNNeigh |
The number of neighbors for Isomap. Default: 5. |
showPlot |
Whether the scatter plot of the first two principal components should be shown. Default: TRUE. |
vectColor |
Vector containing the variable tocolor the scatter plot Default: NULL. |
The starting TDAobj object, in which the principal components of projected data have been added (slot:'comp')
Mattia Chiesa, Laura Ballarini, Luca Piacentini
data(vascEC_norm) data(vascEC_meta) df_TDA <- cbind(vascEC_meta, vascEC_norm) df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) df_TDA <- dfToProjection(df_TDA,'PCA',nComp=2)data(vascEC_norm) data(vascEC_meta) df_TDA <- cbind(vascEC_meta, vascEC_norm) df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) df_TDA <- dfToProjection(df_TDA,'PCA',nComp=2)
The method to get clusters from the clustering slot
getClusters(x) ## S4 method for signature 'TDAobj' getClusters(x)getClusters(x) ## S4 method for signature 'TDAobj' getClusters(x)
x |
a |
a data.frame
Carlo Leonardi
data(tda_test_data)data(tda_test_data)
The method to get data from the comp slot
getComp(x) ## S4 method for signature 'TDAobj' getComp(x)getComp(x) ## S4 method for signature 'TDAobj' getComp(x)
x |
a |
a data.frame with the comp data
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to get data from the dfMapper slot
getDfMapper(x) ## S4 method for signature 'TDAobj' getDfMapper(x)getDfMapper(x) ## S4 method for signature 'TDAobj' getDfMapper(x)
x |
a |
a data.frame with the dfMapper data
Mattia Chiesa
data(tda_test_data) ex_out <- getDfMapper(tda_test_data)data(tda_test_data) ex_out <- getDfMapper(tda_test_data)
The method to get data from the dist_mat slot
getDistMat(x) ## S4 method for signature 'TDAobj' getDistMat(x)getDistMat(x) ## S4 method for signature 'TDAobj' getDistMat(x)
x |
a |
a data.frame with the dist_mat data
Mattia Chiesa
data(tda_test_data) ex_out <- getDistMat(tda_test_data)data(tda_test_data) ex_out <- getDistMat(tda_test_data)
The method to get igraph object from the graph slot
getGraph(x) ## S4 method for signature 'TDAobj' getGraph(x)getGraph(x) ## S4 method for signature 'TDAobj' getGraph(x)
x |
a |
an igraph object
Carlo Leonardi
data(tda_test_data)data(tda_test_data)
The method to get data from the jacc slot
getJacc(x) ## S4 method for signature 'TDAobj' getJacc(x)getJacc(x) ## S4 method for signature 'TDAobj' getJacc(x)
x |
a |
a matrix with the jacc data
Mattia Chiesa
data(tda_test_data) ex_out <- getJacc(tda_test_data)data(tda_test_data) ex_out <- getJacc(tda_test_data)
The method to get metrics from the graph slot
getMetrics(x) ## S4 method for signature 'TDAobj' getMetrics(x)getMetrics(x) ## S4 method for signature 'TDAobj' getMetrics(x)
x |
a |
a vector
Carlo Leonardi, Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to get data from the node_data_mat slot
getNodeDataMat(x) ## S4 method for signature 'TDAobj' getNodeDataMat(x)getNodeDataMat(x) ## S4 method for signature 'TDAobj' getNodeDataMat(x)
x |
a |
a data.frame with the node_data_mat data
Mattia Chiesa
data(tda_test_data) ex_out <- getNodeDataMat(tda_test_data)data(tda_test_data) ex_out <- getNodeDataMat(tda_test_data)
The method to get data from the orig_data slot
getOrigData(x) ## S4 method for signature 'TDAobj' getOrigData(x)getOrigData(x) ## S4 method for signature 'TDAobj' getOrigData(x)
x |
a |
a data.frame with the original data
Mattia Chiesa
data(tda_test_data) ex_out <- getOrigData(tda_test_data)data(tda_test_data) ex_out <- getOrigData(tda_test_data)
The method to get data from the outcome slot
getOutcome(x) ## S4 method for signature 'TDAobj' getOutcome(x)getOutcome(x) ## S4 method for signature 'TDAobj' getOutcome(x)
x |
a |
a data.frame with the outcome data
Mattia Chiesa
data(tda_test_data) ex_out <- getOutcome(tda_test_data)data(tda_test_data) ex_out <- getOutcome(tda_test_data)
The method to get data from the outcomeFact slot
getOutcomeFact(x) ## S4 method for signature 'TDAobj' getOutcomeFact(x)getOutcomeFact(x) ## S4 method for signature 'TDAobj' getOutcomeFact(x)
x |
a |
a data.frame with the outcomeFact data
Mattia Chiesa
data(tda_test_data) ex_out <- getOutcomeFact(tda_test_data)data(tda_test_data) ex_out <- getOutcomeFact(tda_test_data)
The method to get data from the scaled_data slot
getScaledData(x) ## S4 method for signature 'TDAobj' getScaledData(x)getScaledData(x) ## S4 method for signature 'TDAobj' getScaledData(x)
x |
a |
a data.frame with the scaled data
Mattia Chiesa
data(tda_test_data) ex_out <- getScaledData(tda_test_data)data(tda_test_data) ex_out <- getScaledData(tda_test_data)
This function computes the Jaccard index for each pair of
nodes contained in TDAobj, generated by the mapperCore
function. The resulting data.frame can be used to represent data as a
network, for instance, in Cytoscape
jaccardMatrix(x)jaccardMatrix(x)
x |
A TDAobj object, processed by the |
The Jaccard index measures the similarity of two nodes A and B. It ranges from 0 to 1. If A and B share no members, their Jaccard index would be 0 (= NA). If A and B share all members, their Jaccard index would be 1. Hence, the higher the index, the more similar the two nodes. If the Jaccard index between A and B is different from NA, it means that an edge exists between A and B. The output matrix of Jaccard indexes can be used as an adjacency matrix. The resulting data.frame can be used to represent data as a network, for instance, in Cytoscape.
The starting TDAobj object, in which the matrix of Jaccard indexes, calculated comparing each node of the 'dfMapper' slot, has been added (slot: 'jacc')
Mattia Chiesa, Laura Ballarini, Luca Piacentini
makeTDAobj,
dfToDistance,
dfToProjection,
mapperCore
## use example data: data(tda_test_data) jacc_mat <- jaccardMatrix(tda_test_data)## use example data: data(tda_test_data) jacc_mat <- jaccardMatrix(tda_test_data)
This function import a data.frame and create the object to store all data needed for TDA analysis. In addition, some preliminary preprocess steps are performed; specifically, outcomes variables data will be separated the rest of dataset. The remaining dataset will be also re-scaled (0-1)
makeTDAobj(df, outcomes)makeTDAobj(df, outcomes)
df |
A data.frame representing a dataset in the classical n x m form. Rows (n) and columns (m) should be, respectively, observations and features. |
outcomes |
A string or vector of string containing the name of variables that have to be considered 'outcomes' |
A TDA object containing:
orig_data A data.frame of original data (without outcomes)
scaled_data A data.frame of re-scaled data (without outcomes)
outcomeFact A data.frame of original outcomes
outcome A data.frame of original outcomes converted as numeric
comp A data.frame containing the components of projected data
dist_mat A data.frame containing the computed distance matrix
dfMapper A data.frame containing the nodes, with their elements, identified by TDA
jacc A matrix of Jaccard indexes between each pair of dfMapper nodes
node_data_mat A data.frame with the node size and the average value
graph A list containing the igraph object derived from Jaccard matrix and intermediary objects
clustering A list containing two data.frames indicating the clustering per node and per rows
Mattia Chiesa, Laura Ballarini, Luca Piacentini, Carlo Leonardi
## use example data: data("vascEC_meta") data("vascEC_norm") df <- cbind(vascEC_meta, vascEC_norm) res <- makeTDAobj(df, "zone")## use example data: data("vascEC_meta") data("vascEC_norm") df <- cbind(vascEC_meta, vascEC_norm) res <- makeTDAobj(df, "zone")
This function import a SummarizedExperiment object
and create
the object to store all data needed for TDA analysis. In addition, some
preliminary preprocess steps are performed; specifically, outcomes
variables data will be separated the rest of dataset.
The remaining dataset will be also re-scaled (0-1)
makeTDAobjFromSE(SE, outcomes)makeTDAobjFromSE(SE, outcomes)
SE |
A |
outcomes |
A string or vector of string containing the name of variables that have to be considered 'outcomes' |
A TDA object containing:
orig_data A data.frame of original data (without outcomes)
scaled_data A data.frame of re-scaled data (without outcomes)
outcomeFact A data.frame of original outcomes
outcome A data.frame of original outcomes converted as numeric
comp A data.frame containing the components of projected data
dist_mat A data.frame containing the computed distance matrix
dfMapper A data.frame containing the nodes, with their elements, identified by TDA
jacc A matrix of Jaccard indexes between each pair of dfMapper nodes
node_data_mat A data.frame with the node size and the average value
graph A list containing the igraph object derived from Jaccard matrix and intermediary objects
clustering A list containing two data.frames indicating the clustering per node and per rows
Mattia Chiesa, Laura Ballarini, Luca Piacentini, Carlo Leonardi
## use example data: data("vascEC_meta") data("vascEC_norm") suppressMessages(library(SummarizedExperiment)) dataSE <- SummarizedExperiment( assays = as.matrix(t(vascEC_norm)), colData = as.data.frame(vascEC_meta) ) res <- makeTDAobjFromSE(dataSE, "zone")## use example data: data("vascEC_meta") data("vascEC_norm") suppressMessages(library(SummarizedExperiment)) dataSE <- SummarizedExperiment( assays = as.matrix(t(vascEC_norm)), colData = as.data.frame(vascEC_meta) ) res <- makeTDAobjFromSE(dataSE, "zone")
This is a comprehensive function permitting to perform the core
TDA Mapper algorithm with 2D lenses. It allow setting several types of
clustering methods. There is no restriction to nBins
and mClustNode, so the user can tune those for parameter search.
mapperCore( x, nBins = 15, overlap = 0.4, mClustNode = 2, remEmptyNode = TRUE, clustMeth = c("kmeans", "HR", "DBSCAN", "OPTICS"), HRMethod = c("average", "complete") )mapperCore( x, nBins = 15, overlap = 0.4, mClustNode = 2, remEmptyNode = TRUE, clustMeth = c("kmeans", "HR", "DBSCAN", "OPTICS"), HRMethod = c("average", "complete") )
x |
A TDAobj object, processed by the |
nBins |
The number of bins (i.e. the resolution of the cover). Default: 15. |
overlap |
The overlap between bins (i.e.the gain of the cover). Default: 0.4. |
mClustNode |
The number of clusters in each overlapping bin. Default: 2 |
remEmptyNode |
A logical value to remove or not the empty nodes from the resulting data.frame. Default: TRUE. |
clustMeth |
The clustering algorithm."HR", "kmeans", "DBSCAN", and "OPTICS" are allowed. Default: "kmeans". |
HRMethod |
The name of the linkage criterion (when clustMeth="HR"). "average" and "complete" values are allowed. Default: "average". |
The starting TDAobj object, in which the result of mapper algorithm (inferred nodes with their elements) has been added (slot: 'dfMapper')
A data.frame containing the clusters, with their elements, identified by TDA .
Mattia Chiesa, Laura Ballarini, Luca Piacentini, Carlo Leonardi
makeTDAobj,
dfToDistance,
dfToProjection
# use example data: data(vascEC_norm) data(vascEC_meta) df_TDA <- cbind(vascEC_meta, vascEC_norm) df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) df_TDA <- dfToDistance(df_TDA,'euclidean') df_TDA <- dfToProjection(df_TDA, "PCA", nComp = 2) df_TDA <- mapperCore(df_TDA, nBins = 5, overlap = 0.5, mClustNode = 2, clustMeth = "kmeans")# use example data: data(vascEC_norm) data(vascEC_meta) df_TDA <- cbind(vascEC_meta, vascEC_norm) df_TDA <- makeTDAobj(df_TDA,outcomes = c("stage","zone")) df_TDA <- dfToDistance(df_TDA,'euclidean') df_TDA <- dfToProjection(df_TDA, "PCA", nComp = 2) df_TDA <- mapperCore(df_TDA, nBins = 5, overlap = 0.5, mClustNode = 2, clustMeth = "kmeans")
The application of unsupervised learning methodologies could help the identification of specific phenotypes in huge heterogeneous cohorts, such as clinical or -omics data. Among them, the Topological Data Analysis (TDA) is a rapidly growing field that combines concepts from algebraic topology and computational geometry to analyze and extract meaningful information from complex and high-dimensional data sets. Moreover, TDA is a robust and effective methodology, able to preserve the intrinsic characteristics of data and the mutual relations among observations, depicting complex data in a graph-based representation. Indeed, building topological models as networks, TDA allows complex diseases to be inspected in a continuous space, where subjects can fluctuate over the graph, sharing, at the same time, more than one adjacent node of the network. Overall, TDA offers a powerful set of tools to capture the underlying topological features of data, revealing essential patterns and relationships that might be hidden from traditional statistical techniques. The PIUMA package (Phenotypes Identification Using Mapper from topological data Analysis) allows implementing all the main steps of a Topological Data Analysis. PIUMA is the italian word meaning 'feather'.
See the package vignette, by typing vignette("PIUMA") to discover
all the functions.
Mattia Chiesa, Laura Ballarini, Luca Piacentini
Useful links:
Report bugs at https://github.com/BioinfoMonzino/PIUMA/issues
Infer a geometry label for x@graph$igraph using fast heuristics.
Writes only x@graph$predicted$class (one of
c("SF","RGG","WS","ER","SBM","CM")).
#' @details Heuristics (hierarchical decision):
SF (relaxed): rely on checkScaleFreeModel(x). Declare
scale-free if at least one of the following holds:
|corlogklogpk| >= 0.55, |corkpk| >= 0.70,
1.6 <= gamma <= 3.6, or Connectivity >= 0.40;
alternatively accept SF if the product score
|corlogklogpk| * Connectivity >= 0.2.
WS: small-world index sigma > 1.2 with
C/C_ER >= 3 and L/L_ER <= 1.2.
RGG: very high clustering vs ER (C/C_ER >= 5),
longer paths (L/L_ER >= 1.3), and positive degree
assortativity (r >= 0.10).
ER: Poisson-like degree dispersion
VMR = var(k)/mean(k) ~ 1 (within 30%),
|C - p| <= 0.05, |r| <= 0.05, and
0.8 <= sigma <= 1.2.
SBM: strong modular structure, Q >= 0.40 with
>= 3 communities.
CM: heterogeneous degrees (var(k)/mean(k) > 2) with
clustering close to ER (|C - C_ER| <= 0.05); otherwise use a
sigma-based fallback (WS if sigma > 1.2, else ER).
The function sets only x@graph$predicted. It is intentionally
lightweight for fast computation.
predict_mapper_class(x, verbose = FALSE)predict_mapper_class(x, verbose = FALSE)
x |
A |
verbose |
Logical; print the chosen label. Default |
The input TDAobj with x@graph$predicted set.
Carlo Leonardi, Mattia Chiesa
data(tda_test_data) #tda_test_data <- predict_mapper_class(tda_test_data)data(tda_test_data) #tda_test_data <- predict_mapper_class(tda_test_data)
The method to set the comp slot
setComp(x, y) ## S4 method for signature 'TDAobj' setComp(x, y)setComp(x, y) ## S4 method for signature 'TDAobj' setComp(x, y)
x |
a |
y |
a data.frame with the comp data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set the dfMapper slot
setDfMapper(x, y) ## S4 method for signature 'TDAobj' setDfMapper(x, y)setDfMapper(x, y) ## S4 method for signature 'TDAobj' setDfMapper(x, y)
x |
a |
y |
a data.frame with the dfMapper data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set the dist_mat slot
setDistMat(x, y) ## S4 method for signature 'TDAobj' setDistMat(x, y)setDistMat(x, y) ## S4 method for signature 'TDAobj' setDistMat(x, y)
x |
a |
y |
a data.frame with the dist_mat data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set igraph object to the graph slot
setGraph(x) ## S4 method for signature 'TDAobj' setGraph(x)setGraph(x) ## S4 method for signature 'TDAobj' setGraph(x)
x |
a |
a TDAobj object
Carlo Leonardi
data(tda_test_data)data(tda_test_data)
The method to set the jacc slot
setJacc(x, y) ## S4 method for signature 'TDAobj' setJacc(x, y)setJacc(x, y) ## S4 method for signature 'TDAobj' setJacc(x, y)
x |
a |
y |
a matrix with the jacc data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set the node_data_mat slot
setNodeDataMat(x, y) ## S4 method for signature 'TDAobj' setNodeDataMat(x, y)setNodeDataMat(x, y) ## S4 method for signature 'TDAobj' setNodeDataMat(x, y)
x |
a |
y |
a data.frame with the node_data_mat data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set the orig_data slot
setOrigData(x, y) ## S4 method for signature 'TDAobj' setOrigData(x, y)setOrigData(x, y) ## S4 method for signature 'TDAobj' setOrigData(x, y)
x |
a |
y |
a data.frame with the original data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set the outcome slot
setOutcome(x, y) ## S4 method for signature 'TDAobj' setOutcome(x, y)setOutcome(x, y) ## S4 method for signature 'TDAobj' setOutcome(x, y)
x |
a |
y |
a data.frame with the outcome data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set the outcomeFact slot
setOutcomeFact(x, y) ## S4 method for signature 'TDAobj' setOutcomeFact(x, y)setOutcomeFact(x, y) ## S4 method for signature 'TDAobj' setOutcomeFact(x, y)
x |
a |
y |
a data.frame with the outcomeFact data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
The method to set the scaled_data slot
setScaledData(x, y) ## S4 method for signature 'TDAobj' setScaledData(x, y)setScaledData(x, y) ## S4 method for signature 'TDAobj' setScaledData(x, y)
x |
a |
y |
a data.frame with the scaled data |
a TDAobj object
Mattia Chiesa
data(tda_test_data)data(tda_test_data)
PIUMA package.A TDAobj with data in all slots for testing.
data(tda_test_data)data(tda_test_data)
A TDAobj.
This function computes the average value of additional features provided by the user and calculate the size for each node of 'dfMapper' slot
tdaDfEnrichment(x, df)tdaDfEnrichment(x, df)
x |
A TDAobj object, processed by the |
df |
A data.frame with scaled values in the classical n x m form: rows (n) and columns (m) must be observations and features, respectively. |
The starting TDAobj object, in which the a data.frame with additional information for each node has been added (slot: 'node_data_mat')
Mattia Chiesa, Laura Ballarini, Luca Piacentini
makeTDAobj,
dfToDistance,
dfToProjection,
mapperCore,
jaccardMatrix
## use example data: data(tda_test_data) data(df_test_proj) enrich_mat_tda <- tdaDfEnrichment(tda_test_data, df_test_proj)## use example data: data(tda_test_data) data(df_test_proj) enrich_mat_tda <- tdaDfEnrichment(tda_test_data, df_test_proj)
The TDA object for storing TDA data
TDAobj class showClass("TDAobj")
orig_dataA data.frame of original data (without outcomes)
scaled_dataA data.frame of re-scaled data (without outcomes)
outcomeFactA data.frame of original outcomes
outcomeA data.frame of original outcomes converted as numeric
compA data.frame containing the components of projected data
dist_matA data.frame containing the computed distance matrix
dfMapperA data.frame containing the nodes, with their elements, identified by TDA
jaccA matrix of Jaccard indexes between each pair of dfMapper nodes
node_data_matA data.frame with the node size and the average value of each feature
graphA list containing the igraph object of your Jaccard matrix, metrics and intermediary objects
clusteringA data.frame containing clusters from TDA on nodes and cells
We tested PIUMA on a subset of the single-cell RNA Sequencing dataset (GSE:GSE193346 generated and published by Feng et al. (2022) on Nature Communication to demonstrate that distinct transcriptional profiles are present in specific cell types of each heart chambers, which were attributed to have roles in cardiac development. In this tutorial, our aim will be to exploit PIUMA for identifying sub-population of vascular endothelial cells, which can be associated with specific heart developmental stages. The original dataset consisted of three layers of heterogeneity: cell type, stage and zone (i.e., heart chamber). Our testing dataset was obtained by subsetting vascular endothelial cells (cell type) by Seurat object, extracting raw counts and metadata. Thus, we filtered low expressed genes and normalized data by DaMiRseq
data(vascEC_meta)data(vascEC_meta)
A data frame with 1180 rows (cells) and 2 columns (outcomes).
We tested PIUMA on a subset of the single-cell RNA Sequencing dataset (GSE:GSE193346 generated and published by Feng et al. (2022) on Nature Communication to demonstrate that distinct transcriptional profiles are present in specific cell types of each heart chambers, which were attributed to have roles in cardiac development. In this tutorial, our aim will be to exploit PIUMA for identifying sub-population of vascular endothelial cells, which can be associated with specific heart developmental stages. The original dataset consisted of three layers of heterogeneity: cell type, stage and zone (i.e., heart chamber). Our testing dataset was obtained by subsetting vascular endothelial cells (cell type) by Seurat object, extracting raw counts and metadata. Thus, we filtered low expressed genes and normalized data by DaMiRseq
data(vascEC_norm)data(vascEC_norm)
A matrix with 1180 rows (cells) and 838 columns (genes).