TreeSummarizedExperiment
objectsMultiple TreeSummarizedExperiemnt
objects
(TSE) can be combined by using rbind
or
cbind
. Here, we create a toy
TreeSummarizedExperiment
object using
makeTSE()
(see ?makeTSE()
). As the tree in the
row/column tree slot is generated randomly using
ape::rtree()
, set.seed()
is used to create
reproducible results.
library(TreeSummarizedExperiment)
set.seed(1)
# TSE: without the column tree
(tse_a <- makeTSE(include.colTree = FALSE))
## class: TreeSummarizedExperiment
## dim: 10 4
## metadata(0):
## assays(1): ''
## rownames(10): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (10 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
## class: TreeSummarizedExperiment
## dim: 20 4
## metadata(0):
## assays(1): ''
## rownames(20): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (20 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
The generated tse_aa
has 20 rows, which is two times of
that in tse_a
. The row tree in tse_aa
is the
same as that in tse_a
.
## [1] TRUE
If we rbind
two TSEs (e.g., tse_a
and
tse_b
) that have different row trees, the obtained TSE
(e.g., tse_ab
) will have two row trees.
set.seed(2)
tse_b <- makeTSE(include.colTree = FALSE)
# different row trees
identical(rowTree(tse_a), rowTree(tse_b))
## [1] FALSE
## class: TreeSummarizedExperiment
## dim: 20 4
## metadata(0):
## assays(1): ''
## rownames(20): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (20 rows)
## rowTree: 2 phylo tree(s) (20 leaves)
## colLinks: NULL
## colTree: NULL
In the row link data, the whichTree
column gives
information about which tree the row is mapped to. For
tse_aa
, there is only one tree named as phylo
.
However, for tse_ab
, there are two trees
(phylo
and phylo.1
).
## LinkDataFrame with 20 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE phylo
## entity2 entity2 alias_2 2 TRUE phylo
## entity3 entity3 alias_3 3 TRUE phylo
## entity4 entity4 alias_4 4 TRUE phylo
## entity5 entity5 alias_5 5 TRUE phylo
## ... ... ... ... ... ...
## entity6 entity6 alias_6 6 TRUE phylo
## entity7 entity7 alias_7 7 TRUE phylo
## entity8 entity8 alias_8 8 TRUE phylo
## entity9 entity9 alias_9 9 TRUE phylo
## entity10 entity10 alias_10 10 TRUE phylo
## LinkDataFrame with 20 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE phylo
## entity2 entity2 alias_2 2 TRUE phylo
## entity3 entity3 alias_3 3 TRUE phylo
## entity4 entity4 alias_4 4 TRUE phylo
## entity5 entity5 alias_5 5 TRUE phylo
## ... ... ... ... ... ...
## entity6 entity6 alias_6 6 TRUE phylo.1
## entity7 entity7 alias_7 7 TRUE phylo.1
## entity8 entity8 alias_8 8 TRUE phylo.1
## entity9 entity9 alias_9 9 TRUE phylo.1
## entity10 entity10 alias_10 10 TRUE phylo.1
The name of trees can be accessed using rowTreeNames
. If
the input TSEs use the same name for trees,
rbind
will automatically create valid and unique names for
trees by using make.names
. tse_a
and
tse_b
both use phylo
as the name of their row
trees. In tse_ab
, the row tree that originates from
tse_b
is named as phylo.1
instead.
## [1] "phylo"
## [1] "phylo" "phylo.1"
## [1] "phylo"
## [1] "phylo"
Once the name of trees is changed, the column whichTree
in the rowLinks()
is updated accordingly.
## LinkDataFrame with 20 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE tree1
## entity2 entity2 alias_2 2 TRUE tree1
## entity3 entity3 alias_3 3 TRUE tree1
## entity4 entity4 alias_4 4 TRUE tree1
## entity5 entity5 alias_5 5 TRUE tree1
## ... ... ... ... ... ...
## entity6 entity6 alias_6 6 TRUE tree2
## entity7 entity7 alias_7 7 TRUE tree2
## entity8 entity8 alias_8 8 TRUE tree2
## entity9 entity9 alias_9 9 TRUE tree2
## entity10 entity10 alias_10 10 TRUE tree2
To run cbind
, TSEs should agree in the
row dimension. If TSEs only differ in the row tree, the
row tree and the row link data are dropped.
## class: TreeSummarizedExperiment
## dim: 10 8
## metadata(0):
## assays(1): ''
## rownames(10): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(8): sample1 sample2 ... sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (10 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
## Warning in cbind(...): rowTree & rowLinks differ in the provided TSEs.
## rowTree & rowLinks are dropped after 'cbind'
## class: TreeSummarizedExperiment
## dim: 10 8
## metadata(0):
## assays(1): ''
## rownames(10): entity1 entity2 ... entity9 entity10
## rowData names(2): var1 var2
## colnames(8): sample1 sample2 ... sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (10 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
We obtain a subset of tse_ab
by extracting the data on
rows 11:15
. These rows are mapped to the same tree named as
phylo.1
. So, the rowTree
slot of
sse
has only one tree.
## class: TreeSummarizedExperiment
## dim: 5 4
## metadata(0):
## assays(1): ''
## rownames(5): entity1 entity2 entity3 entity4 entity5
## rowData names(2): var1 var2
## colnames(4): sample1 sample2 sample3 sample4
## colData names(2): ID group
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## rowLinks: a LinkDataFrame (5 rows)
## rowTree: 1 phylo tree(s) (10 leaves)
## colLinks: NULL
## colTree: NULL
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity1 entity1 alias_1 1 TRUE tree2
## entity2 entity2 alias_2 2 TRUE tree2
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
[
works not only as a getter but also a setter to
replace a subset of sse
.
set.seed(3)
tse_c <- makeTSE(include.colTree = FALSE)
rowTreeNames(tse_c) <- "new_tree"
# the first two rows are from tse_c, and are mapped to 'new_tree'
sse[1:2, ] <- tse_c[5:6, ]
rowLinks(sse)
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity6 entity6 alias_6 6 TRUE new_tree
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
The TSE object can be subset also by nodes or/and
trees using subsetByNodes
## LinkDataFrame with 2 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity6 entity6 alias_6 6 TRUE new_tree
## LinkDataFrame with 2 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity5 entity5 alias_5 5 TRUE tree2
# by tree and node
sse_c <- subsetByNode(x = sse, rowNode = 5, whichRowTree = "tree2")
rowLinks(sse_c)
## LinkDataFrame with 1 row and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE tree2
By using colTree
, we can add a column tree to
sse
that has no column tree before.
## NULL
library(ape)
set.seed(1)
col_tree <- rtree(ncol(sse))
# To use 'colTree` as a setter, the input tree should have node labels matching
# with column names of the TSE.
col_tree$tip.label <- colnames(sse)
colTree(sse) <- col_tree
colTree(sse)
##
## Phylogenetic tree with 4 tips and 3 internal nodes.
##
## Tip labels:
## sample1, sample2, sample3, sample4
##
## Rooted; includes branch lengths.
sse
has two row trees. We can replace one of them with a
new tree by specifying whichTree
of the
rowTree
.
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_5 5 TRUE new_tree
## entity6 entity6 alias_6 6 TRUE new_tree
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
# the new row tree
set.seed(1)
row_tree <- rtree(4)
row_tree$tip.label <- paste0("entity", 5:7)
# replace the tree named as the 'new_tree'
nse <- sse
rowTree(nse, whichTree = "new_tree") <- row_tree
rowLinks(nse)
## LinkDataFrame with 5 rows and 5 columns
## nodeLab nodeLab_alias nodeNum isLeaf whichTree
## <character> <character> <integer> <logical> <character>
## entity5 entity5 alias_1 1 TRUE new_tree
## entity6 entity6 alias_2 2 TRUE new_tree
## entity3 entity3 alias_3 3 TRUE tree2
## entity4 entity4 alias_4 4 TRUE tree2
## entity5 entity5 alias_5 5 TRUE tree2
In the row links, the first two rows now have new values in
nodeNum
and nodeLab_alias
. The name in
whichTree
is not changed but the tree is actually
updated.
# FALSE is expected
identical(rowTree(sse, whichTree = "new_tree"),
rowTree(nse, whichTree = "new_tree"))
## [1] FALSE
## [1] TRUE
If nodes of the input tree and rows of the TSE are
named differently, users can match rows with nodes via
changeTree
with rowNodeLab
provided.
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ggplot2_3.5.1 ggtree_3.15.0
## [3] ape_5.8 TreeSummarizedExperiment_2.15.0
## [5] Biostrings_2.75.0 XVector_0.46.0
## [7] SingleCellExperiment_1.28.0 SummarizedExperiment_1.36.0
## [9] Biobase_2.67.0 GenomicRanges_1.59.0
## [11] GenomeInfoDb_1.43.0 IRanges_2.41.0
## [13] S4Vectors_0.44.0 BiocGenerics_0.53.1
## [15] generics_0.1.3 MatrixGenerics_1.19.0
## [17] matrixStats_1.4.1 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 dplyr_1.1.4 farver_2.1.2
## [4] fastmap_1.2.0 lazyeval_0.2.2 digest_0.6.37
## [7] lifecycle_1.0.4 tidytree_0.4.6 magrittr_2.0.3
## [10] compiler_4.4.1 rlang_1.1.4 sass_0.4.9
## [13] tools_4.4.1 utf8_1.2.4 yaml_2.3.10
## [16] knitr_1.48 labeling_0.4.3 S4Arrays_1.6.0
## [19] DelayedArray_0.33.1 aplot_0.2.3 abind_1.4-8
## [22] BiocParallel_1.41.0 withr_3.0.2 purrr_1.0.2
## [25] sys_3.4.3 grid_4.4.1 fansi_1.0.6
## [28] colorspace_2.1-1 scales_1.3.0 cli_3.6.3
## [31] rmarkdown_2.28 crayon_1.5.3 treeio_1.30.0
## [34] httr_1.4.7 cachem_1.1.0 zlibbioc_1.52.0
## [37] parallel_4.4.1 ggplotify_0.1.2 BiocManager_1.30.25
## [40] vctrs_0.6.5 yulab.utils_0.1.7 Matrix_1.7-1
## [43] jsonlite_1.8.9 gridGraphics_0.5-1 patchwork_1.3.0
## [46] maketools_1.3.1 jquerylib_0.1.4 tidyr_1.3.1
## [49] glue_1.8.0 codetools_0.2-20 gtable_0.3.6
## [52] UCSC.utils_1.2.0 munsell_0.5.1 tibble_3.2.1
## [55] pillar_1.9.0 htmltools_0.5.8.1 GenomeInfoDbData_1.2.13
## [58] R6_2.5.1 evaluate_1.0.1 lattice_0.22-6
## [61] highr_0.11 ggfun_0.1.7 bslib_0.8.0
## [64] Rcpp_1.0.13 SparseArray_1.6.0 nlme_3.1-166
## [67] xfun_0.48 fs_1.6.5 buildtools_1.0.0
## [70] pkgconfig_2.0.3