Title: | supraHex: a supra-hexagonal map for analysing tabular omics data |
---|---|
Description: | A supra-hexagonal map is a giant hexagon on a 2-dimensional grid seamlessly consisting of smaller hexagons. It is supposed to train, analyse and visualise a high-dimensional omics input data. The supraHex is able to carry out gene clustering/meta-clustering and sample correlation, plus intuitive visualisations to facilitate exploratory analysis. More importantly, it allows for overlaying additional data onto the trained map to explore relations between input and additional data. So with supraHex, it is also possible to carry out multilayer omics data comparisons. Newly added utilities are advanced heatmap visualisation and tree-based analysis of sample relationships. Uniquely to this package, users can ultrafastly understand any tabular omics data, both scientifically and artistically, especially in a sample-specific fashion but without loss of information on large genes. |
Authors: | Hai Fang and Julian Gough |
Maintainer: | Hai Fang <[email protected]> |
License: | GPL-2 |
Version: | 1.45.0 |
Built: | 2024-10-31 05:45:27 UTC |
Source: | https://github.com/bioc/supraHex |
Human embryo dataset contains gene expression levels (5441 genes and 18 embryo samples) from Fang et al. (2010).
data(Fang)
data(Fang)
Fang
: a gene expression matrix of 5441 genes x 18 samples,
involving six successive stages, each with three replicates.
Fang.sampleinfo
: a matrix containing the information of
the 18 samples for the expression matrix Fang. The three columns
correspond to the sample information: "Name", "Stage" and "Replicate".
Fang.geneinfo
: a matrix containing the information of the
5441 genes for the expression matrix Fang. The three columns
correspond to the gene information: "AffyID", "EntrezGene" and
"Symbol".
Fang et al. (2010). Transcriptome analysis of early organogenesis in human embryos. Developmental Cell, 19(1):174-84.
Leukemia dataset (learning set) contains gene expression levels (3051
genes and 38 patient samples) from Golub et al. (1999). This dataset
has been pre-processed: capping into floor of 100 and ceiling of 16000;
filtering by exclusion of genes with or
, where max and min refer respectively to the maximum
and minimum intensities for a particular gene across mRNA samples;
2-base logarithmic transformation.
data(Golub)
data(Golub)
Golub
: a gene expression matrix of 3051 genes x 38
samples. These samples include 11 acute myeloid leukemia (AML) and 27
acute lymphoblastic leukemia (ALL) which can be further subtyped into
19 B-cell ALL and 8 T-cell ALL.
Golub et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, Vol. 286:531-537.
sBMH
is supposed to identify the best-matching
hexagons/rectangles (BMH) for the input data.
sBMH(sMap, data, which_bmh = c("best", "worst", "all"))
sBMH(sMap, data, which_bmh = c("best", "worst", "all"))
sMap |
an object of class "sMap" or a codebook matrix |
data |
a data frame or matrix of input data |
which_bmh |
which BMH is requested. It can be a vector consisting
of any integer values from [1, nHex]. Alternatively, it can also be one
of "best", "worst" and "all" choices. Here, "best" is equivalent to
|
a list with following components:
bmh
: the requested BMH matrix of dlen x length(which_bmh),
where dlen is the total number of rows of the input data
qerr
: the corresponding matrix of quantization errors
(i.e., the distance between the input data and their BMH), with the
same dimensions as "bmh" above
mqe
: the mean quantization error for the "best" BMH
call
: the call that produced this result
"which_bmh" upon request can be a vector consisting of any integer values from [1, nHex]
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, stage="rough") # 5) training at "rough" stage sM_rough <- sTrainBatch(sMap=sI, data=data, sTrain=sT_rough) # 6) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, stage="finetune") # 7) training at "finetune" stage sM_finetune <- sTrainBatch(sMap=sM_rough, data=data, sTrain=sT_rough) # 8) find the best-matching hexagons/rectangles for the input data response <- sBMH(sMap=sM_finetune, data=data, which_bmh="best")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, stage="rough") # 5) training at "rough" stage sM_rough <- sTrainBatch(sMap=sI, data=data, sTrain=sT_rough) # 6) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, stage="finetune") # 7) training at "finetune" stage sM_finetune <- sTrainBatch(sMap=sM_rough, data=data, sTrain=sT_rough) # 8) find the best-matching hexagons/rectangles for the input data response <- sBMH(sMap=sM_finetune, data=data, which_bmh="best")
sCompReorder
is supposed to reorder component planes for the
input map/data. It returns an object of class "sReorder". It is
realized by using a new map grid (with sheet shape consisting of a
rectangular lattice) to train component plane vectors (either
column-wise vectors of codebook/data matrix or the covariance matrix
thereof). As a result, similar component planes are placed closer to
each other. It is highly recommend to use trained map (i.e. codebook
matrix) as input if data matrix is hugely big to save computational
costs.
sCompReorder( sMap, xdim = NULL, ydim = NULL, amplifier = NULL, metric = c("none", "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi"), init = c("linear", "uniform", "sample"), seed = 825, algorithm = c("sequential", "batch"), alphaType = c("invert", "linear", "power"), neighKernel = c("gaussian", "bubble", "cutgaussian", "ep", "gamma"), finetuneSustain = TRUE )
sCompReorder( sMap, xdim = NULL, ydim = NULL, amplifier = NULL, metric = c("none", "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi"), init = c("linear", "uniform", "sample"), seed = 825, algorithm = c("sequential", "batch"), alphaType = c("invert", "linear", "power"), neighKernel = c("gaussian", "bubble", "cutgaussian", "ep", "gamma"), finetuneSustain = TRUE )
sMap |
an object of class "sMap" or input data frame/matrix |
xdim |
an integer specifying x-dimension of the grid |
ydim |
an integer specifying y-dimension of the grid |
amplifier |
an integer specifying the amplifier (3 by default) of the number of component planes. The product of the component number and the amplifier constitutes the number of rectangles in the sheet grid |
metric |
distance metric used to define the similarity between
component planes. It can be "none", which means directly using
column-wise vectors of codebook/data matrix. Otherwise, first calculate
the covariance matrix from the codebook/data matrix. The distance
metric used for calculating the covariance matrix between component
planes can be: "pearson" for pearson correlation, "spearman" for
spearman rank correlation, "kendall" for kendall tau rank correlation,
"euclidean" for euclidean distance, "manhattan" for cityblock distance,
"cos" for cosine similarity, "mi" for mutual information. See
|
init |
an initialisation method. It can be one of "uniform", "sample" and "linear" initialisation methods |
seed |
an integer specifying the seed |
algorithm |
the training algorithm. It can be one of "sequential" and "batch" algorithm. By default, it uses 'sequential' algorithm. If the input data contains a large number of samples but not a great amount of zero entries, then it is reasonable to use 'batch' algorithm for its fast computations (probably also without the compromise of accuracy) |
alphaType |
the alpha type. It can be one of "invert", "linear" and "power" alpha types |
neighKernel |
the training neighbor kernel. It can be one of "gaussian", "bubble", "cutgaussian", "ep" and "gamma" kernels |
finetuneSustain |
logical to indicate whether sustain the "finetune" training. If true, it will repeat the "finetune" stage until the mean quantization error does get worse. By default, it sets to TRUE |
an object of class "sReorder", a list with following components:
nHex
: the total number of rectanges in the grid
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
uOrder
: the unique order/placement for each component
plane that is reordered to the "sheet"-shape grid with rectangular
lattice
coord
: a matrix of nHex x 2, with each row corresponding
to the coordinates of each "uOrder" rectangle in the 2D map grid
call
: the call that produced this result
All component planes are uniquely placed within a "sheet"-shape rectangle grid:
Each component plane mapped to the "sheet"-shape grid with rectangular lattice is determinied iteratively in an order from the best matched to the next compromised one.
If multiple compoments are hit in the same rectangular lattice, the worse one is always sacrificed by moving to the next best one till all components are placed somewhere exclusively on their own.
The size of "sheet"-shape rectangle grid depends on the input arguments:
How the input parameters are used to determine nHex is taken priority in the following order: "xdim & ydim" > "nHex" > "data".
If both of xdim and ydim are given, .
If only data is input, , where dlen is the
number of rows of the input data.
After nHex is determined, xy-dimensions of rectangle grid are then determined according to the square root of the two biggest eigenvalues of the input data.
sTopology
, sPipeline
, sBMH
,
sDistance
, visCompReorder
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) reorder component planes in different ways # 3a) directly using column-wise vectors of codebook matrix sReorder <- sCompReorder(sMap=sMap, amplifier=2, metric="none") # 3b) according to covariance matrix of pearson correlation of codebook matrix sReorder <- sCompReorder(sMap=sMap, amplifier=2, metric="pearson") # 3c) according to covariance matrix of pearson correlation of input matrix sReorder <- sCompReorder(sMap=data, amplifier=2, metric="pearson")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) reorder component planes in different ways # 3a) directly using column-wise vectors of codebook matrix sReorder <- sCompReorder(sMap=sMap, amplifier=2, metric="none") # 3b) according to covariance matrix of pearson correlation of codebook matrix sReorder <- sCompReorder(sMap=sMap, amplifier=2, metric="pearson") # 3c) according to covariance matrix of pearson correlation of input matrix sReorder <- sCompReorder(sMap=data, amplifier=2, metric="pearson")
sDistance
is supposed to compute and return the distance matrix
between the rows of a data matrix using a specified distance metric
sDistance( data, metric = c("pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi", "binary") )
sDistance( data, metric = c("pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi", "binary") )
data |
a data frame or matrix of input data |
metric |
distance metric used to calculate a symmetric distance matrix. See 'Note' below for options available |
dist
: a symmetric distance matrix of nRow x nRow, where
nRow is the number of rows of input data matrix
The distance metrics are supported:
"pearson": Pearson correlation. Note that two curves that have identical shape, but different magnitude will still have a correlation of 1
"spearman": Spearman rank correlation. As a nonparametric version of the pearson correlation, it calculates the correlation between the ranks of the data values in the two vectors (more robust against outliers)
"kendall": Kendall tau rank correlation. Compared to spearman
rank correlation, it goes a step further by using only the relative
ordering to calculate the correlation. For all pairs of data points
and
, it calls a pair of points either
as concordant (
in total) if
,
or as discordant (
in total) if
. Finally, it calculates gamma coefficient
as a measure of association which is highly resistant to tied data
"euclidean": Euclidean distance. Unlike the correlation-based distance measures, it takes the magnitude into account (input data should be suitably normalized
"manhattan": Cityblock distance. The distance between two vectors is the sum of absolute value of their differences along any coordinate dimension
"cos": Cosine similarity. As an uncentered version of pearson correlation, it is a measure of similarity between two vectors of an inner product space, i.e., measuring the cosine of the angle between them (using a dot product and magnitude)
"mi": Mutual information (MI). provides a general
measure of dependencies between variables, in particular, positive,
negative and nonlinear correlations. The caclulation of
is
implemented via applying adaptive partitioning method for deriving
equal-probability bins (i.e., each bin contains approximately the same
number of data points). The number of bins is heuristically determined
(the lower bound):
, where n is the length of the
vector. Because
increases with entropy, we normalize it to
allow comparison of different pairwise clone similarities:
, where
and
stand for the
entropy for the vector
and
, respectively
"binary": asymmetric binary (Jaccard distance index). the proportion of bits in which the only one divided by the at least one
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) calculate distance matrix using different metric sMap <- sPipeline(data=data) # 2a) using "pearson" metric dist <- sDistance(data=data, metric="pearson") # 2b) using "cos" metric # dist <- sDistance(data=data, metric="cos") # 2c) using "spearman" metric # dist <- sDistance(data=data, metric="spearman") # 2d) using "kendall" metric # dist <- sDistance(data=data, metric="kendall") # 2e) using "euclidean" metric # dist <- sDistance(data=data, metric="euclidean") # 2f) using "manhattan" metric # dist <- sDistance(data=data, metric="manhattan") # 2g) using "mi" metric # dist <- sDistance(data=data, metric="mi") # 2h) using "binary" metric # dist <- sDistance(data=data, metric="binary")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) calculate distance matrix using different metric sMap <- sPipeline(data=data) # 2a) using "pearson" metric dist <- sDistance(data=data, metric="pearson") # 2b) using "cos" metric # dist <- sDistance(data=data, metric="cos") # 2c) using "spearman" metric # dist <- sDistance(data=data, metric="spearman") # 2d) using "kendall" metric # dist <- sDistance(data=data, metric="kendall") # 2e) using "euclidean" metric # dist <- sDistance(data=data, metric="euclidean") # 2f) using "manhattan" metric # dist <- sDistance(data=data, metric="manhattan") # 2g) using "mi" metric # dist <- sDistance(data=data, metric="mi") # 2h) using "binary" metric # dist <- sDistance(data=data, metric="binary")
sDmat
is supposed to calculate distance (measured in
high-dimensional input space) to neighbors (defined by based on 2D
output space) for each of hexagons/rectangles
sDmat(sMap, which_neigh = 1, distMeasure = c("median", "mean", "min", "max"))
sDmat(sMap, which_neigh = 1, distMeasure = c("median", "mean", "min", "max"))
sMap |
an object of class "sMap" |
which_neigh |
which neighbors in 2D output space are used for the calculation. By default, it sets to "1" for direct neighbors, and "2" for neighbors within neighbors no more than 2, and so on |
distMeasure |
distance measure used to calculate distances in high-dimensional input space |
dMat
: a vector with the length of nHex. It stores the
distance a hexaon/rectangle is away from its output-space-defined
neighbors in high-dimensional input space
"which_neigh" is defined in output 2D space, but "distMeasure" is defined in high-dimensional input space
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) calculate "median" distances in INPUT space to different neighbors in 2D OUTPUT space # 3a) using direct neighbors in 2D OUTPUT space dMat <- sDmat(sMap=sMap, which_neigh=1, distMeasure="median") # 3b) using no more than 2-topological neighbors in 2D OUTPUT space # dMat <- sDmat(sMap=sMap, which_neigh=2, distMeasure="median")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) calculate "median" distances in INPUT space to different neighbors in 2D OUTPUT space # 3a) using direct neighbors in 2D OUTPUT space dMat <- sDmat(sMap=sMap, which_neigh=1, distMeasure="median") # 3b) using no more than 2-topological neighbors in 2D OUTPUT space # dMat <- sDmat(sMap=sMap, which_neigh=2, distMeasure="median")
sDmatCluster
is supposed to obtain clusters from a grid map. It
returns an object of class "sBase".
sDmatCluster( sMap, which_neigh = 1, distMeasure = c("mean", "median", "min", "max"), constraint = TRUE, clusterLinkage = c("average", "complete", "single", "bmh"), reindexSeed = c("hclust", "svd", "none") )
sDmatCluster( sMap, which_neigh = 1, distMeasure = c("mean", "median", "min", "max"), constraint = TRUE, clusterLinkage = c("average", "complete", "single", "bmh"), reindexSeed = c("hclust", "svd", "none") )
sMap |
an object of class "sMap" |
which_neigh |
which neighbors in 2D output space are used for the calculation. By default, it sets to "1" for direct neighbors, and "2" for neighbors within neighbors no more than 2, and so on |
distMeasure |
distance measure used to calculate distances in high-dimensional input space. It can be one of "median", "mean", "min" and "max" measures |
constraint |
logic whether further constraint applied. If TRUE, only consider those hexagons 1) with 2 or more neighbors; and 2) neighbors are not within minima already found (due to the same distance) |
clusterLinkage |
cluster linkage used to derive clusters. It can be "bmh", which accumulates a cluster just based on best-matching hexagons/rectanges but can not ensure each cluster is continuous. Instead, each cluster is continuous when using region-growing algorithm with one of "average", "complete" and "single" linkages |
reindexSeed |
the way to index seed. It can be "hclust" for reindexing seeds according to hierarchical clustering of patterns seen in seeds, "svd" for reindexing seeds according to svd of patterns seen in seeds, or "none" for seeds being simply increased by the hexagon indexes (i.e. always in an increasing order as hexagons radiate outwards) |
an object of class "sBase", a list with following components:
seeds
: the vector to store cluster seeds, i.e., a list of
local minima (in 2D output space) of distance matrix (in input space).
They are represented by the indexes of hexagons/rectangles
bases
: the vector with the length of nHex to store the
cluster memberships/bases, where nHex is the total number of
hexagons/rectanges in the grid
ig
: an igraph object storing neighbor relations between
bases, with node attributes 'name' (base), 'index', 'xcoord' and
'ycoord' (based on seeds)
hclust
: a hclust object storing tree-like relations
between bases (based on seed model vectors)
call
: the call that produced this result
The first item in the return "seeds" is the first cluster, whose memberships are those in the return "bases" that equals 1. The same relationship is held for the second item, and so on
sPipeline
, sDmatMinima
, sBMH
,
sNeighDirect
, sDistance
,
visDmatCluster
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) partition the grid map into clusters based on different criteria # 3a) based on "bmh" criterion # sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="bmh") # 3b) using region-growing algorithm with linkage "average" sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 4) visualise clusters/bases partitioned from the sMap visDmatCluster(sMap,sBase)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) partition the grid map into clusters based on different criteria # 3a) based on "bmh" criterion # sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="bmh") # 3b) using region-growing algorithm with linkage "average" sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 4) visualise clusters/bases partitioned from the sMap visDmatCluster(sMap,sBase)
sDmatMinima
is supposed to identify local minima of distance
matrix (resulting from sDmat
). The criterion of being
local minima is that the distance associated with a hexagon/rectangle
is always smaller than its direct neighbors (i.e., 1-neighborhood)
sDmatMinima( sMap, which_neigh = 1, distMeasure = c("median", "mean", "min", "max"), constraint = TRUE )
sDmatMinima( sMap, which_neigh = 1, distMeasure = c("median", "mean", "min", "max"), constraint = TRUE )
sMap |
an object of class "sMap" |
which_neigh |
which neighbors in 2D output space are used for the calculation. By default, it sets to "1" for direct neighbors, and "2" for neighbors within neighbors no more than 2, and so on |
distMeasure |
distance measure used to calculate distances in high-dimensional input space. It can be one of "median", "mean", "min" and "max" measures |
constraint |
logic whether further constraint applied. If TRUE, only consider those hexagons 1) with 2 or more neighbors; and 2) neighbors are not within minima already found (due to the same distance) |
minima
: a vector to store a list of local minima
(represented by the indexes of hexogans/rectangles
Do not get confused by "which_neigh" and the criteria of being local minima. Both of them deal with 2D output space. However, "which_neigh" is used to assist in the calculation of distance matrix (so can be 1-neighborhood or more); instead, the criterion of being local minima is only 1-neighborhood in the strictest sense
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) identify local minima of distance matrix based on "median" distances and direct neighbors minima <- sDmatMinima(sMap=sMap, which_neigh=1, distMeasure="median")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) identify local minima of distance matrix based on "median" distances and direct neighbors minima <- sDmatMinima(sMap=sMap, which_neigh=1, distMeasure="median")
sHexDist
is supposed to calculate euclidian distances between
each pair of hexagons/rectangles in a 2D grid of input "sTopol" or
"sMap" object. It returns a symmetric matrix containing pairwise
distances.
sHexDist(sObj)
sHexDist(sObj)
sObj |
an object of class "sTopol" or "sInit" or "sMap" |
dist
: a symmetric matrix of nHex x nHex, containing
pairwise distances, where nHex is the total number of
hexagons/rectanges in the grid
The return matrix has rows/columns ordered in the same order as the "coord" matrix of the input object does.
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) calculate distances between hexagons/rectangles in a 2D grid based on different objects # 4a) based on an object of class "sTopol" dist <- sHexDist(sObj=sTopol) # 4b) based on an object of class "sMap" dist <- sHexDist(sObj=sI)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) calculate distances between hexagons/rectangles in a 2D grid based on different objects # 4a) based on an object of class "sTopol" dist <- sHexDist(sObj=sTopol) # 4b) based on an object of class "sMap" dist <- sHexDist(sObj=sI)
sHexGrid
is supposed to define a supra-hexagonal map grid. A
supra-hexagon is a giant hexagon, which seamlessly consists of smaller
hexagons. Due to the symmetric nature, it can be uniquely determined by
specifying the radius away from the grid centroid. This function takes
input the grid radius (or the number of hexagons in the grid, but will
be adjusted to meet the definition of supra-hexagon), and returns a
list (see 'Value' below) containing: the grid radius, the total number
of hexagons in the grid, the 2D coordinates of the grid centroid, the
step for each hexogan away from the grid centroid, and the 2D
coordinates of all hexagons in the grid.
sHexGrid(r = NULL, nHex = NULL)
sHexGrid(r = NULL, nHex = NULL)
r |
an integer specifying the radius in a supra-hexagonal grid |
nHex |
the number of input hexagons in the grid |
an object of class "sHex", a list with following components:
r
: the grid radius
nHex
: the total number of hexagons in the grid. It may
differ from the input value; actually it is always no less than the
input one to ensure a supra-hexagonal grid exactly formed
centroid
: the 2D coordinates of the grid centroid
stepCentroid
: a vector with the length of nHex. It stores
how many steps a hexagon is awawy from the grid centroid ('1' for the
centroid itself). Starting with the centroid, it orders outward. Also,
for those hexagons of the same step, it orders from the rightmost in an
anti-clock wise
angleCentroid
: a vector with the length of nHex. It stores
the angle a hexagon is in terms of the grid centroid ('0' for the
centroid itself). For those hexagons of the same step, it orders from
the rightmost in an anti-clock wise
coord
: a matrix of nHex x 2 with each row specifying the
2D coordinates of a hexagon in the grid. The order of rows is the same
as 'centroid' above
call
: the call that produced this result
The relationships among return values:
# The supra-hexagonal grid is exactly determined by specifying the radius. sHex <- sHexGrid(r=2) # The grid is determined according to the number of input hexagons (after being adjusted). # The return res$nHex is always no less than the input one. # It ensures a supra-hexagonal grid is exactly formed. sHex <- sHexGrid(nHex=12) # Ignore input nHex if r is also given sHex <- sHexGrid(r=3, nHex=100) # By default, r=3 if no parameters are specified sHex <- sHexGrid()
# The supra-hexagonal grid is exactly determined by specifying the radius. sHex <- sHexGrid(r=2) # The grid is determined according to the number of input hexagons (after being adjusted). # The return res$nHex is always no less than the input one. # It ensures a supra-hexagonal grid is exactly formed. sHex <- sHexGrid(nHex=12) # Ignore input nHex if r is also given sHex <- sHexGrid(r=3, nHex=100) # By default, r=3 if no parameters are specified sHex <- sHexGrid()
sHexGridVariant
is supposed to define a variant of a
supra-hexagonal map grid. In essence, it is the subset of the
supra-hexagon.
sHexGridVariant( r = NULL, nHex = NULL, shape = c("suprahex", "triangle", "diamond", "hourglass", "trefoil", "ladder", "butterfly", "ring", "bridge") )
sHexGridVariant( r = NULL, nHex = NULL, shape = c("suprahex", "triangle", "diamond", "hourglass", "trefoil", "ladder", "butterfly", "ring", "bridge") )
r |
an integer specifying the radius in a supra-hexagonal grid |
nHex |
the number of input hexagons in the grid |
shape |
the grid shape, either "suprahex" for the suprahex itself, or its variants (including "triangle" for the triangle-shaped variant, "diamond" for the diamond-shaped variant, "hourglass" for the hourglass-shaped variant, "trefoil" for the trefoil-shaped variant, "ladder" for the ladder-shaped variant, "butterfly" for the butterfly-shaped variant, "ring" for the ring-shaped variant, and "bridge" for the bridge-shaped variant) |
an object of class "sHex", a list with following components:
r
: the grid radius
nHex
: the total number of hexagons in the grid. It may
differ from the input value; actually it is always no less than the
input one to ensure a supra-hexagonal grid exactly formed
centroid
: the 2D coordinates of the grid centroid
stepCentroid
: a vector with the length of nHex. It stores
how many steps a hexagon is awawy from the grid centroid ('1' for the
centroid itself). Starting with the centroid, it orders outward. Also,
for those hexagons of the same step, it orders from the rightmost in an
anti-clock wise
angleCentroid
: a vector with the length of nHex. It stores
the angle a hexagon is in terms of the grid centroid ('0' for the
centroid itself). For those hexagons of the same step, it orders from
the rightmost in an anti-clock wise
coord
: a matrix of nHex x 2 with each row specifying the
2D coordinates of a hexagon in the grid. The order of rows is the same
as 'centroid' above
call
: the call that produced this result
none
# For "supraHex" shape itself sHex <- sHexGridVariant(r=6, shape="suprahex") ## Not run: library(ggplot2) #geom_polygon(color="black", fill=NA) # For "supraHex" shape itself sHex <- sHexGridVariant(r=6, shape="suprahex") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_suprahex <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="suprahex (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "triangle" shape sHex <- sHexGridVariant(r=6, shape="triangle") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_triangle <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="triangle (r=6; xdim=ydim=6)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "diamond" shape sHex <- sHexGridVariant(r=6, shape="diamond") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_diamond <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="diamond (r=6; xdim=6, ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "hourglass" shape sHex <- sHexGridVariant(r=6, shape="hourglass") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_hourglass <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="hourglass (r=6; xdim=6, ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "trefoil" shape sHex <- sHexGridVariant(r=6, shape="trefoil") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_trefoil <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="trefoil (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "ladder" shape sHex <- sHexGridVariant(r=6, shape="ladder") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_ladder <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="ladder (r=6; xdim=11, ydim=6)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "butterfly" shape sHex <- sHexGridVariant(r=6, shape="butterfly") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_butterfly <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="butterfly (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "ring" shape sHex <- sHexGridVariant(r=6, shape="ring") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_ring <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="ring (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "bridge" shape sHex <- sHexGridVariant(r=6, shape="bridge") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_bridge <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="bridge (r=6; xdim=11, ydim=6)") + theme(plot.title=element_text(hjust=0.5,size=8)) # combined visuals library(gridExtra) grid.arrange(grobs=list(gp_suprahex, gp_ring, gp_diamond, gp_trefoil, gp_butterfly, gp_hourglass, gp_ladder, gp_bridge, gp_triangle), layout_matrix=rbind(c(1,1,2,2,3),c(1,1,2,2,3),c(4,4,5,5,6),c(4,4,5,5,6),c(7,7,8,8,9)), nrow=5, ncol=5) ## End(Not run)
# For "supraHex" shape itself sHex <- sHexGridVariant(r=6, shape="suprahex") ## Not run: library(ggplot2) #geom_polygon(color="black", fill=NA) # For "supraHex" shape itself sHex <- sHexGridVariant(r=6, shape="suprahex") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_suprahex <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="suprahex (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "triangle" shape sHex <- sHexGridVariant(r=6, shape="triangle") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_triangle <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="triangle (r=6; xdim=ydim=6)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "diamond" shape sHex <- sHexGridVariant(r=6, shape="diamond") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_diamond <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="diamond (r=6; xdim=6, ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "hourglass" shape sHex <- sHexGridVariant(r=6, shape="hourglass") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_hourglass <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="hourglass (r=6; xdim=6, ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "trefoil" shape sHex <- sHexGridVariant(r=6, shape="trefoil") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_trefoil <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="trefoil (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "ladder" shape sHex <- sHexGridVariant(r=6, shape="ladder") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_ladder <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="ladder (r=6; xdim=11, ydim=6)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "butterfly" shape sHex <- sHexGridVariant(r=6, shape="butterfly") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_butterfly <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="butterfly (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "ring" shape sHex <- sHexGridVariant(r=6, shape="ring") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_ring <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="ring (r=6; xdim=ydim=11)") + theme(plot.title=element_text(hjust=0.5,size=8)) # For "bridge" shape sHex <- sHexGridVariant(r=6, shape="bridge") df_polygon <- sHexPolygon(sHex) df_coord <- data.frame(sHex$coord, index=1:nrow(sHex$coord)) gp_bridge <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white", size=3) + labs(title="bridge (r=6; xdim=11, ydim=6)") + theme(plot.title=element_text(hjust=0.5,size=8)) # combined visuals library(gridExtra) grid.arrange(grobs=list(gp_suprahex, gp_ring, gp_diamond, gp_trefoil, gp_butterfly, gp_hourglass, gp_ladder, gp_bridge, gp_triangle), layout_matrix=rbind(c(1,1,2,2,3),c(1,1,2,2,3),c(4,4,5,5,6),c(4,4,5,5,6),c(7,7,8,8,9)), nrow=5, ncol=5) ## End(Not run)
sHexPolygon
is supposed to extract polygon location per hexagon
within a supra-hexagonal grid
sHexPolygon(sObj, area.size = 1)
sHexPolygon(sObj, area.size = 1)
sObj |
an object of class "sMap" or "sInit" or "sTopol" or "sHex" |
area.size |
an integer or a vector specifying the area size of each hexagon |
a tibble of 7 columns ('index','x','y','node','edge','stepCentroid','angleCentroid') storing polygon location per hexagon. 'node' for nodes (including n1,n2,n3,n4,n5,n6), and 'edge' for a list-column where each is a tibble with a single column 'edge' containing two rows (such as edges 'e12' and 'e16' for the node 'n1').
None
sObj <- sTopology(xdim=4, ydim=4, lattice="hexa", shape="suprahex") df_polygon <- sHexPolygon(sObj, area.size=1)
sObj <- sTopology(xdim=4, ydim=4, lattice="hexa", shape="suprahex") df_polygon <- sHexPolygon(sObj, area.size=1)
sInitial
is supposed to initialise an object of class "sInit"
given a topology and input data. As a matter of fact, it initialises
the codebook matrix (in input high-dimensional space). The return
object inherits the topology information (i.e., a "sTopol" object from
sTopology
), along with initialised codebook matrix and method
used.
sInitial(data, sTopol, init = c("linear", "uniform", "sample"), seed = 825)
sInitial(data, sTopol, init = c("linear", "uniform", "sample"), seed = 825)
data |
a data frame or matrix of input data |
sTopol |
an object of class "sTopol" (see |
init |
an initialisation method. It can be one of "uniform", "sample" and "linear" initialisation methods |
seed |
an integer specifying the seed |
an object of class "sInit", a list with following components:
nHex
: the total number of hexagons/rectanges in the grid
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
r
: the hypothetical radius of the grid
lattice
: the grid lattice
shape
: the grid shape
coord
: a matrix of nHex x 2, with each row corresponding
to the coordinates of a hexagon/rectangle in the 2D map grid
init
: an initialisation method
codebook
: a codebook matrix of nHex x ncol(data), with
each row corresponding to a prototype vector in input high-dimensional
space
call
: the call that produced this result
The initialisation methods include:
"uniform": the codebook matrix is uniformly initialised via randomly taking any values within the interval [min, max] of each column of input data
"sample": the codebook matrix is initialised via randomly sampling/selecting input data
"linear": the codebook matrix is linearly initialised along the first two greatest eigenvectors of input data
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using different mehtods # 3a) using "uniform" method sI_uniform <- sInitial(data=data, sTopol=sTopol, init="uniform") # 3b) using "sample" method # sI_sample <- sInitial(data=data, sTopol=sTopol, init="sample") # 3c) using "linear" method # sI_linear <- sInitial(data=data, sTopol=sTopol, init="linear")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using different mehtods # 3a) using "uniform" method sI_uniform <- sInitial(data=data, sTopol=sTopol, init="uniform") # 3b) using "sample" method # sI_sample <- sInitial(data=data, sTopol=sTopol, init="sample") # 3c) using "linear" method # sI_linear <- sInitial(data=data, sTopol=sTopol, init="linear")
sMapOverlay
is supposed to overlay additional data onto the
trained map for viewing the distribution of that additional data. It
returns an object of class "sMap". It is realised by first estimating
the hit histogram weighted by the neighborhood kernel, and then
calculating the distribution of the additional data over the map
(similarly weighted by the neighborhood kernel). The final overlaid
distribution of additional data is normalised by the hit histogram.
sMapOverlay(sMap, data = NULL, additional)
sMapOverlay(sMap, data = NULL, additional)
sMap |
an object of class "sMap" |
data |
a data frame or matrix of input data or NULL |
additional |
a numeric vector or numeric matrix used to overlay onto the trained map. It must have the length (if being vector) or row number (if matrix) being equal to the number of rows in input data |
an object of class "sMap", a list with following components:
nHex
: the total number of hexagons/rectanges in the grid
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
r
: the hypothetical radius of the grid
lattice
: the grid lattice
shape
: the grid shape
coord
: a matrix of nHex x 2, with rows corresponding to
the coordinates of all hexagons/rectangles in the 2D map grid
ig
: the igraph object
polygon
: a tibble of 7 columns
('x','y','index','node','edge','stepCentroid','angleCentroid') storing
polygon location per hexagon
init
: an initialisation method
neighKernel
: the training neighborhood kernel
codebook
: a codebook matrix of nHex x ncol(additional),
with rows corresponding to overlaid vectors
hits
: a vector of nHex, each element meaning that a
hexagon/rectangle contains the number of input data vectors being hit
wherein
mqe
: the mean quantization error for the "best" BMH
data
: an input data matrix
response
: a tibble of 3 columns ('did' for rownames of
input data matrix, 'index', and 'qerr' (quantization error; the
distance to the "best" BMH))
call
: the call that produced this result
Weighting by neighbor kernel is to avoid rigid overlaying by only focusing on the best-matching map nodes as there may exist several closest best-matching nodes for an input data vector.
sPipeline
, sBMH
, sHexDist
,
visHexMulComp
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) overlay additional data onto the trained map # here using the first two columns of the input "data" as "additional" # codebook in "sOverlay" is the same as the first two columns of codebook in "sMap" sOverlay <- sMapOverlay(sMap=sMap, data=data, additional=data[,1:2]) # 4) viewing the distribution of that additional data visHexMulComp(sOverlay)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) overlay additional data onto the trained map # here using the first two columns of the input "data" as "additional" # codebook in "sOverlay" is the same as the first two columns of codebook in "sMap" sOverlay <- sMapOverlay(sMap=sMap, data=data, additional=data[,1:2]) # 4) viewing the distribution of that additional data visHexMulComp(sOverlay)
sNeighAny
is supposed to calculate any neighbors for each
hexagon/rectangle in a regular 2D grid. It returns a matrix with rows
for the self, and columns for its any neighbors.
sNeighAny(sObj)
sNeighAny(sObj)
sObj |
an object of class "sTopol" or "sInit" or "sMap" |
aNeigh
: a matrix of nHex x nHex, containing distance info
in terms of any neighbors, where nHex is the total number of
hexagons/rectanges in the grid
The return matrix has rows for the self, and columns for its neighbors. The non-zeros mean the distance away from its neighbors, and the zeros for the self-self. It has rows/columns ordered in the same order as the "coord" matrix of the input object does.
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) calculate any neighbors based on different objects # 4a) based on an object of class "sTopol" aNeigh <- sNeighAny(sObj=sTopol) # 4b) based on an object of class "sMap" # aNeigh <- sNeighAny(sObj=sI)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) calculate any neighbors based on different objects # 4a) based on an object of class "sTopol" aNeigh <- sNeighAny(sObj=sTopol) # 4b) based on an object of class "sMap" # aNeigh <- sNeighAny(sObj=sI)
sNeighDirect
is supposed to calculate direct neighbors for each
hexagon/rectangle in a regular 2D grid. It returns a matrix with rows
for the self, and columns for its direct neighbors.
sNeighDirect(sObj)
sNeighDirect(sObj)
sObj |
an object of class "sTopol" or "sInit" or "sMap" |
dNeigh
: a matrix of nHex x nHex, containing
presence/absence info in terms of direct neighbors, where nHex is the
total number of hexagons/rectanges in the grid
The return matrix has rows for the self, and columns for its direct neighbors. The "1" means the presence of direct neighbors, "0" for the absence. It has rows/columns ordered in the same order as the "coord" matrix of the input object does.
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) calculate direct neighbors based on different objects # 4a) based on an object of class "sTopol" dNeigh <- sNeighDirect(sObj=sTopol) # 4b) based on an object of class "sMap" # dNeigh <- sNeighDirect(sObj=sI)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) calculate direct neighbors based on different objects # 4a) based on an object of class "sTopol" dNeigh <- sNeighDirect(sObj=sTopol) # 4b) based on an object of class "sMap" # dNeigh <- sNeighDirect(sObj=sI)
sPipeline
is supposed to finish ab inito training for the input
data. It returns an object of class "sMap".
sPipeline( data, xdim = NULL, ydim = NULL, nHex = NULL, lattice = c("hexa", "rect"), shape = c("suprahex", "sheet", "triangle", "diamond", "hourglass", "trefoil", "ladder", "butterfly", "ring", "bridge"), scaling = 5, init = c("linear", "uniform", "sample"), seed = 825, algorithm = c("batch", "sequential"), alphaType = c("invert", "linear", "power"), neighKernel = c("gaussian", "bubble", "cutgaussian", "ep", "gamma"), finetuneSustain = FALSE, verbose = TRUE )
sPipeline( data, xdim = NULL, ydim = NULL, nHex = NULL, lattice = c("hexa", "rect"), shape = c("suprahex", "sheet", "triangle", "diamond", "hourglass", "trefoil", "ladder", "butterfly", "ring", "bridge"), scaling = 5, init = c("linear", "uniform", "sample"), seed = 825, algorithm = c("batch", "sequential"), alphaType = c("invert", "linear", "power"), neighKernel = c("gaussian", "bubble", "cutgaussian", "ep", "gamma"), finetuneSustain = FALSE, verbose = TRUE )
data |
a data frame or matrix of input data |
xdim |
an integer specifying x-dimension of the grid |
ydim |
an integer specifying y-dimension of the grid |
nHex |
the number of hexagons/rectangles in the grid |
lattice |
the grid lattice, either "hexa" for a hexagon or "rect" for a rectangle |
shape |
the grid shape, either "suprahex" for a supra-hexagonal grid or "sheet" for a hexagonal/rectangle sheet. Also supported are suprahex's variants (including "triangle" for the triangle-shaped variant, "diamond" for the diamond-shaped variant, "hourglass" for the hourglass-shaped variant, "trefoil" for the trefoil-shaped variant, "ladder" for the ladder-shaped variant, "butterfly" for the butterfly-shaped variant, "ring" for the ring-shaped variant, and "bridge" for the bridge-shaped variant) |
scaling |
the scaling factor. Only used when automatically estimating the grid dimension from input data matrix. By default, it is 5 (big map). Other suggested values: 1 for small map, and 3 for median map |
init |
an initialisation method. It can be one of "uniform", "sample" and "linear" initialisation methods |
seed |
an integer specifying the seed |
algorithm |
the training algorithm. It can be one of "sequential" and "batch" algorithm. By default, it uses 'batch' algorithm purely because of its fast computations (probably also without the compromise of accuracy). However, it is highly recommended not to use 'batch' algorithm if the input data contain lots of zeros; it is because matrix multiplication used in the 'batch' algorithm can be problematic in this context. If much computation resource is at hand, it is alwasy safe to use the 'sequential' algorithm |
alphaType |
the alpha type. It can be one of "invert", "linear" and "power" alpha types |
neighKernel |
the training neighborhood kernel. It can be one of "gaussian", "bubble", "cutgaussian", "ep" and "gamma" kernels |
finetuneSustain |
logical to indicate whether sustain the "finetune" training. If true, it will repeat the "finetune" stage until the mean quantization error does get worse. By default, it sets to FALSE |
verbose |
logical to indicate whether the messages will be displayed in the screen. By default, it sets to false for no display |
an object of class "sMap", a list with following components:
nHex
: the total number of hexagons/rectanges in the grid
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
r
: the hypothetical radius of the grid
lattice
: the grid lattice
shape
: the grid shape
coord
: a matrix of nHex x 2, with rows corresponding to
the coordinates of all hexagons/rectangles in the 2D map grid
ig
: the igraph object
polygon
: a tibble of 7 columns
('x','y','index','node','edge','stepCentroid','angleCentroid') storing
polygon location per hexagon
init
: an initialisation method
neighKernel
: the training neighborhood kernel
codebook
: a codebook matrix of nHex x ncol(data), with
rows corresponding to prototype vectors in input high-dimensional
space
hits
: a vector of nHex, each element meaning that a
hexagon/rectangle contains the number of input data vectors being hit
wherein
mqe
: the mean quantization error for the "best" BMH
data
: an input data matrix (with rownames and colnames
added if NULL)
response
: a tibble of 3 columns ('did' for rownames of
input data matrix, 'index', and 'qerr' (quantization error; the
distance to the "best" BMH))
call
: the call that produced this result
The pipeline sequentially consists of:
i) sTopology
used to define the topology of a grid
(with "suprahex" shape by default ) according to the input data;
ii) sInitial
used to initialise the codebook matrix
given the pre-defined topology and the input data (by default using
"uniform" initialisation method);
iii) sTrainology
and sTrainSeq
or
sTrainBatch
used to get the grid map trained at both
"rough" and "finetune" stages. If instructed, sustain the "finetune"
training until the mean quantization error does get worse;
iv) sBMH
used to identify the best-matching
hexagons/rectangles (BMH) for the input data, and these response data
are appended to the resulting object of "sMap" class.
Hai Fang and Julian Gough. (2014) supraHex: an R/Bioconductor package for tabular omics data analysis using a supra-hexagonal map. Biochemical and Biophysical Research Communications, 443(1), 285-289.
sTopology
, sInitial
,
sTrainology
, sTrainSeq
,
sTrainBatch
, sBMH
,
visHexMulComp
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") ## Not run: # 2) get trained using by default setup but with different neighborhood kernels # 2a) with "gaussian" kernel sMap <- sPipeline(data=data, neighKernel="gaussian") # 2b) with "bubble" kernel # sMap <- sPipeline(data=data, neighKernel="bubble") # 2c) with "cutgaussian" kernel # sMap <- sPipeline(data=data, neighKernel="cutgaussian") # 2d) with "ep" kernel # sMap <- sPipeline(data=data, neighKernel="ep") # 2e) with "gamma" kernel # sMap <- sPipeline(data=data, neighKernel="gamma") # 3) visualise multiple component planes of a supra-hexagonal grid visHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) # 4) get trained using by default setup but using the shape "butterfly" sMap <- sPipeline(data=data, shape="trefoil", algorithm=c("batch","sequential")[2]) visHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) library(ggraph) ggraph(sMap$ig, layout=sMap$coord) + geom_edge_link() + geom_node_circle(aes(r=0.4),fill='white') + coord_fixed(ratio=1) + geom_node_text(aes(label=name), size=2) ## End(Not run)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") ## Not run: # 2) get trained using by default setup but with different neighborhood kernels # 2a) with "gaussian" kernel sMap <- sPipeline(data=data, neighKernel="gaussian") # 2b) with "bubble" kernel # sMap <- sPipeline(data=data, neighKernel="bubble") # 2c) with "cutgaussian" kernel # sMap <- sPipeline(data=data, neighKernel="cutgaussian") # 2d) with "ep" kernel # sMap <- sPipeline(data=data, neighKernel="ep") # 2e) with "gamma" kernel # sMap <- sPipeline(data=data, neighKernel="gamma") # 3) visualise multiple component planes of a supra-hexagonal grid visHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) # 4) get trained using by default setup but using the shape "butterfly" sMap <- sPipeline(data=data, shape="trefoil", algorithm=c("batch","sequential")[2]) visHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) library(ggraph) ggraph(sMap$ig, layout=sMap$coord) + geom_edge_link() + geom_node_circle(aes(r=0.4),fill='white') + coord_fixed(ratio=1) + geom_node_text(aes(label=name), size=2) ## End(Not run)
sTopology
is supposed to define the topology of a 2D map grid.
The topological shape can be either a supra-hexagonal grid or a
hexagonal/rectangle sheet. It returns an object of "sTopol" class,
containing: the total number of hexagons/rectangles in the grid, the
grid xy-dimensions, the grid lattice, the grid shape, and the 2D
coordinates of all hexagons/rectangles in the grid. The 2D coordinates
can be directly used to measure distances between any pair of lattice
hexagons/rectangles.
sTopology( data = NULL, xdim = NULL, ydim = NULL, nHex = NULL, lattice = c("hexa", "rect"), shape = c("suprahex", "sheet", "triangle", "diamond", "hourglass", "trefoil", "ladder", "butterfly", "ring", "bridge"), scaling = 5 )
sTopology( data = NULL, xdim = NULL, ydim = NULL, nHex = NULL, lattice = c("hexa", "rect"), shape = c("suprahex", "sheet", "triangle", "diamond", "hourglass", "trefoil", "ladder", "butterfly", "ring", "bridge"), scaling = 5 )
data |
a data frame or matrix of input data |
xdim |
an integer specifying x-dimension of the grid |
ydim |
an integer specifying y-dimension of the grid |
nHex |
the number of hexagons/rectangles in the grid |
lattice |
the grid lattice, either "hexa" for a hexagon or "rect" for a rectangle |
shape |
the grid shape, either "suprahex" for a supra-hexagonal grid or "sheet" for a hexagonal/rectangle sheet. Also supported are suprahex's variants (including "triangle" for the triangle-shaped variant, "diamond" for the diamond-shaped variant, "hourglass" for the hourglass-shaped variant, "trefoil" for the trefoil-shaped variant, "ladder" for the ladder-shaped variant, "butterfly" for the butterfly-shaped variant, "ring" for the ring-shaped variant, and "bridge" for the bridge-shaped variant) |
scaling |
the scaling factor. Only used when automatically estimating the grid dimension from input data matrix. By default, it is 5 (big map). Other suggested values: 1 for small map, and 3 for median map |
an object of class "sTopol", a list with following components:
nHex
: the total number of hexagons/rectanges in the grid.
It is not always the same as the input nHex (if any); see "Note" below
for the explaination
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
r
: the hypothetical radius of the grid
lattice
: the grid lattice
shape
: the grid shape
coord
: a matrix of nHex x 2, with each row corresponding
to the coordinates of a hexagon/rectangle in the 2D map grid
ig
: the igraph object
call
: the call that produced this result
The output of nHex depends on the input arguments and grid shape:
How the input parameters are used to determine nHex is taken priority in the following order: "xdim & ydim" > "nHex" > "data"
If both of xdim and ydim are given, for the
"sheet" shape,
for the "suprahex" shape
If only data is input, , where dlen
is the number of rows of the input data, and scaling can be 5 (big
map), 3 (median map) and 1 (normal map)
With nHex in hand, it depends on the grid shape:
For "sheet" shape, xy-dimensions of sheet grid is determined according to the square root of the two biggest eigenvalues of the input data
For "suprahex" shape, see sHexGrid
for calculating
the grid radius r. The xdim (and ydim) is related to r via
# For "suprahex" shape sTopol <- sTopology(xdim=3, ydim=3, lattice="hexa", shape="suprahex") # Error: "The suprahex shape grid only allows for hexagonal lattice" # sTopol <- sTopology(xdim=3, ydim=3, lattice="rect", shape="suprahex") # For "sheet" shape with hexagonal lattice sTopol <- sTopology(xdim=3, ydim=3, lattice="hexa", shape="sheet") # For "sheet" shape with rectangle lattice sTopol <- sTopology(xdim=3, ydim=3, lattice="rect", shape="sheet") # By default, nHex=19 (i.e., r=3; xdim=ydim=5) for "suprahex" shape sTopol <- sTopology(shape="suprahex") # By default, xdim=ydim=5 (i.e., nHex=25) for "sheet" shape sTopol <- sTopology(shape="sheet") # Determine the topolopy of a supra-hexagonal grid based on input data # 1) generate an iid normal random matrix of 100x10 data <- matrix(rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # sTopol <- sTopology(data=data, lattice="hexa", shape="trefoil") # do visualisation visHexMapping(sTopol,mappingType="indexes") ## Not run: library(ggplot2) # another way to do visualisation df_polygon <- sHexPolygon(sTopol) df_coord <- data.frame(sTopol$coord, index=1:nrow(sTopol$coord)) gp <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white") library(ggraph) ggraph(sTopol$ig, layout=sTopol$coord) + geom_edge_link() + geom_node_circle(aes(r=0.4),fill='white') + coord_fixed(ratio=1) + geom_node_text(aes(label=name), size=2) ## End(Not run)
# For "suprahex" shape sTopol <- sTopology(xdim=3, ydim=3, lattice="hexa", shape="suprahex") # Error: "The suprahex shape grid only allows for hexagonal lattice" # sTopol <- sTopology(xdim=3, ydim=3, lattice="rect", shape="suprahex") # For "sheet" shape with hexagonal lattice sTopol <- sTopology(xdim=3, ydim=3, lattice="hexa", shape="sheet") # For "sheet" shape with rectangle lattice sTopol <- sTopology(xdim=3, ydim=3, lattice="rect", shape="sheet") # By default, nHex=19 (i.e., r=3; xdim=ydim=5) for "suprahex" shape sTopol <- sTopology(shape="suprahex") # By default, xdim=ydim=5 (i.e., nHex=25) for "sheet" shape sTopol <- sTopology(shape="sheet") # Determine the topolopy of a supra-hexagonal grid based on input data # 1) generate an iid normal random matrix of 100x10 data <- matrix(rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # sTopol <- sTopology(data=data, lattice="hexa", shape="trefoil") # do visualisation visHexMapping(sTopol,mappingType="indexes") ## Not run: library(ggplot2) # another way to do visualisation df_polygon <- sHexPolygon(sTopol) df_coord <- data.frame(sTopol$coord, index=1:nrow(sTopol$coord)) gp <- ggplot(data=df_polygon, aes(x,y,group=index)) + geom_polygon(aes(fill=factor(stepCentroid%%2))) + coord_fixed(ratio=1) + theme_void() + theme(legend.position="none") + geom_text(data=df_coord, aes(x,y,label=index), color="white") library(ggraph) ggraph(sTopol$ig, layout=sTopol$coord) + geom_edge_link() + geom_node_circle(aes(r=0.4),fill='white') + coord_fixed(ratio=1) + geom_node_text(aes(label=name), size=2) ## End(Not run)
sTrainBatch
is supposed to perform batch training algorithm. It
requires three inputs: a "sMap" or "sInit" object, input data, and a
"sTrain" object specifying training environment. The training is
implemented iteratively, but instead of choosing a single input vector,
the whole input matrix is used. In each training cycle, the whole input
matrix first land in the map through identifying the corresponding
winner hexagon/rectangle (BMH), and then the codebook matrix is updated
via updating formula (see "Note" below for details). It returns an
object of class "sMap".
sTrainBatch(sMap, data, sTrain, verbose = TRUE)
sTrainBatch(sMap, data, sTrain, verbose = TRUE)
sMap |
an object of class "sMap" or "sInit" |
data |
a data frame or matrix of input data |
sTrain |
an object of class "sTrain" |
verbose |
logical to indicate whether the messages will be displayed in the screen. By default, it sets to TRUE for display |
an object of class "sMap", a list with following components:
nHex
: the total number of hexagons/rectanges in the grid
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
r
: the hypothetical radius of the grid
lattice
: the grid lattice
shape
: the grid shape
coord
: a matrix of nHex x 2, with each row corresponding
to the coordinates of a hexagon/rectangle in the 2D map grid
ig
: the igraph object
init
: an initialisation method
neighKernel
: the training neighborhood kernel
codebook
: a codebook matrix of nHex x ncol(data), with
each row corresponding to a prototype vector in input high-dimensional
space
call
: the call that produced this result
Updating formula is: ,
where
denotes the training time/step
is an input vector
from the input data matrix
(with
rows in total)
and
stand for the hexagon/rectangle
and
the winner BMH
, respectively
is the prototype vector of the hexagon
at
time
is the neighborhood kernel, a non-increasing
function of i) the distance
between the hexagon/rectangle
and the winner BMH
, and ii) the radius
at time
. There are five kernels available:
For "gaussian" kernel,
For "cutguassian" kernel,
For "bubble" kernel,
For "ep" kernel,
For "gamma" kernel,
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, stage="rough") # 5) training at "rough" stage sM_rough <- sTrainBatch(sMap=sI, data=data, sTrain=sT_rough) # 6) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, stage="finetune") # 7) training at "finetune" stage sM_finetune <- sTrainBatch(sMap=sM_rough, data=data, sTrain=sT_rough)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, stage="rough") # 5) training at "rough" stage sM_rough <- sTrainBatch(sMap=sI, data=data, sTrain=sT_rough) # 6) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, stage="finetune") # 7) training at "finetune" stage sM_finetune <- sTrainBatch(sMap=sM_rough, data=data, sTrain=sT_rough)
sTrainology
is supposed to define the train-ology (i.e., the
training environment/parameters). The trainology here refers to the
training algorithm, the training stage, the stage-specific parameters
(alpha type, initial alpha, initial radius, final radius and train
length), and the training neighbor kernel used. It returns an object of
class "sTrain".
sTrainology( sMap, data, algorithm = c("batch", "sequential"), stage = c("rough", "finetune", "complete"), alphaType = c("invert", "linear", "power"), neighKernel = c("gaussian", "bubble", "cutgaussian", "ep", "gamma") )
sTrainology( sMap, data, algorithm = c("batch", "sequential"), stage = c("rough", "finetune", "complete"), alphaType = c("invert", "linear", "power"), neighKernel = c("gaussian", "bubble", "cutgaussian", "ep", "gamma") )
sMap |
an object of class "sMap" or "sInit" |
data |
a data frame or matrix of input data |
algorithm |
the training algorithm. It can be one of "sequential" and "batch" algorithm |
stage |
the training stage. The training can be achieved using two stages (i.e., "rough" and "finetune") or one stage only (i.e., "complete") |
alphaType |
the alpha type. It can be one of "invert", "linear" and "power" alpha types |
neighKernel |
the training neighbor kernel. It can be one of "gaussian", "bubble", "cutgaussian", "ep" and "gamma" kernels |
an object of class "sTrain", a list with following components:
algorithm
: the training algorithm
stage
: the training stage
alphaType
: the alpha type
alphaInitial
: the initial alpha
radiusInitial
: the initial radius
radiusFinal
: the final radius
neighKernel
: the neighbor kernel
call
: the call that produced this result
Training stage-specific parameters:
"radiusInitial": it depends on the grid shape and training stage
For "sheet" shape: it equals
at "rough" or "complete" stage,
and
at "finetune" stage
For "suprahex" shape: it equals at
"rough" or "complete" stage, and
at
"finetune" stage
"radiusFinal": it depends on the training stage
At "rough" stage, it equals
At "finetune" or "complete" stage, it equals
"trainLength": how many times the whole input data are set for training. It depends on the training stage and training algorithm
At "rough" stage, it equals
At "finetune" stage, it equals
At "complete" stage, it equals
When using "batch" algorithm and the trainLength equals 1
according to the above equation, the trainLength is forced to be 2
unless equals
Where is the training depth, defined as
, i.e., how many hexagons/rectanges are used per the
input data length (here
refers to the number of rows)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at different stages # 4a) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, stage="rough") # 4b) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, stage="finetune") # 4c) define trainology using "complete" stage sT_complete <- sTrainology(sMap=sI, data=data, stage="complete")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at different stages # 4a) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, stage="rough") # 4b) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, stage="finetune") # 4c) define trainology using "complete" stage sT_complete <- sTrainology(sMap=sI, data=data, stage="complete")
sTrainSeq
is supposed to perform sequential training algorithm.
It requires three inputs: a "sMap" or "sInit" object, input data, and a
"sTrain" object specifying training environment. The training is
implemented iteratively, each training cycle consisting of: i) randomly
choose one input vector; ii) determine the winner hexagon/rectangle
(BMH) according to minimum distance of codebook matrix to the input
vector; ii) update the codebook matrix of the BMH and its neighbors via
updating formula (see "Note" below for details). It also returns an
object of class "sMap".
sTrainSeq(sMap, data, sTrain, seed = 825, verbose = TRUE)
sTrainSeq(sMap, data, sTrain, seed = 825, verbose = TRUE)
sMap |
an object of class "sMap" or "sInit" |
data |
a data frame or matrix of input data |
sTrain |
an object of class "sTrain" |
seed |
an integer specifying the seed |
verbose |
logical to indicate whether the messages will be displayed in the screen. By default, it sets to TRUE for display |
an object of class "sMap", a list with following components:
nHex
: the total number of hexagons/rectanges in the grid
xdim
: x-dimension of the grid
ydim
: y-dimension of the grid
r
: the hypothetical radius of the grid
lattice
: the grid lattice
shape
: the grid shape
coord
: a matrix of nHex x 2, with each row corresponding
to the coordinates of a hexagon/rectangle in the 2D map grid
ig
: the igraph object
init
: an initialisation method
neighKernel
: the training neighborhood kernel
codebook
: a codebook matrix of nHex x ncol(data), with
each row corresponding to a prototype vector in input high-dimensional
space
call
: the call that produced this result
Updating formula is: , where
denotes the training time/step
and
stand for the hexagon/rectangle
and
the winner BMH
, respectively
is an input vector randomly choosen (from the input
data) at time
and
are respectively the prototype
vectors of the hexagon
at time
and
is the learning rate at time
. There are
three types of learning rate functions:
For "linear" function,
For "power" function,
For "invert" function,
Where is the initial learing rate (typically,
at "rough" stage,
at "finetune"
stage),
is the length of training time/step (often being set to
input data length, i.e., the total number of rows)
is the neighborhood kernel, a non-increasing
function of i) the distance
between the hexagon/rectangle
and the winner BMH
, and ii) the radius
at time
. There are five kernels available:
For "gaussian" kernel,
For "cutguassian" kernel,
For "bubble" kernel,
For "ep" kernel,
For "gamma" kernel,
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, algorithm="sequential", stage="rough") # 5) training at "rough" stage sM_rough <- sTrainSeq(sMap=sI, data=data, sTrain=sT_rough) # 6) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, algorithm="sequential", stage="finetune") # 7) training at "finetune" stage sM_finetune <- sTrainSeq(sMap=sM_rough, data=data, sTrain=sT_rough)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) from this input matrix, determine nHex=5*sqrt(nrow(data))=50, # but it returns nHex=61, via "sHexGrid(nHex=50)", to make sure a supra-hexagonal grid sTopol <- sTopology(data=data, lattice="hexa", shape="suprahex") # 3) initialise the codebook matrix using "uniform" method sI <- sInitial(data=data, sTopol=sTopol, init="uniform") # 4) define trainology at "rough" stage sT_rough <- sTrainology(sMap=sI, data=data, algorithm="sequential", stage="rough") # 5) training at "rough" stage sM_rough <- sTrainSeq(sMap=sI, data=data, sTrain=sT_rough) # 6) define trainology at "finetune" stage sT_finetune <- sTrainology(sMap=sI, data=data, algorithm="sequential", stage="finetune") # 7) training at "finetune" stage sM_finetune <- sTrainSeq(sMap=sM_rough, data=data, sTrain=sT_rough)
sWriteData
is supposed to write out the best-matching hexagons
and/or cluster bases in terms of data.
sWriteData(sMap, data, sBase = NULL, filename = NULL, keep.data = FALSE)
sWriteData(sMap, data, sBase = NULL, filename = NULL, keep.data = FALSE)
sMap |
an object of class "sMap" or a codebook matrix |
data |
a data frame or matrix of input data |
sBase |
an object of class "sBase" |
filename |
a character string naming a filename |
keep.data |
logical to indicate whether or not to also write out the input data. By default, it sets to false for not keeping it. It is highly expensive to keep the large data sets |
a data frame with following components:
ID
: ID for data. It inherits the rownames of data (if
exists). Otherwise, it is sequential integer values starting with 1 and
ending with dlen, the total number of rows of the input data
Hexagon_index
: the index for best-matching hexagons
Qerr_distance
: the quantification error (distance) for
best-matching hexagons
Cluster_base
: optional, it is only appended when sBase is
given. It stores the cluster memberships/bases
data
: optional, it is only appended when keep.data is
true
If "filename" is not NULL, a tab-delimited text file will be also written out. If "sBase" is not NULL and comes from the "sMap" partition, then cluster bases are also appended. if "keep.data" is true, the data will be part of output.
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) write data's BMH hitting the trained map output <- sWriteData(sMap=sMap, data=data, filename="sData_output.txt") # 4) partition the grid map into cluster bases sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 5) write data's BMH and cluster bases output <- sWriteData(sMap=sMap, data=data, sBase=sBase, filename="sData_base_output.txt")
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) write data's BMH hitting the trained map output <- sWriteData(sMap=sMap, data=data, filename="sData_output.txt") # 4) partition the grid map into cluster bases sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 5) write data's BMH and cluster bases output <- sWriteData(sMap=sMap, data=data, sBase=sBase, filename="sData_base_output.txt")
visColoralpha
is supposed to add transparent (alpha) into
colors.
visColoralpha(col, alpha)
visColoralpha(col, alpha)
col |
input colors. It can be vector of R color specifications, such as a color name (as listed by 'colors()), a hexadecimal string of the form "#rrggbb" or "#rrggbbaa" |
alpha |
numeric vector of values in the range [0, 1] for alpha transparency channel (0 means transparent and 1 means opaque) |
a vector of colors (after transparent being added)
none
# 1) define "blue-white-red" colormap palette.name <- visColormap(colormap="bwr") # 2) use the return function "palette.name" to generate 10 colors spanning "bwr" col <- palette.name(10) # 3) add transparent (alpha=0.5) cols <- visColoralpha(col, alpha=0.5)
# 1) define "blue-white-red" colormap palette.name <- visColormap(colormap="bwr") # 2) use the return function "palette.name" to generate 10 colors spanning "bwr" col <- palette.name(10) # 3) add transparent (alpha=0.5) cols <- visColoralpha(col, alpha=0.5)
visColorbar
is supposed to define a colorbar
visColorbar( colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = c(0, 1), gp = grid::gpar() )
visColorbar( colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = c(0, 1), gp = grid::gpar() )
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
ncolors |
the number of colors specified |
zlim |
the minimum and maximum z values for which colors should be plotted, defaulting to the range of the finite values of z. Each of the given colors will be used to color an equispaced interval of this range. The midpoints of the intervals cover the range, so that values just outside the range will be plotted |
gp |
an object of class gpar, typically the output from a call to the function gpar (i.e., a list of graphical parameter settings) |
invisibly
none
visColormap
, visHexMulComp
,
visCompReorder
# draw "blue-white-red" colorbar visColorbar(colormap="bwr")
# draw "blue-white-red" colorbar visColorbar(colormap="bwr")
visColormap
is supposed to define a colormap. It returns a
function, which will take an integer argument specifying how many
colors interpolate the given colormap.
visColormap( colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb", "heat", "terrain", "topo", "cm") )
visColormap( colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb", "heat", "terrain", "topo", "cm") )
colormap |
short name for the colormap. It can also be a function of 'colorRampPalette' |
palette.name
: a function that takes an integer argument
for generating that number of colors interpolating the given sequence
The input colormap includes:
"jet": jet colormap
"bwr": blue-white-red
"gbr": green-black-red
"wyr": white-yellow-red
"br": black-red
"yr": yellow-red
"wb": white-black
"rainbow": rainbow colormap, that is, red-yellow-green-cyan-blue-magenta
Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkblue-lightblue-lightyellow-darkorange", "darkgreen-white-darkviolet", "darkgreen-lightgreen-lightpink-darkred". A list of standard color names can be found in http://html-color-codes.info/color-names
# 1) define "blue-white-red" colormap palette.name <- visColormap(colormap="bwr") # 2) use the return function "palette.name" to generate 10 colors spanning "bwr" palette.name(10)
# 1) define "blue-white-red" colormap palette.name <- visColormap(colormap="bwr") # 2) use the return function "palette.name" to generate 10 colors spanning "bwr" palette.name(10)
visCompReorder
is supposed to visualise multiple component
planes reorded within a sheet-shape rectangle grid
visCompReorder( sMap, sReorder, margin = rep(0.1, 4), height = 7, title.rotate = 0, title.xy = c(0.45, 1), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = NULL, border.color = "transparent", gp = grid::gpar(), newpage = TRUE )
visCompReorder( sMap, sReorder, margin = rep(0.1, 4), height = 7, title.rotate = 0, title.xy = c(0.45, 1), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = NULL, border.color = "transparent", gp = grid::gpar(), newpage = TRUE )
sMap |
an object of class "sMap" |
sReorder |
an object of class "sReorder" |
margin |
margins as units of length 4 or 1 |
height |
a numeric value specifying the height of device |
title.rotate |
the rotation of the title |
title.xy |
the coordinates of the title |
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
ncolors |
the number of colors specified |
zlim |
the minimum and maximum z values for which colors should be plotted, defaulting to the range of the finite values of z. Each of the given colors will be used to color an equispaced interval of this range. The midpoints of the intervals cover the range, so that values just outside the range will be plotted |
border.color |
the border color for each hexagon |
gp |
an object of class "gpar". It is the output from a call to the function "gpar" (i.e., a list of graphical parameter settings) |
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
invisible
none
visVp
, visHexComp
,
visColorbar
, sCompReorder
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data, shape=c("suprahex","trefoil")[2]) # 3) reorder component planes sReorder <- sCompReorder(sMap=sMap, amplifier=2, metric="none") # 4) visualise multiple component planes reorded within a sheet-shape rectangle grid visCompReorder(sMap=sMap, sReorder=sReorder, margin=rep(0.1,4), height=7, title.rotate=0, title.xy=c(0.45, 1), colormap="gbr", ncolors=10, zlim=c(-1,1), border.color="transparent")
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data, shape=c("suprahex","trefoil")[2]) # 3) reorder component planes sReorder <- sCompReorder(sMap=sMap, amplifier=2, metric="none") # 4) visualise multiple component planes reorded within a sheet-shape rectangle grid visCompReorder(sMap=sMap, sReorder=sReorder, margin=rep(0.1,4), height=7, title.rotate=0, title.xy=c(0.45, 1), colormap="gbr", ncolors=10, zlim=c(-1,1), border.color="transparent")
visDmatCluster
is supposed to visualise clusters/bases
partitioned from a supra-hexagonal grid
visDmatCluster( sMap, sBase, height = 7, margin = rep(0.1, 4), area.size = 1, gp = grid::gpar(cex = 0.8, font = 2, col = "black"), border.color = "transparent", fill.color = NULL, lty = 1, lwd = 1, lineend = "round", linejoin = "round", colormap = c("rainbow", "jet", "bwr", "gbr", "wyr", "br", "yr", "wb"), clip = c("on", "inherit", "off"), newpage = TRUE )
visDmatCluster( sMap, sBase, height = 7, margin = rep(0.1, 4), area.size = 1, gp = grid::gpar(cex = 0.8, font = 2, col = "black"), border.color = "transparent", fill.color = NULL, lty = 1, lwd = 1, lineend = "round", linejoin = "round", colormap = c("rainbow", "jet", "bwr", "gbr", "wyr", "br", "yr", "wb"), clip = c("on", "inherit", "off"), newpage = TRUE )
sMap |
an object of class "sMap" |
sBase |
an object of class "sBase" |
height |
a numeric value specifying the height of device |
margin |
margins as units of length 4 or 1 |
area.size |
an inteter or a vector specifying the area size of each hexagon |
gp |
an object of class "gpar". It is the output from a call to the function "gpar" (i.e., a list of graphical parameter settings) |
border.color |
the border color for each hexagon |
fill.color |
the filled color for each hexagon |
lty |
the line type for each hexagon. 0 for 'blank', 1 for 'solid', 2 for 'dashed', 3 for 'dotted', 4 for 'dotdash', 5 for 'longdash', 6 for 'twodash' |
lwd |
the line width for each hexagon |
lineend |
the line end style for each hexagon. It can be one of 'round', 'butt' and 'square' |
linejoin |
the line join style for each hexagon. It can be one of 'round', 'mitre' and 'bevel' |
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
clip |
either "on" for clipping to the extent of this viewport, "inherit" for inheriting the clipping region from the parent viewport, or "off" to turn clipping off altogether |
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
invisible
none
sDmatCluster
, sDmat
,
visColormap
, visHexGrid
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) ## Not run: # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) partition the grid map into clusters using region-growing algorithm sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 4) visualise clusters/bases partitioned from the sMap visDmatCluster(sMap,sBase) # 4a) also, the area size is proportional to the hits visDmatCluster(sMap,sBase, area.size=log2(sMap$hits+1)) # 4b) also, the area size is inversely proportional to the map distance dMat <- sDmat(sMap) visDmatCluster(sMap,sBase, area.size=-1*log2(dMat)) # 5) customise the fill color and line type my_color <- visColormap(colormap="PapayaWhip-pink-Tomato")(length(sBase$seeds))[sBase$bases] my_lty <- (sBase$bases %% 2) visDmatCluster(sMap,sBase, fill.color=my_color, lty=my_lty, border.color="black", lwd=2, area.size=0.9) # also, the area size is inversely proportional to the map distance visDmatCluster(sMap,sBase, fill.color=my_color, lty=my_lty, border.color="black", lwd=2, area.size=-1*log2(dMat)) ## End(Not run)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) ## Not run: # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) partition the grid map into clusters using region-growing algorithm sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 4) visualise clusters/bases partitioned from the sMap visDmatCluster(sMap,sBase) # 4a) also, the area size is proportional to the hits visDmatCluster(sMap,sBase, area.size=log2(sMap$hits+1)) # 4b) also, the area size is inversely proportional to the map distance dMat <- sDmat(sMap) visDmatCluster(sMap,sBase, area.size=-1*log2(dMat)) # 5) customise the fill color and line type my_color <- visColormap(colormap="PapayaWhip-pink-Tomato")(length(sBase$seeds))[sBase$bases] my_lty <- (sBase$bases %% 2) visDmatCluster(sMap,sBase, fill.color=my_color, lty=my_lty, border.color="black", lwd=2, area.size=0.9) # also, the area size is inversely proportional to the map distance visDmatCluster(sMap,sBase, fill.color=my_color, lty=my_lty, border.color="black", lwd=2, area.size=-1*log2(dMat)) ## End(Not run)
visDmatHeatmap
is supposed to visualise gene clusters/bases
partitioned from a supra-hexagonal grid using heatmap
visDmatHeatmap( sMap, data, sBase, base.color = "rainbow", base.separated.arg = NULL, base.legend.location = c("none", "bottomleft", "bottomright", "bottom", "left", "topleft", "top", "topright", "right", "center"), reorderRow = c("none", "hclust", "svd"), keep.data = FALSE, ... )
visDmatHeatmap( sMap, data, sBase, base.color = "rainbow", base.separated.arg = NULL, base.legend.location = c("none", "bottomleft", "bottomright", "bottom", "left", "topleft", "top", "topright", "right", "center"), reorderRow = c("none", "hclust", "svd"), keep.data = FALSE, ... )
sMap |
an object of class "sMap" or a codebook matrix |
data |
a data frame or matrix of input data |
sBase |
an object of class "sBase" |
base.color |
short name for the colormap used to encode bases (in row side bar). It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
base.separated.arg |
a list of main parameters used for styling bar separated lines. See 'Note' below for details on the parameters |
base.legend.location |
location of legend to describe bases. If "none", this legend will not be displayed |
reorderRow |
the way to reorder the rows within a base. It can be "none" for rows within a base being reorded by the hexagon indexes, "hclust" for rows within a base being reorded according to hierarchical clustering of patterns seen, "svd" for rows within a base being reorded according to svd of patterns seen |
keep.data |
logical to indicate whether or not to also write out the input data. By default, it sets to false for not keeping it. It is highly expensive to keep the large data sets |
... |
additional graphic parameters used in "visHeatmapAdv". For most parameters, please refer to https://www.rdocumentation.org/packages/gplots/topics/heatmap.2 |
a data frame with following components:
ID
: ID for data. It inherits the rownames of data (if
exists). Otherwise, it is sequential integer values starting with 1 and
ending with dlen, the total number of rows of the input data
Hexagon_index
: the index for best-matching hexagons
Cluster_base
: optional, it is only appended when sBase is
given. It stores the cluster memberships/bases
data
: optional, it is only appended when keep.data is
true
Note: the returned data has rows in the same order as visualised in the heatmap
A list of parameters in "base.separated.arg":
"lty": the line type. Line types can either be specified as an integer (0=blank, 1=solid (default), 2=dashed, 3=dotted, 4=dotdash, 5=longdash, 6=twodash) or as one of the character strings "blank","solid","dashed","dotted","dotdash","longdash","twodash", where "blank" uses 'invisible lines' (i.e., does not draw them)
"lwd": the line width
"col": the line color
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) ## Not run: # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) partition the grid map into clusters using region-growing algorithm sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 4) heatmap visualisation output <- visDmatHeatmap(sMap, data, sBase, base.legend.location="bottomleft", labRow=NA) ## End(Not run)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) ## Not run: # 2) get trained using by default setup sMap <- sPipeline(data=data) # 3) partition the grid map into clusters using region-growing algorithm sBase <- sDmatCluster(sMap=sMap, which_neigh=1, distMeasure="median", clusterLinkage="average") # 4) heatmap visualisation output <- visDmatHeatmap(sMap, data, sBase, base.legend.location="bottomleft", labRow=NA) ## End(Not run)
visHeatmap
is supposed to visualise input data matrix using
heatmap. Note: this heatmap displays matrix in a bottom-to-top
direction
visHeatmap( data, scale = c("none", "row", "column"), row.metric = c("none", "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi"), row.method = c("ward", "single", "complete", "average", "mcquitty", "median", "centroid"), column.metric = c("none", "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi"), column.method = c("ward", "single", "complete", "average", "mcquitty", "median", "centroid"), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 64, zlim = NULL, row.cutree = NULL, row.colormap = c("rainbow"), column.cutree = NULL, column.colormap = c("rainbow"), ... )
visHeatmap( data, scale = c("none", "row", "column"), row.metric = c("none", "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi"), row.method = c("ward", "single", "complete", "average", "mcquitty", "median", "centroid"), column.metric = c("none", "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos", "mi"), column.method = c("ward", "single", "complete", "average", "mcquitty", "median", "centroid"), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 64, zlim = NULL, row.cutree = NULL, row.colormap = c("rainbow"), column.cutree = NULL, column.colormap = c("rainbow"), ... )
data |
an input gene-sample data matrix used for heatmap |
scale |
a character indicating when the input matrix should be centered and scaled. It can be one of "none" (no scaling), "row" (being scaled in the row direction), "column" (being scaled in the column direction) |
row.metric |
distance metric used to calculate the distance metric between rows. It can be one of "none" (i.e. no dendrogram between rows), "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos" and "mi". See details at http://suprahex.r-forge.r-project.org/sDistance.html |
row.method |
the agglomeration method used to cluster rows. This should be one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid". See 'Note' below for details |
column.metric |
distance metric used to calculate the distance metric between columns. It can be one of "none" (i.e. no dendrogram between rows), "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos" and "mi". See details at http://suprahex.r-forge.r-project.org/sDistance.html |
column.method |
the agglomeration method used to cluster columns. This should be one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid". See 'Note' below for details |
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
ncolors |
the number of colors specified over the colormap |
zlim |
the minimum and maximum z/patttern values for which colors should be plotted, defaulting to the range of the finite values of z. Each of the given colors will be used to color an equispaced interval of this range. The midpoints of the intervals cover the range, so that values just outside the range will be plotted |
row.cutree |
an integer scalar specifying the desired number of groups being cut from the row dendrogram. Note, this optional is only enabled when the row dengrogram is built |
row.colormap |
short name for the colormap to color-code the row groups (i.e. sidebar colors used to annotate the rows) |
column.cutree |
an integer scalar specifying the desired number of groups being cut from the column dendrogram. Note, this optional is only enabled when the column dengrogram is built |
column.colormap |
short name for the colormap to color-code the column groups (i.e. sidebar colors used to annotate the columns) |
... |
additional graphic parameters. Type ?heatmap for the complete list. |
invisible
The clustering methods are provided:
"ward": Ward's minimum variance method aims at finding compact, spherical clusters
"single": The single linkage method (which is closely related to the minimal spanning tree) adopts a 'friends of friends' clustering strategy
"complete": The complete linkage method finds similar clusters
"average","mcquitty","median","centroid": These methods can be regarded as aiming for clusters with characteristics somewhere between the single and complete link methods. Two methods "median" and "centroid" are not leading to a monotone distance measure, or equivalently the resulting dendrograms can have so called inversions (which are hard to interpret)
# 1) generate data with an iid matrix of 100 x 9 data <- cbind(matrix(rnorm(100*3,mean=0,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=0.5,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=-0.5,sd=1), nrow=100, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) prepare colors for the column sidebar lvs <- unique(colnames(data)) lvs_color <- visColormap(colormap="rainbow")(length(lvs)) my_ColSideColors <- sapply(colnames(data), function(x) lvs_color[x==lvs]) # 3) heatmap with row dendrogram (with 10 color-coded groups) visHeatmap(data, row.metric="euclidean", row.method="average", colormap="gbr", zlim=c(-2,2), ColSideColors=my_ColSideColors, row.cutree=10, row.colormap="jet", labRow=NA)
# 1) generate data with an iid matrix of 100 x 9 data <- cbind(matrix(rnorm(100*3,mean=0,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=0.5,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=-0.5,sd=1), nrow=100, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) prepare colors for the column sidebar lvs <- unique(colnames(data)) lvs_color <- visColormap(colormap="rainbow")(length(lvs)) my_ColSideColors <- sapply(colnames(data), function(x) lvs_color[x==lvs]) # 3) heatmap with row dendrogram (with 10 color-coded groups) visHeatmap(data, row.metric="euclidean", row.method="average", colormap="gbr", zlim=c(-2,2), ColSideColors=my_ColSideColors, row.cutree=10, row.colormap="jet", labRow=NA)
visHeatmapAdv
is supposed to visualise input data matrix using
advanced heatmap. It allows for adding multiple sidecolors in both
columns and rows. Besides, the sidecolor can be automatically added via
cutting histogram into groups. Note: this heatmap displays matrix in a
top-to-bottom direction
visHeatmapAdv( data, scale = c("none", "row", "column"), Rowv = TRUE, Colv = TRUE, dendrogram = c("both", "row", "column", "none"), dist.metric = c("euclidean", "pearson", "spearman", "kendall", "manhattan", "cos", "mi"), linkage.method = c("complete", "ward", "single", "average", "mcquitty", "median", "centroid"), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 64, zlim = NULL, RowSideColors = NULL, row.cutree = NULL, row.colormap = c("jet"), ColSideColors = NULL, column.cutree = NULL, column.colormap = c("jet"), ... )
visHeatmapAdv( data, scale = c("none", "row", "column"), Rowv = TRUE, Colv = TRUE, dendrogram = c("both", "row", "column", "none"), dist.metric = c("euclidean", "pearson", "spearman", "kendall", "manhattan", "cos", "mi"), linkage.method = c("complete", "ward", "single", "average", "mcquitty", "median", "centroid"), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 64, zlim = NULL, RowSideColors = NULL, row.cutree = NULL, row.colormap = c("jet"), ColSideColors = NULL, column.cutree = NULL, column.colormap = c("jet"), ... )
data |
an input gene-sample data matrix used for heatmap |
scale |
a character indicating when the input matrix should be centered and scaled. It can be one of "none" (no scaling), "row" (being scaled in the row direction), "column" (being scaled in the column direction) |
Rowv |
determines if and how the row dendrogram should be reordered. By default, it is TRUE, which implies dendrogram is computed and reordered based on row means. If NULL or FALSE, then no dendrogram is computed and no reordering is done. If a dendrogram, then it is used "as-is", ie without any reordering. If a vector of integers, then dendrogram is computed and reordered based on the order of the vector |
Colv |
determines if and how the column dendrogram should be reordered. Has the options as the Rowv argument above and additionally when x is a square matrix, Colv = "Rowv" means that columns should be treated identically to the rows |
dendrogram |
character string indicating whether to draw 'none', 'row', 'column' or 'both' dendrograms. Defaults to 'both'. However, if Rowv (or Colv) is FALSE or NULL and dendrogram is 'both', then a warning is issued and Rowv (or Colv) arguments are honoured |
dist.metric |
distance metric used to calculate the distance metric between columns (or rows). It can be one of "none" (i.e. no dendrogram between rows), "pearson", "spearman", "kendall", "euclidean", "manhattan", "cos" and "mi". See details at http://suprahex.r-forge.r-project.org/sDistance.html |
linkage.method |
the agglomeration method used to cluster/linkages columns (or rows). This should be one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid". See 'Note' below for details |
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
ncolors |
the number of colors specified over the colormap |
zlim |
the minimum and maximum z/patttern values for which colors should be plotted, defaulting to the range of the finite values of z. Each of the given colors will be used to color an equispaced interval of this range. The midpoints of the intervals cover the range, so that values just outside the range will be plotted |
RowSideColors |
NULL or a matrix of "numRowsidebars" X nrow(x), where "numRowsidebars" stands for the number of sidebars annotating rows of x. This matrix contains the color names for vertical sidebars. By default, it sets to NULL. In this case, sidebars in rows can still be enabled by cutting the row dendrogram into several clusters (see the next two parameters) |
row.cutree |
an integer scalar specifying the desired number of groups being cut from the row dendrogram. Note, this optional is only enabled when the ColSideColors is NULL |
row.colormap |
short name for the colormap to color-code the row groups (i.e. sidebar colors used to annotate the rows) |
ColSideColors |
NULL or a matrix of ncol(x) X "numColsidebars", where "numColsidebars" stands for the number of sidebars annotating the columns of x. This matrix contains the color names for horizontal sidebars. By default, it sets to NULL. In this case, sidebars in columns can still be enabled by cutting the column dendrogram into several clusters (see the next two parameters) |
column.cutree |
an integer scalar specifying the desired number of groups being cut from the column dendrogram. Note, this optional is only enabled when the column dengrogram is built |
column.colormap |
short name for the colormap to color-code the column groups (i.e. sidebar colors used to annotate the columns) |
... |
additional graphic parameters. For most parameters, please refer to https://www.rdocumentation.org/packages/gplots/topics/heatmap.2. For example, the parameters "srtRow" and "srtCol" to control the angle of row/column labels (in degrees from horizontal: 45 degrees for the column, 0 degrees for the row, by default), i.e. string rotation. The parameters "offsetRow" and "offsetCol" to indicate the number of character-width spaces to place between row/column labels and the edge of the plotting region. Unique to this function, there are two parameters "RowSideWidth" and RowSideLabelLocation, to respectively indicate the fraction of the row side width and the location (either bottom or top) of the row side labelling; the other two parameters "ColSideHeight" and "ColSideLabelLocation" for the column side height and the location (either left or right) of the column side labelling; and two parameters "RowSideBox" and "ColSideBox" to indicate whether there are boxes outside. |
invisible
The clustering/linkage methods are provided:
"ward": Ward's minimum variance method aims at finding compact, spherical clusters
"single": The single linkage method (which is closely related to the minimal spanning tree) adopts a 'friends of friends' clustering strategy
"complete": The complete linkage method finds similar clusters
"average","mcquitty","median","centroid": These methods can be regarded as aiming for clusters with characteristics somewhere between the single and complete link methods. Two methods "median" and "centroid" are not leading to a monotone distance measure, or equivalently the resulting dendrograms can have so called inversions (which are hard to interpret)
# 1) generate data with an iid matrix of 100 x 9 data <- cbind(matrix(rnorm(100*3,mean=0,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=0.5,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=-0.5,sd=1), nrow=100, ncol=3)) colnames(data) <- c("S1_R1","S1_R2","S1_R3","S2_R1","S2_R2","S2_R3","S3_R1","S3_R2","S3_R3") # 2) heatmap after clustering both rows and columns # 2a) shown with row and column dendrograms visHeatmapAdv(data, dendrogram="both", colormap="gbr", zlim=c(-2,2), KeyValueName="log2(Ratio)", add.expr=abline(v=(1:(ncol(data)+1))-0.5,col="white"), lmat=rbind(c(4,3), c(2,1)), lhei=c(1,5), lwid=c(1,3)) # 2b) shown with row dendrogram only visHeatmapAdv(data, dendrogram="row", colormap="gbr", zlim=c(-2,2)) # 2c) shown with column dendrogram only visHeatmapAdv(data, dendrogram="column", colormap="gbr", zlim=c(-2,2)) # 3) heatmap after only clustering rows (with 2 color-coded groups) visHeatmapAdv(data, Colv=FALSE, colormap="gbr", zlim=c(-2,2), row.cutree=2, row.colormap="jet", labRow=NA) # 4) prepare colors for the column sidebar # color for stages (S1-S3) stages <- sub("_.*","",colnames(data)) sta_lvs <- unique(stages) sta_color <- visColormap(colormap="rainbow")(length(sta_lvs)) col_stages <- sapply(stages, function(x) sta_color[x==sta_lvs]) # color for replicates (R1-R3) replicates <- sub(".*_","",colnames(data)) rep_lvs <- unique(replicates) rep_color <- visColormap(colormap="rainbow")(length(rep_lvs)) col_replicates <- sapply(replicates, function(x) rep_color[x==rep_lvs]) # combine both color vectors ColSideColors <- cbind(col_stages,col_replicates) colnames(ColSideColors) <- c("Stages","Replicates") # 5) heatmap without clustering on rows and columns but with the two sidebars in columns visHeatmapAdv(data, Rowv=FALSE, Colv=FALSE, colormap="gbr", zlim=c(-2,2), density.info="density", tracecol="yellow", ColSideColors=ColSideColors, ColSideHeight=0.5, ColSideLabelLocation="right") # 6) legends legend(0,0.8, legend=rep_lvs, col=rep_color, lty=1, lwd=5, cex=0.6, box.col="transparent", horiz=FALSE) legend(0,0.6, legend=sta_lvs, col=sta_color, lty=1, lwd=5, cex=0.6, box.col="transparent", horiz=FALSE)
# 1) generate data with an iid matrix of 100 x 9 data <- cbind(matrix(rnorm(100*3,mean=0,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=0.5,sd=1), nrow=100, ncol=3), matrix(rnorm(100*3,mean=-0.5,sd=1), nrow=100, ncol=3)) colnames(data) <- c("S1_R1","S1_R2","S1_R3","S2_R1","S2_R2","S2_R3","S3_R1","S3_R2","S3_R3") # 2) heatmap after clustering both rows and columns # 2a) shown with row and column dendrograms visHeatmapAdv(data, dendrogram="both", colormap="gbr", zlim=c(-2,2), KeyValueName="log2(Ratio)", add.expr=abline(v=(1:(ncol(data)+1))-0.5,col="white"), lmat=rbind(c(4,3), c(2,1)), lhei=c(1,5), lwid=c(1,3)) # 2b) shown with row dendrogram only visHeatmapAdv(data, dendrogram="row", colormap="gbr", zlim=c(-2,2)) # 2c) shown with column dendrogram only visHeatmapAdv(data, dendrogram="column", colormap="gbr", zlim=c(-2,2)) # 3) heatmap after only clustering rows (with 2 color-coded groups) visHeatmapAdv(data, Colv=FALSE, colormap="gbr", zlim=c(-2,2), row.cutree=2, row.colormap="jet", labRow=NA) # 4) prepare colors for the column sidebar # color for stages (S1-S3) stages <- sub("_.*","",colnames(data)) sta_lvs <- unique(stages) sta_color <- visColormap(colormap="rainbow")(length(sta_lvs)) col_stages <- sapply(stages, function(x) sta_color[x==sta_lvs]) # color for replicates (R1-R3) replicates <- sub(".*_","",colnames(data)) rep_lvs <- unique(replicates) rep_color <- visColormap(colormap="rainbow")(length(rep_lvs)) col_replicates <- sapply(replicates, function(x) rep_color[x==rep_lvs]) # combine both color vectors ColSideColors <- cbind(col_stages,col_replicates) colnames(ColSideColors) <- c("Stages","Replicates") # 5) heatmap without clustering on rows and columns but with the two sidebars in columns visHeatmapAdv(data, Rowv=FALSE, Colv=FALSE, colormap="gbr", zlim=c(-2,2), density.info="density", tracecol="yellow", ColSideColors=ColSideColors, ColSideHeight=0.5, ColSideLabelLocation="right") # 6) legends legend(0,0.8, legend=rep_lvs, col=rep_color, lty=1, lwd=5, cex=0.6, box.col="transparent", horiz=FALSE) legend(0,0.6, legend=sta_lvs, col=sta_color, lty=1, lwd=5, cex=0.6, box.col="transparent", horiz=FALSE)
visHexAnimate
is supposed to animate multiple component planes
of a supra-hexagonal grid. The output can be a pdf file containing a
list of frames/images, a mp4 video file or a gif file. To support video
output file, the software 'ffmpeg' must be first installed (also put
its path into the system PATH variable; see Note). To support gif
output file, the software 'ImageMagick' must be first installed (also
put its path into the system PATH variable; see Note).
visHexAnimate( sMap, which.components = NULL, filename = "visHexAnimate", filetype = c("pdf", "mp4", "gif"), image.type = c("jpg", "png"), sec_per_frame = 1, margin = rep(0.1, 4), height = 7, title.rotate = 0, title.xy = c(0.45, 1), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = NULL, border.color = "transparent", gp = grid::gpar() )
visHexAnimate( sMap, which.components = NULL, filename = "visHexAnimate", filetype = c("pdf", "mp4", "gif"), image.type = c("jpg", "png"), sec_per_frame = 1, margin = rep(0.1, 4), height = 7, title.rotate = 0, title.xy = c(0.45, 1), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = NULL, border.color = "transparent", gp = grid::gpar() )
sMap |
an object of class "sMap" |
which.components |
an integer vector specifying which compopnets will be visualised. By default, it is NULL meaning all components will be visualised |
filename |
the without-extension part of the name of the output file. By default, it is 'visHexAnimate' |
filetype |
the type of the output file, i.e. the extension of the output file name. It can be one of either 'pdf' for the pdf file, 'mp4' for the mp4 video file, 'gif' for the gif file |
image.type |
the type of the image files temporarily generated. It can be one of either 'jpg' or 'png'. These temporary image files are used for producing mp4/gif output file. The reason doing so is to accommodate that sometimes only one of image types is supported so that you can choose the right one |
sec_per_frame |
a numeric value specifying how long (seconds) it takes to stream a frame/image. This argument only works when producing mp4 video or gif file. |
margin |
margins as units of length 4 or 1 |
height |
a numeric value specifying the height of device |
title.rotate |
the rotation of the title |
title.xy |
the coordinates of the title |
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
ncolors |
the number of colors specified |
zlim |
the minimum and maximum z values for which colors should be plotted, defaulting to the range of the finite values of z. Each of the given colors will be used to color an equispaced interval of this range. The midpoints of the intervals cover the range, so that values just outside the range will be plotted |
border.color |
the border color for each hexagon |
gp |
an object of class gpar, typically the output from a call to the function gpar (i.e., a list of graphical parameter settings) |
If specifying the output file name (see argument 'filename' above), the output file is either 'filename.pdf' or 'filename.mp4' or 'filename.gif' in the current working directory. If no output file name specified, by default the output file is either 'visHexAnimate.pdf' or 'visHexAnimate.mp4' or 'visHexAnimate.gif'
When producing mp4 video, this function requires the installation of the software 'ffmpeg' at https://www.ffmpeg.org. Shell command lines for ffmpeg installation in Terminal (for both Linux and Mac) are:
1) wget -O ffmpeg.tar.gz
http://www.ffmpeg.org/releases/ffmpeg-2.7.1.tar.gz
2) mkdir ~/ffmpeg | tar xvfz ffmpeg.tar.gz -C ~/ffmpeg
--strip-components=1
3) cd ffmpeg
4a) # Assuming you want installation with a ROOT (sudo)
privilege: ./configure --disable-yasm
4b) # Assuming you want local installation without ROOT (sudo)
privilege: ./configure --disable-yasm --prefix=$HOME/ffmpeg
5) make
6) make install
7) # add the system PATH variable to your ~/.bash_profile file if
you follow 4b) route: export PATH=$HOME/ffmpeg:$PATH
8) # make sure ffmpeg has been installed successfully:
ffmpeg -h
When producing gif file, this function requires the installation of the software 'ImageMagick' at http://www.imagemagick.org. Shell command lines for ImageMagick installation in Terminal are:
1) wget
http://www.imagemagick.org/download/ImageMagick.tar.gz
2) mkdir ~/ImageMagick | tar xvzf ImageMagick.tar.gz -C
~/ImageMagick --strip-components=1
3) cd ImageMagick
4) ./configure --prefix=$HOME/ImageMagick
5) make
6) make install
7) # add the system PATH variable to your ~/.bash_profile file.
For Linux: export MAGICK_HOME=$HOME/ImageMagick
export PATH=$MAGICK_HOME/bin:$PATH
export
LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}$MAGICK_HOME/lib
For Mac: export MAGICK_HOME=$HOME/ImageMagick
export PATH=$MAGICK_HOME/bin:$PATH
export
DYLD_LIBRARY_PATH=$MAGICK_HOME/lib/
8a) # check configuration: convert -list configure
8b) # check image format supported: identify -list
format
Tips:
Prior to 4), please make sure libjpeg
and
libpng
are installed. If NOT, for Mac try this: brew
install libjpeg libpng
To check whether ImageMagick does work,
please get additional information from: identify -list format
convert -list configure
On details, please refer to
http://www.imagemagick.org/script/advanced-unix-installation.php
# 1) generate data with an iid matrix of 1000 x 3 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") ## Not run: # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) animate sMap # output as a <a href="visHexAnimate.pdf">pdf</a> file visHexAnimate(sMap, filename="visHexAnimate", filetype="pdf") # output as a <a href="visHexAnimate.mp4">mp4</a> file visHexAnimate(sMap, filename="visHexAnimate", filetype="mp4") # output as a <a href="visHexAnimate.gif">gif</a> file visHexAnimate(sMap, filename="visHexAnimate", filetype="gif") ## End(Not run)
# 1) generate data with an iid matrix of 1000 x 3 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") ## Not run: # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) animate sMap # output as a <a href="visHexAnimate.pdf">pdf</a> file visHexAnimate(sMap, filename="visHexAnimate", filetype="pdf") # output as a <a href="visHexAnimate.mp4">mp4</a> file visHexAnimate(sMap, filename="visHexAnimate", filetype="mp4") # output as a <a href="visHexAnimate.gif">gif</a> file visHexAnimate(sMap, filename="visHexAnimate", filetype="gif") ## End(Not run)
visHexBarplot
is supposed to visualise codebook matrix using
barplot for all hexagons or a specific one
visHexBarplot( sObj, which.hexagon = NULL, which.hexagon.highlight = NULL, height = 7, margin = rep(0.1, 4), colormap = c("customized", "bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), customized.color = "red", zeropattern.color = "gray", gp = grid::gpar(cex = 0.7, font = 1, col = "black"), bar.text.cex = 0.8, bar.text.srt = 90, newpage = TRUE )
visHexBarplot( sObj, which.hexagon = NULL, which.hexagon.highlight = NULL, height = 7, margin = rep(0.1, 4), colormap = c("customized", "bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), customized.color = "red", zeropattern.color = "gray", gp = grid::gpar(cex = 0.7, font = 1, col = "black"), bar.text.cex = 0.8, bar.text.srt = 90, newpage = TRUE )
sObj |
an object of class "sMap" or "sTopol" or "sInit" |
which.hexagon |
the integer specifying which hexagon to display. If NULL, all hexagons will be visualised |
which.hexagon.highlight |
an integer vector specifying which hexagons are labelled. If NULL, all hexagons will be labelled |
height |
a numeric value specifying the height of device |
margin |
margins as units of length 4 or 1 |
colormap |
short name for the predifined colormap, and "customized" for custom input (see the next 'customized.color'). The predifined colormap can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
customized.color |
the customized color for pattern visualisation |
zeropattern.color |
the color for zero horizental line |
gp |
an object of class "gpar". It is the output from a call to the function "gpar" (i.e., a list of graphical parameter settings) |
bar.text.cex |
a numerical value giving the amount by which bar text should be magnified relative to the default (i.e., 1) |
bar.text.srt |
a numerical value giving the angle by which bar text should be orientated |
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
invisible
none
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) plot codebook patterns using different types # 3a) for all hexagons visHexBarplot(sMap) # 3b) only for the first hexagon visHexBarplot(sMap, which.hexagon=1)
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) plot codebook patterns using different types # 3a) for all hexagons visHexBarplot(sMap) # 3b) only for the first hexagon visHexBarplot(sMap, which.hexagon=1)
visHexComp
is supposed to visualise a supra-hexagonal grid in
the context of viewport
visHexComp( sMap, comp, margin = rep(0.6, 4), area.size = 1, colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = c(0, 1), border.color = "transparent", newpage = TRUE )
visHexComp( sMap, comp, margin = rep(0.6, 4), area.size = 1, colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = c(0, 1), border.color = "transparent", newpage = TRUE )
sMap |
an object of class "sMap" |
comp |
a component/column of codebook matrix from an object "sMap" |
margin |
margins as units of length 4 or 1 |
area.size |
an inteter or a vector specifying the area size of each hexagon |
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
ncolors |
the number of colors specified |
zlim |
the minimum and maximum z values for which colors should be plotted, defaulting to the range of the finite values of z. Each of the given colors will be used to color an equispaced interval of this range. The midpoints of the intervals cover the range, so that values just outside the range will be plotted |
border.color |
the border color for each hexagon |
newpage |
a logical to indicate whether or not to open a new page |
invisible
none
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) visualise the first component plane with a supra-hexagonal grid visHexComp(sMap, comp=sMap$codebook[,1], colormap="jet", ncolors=100, zlim=c(-1,1))
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) visualise the first component plane with a supra-hexagonal grid visHexComp(sMap, comp=sMap$codebook[,1], colormap="jet", ncolors=100, zlim=c(-1,1))
visHexGrid
is supposed to visualise a supra-hexagonal grid
visHexGrid( hbin, area.size = 1, border.color = NULL, fill.color = NULL, lty = 1, lwd = 1, lineend = "round", linejoin = "round" )
visHexGrid( hbin, area.size = 1, border.color = NULL, fill.color = NULL, lty = 1, lwd = 1, lineend = "round", linejoin = "round" )
hbin |
an object of class "hexbin" |
area.size |
an inteter or a vector specifying the area size of each hexagon |
border.color |
the border color for each hexagon |
fill.color |
the filled color for each hexagon |
lty |
the line type for each hexagon. 0 for 'blank', 1 for 'solid', 2 for 'dashed', 3 for 'dotted', 4 for 'dotdash', 5 for 'longdash', 6 for 'twodash' |
lwd |
the line width for each hexagon |
lineend |
the line end style for each hexagon. It can be one of 'round', 'butt' and 'square' |
linejoin |
the line join style for each hexagon. It can be one of 'round', 'mitre' and 'bevel' |
invisible
none
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) create an object of "hexbin" class from sMap dat <- data.frame(sMap$coord) xdim <- sMap$xdim ydim <- sMap$ydim hbin <- hexbin::hexbin(dat$x, dat$y, xbins=xdim-1, shape=sqrt(0.75)*ydim/xdim) # 4) visualise hbin object vp <- hexbin::hexViewport(hbin) visHexGrid(hbin)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) create an object of "hexbin" class from sMap dat <- data.frame(sMap$coord) xdim <- sMap$xdim ydim <- sMap$ydim hbin <- hexbin::hexbin(dat$x, dat$y, xbins=xdim-1, shape=sqrt(0.75)*ydim/xdim) # 4) visualise hbin object vp <- hexbin::hexViewport(hbin) visHexGrid(hbin)
visHexMapping
is supposed to visualise various mapping items
within a supra-hexagonal grid
visHexMapping( sObj, mappingType = c("indexes", "hits", "dist", "antidist", "bases", "customized"), labels = NULL, height = 7, margin = rep(0.1, 4), area.size = 1, gp = grid::gpar(cex = 0.7, font = 1, col = "black"), border.color = NULL, fill.color = "transparent", lty = 1, lwd = 1, lineend = "round", linejoin = "round", clip = c("on", "inherit", "off"), newpage = TRUE )
visHexMapping( sObj, mappingType = c("indexes", "hits", "dist", "antidist", "bases", "customized"), labels = NULL, height = 7, margin = rep(0.1, 4), area.size = 1, gp = grid::gpar(cex = 0.7, font = 1, col = "black"), border.color = NULL, fill.color = "transparent", lty = 1, lwd = 1, lineend = "round", linejoin = "round", clip = c("on", "inherit", "off"), newpage = TRUE )
sObj |
an object of class "sMap" or "sInit" or "sTopol" |
mappingType |
the mapping type, can be "indexes", "hits", "dist", "antidist", "bases", and "customized" |
labels |
NULL or a vector with the length of nHex |
height |
a numeric value specifying the height of device |
margin |
margins as units of length 4 or 1 |
area.size |
an inteter or a vector specifying the area size of each hexagon |
gp |
an object of class "gpar". It is the output from a call to the function "gpar" (i.e., a list of graphical parameter settings) |
border.color |
the border color for each hexagon |
fill.color |
the filled color for each hexagon |
lty |
the line type for each hexagon. 0 for 'blank', 1 for 'solid', 2 for 'dashed', 3 for 'dotted', 4 for 'dotdash', 5 for 'longdash', 6 for 'twodash' |
lwd |
the line width for each hexagon |
lineend |
the line end style for each hexagon. It can be one of 'round', 'butt' and 'square' |
linejoin |
the line join style for each hexagon. It can be one of 'round', 'mitre' and 'bevel' |
clip |
either "on" for clipping to the extent of this viewport, "inherit" for inheriting the clipping region from the parent viewport, or "off" to turn clipping off altogether |
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
invisible
The mappingType includes:
"indexes": the index of hexagons in a supra-hexagonal grid
"hits": the number of input data vectors hitting the hexagons
"dist": distance (in high-dimensional input space) to neighbors (defined in 2D output space)
"antidist": the oppose version of "dist"
"bases": clusters partitioned from the sMap
"customized": displaying input "labels"
sDmat
, sDmatCluster
,
visHexGrid
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) visualise supported mapping items within a supra-hexagonal grid # 3a) for indexes of hexagons visHexMapping(sMap, mappingType="indexes", fill.color="transparent") # 3b) for the number of input data vectors hitting the hexagons visHexMapping(sMap, mappingType="hits", fill.color=NULL) # 3c) for distance (in high-dimensional input space) to neighbors (defined in 2D output space) visHexMapping(sMap, mappingType="dist") # 3d) for clusters/bases partitioned from the sMap visHexMapping(sMap, mappingType="bases")
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) visualise supported mapping items within a supra-hexagonal grid # 3a) for indexes of hexagons visHexMapping(sMap, mappingType="indexes", fill.color="transparent") # 3b) for the number of input data vectors hitting the hexagons visHexMapping(sMap, mappingType="hits", fill.color=NULL) # 3c) for distance (in high-dimensional input space) to neighbors (defined in 2D output space) visHexMapping(sMap, mappingType="dist") # 3d) for clusters/bases partitioned from the sMap visHexMapping(sMap, mappingType="bases")
visHexMulComp
is supposed to visualise multiple component planes
of a supra-hexagonal grid
visHexMulComp( sMap, which.components = NULL, rect.grid = NULL, margin = rep(0.1, 4), height = 7, title.rotate = 0, title.xy = c(0.45, 1), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = NULL, border.color = "transparent", gp = grid::gpar(), newpage = TRUE )
visHexMulComp( sMap, which.components = NULL, rect.grid = NULL, margin = rep(0.1, 4), height = 7, title.rotate = 0, title.xy = c(0.45, 1), colormap = c("bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), ncolors = 40, zlim = NULL, border.color = "transparent", gp = grid::gpar(), newpage = TRUE )
sMap |
an object of class "sMap" |
which.components |
an integer vector specifying which compopnets will be visualised. By default, it is NULL meaning all components will be visualised |
rect.grid |
a vector specifying the number of rows and columns for a rectangle grid wherein the component planes are placed. By defaul, it is NULL (decided on according to the number of component planes that will be visualised) |
margin |
margins as units of length 4 or 1 |
height |
a numeric value specifying the height of device |
title.rotate |
the rotation of the title |
title.xy |
the coordinates of the title |
colormap |
short name for the colormap. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
ncolors |
the number of colors specified |
zlim |
the minimum and maximum z values for which colors should be plotted, defaulting to the range of the finite values of z. Each of the given colors will be used to color an equispaced interval of this range. The midpoints of the intervals cover the range, so that values just outside the range will be plotted |
border.color |
the border color for each hexagon |
gp |
an object of class gpar, typically the output from a call to the function gpar (i.e., a list of graphical parameter settings) |
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
invisible
none
visVp
, visHexComp
,
visColorbar
# 1) generate data with an iid matrix of 1000 x 3 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) visualise multiple component planes of a supra-hexagonal grid visHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) # 3a) visualise only the first 6 component planes visHexMulComp(sMap, which.components=1:6, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) # 3b) visualise only the first 6 component planes within the rectangle grid of 3 X 2 visHexMulComp(sMap, which.components=1:6, rect.grid=c(3,2), colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8))
# 1) generate data with an iid matrix of 1000 x 3 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) visualise multiple component planes of a supra-hexagonal grid visHexMulComp(sMap, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) # 3a) visualise only the first 6 component planes visHexMulComp(sMap, which.components=1:6, colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8)) # 3b) visualise only the first 6 component planes within the rectangle grid of 3 X 2 visHexMulComp(sMap, which.components=1:6, rect.grid=c(3,2), colormap="jet", ncolors=20, zlim=c(-1,1), gp=grid::gpar(cex=0.8))
visHexPattern
is supposed to codebook matrix or input patterns
within a supra-hexagonal grid.
visHexPattern( sObj, plotType = c("lines", "bars", "radars"), pattern = NULL, height = 7, margin = rep(0.1, 4), colormap = c("customized", "bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), customized.color = "red", alterntive.color = c("transparent", "gray"), zeropattern.color = "gray", legend = TRUE, legend.cex = 0.8, legend.label = NULL, newpage = TRUE )
visHexPattern( sObj, plotType = c("lines", "bars", "radars"), pattern = NULL, height = 7, margin = rep(0.1, 4), colormap = c("customized", "bwr", "jet", "gbr", "wyr", "br", "yr", "rainbow", "wb"), customized.color = "red", alterntive.color = c("transparent", "gray"), zeropattern.color = "gray", legend = TRUE, legend.cex = 0.8, legend.label = NULL, newpage = TRUE )
sObj |
an object of class "sMap" or "sTopol" or "sInit" |
plotType |
the plot type, can be "lines" for line/point graph, "bars" for bar graph, "radars" for radar graph |
pattern |
By default, it sets to "NULL" for the codebook matrix. It is intended for the user-input patterns, i.e., a matrix with the dimension of nHex x nPattern, where nHex is the number of hexagons and nPattern is the number of elements for each pattern |
height |
a numeric value specifying the height of device |
margin |
margins as units of length 4 or 1 |
colormap |
short name for the predifined colormap, and "customized" for custom input (see the next 'customized.color'). The predifined colormap can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names |
customized.color |
the customized color for pattern visualisation |
alterntive.color |
the alterntive color used to indicate the hexagon layout |
zeropattern.color |
the color for zero horizental line |
legend |
logical to indicate whether to add the legend |
legend.cex |
a numerical value giving the amount by which legend text should be magnified relative to the default (i.e., 1) |
legend.label |
a vector specifying the legend label. By default, it is NULL for using column names of the codebook matrix (or the matrix given by the parameter 'pattern') |
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
invisible
The "plotType" includes:
"lines": line plot. If multple colors are given, the points are also plotted. When the pattern involves both positive and negative values, zero horizental line is also shown
"bars": bar plot. When the pattern involves both positive and negative values, the zero horizental line is in the middle of the hexagon; otherwise at the top of the hexagon for all negative values, and at the bottom for all positive values
"radars": radar plot. Each radar diagram represents one pattern, wherein each element value is proportional to the distance from the center. Note, it starts on the right and wind counterclockwise around the circle
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) plot codebook patterns using different types # 3a) line plot visHexPattern(sMap, plotType="lines") # 3b) bar plot visHexPattern(sMap, plotType="bars") # 3c) radar plot visHexPattern(sMap, plotType="radars") # 4) plot user-input patterns using different types # 4a) generate pattern data with two different groups "S" and "T" nHex <- sMap$nHex pattern <- cbind(matrix(runif(nHex*3,min=0,max=1), nrow=nHex, ncol=3), matrix(runif(nHex*3,min=1,max=2), nrow=nHex, ncol=3)) colnames(pattern) <- c("S1","S2","S3","T1","T2","T3") # 4b) for line plot visHexPattern(sMap, plotType="lines", pattern=pattern, customized.color="red", zeropattern.color="gray") # 4c) for bar plot visHexPattern(sMap, plotType="bars", pattern=pattern, customized.color=rep(c("red","green"),each=3)) visHexPattern(sMap, plotType="bars", pattern=pattern, customized.color=rep(c("red","green"),each=3), legend.label=c("S","T")) # 4d) for radar plot visHexPattern(sMap, plotType="radars", pattern=pattern, customized.color=rep(c("red","green"),each=3)) visHexPattern(sMap, plotType="radars", pattern=pattern, customized.color=rep(c("red","green"),each=3), legend.label=c("S","T"))
# 1) generate data with an iid matrix of 1000 x 9 data <- cbind(matrix(rnorm(1000*3,mean=0,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=0.5,sd=1), nrow=1000, ncol=3), matrix(rnorm(1000*3,mean=-0.5,sd=1), nrow=1000, ncol=3)) colnames(data) <- c("S1","S1","S1","S2","S2","S2","S3","S3","S3") # 2) sMap resulted from using by default setup sMap <- sPipeline(data=data) # 3) plot codebook patterns using different types # 3a) line plot visHexPattern(sMap, plotType="lines") # 3b) bar plot visHexPattern(sMap, plotType="bars") # 3c) radar plot visHexPattern(sMap, plotType="radars") # 4) plot user-input patterns using different types # 4a) generate pattern data with two different groups "S" and "T" nHex <- sMap$nHex pattern <- cbind(matrix(runif(nHex*3,min=0,max=1), nrow=nHex, ncol=3), matrix(runif(nHex*3,min=1,max=2), nrow=nHex, ncol=3)) colnames(pattern) <- c("S1","S2","S3","T1","T2","T3") # 4b) for line plot visHexPattern(sMap, plotType="lines", pattern=pattern, customized.color="red", zeropattern.color="gray") # 4c) for bar plot visHexPattern(sMap, plotType="bars", pattern=pattern, customized.color=rep(c("red","green"),each=3)) visHexPattern(sMap, plotType="bars", pattern=pattern, customized.color=rep(c("red","green"),each=3), legend.label=c("S","T")) # 4d) for radar plot visHexPattern(sMap, plotType="radars", pattern=pattern, customized.color=rep(c("red","green"),each=3)) visHexPattern(sMap, plotType="radars", pattern=pattern, customized.color=rep(c("red","green"),each=3), legend.label=c("S","T"))
visKernels
is supposed to visualize a series of neighborhood
kernels, each of which is a non-increasing functions of: i) the
distance between the hexagon/rectangle
and the
winner
, and ii) the radius
at time
.
visKernels(newpage = TRUE)
visKernels(newpage = TRUE)
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
invisible
There are five kernels that are currently supported:
For "gaussian" kernel,
For "cutguassian" kernel,
For "bubble" kernel,
For "ep" kernel,
For "gamma" kernel,
These kernels above are displayed within a plot for each fixed radius. Three different radii (i.e., 1 and 2) are illustrated.
# visualise currently supported five kernels visKernels()
# visualise currently supported five kernels visKernels()
visTreeBootstrap
is supposed to build the tree, perform
bootstrap analysis and visualise the bootstrapped tree. It returns an
object of class "phylo". For easy downstream analysis, the bootstrapped
tree is rerooted either at the internal node with the miminum
bootstrap/confidence value or at any customised internal node.
visTreeBootstrap( data, algorithm = c("nj", "fastme.ols", "fastme.bal"), metric = c("euclidean", "pearson", "spearman", "cos", "manhattan", "kendall", "mi", "binary"), num.bootstrap = 100, consensus = FALSE, consensus.majority = 0.5, reroot = "min.bootstrap", plot.phylo.arg = NULL, nodelabels.arg = NULL, visTree = TRUE, verbose = TRUE, ... )
visTreeBootstrap( data, algorithm = c("nj", "fastme.ols", "fastme.bal"), metric = c("euclidean", "pearson", "spearman", "cos", "manhattan", "kendall", "mi", "binary"), num.bootstrap = 100, consensus = FALSE, consensus.majority = 0.5, reroot = "min.bootstrap", plot.phylo.arg = NULL, nodelabels.arg = NULL, visTree = TRUE, verbose = TRUE, ... )
data |
an input data matrix used to build the tree. The built tree describes the relationships between rows of input matrix |
algorithm |
the tree-building algorithm. It can be one of "nj" for the neighbor-joining tree estimation, "fastme.ols" for the minimum evolution algorithm with ordinary least-squares (OLS) fitting of a metric to a tree structure, and "fastme.bal" for the minimum evolution algorithm under a balanced (BAL) weighting scheme |
metric |
distance metric used to calculate a distance matrix between rows of input matrix. It can be: "pearson" for pearson correlation, "spearman" for spearman rank correlation, "kendall" for kendall tau rank correlation, "euclidean" for euclidean distance, "manhattan" for cityblock distance, "cos" for cosine similarity, "mi" for mutual information |
num.bootstrap |
an integer specifying the number of bootstrap replicates |
consensus |
logical to indicate whether to return the consensus tree. By default, it sets to false for not doing so. Note: if true, there will be no visualisation of the bootstrapped tree |
consensus.majority |
a numeric value between 0.5 and 1 (or between 50 and 100) giving the proportion for a clade to be represented in the consensus tree |
reroot |
determines if and how the bootstrapped tree should be rerooted. By default, it is "min.bootstrap", which implies that the bootstrapped tree will be rerooted at the internal node with the miminum bootstrap/confidence value. If it is an integer between 1 and the number of internal nodes, the tree will be rerooted at the internal node with this index value |
plot.phylo.arg |
a list of main parameters used in the function "ape::plot.phylo" http://rdrr.io/cran/ape/man/plot.phylo.html. See 'Note' below for details on the parameters |
nodelabels.arg |
a list of main parameters used in the function "ape::nodelabels" http://rdrr.io/cran/ape/man/nodelabels.html. See 'Note' below for details on the parameters |
visTree |
logical to indicate whether the bootstrap tree will be visualised. By default, it sets to true for display. Note, the consensus tree can not be enabled for visualisation |
verbose |
logical to indicate whether the messages will be displayed in the screen. By default, it sets to true for display |
... |
additional "ape::plot.phylo" parameters |
an object of class "phylo". It can return a bootstrapped tree or a consensus tree (if enabled): When a bootstrapped tree is returned (also visualised by default), the "phylo" object has a list with following components:
Nnode
: the number of internal nodes
node.label
: the labels for internal nodes. Here, each
internal node is associated with the bootstrap value
tip.label
: the labels for tip nodes. Tip labels come from
the row names of the input matrix, but are not necessarily the same
order as they appear in the input matrix
edge
: a two-column matrix describing the links between
tree nodes (including internal and tip nodes)
edge.length
: a vector indicating the edge length in the
'edge'
Note: the tree structure is indexed with 1:Ntip for tip nodes,
and (+1):(
+
) for internal nodes, where
is the number of tip nodes and
for the number of
internal nodes. Moreover,
.
When a consensus tree is returned (no visualisation), the "phylo" object has a list with following components:
Nnode
: the number of internal nodes
tip.label
: the lables for tip nodes. Tip labels come from
the row names of the input matrix, but are not necessarily the same
order as they appear in the input matrix
edge
: a two-column matrix describing the links between
tree nodes (including internal and tip nodes)
A list of main parameters used in the function "ape::plot.phylo":
"type": a character string specifying the type of phylogeny to be drawn; it must be one of "phylogram" (the default), "cladogram", "fan", "unrooted", "radial" or any unambiguous abbreviation of these
"direction": a character string specifying the direction of the tree. Four values are possible: "rightwards" (the default), "leftwards", "upwards", and "downwards"
"lab4ut": (= labels for unrooted trees) a character string specifying the display of tip labels for unrooted trees: either "horizontal" where all labels are horizontal (the default), or "axial" where the labels are displayed in the axis of the corresponding terminal branches. This option has an effect only if type = "unrooted"
"edge.color": a vector of mode character giving the colours used to draw the branches of the plotted phylogeny. These are taken to be in the same order than the component edge of phy. If fewer colours are given than the length of edge, then the colours are recycled
"edge.width": a numeric vector giving the width of the branches of the plotted phylogeny. These are taken to be in the same order than the component edge of phy. If fewer widths are given than the length of edge, then these are recycled
"edge.lty": same than the previous argument but for line types; 1: plain, 2: dashed, 3: dotted, 4: dotdash, 5: longdash, 6: twodash
"font": an integer specifying the type of font for the labels: 1 (plain text), 2 (bold), 3 (italic, the default), or 4 (bold italic)
"cex": a numeric value giving the factor scaling of the tip and node labels (Character EXpansion). The default is to take the current value from the graphical parameters
"adj": a numeric specifying the justification of the text strings of the labels: 0 (left-justification), 0.5 (centering), or 1 (right-justification). This option has no effect if type="unrooted". If NULL (the default) the value is set with respect of direction (see details)
"srt": a numeric giving how much the labels are rotated in degrees (negative values are allowed resulting in clock-like rotation); the value has an effect respectively to the value of direction (see Examples). This option has no effect if type="unrooted"
"no.margin": a logical. If TRUE, the margins are set to zero and the plot uses all the space of the device
"label.offset": a numeric giving the space between the nodes and the tips of the phylogeny and their corresponding labels. This option has no effect if type="unrooted"
"rotate.tree": for "fan", "unrooted", or "radial" trees: the rotation of the whole tree in degrees (negative values are accepted
A list of main parameters used in the function "ape::nodelabels":
"text": a vector of mode character giving the text to be printed. By default, the labels for internal nodes (see "node.label"), that is, the bootstrap values associated with internal nodes
"node": a vector of mode numeric giving the numbers of the nodes
where the text or the symbols are to be printed. By default, indexes
for internal nodes, that is, (+1):(
+
),
where
is the number of tip nodes and
for the
number of internal nodes
"adj": one or two numeric values specifying the horizontal and vertical, respectively, justification of the text or symbols. By default, the text is centered horizontally and vertically. If a single value is given, this alters only the horizontal position of the text
"frame": a character string specifying the kind of frame to be printed around the text. This must be one of "rect" (the default), "circle", "none", or any unambiguous abbreviation of these
"cex": a numeric value giving the factor scaling of the tip and node labels (Character EXpansion). The default is to take the current value from the graphical parameters
"font": an integer specifying the type of font for the labels: 1 (plain text), 2 (bold), 3 (italic, the default), or 4 (bold italic)
"col": a character string giving the color to be used for the text or the plotting symbols; this is eventually recycled
"bg": a character string giving the color to be used for the background of the text frames or of the plotting symbols if it applies; this is eventually recycled. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") data <- t(data) ## Not run: # 2) build neighbor-joining tree with bootstrap values and visualise it by default visTreeBootstrap(data) # 3) only display those internal nodes with bootstrap values > 30 # 3a) generate the bootstrapped tree (without visualisation) tree_bs <- visTreeBootstrap(data, visTree=FALSE) # 3b) look at the bootstrap values and ordered row names of input matrix # the bootstrap values tree_bs$node.label # ordered row names of input matrix tree_bs$tip.label # 3c) determine internal nodes that should be displayed Ntip <- length(tree_bs$tip.label) # number of tip nodes Nnode <- length(tree_bs$node.label) # number of internal nodes flag <- which(as.numeric(tree_bs$node.label) > 30 | !is.na(tree_bs$node.label)) text <- tree_bs$node.label[flag] node <- Ntip + (1:Nnode)[flag] visTreeBootstrap(data, nodelabels.arg=list(text=text,node=node)) # 4) obtain the consensus tree tree_cons <- visTreeBootstrap(data, consensus=TRUE, num.bootstrap=10) ## End(Not run)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") data <- t(data) ## Not run: # 2) build neighbor-joining tree with bootstrap values and visualise it by default visTreeBootstrap(data) # 3) only display those internal nodes with bootstrap values > 30 # 3a) generate the bootstrapped tree (without visualisation) tree_bs <- visTreeBootstrap(data, visTree=FALSE) # 3b) look at the bootstrap values and ordered row names of input matrix # the bootstrap values tree_bs$node.label # ordered row names of input matrix tree_bs$tip.label # 3c) determine internal nodes that should be displayed Ntip <- length(tree_bs$tip.label) # number of tip nodes Nnode <- length(tree_bs$node.label) # number of internal nodes flag <- which(as.numeric(tree_bs$node.label) > 30 | !is.na(tree_bs$node.label)) text <- tree_bs$node.label[flag] node <- Ntip + (1:Nnode)[flag] visTreeBootstrap(data, nodelabels.arg=list(text=text,node=node)) # 4) obtain the consensus tree tree_cons <- visTreeBootstrap(data, consensus=TRUE, num.bootstrap=10) ## End(Not run)
visTreeBSclust
is supposed to obtain clusters from a
bootstrapped tree.
visTreeBSclust( tree_bs, bootstrap.cutoff = 80, max.fraction = 1, min.size = 3, visTree = TRUE, plot.phylo.arg = NULL, nodelabels.arg = NULL, verbose = TRUE, ... )
visTreeBSclust( tree_bs, bootstrap.cutoff = 80, max.fraction = 1, min.size = 3, visTree = TRUE, plot.phylo.arg = NULL, nodelabels.arg = NULL, verbose = TRUE, ... )
tree_bs |
an "phylo" object storing a bootstrapped tree |
bootstrap.cutoff |
an integer specifying bootstrap-derived clusters |
max.fraction |
the maximum fraction of leaves contained in a cluster |
min.size |
the minumum number of leaves contained in a cluster |
visTree |
logical to indicate whether the tree will be visualised. By default, it sets to true for display |
plot.phylo.arg |
a list of main parameters used in the function "ape::plot.phylo" http://rdrr.io/cran/ape/man/plot.phylo.html. See 'Note' below for details on the parameters |
nodelabels.arg |
a list of main parameters used in the function "ape::nodelabels" http://rdrr.io/cran/ape/man/nodelabels.html. See 'Note' below for details on the parameters |
verbose |
logical to indicate whether the messages will be displayed in the screen. By default, it sets to true for display |
... |
additional "ape::plot.phylo" parameters |
a data frame following components:
Samples
: the labels for tip nodes (samples)
Clusters
: the clusters each tip node belongs to;
unassigned tip nodes will be the cluster called 'C0'
Clans
: the internal node id for each cluster
A list of main parameters used in the function "ape::plot.phylo":
"type": a character string specifying the type of phylogeny to be drawn; it must be one of "phylogram" (the default), "cladogram", "fan", "unrooted", "radial" or any unambiguous abbreviation of these
"direction": a character string specifying the direction of the tree. Four values are possible: "rightwards" (the default), "leftwards", "upwards", and "downwards"
"lab4ut": (= labels for unrooted trees) a character string specifying the display of tip labels for unrooted trees: either "horizontal" where all labels are horizontal (the default), or "axial" where the labels are displayed in the axis of the corresponding terminal branches. This option has an effect only if type = "unrooted"
"edge.color": a vector of mode character giving the colours used to draw the branches of the plotted phylogeny. These are taken to be in the same order than the component edge of phy. If fewer colours are given than the length of edge, then the colours are recycled
"edge.width": a numeric vector giving the width of the branches of the plotted phylogeny. These are taken to be in the same order than the component edge of phy. If fewer widths are given than the length of edge, then these are recycled
"edge.lty": same than the previous argument but for line types; 1: plain, 2: dashed, 3: dotted, 4: dotdash, 5: longdash, 6: twodash
"font": an integer specifying the type of font for the labels: 1 (plain text), 2 (bold), 3 (italic, the default), or 4 (bold italic)
"cex": a numeric value giving the factor scaling of the tip and node labels (Character EXpansion). The default is to take the current value from the graphical parameters
"adj": a numeric specifying the justification of the text strings of the labels: 0 (left-justification), 0.5 (centering), or 1 (right-justification). This option has no effect if type="unrooted". If NULL (the default) the value is set with respect of direction (see details)
"srt": a numeric giving how much the labels are rotated in degrees (negative values are allowed resulting in clock-like rotation); the value has an effect respectively to the value of direction (see Examples). This option has no effect if type="unrooted"
"no.margin": a logical. If TRUE, the margins are set to zero and the plot uses all the space of the device
"label.offset": a numeric giving the space between the nodes and the tips of the phylogeny and their corresponding labels. This option has no effect if type="unrooted"
"rotate.tree": for "fan", "unrooted", or "radial" trees: the rotation of the whole tree in degrees (negative values are accepted
A list of main parameters used in the function "ape::nodelabels":
"text": a vector of mode character giving the text to be printed. By default, the labels for internal nodes (see "node.label"), that is, the bootstrap values associated with internal nodes
"node": a vector of mode numeric giving the numbers of the nodes
where the text or the symbols are to be printed. By default, indexes
for internal nodes, that is, (+1):(
+
),
where
is the number of tip nodes and
for the
number of internal nodes
"adj": one or two numeric values specifying the horizontal and vertical, respectively, justification of the text or symbols. By default, the text is centered horizontally and vertically. If a single value is given, this alters only the horizontal position of the text
"frame": a character string specifying the kind of frame to be printed around the text. This must be one of "rect" (the default), "circle", "none", or any unambiguous abbreviation of these
"cex": a numeric value giving the factor scaling of the tip and node labels (Character EXpansion). The default is to take the current value from the graphical parameters
"font": an integer specifying the type of font for the labels: 1 (plain text), 2 (bold), 3 (italic, the default), or 4 (bold italic)
"col": a character string giving the color to be used for the text or the plotting symbols; this is eventually recycled
"bg": a character string giving the color to be used for the background of the text frames or of the plotting symbols if it applies; this is eventually recycled. It can be one of "jet" (jet colormap), "bwr" (blue-white-red colormap), "gbr" (green-black-red colormap), "wyr" (white-yellow-red colormap), "br" (black-red colormap), "yr" (yellow-red colormap), "wb" (white-black colormap), and "rainbow" (rainbow colormap, that is, red-yellow-green-cyan-blue-magenta). Alternatively, any hyphen-separated HTML color names, e.g. "blue-black-yellow", "royalblue-white-sandybrown", "darkgreen-white-darkviolet". A list of standard color names can be found in http://html-color-codes.info/color-names
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") data <- t(data) ## Not run: # 2) build neighbor-joining tree with bootstrap values and visualise it by default tree_bs <- visTreeBootstrap(data) # 3) obtain clusters from a bootstrapped tree res <- visTreeBSclust(tree_bs, bootstrap.cutoff=80) ## hide tip labels and modify the font of internal node labels res <- visTreeBSclust(tree_bs, bootstrap.cutoff=80, nodelabels.arg=list(cex=0.4), show.tip.label=FALSE) ## End(Not run)
# 1) generate an iid normal random matrix of 100x10 data <- matrix( rnorm(100*10,mean=0,sd=1), nrow=100, ncol=10) colnames(data) <- paste(rep('S',10), seq(1:10), sep="") data <- t(data) ## Not run: # 2) build neighbor-joining tree with bootstrap values and visualise it by default tree_bs <- visTreeBootstrap(data) # 3) obtain clusters from a bootstrapped tree res <- visTreeBSclust(tree_bs, bootstrap.cutoff=80) ## hide tip labels and modify the font of internal node labels res <- visTreeBSclust(tree_bs, bootstrap.cutoff=80, nodelabels.arg=list(cex=0.4), show.tip.label=FALSE) ## End(Not run)
visVp
is supposed to create viewports, which describe
rectangular regions on a graphics device and define a number of
coordinate systems for each of supra-hexagonal grids.
visVp( height = 7, xdim = 1, ydim = 1, colNum = 1, rowNum = 1, gp = grid::gpar(), newpage = TRUE )
visVp( height = 7, xdim = 1, ydim = 1, colNum = 1, rowNum = 1, gp = grid::gpar(), newpage = TRUE )
height |
a numeric value specifying the height of device |
xdim |
an integer specifying x-dimension of the grid |
ydim |
an integer specifying y-dimension of the grid |
colNum |
an integer specifying the number of columns |
rowNum |
an integer specifying the number of rows |
gp |
an object of class gpar, typically the output from a call to the function gpar (i.e., a list of graphical parameter settings) |
newpage |
logical to indicate whether to open a new page. By default, it sets to true for opening a new page |
vpnames |
an R object of "viewport" class |
none
# 1) create 5x5 viewports vpnames <- visVp(colNum=5, rowNum=5) # 2) look at names of these viewports vpnames
# 1) create 5x5 viewports vpnames <- visVp(colNum=5, rowNum=5) # 2) look at names of these viewports vpnames
Arabidopsis embryo dataset contains gene expression levels (3625 genes and 7 embryo samples) from Xiang et al. (2011). This dataset has been pre-processed: capping into floor of intensity 777.6; 2-base logarithmic transformation; row/gene centering; and keeping genes with at least 2-fold changes (in any stage) as compared to the average over embryo stages.
data(Xiang)
data(Xiang)
Xiang
: a gene expression matrix of 3625 genes x 7 stage
samples. These embryo stages are: zygote, quadrant, globular, heart,
torpedo, bent, and mature.
Xiang et al. (2011) Genome-wide analysis reveals gene expression and metabolic network dynamics during embryo development in Arabidopsis. Plant Physiol, 156(1):346-356.