Title: | Phylogeny-based sequence clustering with site polymorphism |
---|---|
Description: | Using site polymorphism is one of the ways to cluster DNA/protein sequences but it is possible for the sequences with the same polymorphism on a single site to be genetically distant. This package is aimed at clustering sequences using site polymorphism and their corresponding phylogenetic trees. By considering their location on the tree, only the structurally adjacent sequences will be clustered. However, the adjacent sequences may not necessarily have the same polymorphism. So a branch-and-bound like algorithm is used to minimize the entropy representing the purity of site polymorphism of each cluster. |
Authors: | Chengyang Ji [aut, cre, cph] , Hangyu Zhou [ths], Aiping Wu [ths] |
Maintainer: | Chengyang Ji <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.23.0 |
Built: | 2024-11-27 05:13:15 UTC |
Source: | https://github.com/bioc/sitePath |
The function is a way to get position of the resulting sites
from SNPsites
, fixationSites
and
parallelSites
. The numbering is consistent with what's being
set by setSiteNumbering
allSitesName(x, ...) ## S3 method for class 'SNPsites' allSitesName(x, ...) ## S3 method for class 'sitesMinEntropy' allSitesName(x, ...) ## S3 method for class 'fixationSites' allSitesName(x, ...) ## S3 method for class 'parallelSites' allSitesName(x, ...) ## S3 method for class 'paraFixSites' allSitesName(x, type = c("paraFix", "fixation", "parallel"), ...)
allSitesName(x, ...) ## S3 method for class 'SNPsites' allSitesName(x, ...) ## S3 method for class 'sitesMinEntropy' allSitesName(x, ...) ## S3 method for class 'fixationSites' allSitesName(x, ...) ## S3 method for class 'parallelSites' allSitesName(x, ...) ## S3 method for class 'paraFixSites' allSitesName(x, type = c("paraFix", "fixation", "parallel"), ...)
x |
The object containing the sites from analysis |
... |
Other arguments |
type |
Return fixation or parallel sites |
An integer vector for sites position
data(zikv_tree) msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath') tree <- addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta') snp <- SNPsites(tree) allSitesName(snp)
data(zikv_tree) msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath') tree <- addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta') snp <- SNPsites(tree) allSitesName(snp)
Convert return of functions in sitePath
package to a
data.frame
so can be better worked with. The group name for
each tip is the same as groupTips
.
A fixationSites
object will output the mutation
name of the fixation and the cluster name before and after the mutation.
An SNPsites
object will output the tip name with
the SNP and its position.
An parallelSites
object will output the tip name
with the group name and mutation info.
## S3 method for class 'fixationSites' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'SNPsites' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'parallelSites' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
## S3 method for class 'fixationSites' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'SNPsites' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'parallelSites' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
x |
The object to be converted to |
row.names |
Unimplemented. |
optional |
Unimplemented. |
... |
Other arguments. |
A data.frame
object.
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) fixations <- fixationSites(lineagePath(tree)) as.data.frame(fixations)
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) fixations <- fixationSites(lineagePath(tree)) as.data.frame(fixations)
The functions in sitePath
usually include the results on
more than one site. The function extractSite
can be used to extract
the predicted result on a single site.
extractSite(x, site, ...) ## S3 method for class 'fixationSites' extractSite(x, site, ...)
extractSite(x, site, ...) ## S3 method for class 'fixationSites' extractSite(x, site, ...)
x |
A |
site |
A site included in the result. |
... |
Other arguments |
The predicted result of a single site
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) mutations <- fixationSites(lineagePath(tree)) extractSite(mutations, 139)
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) mutations <- fixationSites(lineagePath(tree)) extractSite(mutations, 139)
The result of fixationSites
and sitePath
contains all the possible sites with fixation mutation. The function
extractTips
retrieves the name of the tips involved in the fixation.
For lineagePath
, the function extractTips
groups all the tree tips according to the amino acid/nucleotide of the
site
.
For parallelSites
and sitePara
object, the
function extractTips
retrieve all the tips with parallel mutation.
extractTips(x, ...) ## S3 method for class 'lineagePath' extractTips(x, site, ...) ## S3 method for class 'sitesMinEntropy' extractTips(x, site, ...) ## S3 method for class 'fixationSites' extractTips(x, site, select = 1, ...) ## S3 method for class 'sitePath' extractTips(x, select = 1, ...) ## S3 method for class 'parallelSites' extractTips(x, site, ...) ## S3 method for class 'sitePara' extractTips(x, ...)
extractTips(x, ...) ## S3 method for class 'lineagePath' extractTips(x, site, ...) ## S3 method for class 'sitesMinEntropy' extractTips(x, site, ...) ## S3 method for class 'fixationSites' extractTips(x, site, select = 1, ...) ## S3 method for class 'sitePath' extractTips(x, select = 1, ...) ## S3 method for class 'parallelSites' extractTips(x, site, ...) ## S3 method for class 'sitePara' extractTips(x, ...)
x |
A |
... |
Other arguments |
site |
A site predicted to experience fixation. |
select |
For a site, there theoretically might be more than one fixation on different lineages. You may use this argument to extract for a specific fixation of a site. The default is the first fixation of the site. |
Tree tips grouped as list
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) mutations <- fixationSites(lineagePath(tree)) extractTips(mutations, 139)
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) mutations <- fixationSites(lineagePath(tree)) extractTips(mutations, 139)
The fixation of insertions of deletions.
fixationIndels(x, ...) ## S3 method for class 'sitesMinEntropy' fixationIndels(x, ...)
fixationIndels(x, ...) ## S3 method for class 'sitesMinEntropy' fixationIndels(x, ...)
x |
The return from |
... |
Other arguments. |
A fixationIndels
object.
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) fixationIndels(sitesMinEntropy(tree))
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) fixationIndels(sitesMinEntropy(tree))
The tips are clustered according to the fixation sites. The transition of fixation sites will be plotted as a phylogenetic tree. The length of each branch represents the number of fixation mutation between two clusters. The name of the tree tips indicate the number of sequences in the cluster.
fixationPath(x, ...) ## S3 method for class 'sitesMinEntropy' fixationPath(x, minEffectiveSize = NULL, ...) ## S3 method for class 'fixationSites' fixationPath(x, minEffectiveSize = NULL, ...)
fixationPath(x, ...) ## S3 method for class 'sitesMinEntropy' fixationPath(x, minEffectiveSize = NULL, ...) ## S3 method for class 'fixationSites' fixationPath(x, minEffectiveSize = NULL, ...)
x |
The return from |
... |
Further arguments passed to or from other methods. |
minEffectiveSize |
The minimum size for a tip cluster. |
An fixationPath
object
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) paths <- lineagePath(tree) mutations <- fixationSites(paths) fixationPath(mutations)
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) paths <- lineagePath(tree) mutations <- fixationSites(paths) fixationPath(mutations)
After finding the lineagePath
of a phylogenetic
tree, fixationSites
uses the result to find those sites that show
fixation on some, if not all, of the lineages. The number of tips before
and after the fixation mutation is expected to be more than
minEffectiveSize
. Also, the fixation will be skipped if the amino
acid/nucleotide is gap or ambiguous character. A lineage has to have at
least one fixation mutation to be reported.
fixationSites(paths, ...) ## S3 method for class 'lineagePath' fixationSites( paths, minEffectiveSize = NULL, searchDepth = 1, method = c("compare", "insert", "delete"), ... ) ## S3 method for class 'sitesMinEntropy' fixationSites(paths, ...) ## S3 method for class 'paraFixSites' fixationSites(paths, ...)
fixationSites(paths, ...) ## S3 method for class 'lineagePath' fixationSites( paths, minEffectiveSize = NULL, searchDepth = 1, method = c("compare", "insert", "delete"), ... ) ## S3 method for class 'sitesMinEntropy' fixationSites(paths, ...) ## S3 method for class 'paraFixSites' fixationSites(paths, ...)
paths |
A |
... |
further arguments passed to or from other methods. |
minEffectiveSize |
The minimum number of tips in a group. |
searchDepth |
The function uses heuristic search but the termination of
the search cannot be intrinsically decided. |
method |
The strategy for predicting the fixation. The basic approach is entropy minimization and can be achieved by adding or removing fixation point, or by comparing the two. |
A fixationSites
object.
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) fixationSites(lineagePath(tree))
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) fixationSites(lineagePath(tree))
The tips between divergent nodes or fixation mutations on the lineages are each gathered as group.
groupTips(tree, ...) ## S3 method for class 'phyMSAmatched' groupTips( tree, similarity = NULL, simMatrix = NULL, forbidTrivial = TRUE, tipnames = TRUE, ... ) ## S3 method for class 'lineagePath' groupTips(tree, tipnames = TRUE, ...) ## S3 method for class 'sitesMinEntropy' groupTips(tree, tipnames = TRUE, ...) ## S3 method for class 'fixationSites' groupTips(tree, tipnames = TRUE, ...) ## S3 method for class 'fixationPath' groupTips(tree, tipnames = TRUE, ...)
groupTips(tree, ...) ## S3 method for class 'phyMSAmatched' groupTips( tree, similarity = NULL, simMatrix = NULL, forbidTrivial = TRUE, tipnames = TRUE, ... ) ## S3 method for class 'lineagePath' groupTips(tree, tipnames = TRUE, ...) ## S3 method for class 'sitesMinEntropy' groupTips(tree, tipnames = TRUE, ...) ## S3 method for class 'fixationSites' groupTips(tree, tipnames = TRUE, ...) ## S3 method for class 'fixationPath' groupTips(tree, tipnames = TRUE, ...)
tree |
The return from |
... |
Other arguments. |
similarity |
This decides how minor SNPs are to remove. If provided as
fraction between 0 and 1, then the minimum number of SNP will be total tips
times |
simMatrix |
Deprecated and will not have effect. |
forbidTrivial |
Does not allow trivial trimming. |
tipnames |
If return tips as integer or tip names. |
groupTips
returns grouping of tips.
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) groupTips(tree)
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) groupTips(tree)
The raw protein sequences were downloaded from NCBI database and aligned using MAFFT (https://mafft.cbrc.jp/alignment/software/).
h3n2_align_reduced
is a truncated version of
h3n2_align
data(h3n2_align) data(h3n2_align_reduced)
data(h3n2_align) data(h3n2_align_reduced)
an alignment
object
an alignment
object
Tree was built from h3n2_align
using RAxML
(http://www.exelixis-lab.org/) with default settings.
h3n2_tree_reduced
is a truncated version of
h3n2_tree
data(h3n2_tree) data(h3n2_tree_reduced)
data(h3n2_tree) data(h3n2_tree_reduced)
a phylo
object
a phylo
object
lineagePath
finds the lineages of a phylogenetic tree
providing the corresponding sequence alignment. This is done by finding
'major SNPs' which usually accumulate along the evolutionary pathways.
sneakPeek
is intended to plot 'similarity' (actually the
least percentage of 'major SNP') as a threshold against number of output
lineagePath. This plot is intended to give user a rough view about how many
lineages they could expect from the 'similarity' threshold in the function
lineagePath
. The number of lineagePath is preferably not be
too many or too few. The result excludes where the number of lineagePath is
greater than number of tips divided by 20 or user-defined maxPath. The zero
lineagePath result will also be excluded.
When used on the return of sneakPeek
, a
lineagePath
with the closest similarity
will be retrieved
from the returned value.
similarity
has no effect when using on
paraFixSites
object
lineagePath(tree, similarity, ...) ## S3 method for class 'phylo' lineagePath( tree, similarity = NULL, alignment = NULL, seqType = c("AA", "DNA", "RNA"), reference = NULL, gapChar = "-", minSkipSize = NULL, ... ) ## S3 method for class 'treedata' lineagePath(tree, ...) ## S3 method for class 'phyMSAmatched' lineagePath( tree, similarity = NULL, simMatrix = NULL, forbidTrivial = TRUE, ... ) sneakPeek(tree, step = 9, maxPath = NULL, minPath = 0, makePlot = TRUE) ## S3 method for class 'sneakPeekedPaths' lineagePath(tree, similarity, ...) ## S3 method for class 'paraFixSites' lineagePath(tree, similarity = NULL, ...)
lineagePath(tree, similarity, ...) ## S3 method for class 'phylo' lineagePath( tree, similarity = NULL, alignment = NULL, seqType = c("AA", "DNA", "RNA"), reference = NULL, gapChar = "-", minSkipSize = NULL, ... ) ## S3 method for class 'treedata' lineagePath(tree, ...) ## S3 method for class 'phyMSAmatched' lineagePath( tree, similarity = NULL, simMatrix = NULL, forbidTrivial = TRUE, ... ) sneakPeek(tree, step = 9, maxPath = NULL, minPath = 0, makePlot = TRUE) ## S3 method for class 'sneakPeekedPaths' lineagePath(tree, similarity, ...) ## S3 method for class 'paraFixSites' lineagePath(tree, similarity = NULL, ...)
tree |
The return from |
similarity |
The parameter for identifying phylogenetic pathway using SNP. If
provided as fraction between 0 and 1, then the minimum number of SNP will
be total tips times |
... |
Other arguments. |
alignment |
An |
seqType |
The type of the sequence in the alignment file. The default is "AA" for amino acid. The other options are "DNA" and "RNA". |
reference |
Name of reference for site numbering. The name has to be one of the sequences' name. The default uses the intrinsic alignment numbering |
gapChar |
The character to indicate gap. The numbering will skip the
|
minSkipSize |
The minimum number of tips to have gap or ambiguous amino acid/nucleotide for a site to be ignored in other analysis. This will not affect the numbering. The default is 0.8. |
simMatrix |
Deprecated and will not have effect. |
forbidTrivial |
Does not allow trivial trimming. |
step |
the 'similarity' window for calculating and plotting. To better see the impact of threshold on path number. The default is 10. |
maxPath |
maximum number of path to return show in the plot. The number of path in the raw tree can be far greater than trimmed tree. To better see the impact of threshold on path number. This is preferably specified. The default is one 20th of tree tip number. |
minPath |
minimum number of path to return show in the plot. To better see the impact of threshold on path number. The default is 1. |
makePlot |
Whether make a plot when return. |
Lineage path represent by node number.
sneakPeek
return the similarity threhold against number of
lineagePath. There will be a simple dot plot between threshold and path
number if makePlot
is TRUE.
data('zikv_tree') data('zikv_align') tree <- addMSA(zikv_tree, alignment = zikv_align) lineagePath(tree) sneakPeek(tree, step = 3) x <- sneakPeek(tree, step = 3) lineagePath(x, similarity = 0.05)
data('zikv_tree') data('zikv_align') tree <- addMSA(zikv_tree, alignment = zikv_align) lineagePath(tree) sneakPeek(tree, step = 3) x <- sneakPeek(tree, step = 3) lineagePath(x, similarity = 0.05)
The operation between the results of fixationSites
and parallelSites
.
paraFixSites(x, ...) ## S3 method for class 'phylo' paraFixSites( x, alignment = NULL, seqType = c("AA", "DNA", "RNA"), Nmin = NULL, reference = NULL, gapChar = "-", minSkipSize = NULL, ... ) ## S3 method for class 'treedata' paraFixSites(x, ...) ## S3 method for class 'lineagePath' paraFixSites( x, minEffectiveSize = NULL, searchDepth = 1, method = c("compare", "insert", "delete"), ... ) ## S3 method for class 'sitesMinEntropy' paraFixSites( x, category = c("intersect", "union", "parallelOnly", "fixationOnly"), minSNP = NULL, mutMode = c("all", "exact", "pre", "post"), ... )
paraFixSites(x, ...) ## S3 method for class 'phylo' paraFixSites( x, alignment = NULL, seqType = c("AA", "DNA", "RNA"), Nmin = NULL, reference = NULL, gapChar = "-", minSkipSize = NULL, ... ) ## S3 method for class 'treedata' paraFixSites(x, ...) ## S3 method for class 'lineagePath' paraFixSites( x, minEffectiveSize = NULL, searchDepth = 1, method = c("compare", "insert", "delete"), ... ) ## S3 method for class 'sitesMinEntropy' paraFixSites( x, category = c("intersect", "union", "parallelOnly", "fixationOnly"), minSNP = NULL, mutMode = c("all", "exact", "pre", "post"), ... )
x |
A |
... |
further arguments passed to or from other methods. |
alignment |
An |
seqType |
The type of the sequence in the alignment file. The default is "AA" for amino acid. The other options are "DNA" and "RNA". |
Nmin |
The parameter for identifying phylogenetic pathway using SNP. If
provided as fraction between 0 and 1, then the minimum number of SNP will
be total tips times |
reference |
Name of reference for site numbering. The name has to be one of the sequences' name. The default uses the intrinsic alignment numbering |
gapChar |
The character to indicate gap. The numbering will skip the
|
minSkipSize |
The minimum number of tips to have gap or ambiguous amino acid/nucleotide for a site to be ignored in other analysis. This will not affect the numbering. The default is 0.8. |
minEffectiveSize |
The minimum number of tips in a group. |
searchDepth |
The function uses heuristic search but the termination of
the search cannot be intrinsically decided. |
method |
The strategy for predicting the fixation. The basic approach is entropy minimization and can be achieved by adding or removing fixation point, or by comparing the two. |
category |
Could be |
minSNP |
The minimum number of mutations to be qualified as parallel on at least two lineages. The default is 1. |
mutMode |
The strategy for finding parallel site. The default |
A paraFixSites
object.
data(zikv_tree_reduced) data(zikv_align_reduced) paraFixSites(zikv_tree_reduced, alignment = zikv_align_reduced)
data(zikv_tree_reduced) data(zikv_align_reduced) paraFixSites(zikv_tree_reduced, alignment = zikv_align_reduced)
A site may have mutated on parallel lineages. Mutation can occur
on the same site across the phylogenetic lineages solved by
lineagePath
. The site will be considered mutated in parallel
if the mutation occurs on the non-overlap part of more than two lineages.
The amino acid/nucleotide before and after the mutation can be allowed
different on different lineages or only the exact same mutations are
considered.
parallelSites(x, ...) ## S3 method for class 'lineagePath' parallelSites( x, minSNP = NULL, mutMode = c("all", "exact", "pre", "post"), ... ) ## S3 method for class 'sitesMinEntropy' parallelSites( x, minSNP = NULL, mutMode = c("all", "exact", "pre", "post"), ... ) ## S3 method for class 'paraFixSites' parallelSites(x, ...)
parallelSites(x, ...) ## S3 method for class 'lineagePath' parallelSites( x, minSNP = NULL, mutMode = c("all", "exact", "pre", "post"), ... ) ## S3 method for class 'sitesMinEntropy' parallelSites( x, minSNP = NULL, mutMode = c("all", "exact", "pre", "post"), ... ) ## S3 method for class 'paraFixSites' parallelSites(x, ...)
x |
A |
... |
The arguments in |
minSNP |
The minimum number of mutations to be qualified as parallel on at least two lineages. The default is 1. |
mutMode |
The strategy for finding parallel site. The default |
A parallelSites
object
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) paths <- lineagePath(tree) x <- sitesMinEntropy(paths) parallelSites(x)
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) paths <- lineagePath(tree) x <- sitesMinEntropy(paths) parallelSites(x)
addMSA
wraps read.alignment
function in
seqinr
package and helps match names in tree and sequence
alignment. Either provide the file path to an alignment file and its format
or an alignment object from the return of read.alignment
function. If both the file path and alignment object are given, the
function will use the sequence in the alignment file.
addMSA(tree, ...) ## S3 method for class 'phylo' addMSA( tree, msaPath = "", msaFormat = c("fasta", "clustal", "phylip", "mase", "msf"), alignment = NULL, seqType = c("AA", "DNA", "RNA"), ... ) ## S3 method for class 'treedata' addMSA(tree, ...)
addMSA(tree, ...) ## S3 method for class 'phylo' addMSA( tree, msaPath = "", msaFormat = c("fasta", "clustal", "phylip", "mase", "msf"), alignment = NULL, seqType = c("AA", "DNA", "RNA"), ... ) ## S3 method for class 'treedata' addMSA(tree, ...)
tree |
A |
... |
Other arguments. |
msaPath |
The file path to the multiple sequence alignment file. |
msaFormat |
The format of the multiple sequence alignment file. The
internal uses the |
alignment |
An |
seqType |
The type of the sequence in the alignment file. The default is "AA" for amino acid. The other options are "DNA" and "RNA". |
Since 1.5.12, the function returns a phyMSAmatched
object to
avoid S3 methods used on phylo
(better encapsulation).
data(zikv_tree) msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath') addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta')
data(zikv_tree) msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath') addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta')
The plot function to visualize the return of functions in the
package. The underlying function applies ggplot2
. The
function name plot
is used to keep the compatibility with previous
versions, but they do not behave like the generic plot
function since 1.5.4.
A phyMSAmatched
object will be plotted as a tree
diagram.
A lineagePath
object will be plotted as a tree
diagram and paths are black solid line while the trimmed nodes and tips
will use gray dashed line.
A parallelSites
object will be plotted as original
phylogenetic tree marked with parallel mutations attached as dot plot.
A fixationSites
object will be plotted as original
phylogenetic tree marked with fixation substitutions.
A sitePath
object can be extracted by using
extractSite
on the return of fixationSites
.
A fixationIndels
object will be plotted as
original phylogenetic tree marked with indel fixation.
A fixationPath
object will be plotted as a
phylo
object. The tips are clustered according to the fixation
sites. The transition of fixation sites will be plotted as a phylogenetic
tree. The length of each branch represents the number of fixation mutation
between two clusters.
## S3 method for class 'phyMSAmatched' plot(x, y = TRUE, ...) ## S3 method for class 'lineagePath' plot(x, y = TRUE, showTips = FALSE, ...) ## S3 method for class 'parallelSites' plot(x, y = TRUE, ...) ## S3 method for class 'fixationSites' plot(x, y = TRUE, tipsGrouping = NULL, ...) ## S3 method for class 'sitePath' plot(x, y = NULL, select = NULL, showTips = FALSE, ...) ## S3 method for class 'fixationIndels' plot(x, y = TRUE, ...) ## S3 method for class 'fixationPath' plot(x, y = TRUE, ...)
## S3 method for class 'phyMSAmatched' plot(x, y = TRUE, ...) ## S3 method for class 'lineagePath' plot(x, y = TRUE, showTips = FALSE, ...) ## S3 method for class 'parallelSites' plot(x, y = TRUE, ...) ## S3 method for class 'fixationSites' plot(x, y = TRUE, tipsGrouping = NULL, ...) ## S3 method for class 'sitePath' plot(x, y = NULL, select = NULL, showTips = FALSE, ...) ## S3 method for class 'fixationIndels' plot(x, y = TRUE, ...) ## S3 method for class 'fixationPath' plot(x, y = TRUE, ...)
x |
The object to plot. |
y |
Whether to show the fixation mutation between clusters. For
|
... |
Other arguments. Since 1.5.4, the function uses
|
showTips |
Whether to plot the tip labels. The default is |
tipsGrouping |
A |
select |
For a |
A ggplot object to make the plot.
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) plot(tree) paths <- lineagePath(tree) plot(paths) parallel <- parallelSites(paths) plot(parallel) fixations <- fixationSites(paths) plot(fixations) sp <- extractSite(fixations, 139) plot(sp) x <- fixationPath(fixations) plot(x)
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) plot(tree) paths <- lineagePath(tree) plot(paths) parallel <- parallelSites(paths) plot(parallel) fixations <- fixationSites(paths) plot(fixations) sp <- extractSite(fixations, 139) plot(sp) x <- fixationPath(fixations) plot(x)
Visualize the results of paraFixSites
plotFixationSites(x, ...) ## S3 method for class 'fixationSites' plotFixationSites(x, site = NULL, ...) ## S3 method for class 'paraFixSites' plotFixationSites(x, site = NULL, ...)
plotFixationSites(x, ...) ## S3 method for class 'fixationSites' plotFixationSites(x, site = NULL, ...) ## S3 method for class 'paraFixSites' plotFixationSites(x, site = NULL, ...)
x |
return from |
... |
further arguments passed to or from other methods. |
site |
the number of the site according to
|
A ggplot
object.
data(zikv_tree_reduced) data(zikv_align_reduced) paraFix <- paraFixSites(zikv_tree_reduced, alignment = zikv_align_reduced) plotFixationSites(paraFix)
data(zikv_tree_reduced) data(zikv_align_reduced) paraFix <- paraFixSites(zikv_tree_reduced, alignment = zikv_align_reduced) plotFixationSites(paraFix)
The mutated sites for each tip in a phylogenetic tree will be represented as colored dots positioned by their site number.
plotMutSites(x, ...) ## S3 method for class 'SNPsites' plotMutSites(x, showTips = FALSE, ...) ## S3 method for class 'lineagePath' plotMutSites(x, ...) ## S3 method for class 'parallelSites' plotMutSites(x, ...) ## S3 method for class 'fixationSites' plotMutSites(x, ...) ## S3 method for class 'paraFixSites' plotMutSites( x, widthRatio = 0.75, fontSize = 3.88, dotSize = 1, lineSize = 0.5, ... )
plotMutSites(x, ...) ## S3 method for class 'SNPsites' plotMutSites(x, showTips = FALSE, ...) ## S3 method for class 'lineagePath' plotMutSites(x, ...) ## S3 method for class 'parallelSites' plotMutSites(x, ...) ## S3 method for class 'fixationSites' plotMutSites(x, ...) ## S3 method for class 'paraFixSites' plotMutSites( x, widthRatio = 0.75, fontSize = 3.88, dotSize = 1, lineSize = 0.5, ... )
x |
An |
... |
Other arguments |
showTips |
Whether to plot the tip labels. The default is |
widthRatio |
The width ratio between tree plot and SNP plot |
fontSize |
The font size of the mutation label in tree plot |
dotSize |
The dot size of SNP in SNP plot |
lineSize |
The background line size in SNP plot |
A tree plot with SNP as dots for each tip.
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) plotMutSites(SNPsites(tree))
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) plotMutSites(SNPsites(tree))
Visualize the results of paraFixSites
plotParallelSites(x, ...) ## S3 method for class 'parallelSites' plotParallelSites(x, site = NULL, ...) ## S3 method for class 'paraFixSites' plotParallelSites(x, site = NULL, ...)
plotParallelSites(x, ...) ## S3 method for class 'parallelSites' plotParallelSites(x, site = NULL, ...) ## S3 method for class 'paraFixSites' plotParallelSites(x, site = NULL, ...)
x |
return from |
... |
further arguments passed to or from other methods. |
site |
the number of the site according to
|
A ggplot
object.
data(zikv_tree) data(zikv_align) paraFix <- paraFixSites(zikv_tree, alignment = zikv_align) plotParallelSites(paraFix)
data(zikv_tree) data(zikv_align) paraFix <- paraFixSites(zikv_tree, alignment = zikv_align) plotParallelSites(paraFix)
Plot and color the tree according to amino acid/nucleotide of
the selected site. The color scheme depends on the seqType
set in
addMSA
function.
For lineagePath
, the tree will be colored
according to the amino acid of the site. The color scheme tries to assign
distinguishable color for each amino acid.
For parallelSites
, the tree will be colored
according to the amino acid of the site if the mutation is not fixed.
For fixationSites
, it will color the ancestral
tips in red, descendant tips in blue and excluded tips in grey.
plotSingleSite(x, site, ...) ## S3 method for class 'lineagePath' plotSingleSite(x, site, showPath = TRUE, showTips = FALSE, ...) ## S3 method for class 'sitesMinEntropy' plotSingleSite(x, site, ...) ## S3 method for class 'parallelSites' plotSingleSite(x, site, showPath = TRUE, ...) ## S3 method for class 'fixationSites' plotSingleSite(x, site, select = NULL, ...)
plotSingleSite(x, site, ...) ## S3 method for class 'lineagePath' plotSingleSite(x, site, showPath = TRUE, showTips = FALSE, ...) ## S3 method for class 'sitesMinEntropy' plotSingleSite(x, site, ...) ## S3 method for class 'parallelSites' plotSingleSite(x, site, showPath = TRUE, ...) ## S3 method for class 'fixationSites' plotSingleSite(x, site, select = NULL, ...)
x |
The object to plot. |
site |
For |
... |
Other arguments. Since 1.5.4, the function uses
|
showPath |
If plot the lineage result from |
showTips |
Whether to plot the tip labels. The default is |
select |
Select which fixation path in to plot. The default is NULL which will plot all the fixations. |
Since 1.5.4, the function returns a ggplot object so on longer
behaviors like the generic plot
function.
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) paths <- lineagePath(tree) plotSingleSite(paths, 139) fixations <- fixationSites(paths) plotSingleSite(fixations, 139)
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) paths <- lineagePath(tree) plotSingleSite(paths, 139) fixations <- fixationSites(paths) plotSingleSite(fixations, 139)
The raw sequences were downloaded from GISAID database (https://www.gisaid.org/) and aligned using MAFFT (https://mafft.cbrc.jp/alignment/software/) with default settings.
data(sars2_align)
data(sars2_align)
an alignment
object
Tree was built from sars2_align
using RAxML
(http://www.exelixis-lab.org/) with default settings. The tip
EPI_ISL_402125
was used as the outgroup to root the tree.
data(sars2_tree)
data(sars2_tree)
a phylo
object
A reference sequence can be used to define a global site numbering scheme for multiple sequence alignment. The gap in the reference sequence will be skipped for the numbering. Also, the site that is gap or amino acid/nucleotide for too many tips will be ignored but won't affect numbering.
setSiteNumbering(x, reference, gapChar, ...) ## S3 method for class 'phyMSAmatched' setSiteNumbering(x, reference = NULL, gapChar = "-", minSkipSize = NULL, ...)
setSiteNumbering(x, reference, gapChar, ...) ## S3 method for class 'phyMSAmatched' setSiteNumbering(x, reference = NULL, gapChar = "-", minSkipSize = NULL, ...)
x |
The object to set site numbering. It could be a
|
reference |
Name of reference for site numbering. The name has to be one of the sequences' name. The default uses the intrinsic alignment numbering |
gapChar |
The character to indicate gap. The numbering will skip the
|
... |
Further arguments passed to or from other methods. |
minSkipSize |
The minimum number of tips to have gap or ambiguous amino acid/nucleotide for a site to be ignored in other analysis. This will not affect the numbering. The default is 0.8. |
The input x
with numbering mapped to reference
.
data(zikv_tree) msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath') tree <- addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta') setSiteNumbering(tree)
data(zikv_tree) msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath') tree <- addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta') setSiteNumbering(tree)
Get similarity between aligned sequences with gap ignored.
similarityMatrix(tree)
similarityMatrix(tree)
tree |
The return from |
A diagonal matrix of similarity between sequences.
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) simMatrix <- similarityMatrix(tree)
data(zikv_tree) data(zikv_align) tree <- addMSA(zikv_tree, alignment = zikv_align) simMatrix <- similarityMatrix(tree)
These functions are provided for compatibility with older versions of ‘sitePath’ only, and will be defunct at the next release.
The following functions are deprecated and will be made defunct; use the replacement indicated below:
multiFixationSites:
fixationSites
After finding the lineagePath
of a phylogenetic
tree, sitesMinEntropy
perform entropy minimization on every site of
the sequence to group the tips according to amino acid/nucleotide.
sitesMinEntropy(x, ...) ## S3 method for class 'lineagePath' sitesMinEntropy( x, minEffectiveSize = NULL, searchDepth = 1, method = c("compare", "insert", "delete"), ... )
sitesMinEntropy(x, ...) ## S3 method for class 'lineagePath' sitesMinEntropy( x, minEffectiveSize = NULL, searchDepth = 1, method = c("compare", "insert", "delete"), ... )
x |
A |
... |
further arguments passed to or from other methods. |
minEffectiveSize |
The minimum number of tips in a group. |
searchDepth |
The function uses heuristic search but the termination of
the search cannot be intrinsically decided. |
method |
The strategy for predicting the fixation. The basic approach is entropy minimization and can be achieved by adding or removing fixation point, or by comparing the two. |
A sitesMinEntropy
object.
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) sitesMinEntropy(lineagePath(tree))
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) sitesMinEntropy(lineagePath(tree))
Single nucleotide polymorphism (SNP) in the whole package refers
to variation of amino acid. SNPsite
will try to find SNP in the
multiple sequence alignment. A reference sequence and gap character may be
specified to number the site.
SNPsites(tree, ...) ## S3 method for class 'phyMSAmatched' SNPsites(tree, minSNP = NULL, ...)
SNPsites(tree, ...) ## S3 method for class 'phyMSAmatched' SNPsites(tree, minSNP = NULL, ...)
tree |
A |
... |
Other arguments |
minSNP |
Minimum number of a mutation to be a SNP. The default is 10th of the total tree tips. |
A SNPsites
object.
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) SNPsites(tree)
data(zikv_tree_reduced) data(zikv_align_reduced) tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced) SNPsites(tree)
The raw protein sequences were downloaded from ViPR database (https://www.viprbrc.org/) and aligned using MAFFT (https://mafft.cbrc.jp/alignment/software/). with default settings.
zikv_align_reduced
is a truncated version of
zikv_align
data(zikv_align) data(zikv_align_reduced)
data(zikv_align) data(zikv_align_reduced)
an alignment
object
an alignment
object
Tree was built from zikv_align
using RAxML
(http://www.exelixis-lab.org/) with default settings. The tip
ANK57896 was used as outgroup to root the tree.
zikv_tree_reduced
is a truncated version of
zikv_tree
data(zikv_tree) data(zikv_tree_reduced)
data(zikv_tree) data(zikv_tree_reduced)
a phylo
object
a phylo
object