Title: | Cartesian plot and contingency test on 16S Microbial data |
---|---|
Description: | eudysbiome a package that permits to annotate the differential genera as harmful/harmless based on their ability to contribute to host diseases (as indicated in literature) or unknown based on their ambiguous genus classification. Further, the package statistically measures the eubiotic (harmless genera increase or harmful genera decrease) or dysbiotic(harmless genera decrease or harmful genera increase) impact of a given treatment or environmental change on the (gut-intestinal, GI) microbiome in comparison to the microbiome of the reference condition. |
Authors: | Xiaoyuan Zhou, Christine Nardini |
Maintainer: | Xiaoyuan Zhou <[email protected]> |
License: | GPL-2 |
Version: | 1.37.0 |
Built: | 2024-11-04 06:05:24 UTC |
Source: | https://github.com/bioc/eudysbiome |
Assign taxonomic paths to unclassified SSU rRNA sequences, by executing classify.seqs
in Mothur with the 'Wang' approach.
assignTax(fasta, template = NULL, taxonomy = NULL, ksize = 8, iters = 100, cutoff = 80, processors = 1, dir.out = "assignTax_out")
assignTax(fasta, template = NULL, taxonomy = NULL, ksize = 8, iters = 100, cutoff = 80, processors = 1, dir.out = "assignTax_out")
fasta |
a fasta file of rRNA sequences to be assigned with taxonomies, e.g. a set of sequences picked as the representatives of OTUs. |
template |
a faste file of rRNA reference sequences, default to download "Silva_119_provisional_release.zip" under "qiime" directory from SILVA archive under , and extract "Silva_119_rep_set97.fna", a representative set of SILVA rRNA references of version119 at 97% sequence identity. |
taxonomy |
a taxonomic path file mapping to the template file, default to load "Silva_119_rep_set97.fna" matched "taxonomy" stored in the package. |
ksize , iters , cutoff , processors
|
parameters used in Classify.seqs by Mothur. |
dir.out |
a directory where the assigned files were outputted, by default to create |
This function performs 'classify.seqs' by running Mothur in command line mode, hence the executable Mothur on your computer is needed. For unix users, the absolute path of Mothur should be added to the PATH environmental variable and exported. For Windows users, the executable Mothur with extension .exe is required under your disks.
two files under dir.out
, a *.taxonomy
file which contains a taxonomic path for each sequence and a *.tax.summary
file which contains a taxonomic outline indicating the number of sequences that were found at each level (kingdom to species).
a list containing the following components:
exitStatus an error code ('0' for success) given by the execution of the system Mothur commands, see system
.
stderr, stdout standard errors and outputs by executing Mothur command 'classify.seqs'.
This function plots a Cartesian plane of genus abundance difference across the tested conditions (y-axis) and their harmful/harmless nature (negative/positive x-axis), giving rise to up-utmost right and bottom-utmost left quadrants of microbial eubiotic impact and bottom-utmost right and up-utmost left quadrants of dysbiotic impact.
Cartesian(x,log2 = TRUE,micro.anno = NULL, comp.anno = NULL, pch = 16, point.col = NULL,point.alpha = 0.6,ylim = NULL, xlab = NULL,ylab = NULL,vlty = 2, hlty = 1, srt = 60, font = 3, adj = c(1,1), xaxis = NULL, yaxis = NULL, legend = TRUE, box = TRUE,box.col = c("darkblue","yellow"), ...)
Cartesian(x,log2 = TRUE,micro.anno = NULL, comp.anno = NULL, pch = 16, point.col = NULL,point.alpha = 0.6,ylim = NULL, xlab = NULL,ylab = NULL,vlty = 2, hlty = 1, srt = 60, font = 3, adj = c(1,1), xaxis = NULL, yaxis = NULL, legend = TRUE, box = TRUE,box.col = c("darkblue","yellow"), ...)
x |
a data frame or numeric matrix of microbial abundance variations from which the plot is produced. Rows indicate the differential microbes, columns indicate the pair-wise conditions. |
log2 |
logical, specifying if x values should be log2 converted; default to |
micro.anno |
a character vector to annotate all row microbes in |
comp.anno |
a character vector of conditions pre-defined from the column pair-wise comparisons, should be in same length with the comparisons; default to the pair-wise comparisons. |
pch |
a vector of point types. Graphical parameters:see |
point.col |
a vector of colors for the points. |
point.alpha |
alpha value for points; see |
ylim |
limits for the y axis. |
xlab |
a title for the x axis. |
ylab |
a title for the y axis. |
vlty , hlty
|
types of vertical and horizontal lines to divide the plane with x-axis and y-axis, respectively. |
srt , font , adj
|
graphical parameters for the text on x-axis, see |
xaxis |
a character or expression vector specifying the labels of x axis by text; default to row names of |
yaxis |
a character or expression vector specifying the labels of y axis by axis; default to |
legend |
logical, specifying if the legend should be added to the plot; default to |
box |
logical, specifying if the quadrants should be highlighted by boxes; default to |
box.col |
a vector of colors for the up-utmost right, bottom-utmost left quadrants and bottom-utmost right, up-utmost left quadrants, respectively; default to "darkblue" and "yellow". If only one color is specified, the other one can be |
... |
additional parameters passed to the default method, or by it to |
The Cartesian plane plot
data(microDiff) attach(microDiff) newpar = par() par(mar = c(6,5.1,4.1,6)) Cartesian(x = data,log2 = TRUE,micro.anno = micro.anno,pch = 16, comp.anno = comp.anno,point.col = c("blue","purple","orange")) par(newpar) detach(microDiff)
data(microDiff) attach(microDiff) newpar = par() par(mar = c(6,5.1,4.1,6)) Cartesian(x = data,log2 = TRUE,micro.anno = micro.anno,pch = 16, comp.anno = comp.anno,point.col = c("blue","purple","orange")) par(newpar) detach(microDiff)
Computes the frequencies of the contingency table as the cumulated microbial abundance difference classified into each condition and eubiotic/dysbiotic impact term for examining the significance of the association (contingency) between conditions and impacts by contingencyTest
.
contingencyCount(x, micro.anno=NULL, comp.anno=NULL)
contingencyCount(x, micro.anno=NULL, comp.anno=NULL)
x |
See |
micro.anno |
See |
comp.anno |
See |
Eubiotic impact is measured by variations of increased harmless and decreased harmful microbes, while the dysbiotic impact is measured by the decreased harmless and increased harmful microbes.
The frequencies of condition-impact terms in contingency table
data(microDiff) attach(microDiff) microCount = contingencyCount(x = data, micro.anno = micro.anno, comp.anno = comp.anno) detach(microDiff)
data(microDiff) attach(microDiff) microCount = contingencyCount(x = data, micro.anno = micro.anno, comp.anno = comp.anno) detach(microDiff)
Performs Chi-squared test or Fisher's exact test for testing the significance of association between conditions and eubiotic/dysbiotic impacts in a contingency table.
contingencyTest(microCount, chisq = TRUE, fisher = TRUE, alternative = c("greater"))
contingencyTest(microCount, chisq = TRUE, fisher = TRUE, alternative = c("greater"))
microCount |
a m by 2 data frame or numeric matrix of contingency table with frequencies under each condition-impact term; could be produced from |
chisq , fisher
|
logical indicating if the Chi-squared test or Fisher's exact test should be performed. |
alternative |
parameter specifying for alternative hypothesis, only used when |
Chi-squared test for testing the probability that the proportions of eubiotic frequencies are different between two conditions; furtherly, the Fisher's exact test for testing whether one condition is more likely to be associated to eubiotic impact. More details, refer to chisq.test
and fisher.test
A list with following components: Chisq Chi-squared test results for each pair-wise condition. Chisq.p the p-values of the Chi-squared tests for all pair-wise conditions. Fisher Fisher's exact test results for each pair-wise condition. Fisher.p the p-values of the Fisher's exact tests for all pair-wise conditions.
contingencyCount
, fisher.test
, chisq.test
data(microCount) test = contingencyTest(microCount,chisq = TRUE,fisher = TRUE, alternative = "greater") chisq.p = test[["Chisq.p"]] fisher.p = test[["Fisher.p"]]
data(microCount) test = contingencyTest(microCount,chisq = TRUE,fisher = TRUE, alternative = "greater") chisq.p = test[["Chisq.p"]] fisher.p = test[["Fisher.p"]]
"harmful"
or "harmless"
.Differential microbes in Genus-Species table
A data frame containing 10 differential genera and the species included, which was to be annotated as "harmful"
or "harmless"
.
data(diffGenera)
data(diffGenera)
A data frame with 26 rows and 2 columns specifying for Genus
and Species
.
Manually curated genera annotation table A data frame containing 260 genera annotated as "harmful" and the harmful species included in these genera.
data(harmGenera)
data(harmGenera)
A data frame with 900 rows and 3 columns specifying for Genus
and Species
and the references.
Annotates given genera as harmful or harmless based on either our manually curated, harmful Genus-Species table in data harmGenera
of this package or user defined table.
microAnnotate(microbe, species = TRUE, annotated.micro = NULL)
microAnnotate(microbe, species = TRUE, annotated.micro = NULL)
microbe |
a genus list to be annotated; a Genus-Species data frame which represents the genera and the included corresponding species is recommended to be provided by users for the more accurate annotations, see |
species |
logical, specifying if the species are provided in the |
annotated.micro |
the annotated genera which are used for the annotation of |
The annotated genera.
#load the genera to be annotated library(eudysbiome) data(diffGenera) #load the curated Genus-Species annotation table data(harmGenera) microAnnotate(microbe = diffGenera, species = TRUE, annotated.micro = harmGenera)
#load the genera to be annotated library(eudysbiome) data(diffGenera) #load the curated Genus-Species annotation table data(harmGenera) microAnnotate(microbe = diffGenera, species = TRUE, annotated.micro = harmGenera)
A matrix containing the counts of differential microbe classified into each condition-eubiotic/dysbiotic impact couple. Rows represent the condition comparisons, columns represent the eubiotic and dysbiotic impacts:
data(microCount)
data(microCount)
A data frame with 2 rows and 2 variables
EI. eubiotic impact
DI. dysbiotic impact
The table can be produced by microCount
function.
A list containing: i) a data frame of 10 differentila genera with abundance differences among 3 condition comparisons, in which row represents the differential microbes and column represents the comparisons; ii) Genera annotations for the 10 differential genera; iii) pre-defined condition comparison names
data(microDiff)
data(microDiff)
A list
This function is used to extract only Genus-Species data from the assigned taxonomic paths outputted by assignTax
and construct a table containing the classified genera and species included in these genera correspondingly.
tableSpecies(tax.file, microbe)
tableSpecies(tax.file, microbe)
tax.file |
a taxonomy file with SSU rRNA sequence names and assigned taxonomic paths, see "*.taxonomy" file outputted by |
microbe |
a character vector specifying the genera used to construct the Genus-Species data frame |
The outputted Genus-Species table can be used as input for the more accurate genus annotation, which annotates genera as harmful
or harmless
based on their ability to contribute to mammals' host diseases by microAnnotate
.
a Genus-Species data frame, only with the genera specified by microbe
and the included corresponding species .
#a table with "Lactobacillus" and "Bacteroids" genera and the included species genera = c("Lactobacillus","Bacteroides") #not excute #tableSpecies(tax.file = "test.taxExtract.wang.taxonomy", microbe = genera)
#a table with "Lactobacillus" and "Bacteroids" genera and the included species genera = c("Lactobacillus","Bacteroides") #not excute #tableSpecies(tax.file = "test.taxExtract.wang.taxonomy", microbe = genera)