Title: | Provides structure and functions for the analysis of OTU data |
---|---|
Description: | Provides a platform for Operational Taxonomic Unit based analysis |
Authors: | Daniel Beck, Matt Settles, and James A. Foster |
Maintainer: | Daniel Beck <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.57.0 |
Built: | 2024-10-30 09:19:50 UTC |
Source: | https://github.com/bioc/OTUbase |
The OTUbase Base class for OTU data
Package: | OTUbase |
Type: | Package |
Version: | 0.1.0 |
Date: | 2010-04-05 |
License: | Artistic-2.0 |
LazyLoad: | yes |
~~OTUbase includes a number of OTUset type classes which provide structure for OTU based data. These classes allow the user to store information that may be usefull in the analysis of OTUs. Slots are provided for sequence and quality values, OTU classifications, Sample identifications, and metadata associated with samples and OTUs. In addition, basic functions are provided for the analysis and visualization of the data. In addition to OTU type analysis, classification data is also supported with the TAXset classes.~~
Daniel Beck - [email protected], Matt Settles - [email protected], and James Foster Maintainer: Daniel Beck - [email protected]
An_introduction_to_OTUbase.pdf
This class provides a way to store and manipulate operational taxonomic unit data. ".OTUset" is inherited by "OTUsetQ", "OTUsetF", and "OTUsetB". The user will want to use "OTUsetQ" when quality data is available, "OTUsetF" when sequence data (without quality data) is available, and "OTUsetB" when only OTU and sample data are available.
OTUsetB includes Slots id
, sampleID
, otuID
, sampleData
, assignmentData
.
OTUsetF includes Slots id
sampleID
, otuID
, sampleData
, assignmentData
, sread
.
OTUsetQ includes Slots id
sampleID
, otuID
, sampleData
, assignmentData
, sread
, quality
.
Methods include:
provides access to the id slot of object
provides access to the sampleID slot of object
provides access the otuID slot of object
provides accesss the sampleData slot of object
provides access the assignmentData slot of object
provides access to the sread slot of object
provides access to the quality slot of object
returns the first word of the id line. Intended to extract the sequence name from other sequence information.
returns the number of samples in an OTUset object
returns the number of OTUs in an OTUset object
signature(object=".OTUset")
: provides a brief summary of the object, including its class, number of sequences, number of samples, and number of OTUs.
showClass(".OTUset") showMethods(class=".OTUset") showClass("OTUsetQ")
showClass(".OTUset") showMethods(class=".OTUset") showClass("OTUsetQ")
This class provides a way to store and manipulate read-classification data. ".TAXset" is inherited by "TAXsetQ", "TAXsetF", and "TAXsetB". The user will want to use "TAXsetQ" when quality data is available, "TAXsetF" when sequence data (without quality data) is available, and "TAXsetB" when only classification and sample data are available.
TAXsetB includes Slots id
, sampleID
, tax
, sampleData
, assignmentData
.
TAXsetF includes Slots id
sampleID
, tax
, sampleData
, assignmentData
, sread
.
TAXsetQ includes Slots id
sampleID
, tax
, sampleData
, assignmentData
, sread
, quality
.
Methods include:
provides access to the id slot of object
provides access to the sampleID slot of object
provides access the tax slot of object
provides accesss the sampleData slot of object
provides access the assignmentData slot of object
provides access to the sread slot of object
provides access to the quality slot of object
returns the first word of the id line. Intended to extract the sequence name from other sequence information.
returns the number of samples in an TAXset object
signature(object=".TAXset")
: provides a brief summary of the object, including its class, number of sequences, and number of samples.
showClass(".TAXset") showMethods(class=".TAXset") showClass("TAXsetQ")
showClass(".TAXset") showMethods(class=".TAXset") showClass("TAXsetQ")
abundance
generates an abundance table. This table can be either weighted or unweighted.
abundance(object, ...)
abundance(object, ...)
object |
An OTUset or a TAXset object |
... |
Additional arguments. These will depend on if the object is an OTUset or a TAXset object. |
These are other arguments passed to abundance
taxCol If generating the abundance from a TAXset object, taxCol
selects the column of the tax
dataframe from which to calculate the abundance.
assignmentCol If generating the abundance from an OTUset object assignmentCol
will select a column of the assignmentData
dataframe to use when calculating abundance. This will override the default of creating an abundance table of the OTUs and instead create an abundance table of a column in the assignmentData
dataframe.
sampleCol sampleCol
generates the abundance table using a column in the sampleData data fram instead of the default of using the sampleID.
collab An optional parameter that selets a column of the sampleData
dataframe to use when labeling the columns of the abundance table.
weighted By default this is FALSE. When set to TRUE abundance
will return proportional abundances.
The returned value will be a data.frame.
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## calculate abundance abundance(soginOTU, collab="Site")
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## calculate abundance abundance(soginOTU, collab="Site")
These functions provide access to some of the slots of OTUset and TAXset objects. otuID
returns the otuID
slot of OTUset objects. sampleID
returns the sampleID
slot of both OTUset and TAXset objects. tax
and tax<-
return and replace the tax
slot of TAXset objects.
sampleID(object, ...) otuID(object, ...) tax(object, ...) tax(object)<-value
sampleID(object, ...) otuID(object, ...) tax(object, ...) tax(object)<-value
object |
An OTUset or a TAXset object |
value |
The replacement value for |
... |
Added for completeness. Enables the passing of arguments. |
sampleID
and otuID
return a character.
tax
returns a data.frame.
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the sampleID slot sampleID(soginOTU) ## get the otuID slot otuID(soginOTU)
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the sampleID slot sampleID(soginOTU) ## get the otuID slot otuID(soginOTU)
These accessors access and replace the assignmentData slot of OTUbase objects. assignmentData
is an AnnotatedDataFrame. assignmentData
and assignmentData<-
access and replace this AnnotatedDataFrame. assignmentLabels
and assignmentLabels<-
access and replace the labels of this AnnotatedDataFrame. aData
and aData<-
access and replace the dataframe component of the AnnotatedDataFrame.
assignmentNames
returns the assignment names present in the assignmentData
slot.
aData(object,...) aData(object)<-value assignmentData(object,...) assignmentData(object)<-value assignmentLabels(object,...) assignmentLabels(object)<-value assignmentNames(object,...)
aData(object,...) aData(object)<-value assignmentData(object,...) assignmentData(object)<-value assignmentLabels(object,...) assignmentLabels(object)<-value assignmentNames(object,...)
object |
An OTUset or a TAXset object |
value |
The replacement value for |
... |
Added for completeness. Enables the passing of arguments. |
aData
returns a dataframe.
assignmentData
returns an AnnotatedDataFrame.
assignmentLabels
returns a character.
assignmentNames
returns a character.
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the aData dataframe aData(soginOTU) ## get the assignmentData slot assignmentData(soginOTU)
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the aData dataframe aData(soginOTU) ## get the assignmentData slot assignmentData(soginOTU)
This function is a wrapper for the vegan function vegedist and hclust. It allows the user to cluster samples using a number of different distance measure and clustering methods. Please see the documentation for vegedist and hclust for a more indepth explanation.
clusterSamples(object, ...)
clusterSamples(object, ...)
object |
An OTUset or a TAXset object |
... |
Additional arguments. These will depend on if the object is an OTUset or a TAXset object. |
These are other arguments passed to clusterSamples
. For further information on specific arguments, please see abundance
, vegdist
, or hclust
.
taxCol Column of the tax
slot dataframe on which to cluster (unique to TAXset objects). Passed to the abundance
function.
assignmentCol Column of the assignmentData
dataframe used to classify sequences for clustering. This overrides the default of using the OTUs to cluster samples. This is passed to the abundance
function.
collab Specifies a column of the sampleData
dataframe that will provide the sample lables for the cluster analysis. This is passed to the abundance
function.
distmethod The distance method to be used. This value is passed to the vegedist
function. The default is the Bray-Curtis distance.
clustermethod The clustering method to be used. This value is passed to the hclust
function. The default is complete
clustering.
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## cluster samples clusterSamples(soginOTU, collab="Site", distmethod="jaccard")
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## cluster samples clusterSamples(soginOTU, collab="Site", distmethod="jaccard")
These are other functions available. Caution is advised when using them. Some are still in development and others only work on specific objects (OTUset or TAXset).
getOTUs(object, colnum, value, exact) getSamples(object, colnum, value, exact) o_diversity(object, ...) o_estimateR(object, ...)
getOTUs(object, colnum, value, exact) getSamples(object, colnum, value, exact) o_diversity(object, ...) o_estimateR(object, ...)
object |
An OTUset or a TAXset object. |
colnum |
The column of the |
value |
The desired value. |
exact |
If exact=T |
... |
Other arguments. Often these are passed to |
getOTUs Returns OTU names that match given values in the assignmentData dataframe.
getSamples Returns sample names that match given values in the sampleData dataframe.
o_diversity Wrapper for vegan's diversity function.
o_estimateR Wrapper for vegan's estimateR function.
otuseqplot Plots the samples acording to number of OTUs and number of sequences.
otusize Returns the size of each OTU.
otuspersample Lists the number of OTUs in each sample.
rseqplot Plots the samples by estimated richness and number of sequences.
seqspersample Returns the number of sequences in each sample.
sharedotus Returns the number of OTUs shared between samples.
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") getSamples(soginOTU, colnum="Site", value="Labrador", exact=FALSE) o_estimateR(soginOTU)
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") getSamples(soginOTU, colnum="Site", value="Labrador", exact=FALSE) o_estimateR(soginOTU)
Various functions. notus
returns the number of OTUs in an OTUset object. nsamples
returns the number of samples in either an OTUset or a TAXset object. seqnames
returns the sequence names of the OTUset or TAXset object without the extra information commonly present with the id.
notus(object, ...) nsamples(object, ...) seqnames(object, ...)
notus(object, ...) nsamples(object, ...) seqnames(object, ...)
object |
An OTUset or a TAXset object. |
... |
Other arguments. These are currently nonfunctional. |
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the number of OTUs notus(soginOTU) ## get the number of samples nsamples(soginOTU)
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the number of OTUs notus(soginOTU) ## get the number of samples nsamples(soginOTU)
This function reads in data and creates an OTUset object
readOTUset(dirPath, otufile, level, fastafile, qualfile, samplefile, sampleADF, assignmentADF, sADF.names, aADF.names, rdp=F, otufiletype)
readOTUset(dirPath, otufile, level, fastafile, qualfile, samplefile, sampleADF, assignmentADF, sADF.names, aADF.names, rdp=F, otufiletype)
dirPath |
The directory path were the datafile are located. This is the current directory by default. |
otufile |
The OTU file. The only format currently supported is the Mothur format for .list files. |
level |
The OTU clustering level. By default this is 0.03. This level must correspond to levels present in the otufile. |
fastafile |
The fasta file. This is read in by ShortRead. |
qualfile |
The quality file. This is read in by ShortRead. |
samplefile |
The sample file. Currently this must be in Mothur format (.groups). |
sampleADF |
The sample meta data file. This is in AnnotatedDataFrame format. |
assignmentADF |
The assignment meta data file (the OTU meta data). This is generally in AnnotatedDataFrame format although it is also possible to read in an RDP classification file if there is only one read classification for each cluster and |
sADF.names |
The column of the sampleADF file that has the sample names. |
aADF.names |
The column of the assignmentADF file that has the assignment names. |
rdp |
By default this is FALSE. Change to TRUE if assignmentADF is an RDP classification file. The RDP file must be in the fixed format. |
otufiletype |
The type of OTU file. Takes values "mothur", "cdhit", and "blastclust". Defaults to "mothur". |
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") soginOTU
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") soginOTU
Function to read in data and create a TAXset object
readTAXset(dirPath, taxfile, namefile, fastafile, qualfile, samplefile, sampleADF, assignmentADF, sADF.names, aADF.names, type, ...)
readTAXset(dirPath, taxfile, namefile, fastafile, qualfile, samplefile, sampleADF, assignmentADF, sADF.names, aADF.names, type, ...)
dirPath |
The directory path were the datafile are located. This is the current directory by default. |
taxfile |
The classification file. The default format is RDP's fixed format. |
namefile |
A names file in the Mothur format. This is used to add removed unique sequences back into the dataset. |
fastafile |
The fasta file. This is read in by ShortRead. |
qualfile |
The quality file. This is read in by ShortRead. |
samplefile |
The sample file. Currently this must be in Mothur format (.groups). |
sampleADF |
The sample meta data file. This is in AnnotatedDataFrame format. |
assignmentADF |
The assignment meta data file (the OTU meta data) This is in AnnotatedDataFrame format. |
sADF.names |
The column of the sampleADF file that has the sample names. |
aADF.names |
The column of the assignmentADF file that has the assignment names. |
type |
This is the type of taxfile. By default this is the RDP fixed format. However, if |
... |
Additional arguments passed to |
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into TAXset object soginTAX <- readTAXset(dirPath=dirPath, samplefile="sogin.groups", fastafile="sogin.fasta", taxfile="sogin.unique.fix.tax", namefile="sogin.names", sampleADF="sample_metadata.txt") soginTAX
dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into TAXset object soginTAX <- readTAXset(dirPath=dirPath, samplefile="sogin.groups", fastafile="sogin.fasta", taxfile="sogin.unique.fix.tax", namefile="sogin.names", sampleADF="sample_metadata.txt") soginTAX
These functions access and replace the sampleData
slot of OTUbase objects. sampleData
and sampleData<-
access and replace the AnnotatedDataFrame sampleData
. sampleLabels
and sampleLabels<-
access and replace the labels of this AnnotatedDataFrame. sData
and sData<-
access and replace the dataframe component of the AnnotatedDataFrame.
sData(object,...) sData(object)<-value sampleData(object,...) sampleData(object)<-value sampleLabels(object,...) sampleLabels(object)<-value
sData(object,...) sData(object)<-value sampleData(object,...) sampleData(object)<-value sampleLabels(object,...) sampleLabels(object)<-value
object |
An OTUset or a TAXset object |
value |
The replacement value for |
... |
Added for completeness. Enables the passing of arguments. |
sData
returns a dataframe.
sampleData
returns an AnnotatedDataFrame.
sampleLabels
returns a character.
assignmentNames
returns a character.
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the sData dataframe sData(soginOTU) ## get the sampleData slot sampleData(soginOTU)
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get the sData dataframe sData(soginOTU) ## get the sampleData slot sampleData(soginOTU)
Function to get a subset of an OTUset object.
subOTUset(object, samples, otus)
subOTUset(object, samples, otus)
object |
An OTUset object |
samples |
A list of sample names |
otus |
A list of OTU names |
subOTUset
returns an OTUset
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get subset of soginOTU subOTUset(soginOTU, samples=getSamples(soginOTU, colnum="Site", value="Labrador", exact=FALSE))
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into OTUset object soginOTU <- readOTUset(dirPath=dirPath, level="0.03", samplefile="sogin.groups", fastafile="sogin.fasta", otufile="sogin.unique.filter.fn.list", sampleADF="sample_metadata.txt") ## get subset of soginOTU subOTUset(soginOTU, samples=getSamples(soginOTU, colnum="Site", value="Labrador", exact=FALSE))
Function to get a subset of an TAXset object.
subTAXset(object, samples)
subTAXset(object, samples)
object |
An TAXset object |
samples |
A list of sample names |
subTAXset
returns an TAXset
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into TAXset object soginTAX <- readTAXset(dirPath=dirPath, samplefile="sogin.groups", fastafile="sogin.fasta", taxfile="sogin.unique.fix.tax",namefile="sogin.names", sampleADF="sample_metadata.txt") ## get subset of soginTAX subTAXset(soginTAX, samples=getSamples(soginTAX, colnum="Site", value="Labrador", exact=FALSE))
## locate directory with data dirPath <- system.file("extdata/Sogin_2006", package="OTUbase") ## read in data into TAXset object soginTAX <- readTAXset(dirPath=dirPath, samplefile="sogin.groups", fastafile="sogin.fasta", taxfile="sogin.unique.fix.tax",namefile="sogin.names", sampleADF="sample_metadata.txt") ## get subset of soginTAX subTAXset(soginTAX, samples=getSamples(soginTAX, colnum="Site", value="Labrador", exact=FALSE))