Package: dada2 1.35.0

Benjamin Callahan

dada2: Accurate, high-resolution sample inference from amplicon sequencing data

The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.

Authors:Benjamin Callahan <[email protected]>, Paul McMurdie, Susan Holmes

dada2_1.35.0.tar.gz
dada2_1.35.0.zip(r-4.5)dada2_1.35.0.zip(r-4.4)dada2_1.35.0.zip(r-4.3)
dada2_1.35.0.tgz(r-4.5-x86_64)dada2_1.35.0.tgz(r-4.5-arm64)dada2_1.35.0.tgz(r-4.4-x86_64)dada2_1.35.0.tgz(r-4.4-arm64)dada2_1.35.0.tgz(r-4.3-x86_64)dada2_1.35.0.tgz(r-4.3-arm64)
dada2_1.35.0.tar.gz(r-4.5-noble)dada2_1.35.0.tar.gz(r-4.4-noble)
dada2_1.35.0.tgz(r-4.4-emscripten)dada2_1.35.0.tgz(r-4.3-emscripten)
dada2.pdf |dada2.html✨
dada2/json (API)
NEWS

# Install 'dada2' in R:

install.packages('dada2', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/benjjneb/dada2/issues

Uses libs:

c++– GNU Standard C++ Library v3

Datasets:

errBalancedF - An empirical error matrix.
errBalancedR - An empirical error matrix.
tperr1 - An empirical error matrix.

On BioConductor:dada2-1.35.0(bioc 3.21)dada2-1.34.0(bioc 3.20)

immunooncology microbiome sequencing classification metagenomics amplicon bioconductor bioinformatics metabarcoding taxonomy cpp

13.17 score 487 stars 4 packages 3.0k scripts 4.5k downloads 274 mentions 37 exports 82 dependencies

Last updated 5 months agofrom:3d68997ee4. Checks:1 OK, 8 WARNING, 3 NOTE. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 21 2025
R-4.5-win-x86_64	WARNING	Mar 21 2025
R-4.5-mac-x86_64	WARNING	Mar 21 2025
R-4.5-mac-aarch64	WARNING	Mar 21 2025
R-4.5-linux-x86_64	WARNING	Mar 21 2025
R-4.4-win-x86_64	WARNING	Mar 21 2025
R-4.4-mac-x86_64	WARNING	Mar 21 2025
R-4.4-mac-aarch64	WARNING	Mar 21 2025
R-4.4-linux-x86_64	WARNING	Mar 21 2025
R-4.3-win-x86_64	NOTE	Mar 21 2025
R-4.3-mac-x86_64	NOTE	Mar 21 2025
R-4.3-mac-aarch64	NOTE	Mar 21 2025

Exports:addSpecies assignSpecies assignTaxonomy collapseNoMismatch dada derepFastq fastqFilter fastqPairedFilter filterAndTrim getDadaOpt getErrors getSequences getUniques inflateErr isBimera isBimeraDenovo isBimeraDenovoTable isPhiX isShiftDenovo learnErrors loessErrfun makeSequenceTable mergePairs mergeSequenceTables noqualErrfun nwalign nwhamming PacBioErrfun plotComplexity plotErrors plotQualityProfile rc removeBimeraDenovo removePrimers seqComplexity setDadaOpt uniquesToFasta

Dependencies:abind askpass BH Biobase BiocGenerics BiocParallel Biostrings bitops cli codetools colorspace cpp11 crayon curl DelayedArray deldir fansi farver formatR futile.logger futile.options generics GenomeInfoDb GenomeInfoDbData GenomicAlignments GenomicRanges ggplot2 glue gtable httr hwriter interp IRanges isoband jpeg jsonlite labeling lambda.r lattice latticeExtra lifecycle magrittr MASS Matrix MatrixGenerics matrixStats mgcv mime munsell nlme openssl pillar pkgconfig plyr png pwalign R6 RColorBrewer Rcpp RcppEigen RcppParallel reshape2 Rhtslib rlang Rsamtools S4Arrays S4Vectors scales ShortRead snow SparseArray stringi stringr SummarizedExperiment sys tibble UCSC.utils utf8 vctrs viridisLite withr XVector

Introduction to dada2

Benjamin J Callahan, Joey McMurdie, Susan Holmes

Rendered fromdada2-intro.Rmdusingknitr::rmarkdownon Mar 21 2025.

Last update: 2018-10-10
Started: 2016-02-23

Help page	Topics
DADA2 package	dada2-package
Add species-level annotation to a taxonomic table.	addSpecies
Taxonomic assignment to the species level by exact matching.	assignSpecies
Classifies sequences against reference training dataset.	assignTaxonomy
Change concatenation of dada-class objects to list construction.	c,dada-method
Change concatenation of derep-class objects to list construction.	c,derep-method
Combine together sequences that are identical up to shifts and/or length.	collapseNoMismatch
High resolution sample inference from amplicon data.	dada
The object class returned by 'dada'	dada-class
A class representing dereplicated sequences	derep-class
Read in and dereplicate a fastq file.	derepFastq
An empirical error matrix.	errBalancedF
An empirical error matrix.	errBalancedR
Filter and trim a fastq file.	fastqFilter
Filters and trims paired forward and reverse fastq files.	fastqPairedFilter
Filter and trim fastq file(s).	filterAndTrim
Get DADA options	getDadaOpt
Extract already computed error rates.	getErrors
Get vector of sequences from input object.	getSequences
Get the uniques-vector from the input object.	getUniques
Inflates an error rate matrix by a specified factor, while accounting for saturation.	inflateErr
Determine if input sequence is a bimera of putative parent sequences.	isBimera
Identify bimeras from collections of unique sequences.	isBimeraDenovo
Identify bimeras in a sequence table.	isBimeraDenovoTable
Determine if input sequence(s) match the phiX genome.	isPhiX
Identify sequences that are identical to a more abundant sequence up to an overall shift.	isShiftDenovo
Learns the error rates from an input list, or vector, of file names or a list of 'derep-class' objects.	learnErrors
Use a loess fit to estimate error rates from transition counts.	loessErrfun
Construct a sample-by-sequence observation matrix.	makeSequenceTable
Merge denoised forward and reverse reads.	mergePairs
Merge two or more sample-by-sequence observation matrices.	mergeSequenceTables
Deactivate renaming of dada-class objects.	names<-,dada,ANY-method
Deactivate renaming of derep-class objects.	names<-,derep,ANY-method
Estimate error rates for each type of transition while ignoring quality scores.	noqualErrfun
Needleman-Wunsch alignment.	nwalign
Hamming distance after Needlman-Wunsch alignment.	nwhamming
Estimate error rates from transition counts in PacBio CCS data.	PacBioErrfun
Plot sequence complexity profile of a fastq file.	plotComplexity
Plot observed and estimated error rates.	plotErrors
Plot quality profile of a fastq file.	plotQualityProfile
Reverse complement DNA sequences.	rc
Remove bimeras from collections of unique sequences.	removeBimeraDenovo
Removes primers and orients reads in a consistent direction.	removePrimers
Determine if input sequence(s) are low complexity.	seqComplexity
Set DADA options	setDadaOpt
method extensions to show for dada2 objects.	show,dada-method show,derep-method
An empirical error matrix.	tperr1
The named integer vector format used to represent collections of unique DNA sequences.	uniques-vector
Write a uniques vector to a FASTA file	uniquesToFasta
Writes a named character vector of DNA sequences to a fasta file. Values are the sequences, and names are used for the id lines.	writeFasta,character-method

Package: dada2 1.35.0

dada2: Accurate, high-resolution sample inference from amplicon sequencing data

Introduction to dada2

Citation

Development and contributors

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)