Package: dada2 1.35.0

Benjamin Callahan

dada2: Accurate, high-resolution sample inference from amplicon sequencing data

The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.

Authors:Benjamin Callahan <[email protected]>, Paul McMurdie, Susan Holmes

dada2_1.35.0.tar.gz
dada2_1.35.0.zip(r-4.5)dada2_1.35.0.zip(r-4.4)dada2_1.35.0.zip(r-4.3)
dada2_1.35.0.tgz(r-4.4-x86_64)dada2_1.35.0.tgz(r-4.4-arm64)dada2_1.35.0.tgz(r-4.3-x86_64)dada2_1.35.0.tgz(r-4.3-arm64)
dada2_1.35.0.tar.gz(r-4.5-noble)dada2_1.35.0.tar.gz(r-4.4-noble)
dada2_1.35.0.tgz(r-4.4-emscripten)dada2_1.35.0.tgz(r-4.3-emscripten)
dada2.pdf |dada2.html
dada2/json (API)
NEWS

# Install 'dada2' in R:
install.packages('dada2', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/benjjneb/dada2/issues

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:

On BioConductor:dada2-1.35.0(bioc 3.21)dada2-1.34.0(bioc 3.20)

immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioconductorbioinformaticsmetabarcodingtaxonomycpp

13.06 score 473 stars 4 packages 2.2k scripts 4.8k downloads 274 mentions 37 exports 83 dependencies

Last updated 2 months agofrom:3d68997ee4. Checks:OK: 1 WARNING: 5 NOTE: 3. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 29 2024
R-4.5-win-x86_64WARNINGNov 29 2024
R-4.5-linux-x86_64WARNINGNov 29 2024
R-4.4-win-x86_64WARNINGNov 29 2024
R-4.4-mac-x86_64WARNINGNov 29 2024
R-4.4-mac-aarch64WARNINGNov 29 2024
R-4.3-win-x86_64NOTENov 29 2024
R-4.3-mac-x86_64NOTENov 29 2024
R-4.3-mac-aarch64NOTENov 29 2024

Exports:addSpeciesassignSpeciesassignTaxonomycollapseNoMismatchdadaderepFastqfastqFilterfastqPairedFilterfilterAndTrimgetDadaOptgetErrorsgetSequencesgetUniquesinflateErrisBimeraisBimeraDenovoisBimeraDenovoTableisPhiXisShiftDenovolearnErrorsloessErrfunmakeSequenceTablemergePairsmergeSequenceTablesnoqualErrfunnwalignnwhammingPacBioErrfunplotComplexityplotErrorsplotQualityProfilercremoveBimeraDenovoremovePrimersseqComplexitysetDadaOptuniquesToFasta

Dependencies:abindaskpassBHBiobaseBiocGenericsBiocParallelBiostringsbitopsclicodetoolscolorspacecpp11crayoncurlDelayedArraydeldirfansifarverformatRfutile.loggerfutile.optionsgenericsGenomeInfoDbGenomeInfoDbDataGenomicAlignmentsGenomicRangesggplot2gluegtablehttrhwriterinterpIRangesisobandjpegjsonlitelabelinglambda.rlatticelatticeExtralifecyclemagrittrMASSMatrixMatrixGenericsmatrixStatsmgcvmimemunsellnlmeopensslpillarpkgconfigplyrpngpwalignR6RColorBrewerRcppRcppEigenRcppParallelreshape2RhtslibrlangRsamtoolsS4ArraysS4VectorsscalesShortReadsnowSparseArraystringistringrSummarizedExperimentsystibbleUCSC.utilsutf8vctrsviridisLitewithrXVectorzlibbioc

Introduction to dada2

Rendered fromdada2-intro.Rmdusingknitr::rmarkdownon Nov 29 2024.

Last update: 2018-10-10
Started: 2016-02-23

Readme and manuals

Help Manual

Help pageTopics
DADA2 packagedada2-package
Add species-level annotation to a taxonomic table.addSpecies
Taxonomic assignment to the species level by exact matching.assignSpecies
Classifies sequences against reference training dataset.assignTaxonomy
Change concatenation of dada-class objects to list construction.c,dada-method
Change concatenation of derep-class objects to list construction.c,derep-method
Combine together sequences that are identical up to shifts and/or length.collapseNoMismatch
High resolution sample inference from amplicon data.dada
The object class returned by 'dada'dada-class
A class representing dereplicated sequencesderep-class
Read in and dereplicate a fastq file.derepFastq
An empirical error matrix.errBalancedF
An empirical error matrix.errBalancedR
Filter and trim a fastq file.fastqFilter
Filters and trims paired forward and reverse fastq files.fastqPairedFilter
Filter and trim fastq file(s).filterAndTrim
Get DADA optionsgetDadaOpt
Extract already computed error rates.getErrors
Get vector of sequences from input object.getSequences
Get the uniques-vector from the input object.getUniques
Inflates an error rate matrix by a specified factor, while accounting for saturation.inflateErr
Determine if input sequence is a bimera of putative parent sequences.isBimera
Identify bimeras from collections of unique sequences.isBimeraDenovo
Identify bimeras in a sequence table.isBimeraDenovoTable
Determine if input sequence(s) match the phiX genome.isPhiX
Identify sequences that are identical to a more abundant sequence up to an overall shift.isShiftDenovo
Learns the error rates from an input list, or vector, of file names or a list of 'derep-class' objects.learnErrors
Use a loess fit to estimate error rates from transition counts.loessErrfun
Construct a sample-by-sequence observation matrix.makeSequenceTable
Merge denoised forward and reverse reads.mergePairs
Merge two or more sample-by-sequence observation matrices.mergeSequenceTables
Deactivate renaming of dada-class objects.names<-,dada,ANY-method
Deactivate renaming of derep-class objects.names<-,derep,ANY-method
Estimate error rates for each type of transition while ignoring quality scores.noqualErrfun
Needleman-Wunsch alignment.nwalign
Hamming distance after Needlman-Wunsch alignment.nwhamming
Estimate error rates from transition counts in PacBio CCS data.PacBioErrfun
Plot sequence complexity profile of a fastq file.plotComplexity
Plot observed and estimated error rates.plotErrors
Plot quality profile of a fastq file.plotQualityProfile
Reverse complement DNA sequences.rc
Remove bimeras from collections of unique sequences.removeBimeraDenovo
Removes primers and orients reads in a consistent direction.removePrimers
Determine if input sequence(s) are low complexity.seqComplexity
Set DADA optionssetDadaOpt
method extensions to show for dada2 objects.show,dada-method show,derep-method
An empirical error matrix.tperr1
The named integer vector format used to represent collections of unique DNA sequences.uniques-vector
Write a uniques vector to a FASTA fileuniquesToFasta
Writes a named character vector of DNA sequences to a fasta file. Values are the sequences, and names are used for the id lines.writeFasta,character-method