Package: doubletrouble 1.7.0

Fabrício Almeida-Silva

doubletrouble: Identification and classification of duplicated genes

doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.

Authors:Fabrício Almeida-Silva [aut, cre], Yves Van de Peer [aut]

doubletrouble_1.7.0.tar.gz
doubletrouble_1.7.0.zip(r-4.5)doubletrouble_1.7.0.zip(r-4.4)doubletrouble_1.5.4.zip(r-4.3)
doubletrouble_1.7.0.tgz(r-4.4-any)doubletrouble_1.5.4.tgz(r-4.3-any)
doubletrouble_1.7.0.tar.gz(r-4.5-noble)doubletrouble_1.7.0.tar.gz(r-4.4-noble)
doubletrouble_1.7.0.tgz(r-4.4-emscripten)doubletrouble_1.5.4.tgz(r-4.3-emscripten)
doubletrouble.pdf |doubletrouble.html
doubletrouble/json (API)
NEWS

# Install 'doubletrouble' in R:
install.packages('doubletrouble', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/almeidasilvaf/doubletrouble/issues

Datasets:
  • cds_scerevisiae - Coding sequences (CDS) of S. cerevisiae
  • diamond_inter - Interspecies DIAMOND output for yeast species
  • diamond_intra - Intraspecies DIAMOND output for S. cerevisiae
  • fungi_kaks - Duplicate pairs and Ka, Ks, and Ka/Ks values for fungi species
  • gmax_ks - Duplicate pairs and Ks values for Glycine max
  • yeast_annot - Genome annotation of the yeast species S. cerevisiae and C. glabrata
  • yeast_seq - Protein sequences of the yeast species S. cerevisiae and C. glabrata

On BioConductor:doubletrouble-1.7.0(bioc 3.21)doubletrouble-1.6.0(bioc 3.20)

softwarewholegenomecomparativegenomicsfunctionalgenomicsphylogeneticsnetworkclassificationbioinformaticscomparative-genomicsgene-duplicationmolecular-evolutionwhole-genome-duplication

6.23 score 11 stars 17 scripts 175 downloads 16 exports 134 dependencies

Last updated 2 months agofrom:4db21ce076. Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 29 2024
R-4.5-winOKNov 29 2024
R-4.5-linuxOKNov 29 2024
R-4.4-winOKNov 29 2024
R-4.4-macOKNov 29 2024
R-4.3-winOKOct 08 2024
R-4.3-macOKOct 08 2024

Exports:classify_gene_pairsclassify_genesduplicates2countsfind_ks_peaksget_anchors_listget_intron_countsget_segmentalget_tandem_proximalget_transposedget_transposed_classespairs2kaksplot_duplicate_freqsplot_ks_distroplot_ks_peaksplot_rates_by_speciessplit_pairs_by_peak

Dependencies:abindade4AnnotationDbiapeaskpassBHBiobaseBiocGenericsBiocIOBiocParallelBiostringsbitbit64bitopsblobbriocachemcallrclicodacodetoolscolorspacecpp11crayoncurlDBIDelayedArraydescdiffobjdigestdoParalleldplyrevaluatefansifarverfastmapforeachformatRfsfutile.loggerfutile.optionsgenericsGenomeInfoDbGenomeInfoDbDataGenomicAlignmentsGenomicFeaturesGenomicRangesggnetworkggplot2ggrepelgluegtablehttrigraphintergraphIRangesisobanditeratorsjsonliteKEGGRESTlabelinglambda.rlatticelifecyclemagrittrMASSMatrixMatrixGenericsmatrixStatsmclustmemoisemgcvmimeMSA2distmunsellnetworknlmeopensslpheatmappillarpixmappkgbuildpkgconfigpkgloadplogrpngpraiseprocessxpspurrrpwalignR6RColorBrewerRcppRcppArmadilloRcppThreadRCurlrestfulrRhtslibrjsonrlangrprojrootRsamtoolsRSQLitertracklayerS4ArraysS4VectorsscalessegmentedseqinrsnasnowspSparseArraystatnet.commonstringistringrSummarizedExperimentsyntenetsystestthattibbletidyrtidyselectUCSC.utilsutf8vctrsviridisLitewaldowithrXMLXVectoryamlzlibbioc

Identification and classification of duplicated genes

Rendered fromdoubletrouble_vignette.Rmdusingknitr::rmarkdownon Nov 29 2024.

Last update: 2024-10-07
Started: 2022-10-04

Readme and manuals

Help Manual

Help pageTopics
Coding sequences (CDS) of S. cerevisiaecds_scerevisiae
Classify duplicate gene pairs based on their modes of duplicationclassify_gene_pairs
Classify genes into unique modes of duplicationclassify_genes
Interspecies DIAMOND output for yeast speciesdiamond_inter
Intraspecies DIAMOND output for S. cerevisiaediamond_intra
Get a duplicate count matrix for each genomeduplicates2counts
Find peaks in a Ks distribution with Gaussian Mixture Modelsfind_ks_peaks
Duplicate pairs and Ka, Ks, and Ka/Ks values for fungi speciesfungi_kaks
Get a list of anchor pairs for each speciesget_anchors_list
Get a data frame of intron counts per geneget_intron_counts
Classify gene pairs derived from segmental duplicationsget_segmental
Classify gene pairs derived from tandem and proximal duplicationsget_tandem_proximal
Classify gene pairs originating from transposon-derived duplicationsget_transposed
Classify TRD genes as derived from either DNA transposons or retrotransposonsget_transposed_classes
Duplicate pairs and Ks values for Glycine maxgmax_ks
Calculate Ka, Ks, and Ka/Ks from duplicate gene pairspairs2kaks
Plot frequency of duplicates per mode for each speciesplot_duplicate_freqs
Plot distribution of synonymous substitution rates (Ks)plot_ks_distro
Plot histogram of Ks distribution with peaksplot_ks_peaks
Plot distributions of substitution rates (Ka, Ks, or Ka/Ks) per speciesplot_rates_by_species
Split gene pairs based on their Ks peakssplit_pairs_by_peak
Genome annotation of the yeast species S. cerevisiae and C. glabratayeast_annot
Protein sequences of the yeast species S. cerevisiae and C. glabratayeast_seq