Package 'TrIdent'

Title: TrIdent - Transduction Identification
Description: The `TrIdent` R package automates the analysis of transductomics data by detecting, classifying, and characterizing read coverage patterns associated with potential transduction events. Transductomics is a DNA sequencing-based method for the detection and characterization of transduction events in pure cultures and complex communities. Transductomics relies on mapping sequencing reads from a viral-like particle (VLP)-fraction of a sample to contigs assembled from the metagenome (whole-community) of the same sample. Reads from bacterial DNA carried by VLPs will map back to the bacterial contigs of origin creating read coverage patterns indicative of ongoing transduction.
Authors: Jessie Maier [aut, cre] (ORCID: <https://orcid.org/0009-0001-8575-5386>), Yixuan Yang [aut, ctb] (ORCID: <https://orcid.org/0009-0003-5064-6512>), Jorden Rabasco [aut, ctb] (ORCID: <https://orcid.org/0000-0002-6971-6678>), Craig Gin [aut] (ORCID: <https://orcid.org/0000-0002-7447-663X>), Benjamin Callahan [aut] (ORCID: <https://orcid.org/0000-0002-8752-117X>), Manuel Kleiner [aut, ths] (ORCID: <https://orcid.org/0000-0001-6904-0287>)
Maintainer: Jessie Maier <[email protected]>
License: GPL-2
Version: 1.5.1
Built: 2026-05-31 15:24:35 UTC
Source: https://github.com/bioc/TrIdent

Help Index


Plot read coverage graphs of contigs classified as Prophage-like, Sloping, or HighCovNoPattern

Description

Plot the read coverages of a contig and its associated pattern-match for Prophage-like, Sloping and HighCovNoPattern classifications. Returns a list of ggplot objects.

Usage

plotTrIdentResults(
  VLPpileup,
  WCpileup,
  TrIdentResults,
  onlyPlot,
  logScale = FALSE,
  saveFilesTo
)

Arguments

VLPpileup

VLP-fraction pileup file generated by mapping sequencing reads from a sample's ultra-purified VLP-fraction mapped to the sample's whole-community metagenome assembly. The pileup file MUST have the following format: * V1: Contig accession * V2: Mapped read coverage values averaged over 100 bp windows * V3: Starting position (bp) of each 100 bp window. Restarts from 0 at the start of each new contig. * V4: Starting position (bp) of each 100 bp window. Does NOT restart at the start of each new contig.

WCpileup

A whole-community pileup file generated by mapping sequencing reads from a sample's whole-community mapped to the sample's whole-community metagenome assembly. The pileup file MUST have the following format: * V1: Contig accession * V2: Mapped read coverage values averaged over 100 bp windows * V3: Starting position (bp) of each 100 bp window. Restarts from 0 at the start of each new contig. * V4: Starting position (bp) of each 100 bp window. Does NOT restart at the start of each new contig.

TrIdentResults

Output from 'TrIdentClassifier()'.

onlyPlot

Optional, use to 'only plot' the contigs classified as either "Prophage-like", "Sloping", or "HighCovNoPattern".

logScale

TRUE or FALSE, display VLP-fraction read coverage in log10 scale. Default is FALSE.

saveFilesTo

Optional, Provide a path to the directory you wish to save output to. A folder will be made within the provided directory to store results.

Value

Large list containing ggplot objects

Examples

data("VLPFractionSamplePileup")
data("WholeCommunitySamplePileup")
data("TrIdentSampleOutput")

patternMatches <- plotTrIdentResults(
  VLPpileup = VLPFractionSamplePileup,
  WCpileup = WholeCommunitySamplePileup,
  TrIdentResults = TrIdentSampleOutput
)

Identify potential specialized transduction events on contigs classified as Prophage-like

Description

Search contigs classified as Prophage-like for dense read coverage outside of the pattern-match borders that may indicate specialized transduction. Returns a list with the first object containing a summary table and the second object containing a list of plots of with associated specialzied transduction search results. If the plot is green, it has been identified as having potential specialized transduction.

Usage

specializedTransductionID(
  VLPpileup,
  TrIdentResults,
  specificContig,
  noReadCov = 500,
  specTransLength = 2000,
  logScale = FALSE,
  verbose = TRUE,
  SaveFilesTo
)

Arguments

VLPpileup

VLP-fraction pileup file generated by mapping sequencing reads from a sample's ultra-purified VLP-fraction mapped to the sample's whole-community metagenome assembly. The pileup file MUST have the following format: * V1: Contig accession * V2: Mapped read coverage values averaged over 100 bp windows * V3: Starting position (bp) of each 100 bp window. Restarts from 0 at the start of each new contig. * V4: Starting position (bp) of each 100 bp window. Does NOT restart at the start of each new contig.

TrIdentResults

Output from 'TrIdentClassifier()'

specificContig

Optional, Search a specific contig classified as Prophage-like ("NODE_1").

noReadCov

Number of basepairs of zero read coverage encountered before specialized transduction searching stops. Default is 500. Must be at least 100.

specTransLength

Number of basepairs of non-zero read coverage needed for specialized transduction to be considered. Default is 2000. Must be at least 100.

logScale

TRUE or FALSE, display VLP-fraction read coverage in log10 scale. Default is FALSE.

verbose

TRUE or FALSE. Print progress messages to console. Default is TRUE.

SaveFilesTo

Provide a path to the directory you wish to save output to. 'specializedTransductionID()' will make a folder within the provided directory to store results.

Value

Large list containing two objects

Examples

data("VLPFractionSamplePileup")
data("TrIdentSampleOutput")

specTransduction <- specializedTransductionID(
  VLPpileup = VLPFractionSamplePileup,
  TrIdentResults = TrIdentSampleOutput
)

specTransductionNODE62 <- specializedTransductionID(
  VLPpileup = VLPFractionSamplePileup,
  TrIdentResults = TrIdentSampleOutput,
  specificContig = "NODE_62"
)

Classify contigs as Prophage-like, Sloping, HighCovNoPattern, and NoPattern

Description

Performs all the pattern-matching and summarizes the results into a list. The first item in the list is a table consisting of the summary information of all the contigs that passed through pattern-matching (i.e were not filtered out). The second item in the list is a table consisting of the summary information of all contigs that were classified via pattern-matching. The third item in the list contains the pattern-match information associated with each contig in the previous table. The fourth object in the list is a table containing the contigs that were filtered out prior to pattern-matching. The fifth item is the windowSize used for the search.

Usage

TrIdentClassifier(
  VLPpileup,
  WCpileup,
  windowSize = 1000,
  minBlockSize = 10000,
  maxBlockSize = Inf,
  minContigLength = 30000,
  minSlope = 0.001,
  minSlopeSize = 20000,
  minHCNPRatio = 2,
  VLPReads,
  WCReads,
  verbose = TRUE,
  searchMethod = "grid",
  DirectMaxEval = 100,
  SaveFilesTo
)

Arguments

VLPpileup

VLP-fraction pileup file generated by mapping sequencing reads from a sample's ultra-purified VLP-fraction mapped to the sample's whole-community metagenome assembly. The pileup file MUST have the following format: * V1: Contig accession * V2: Mapped read coverage values averaged over 100 bp windows * V3: Starting position (bp) of each 100 bp window. Restarts from 0 at the start of each new contig. * V4: Starting position (bp) of each 100 bp window. Does NOT restart at the start of each new contig.

WCpileup

A whole-community pileup file generated by mapping sequencing reads from a sample's whole-community mapped to the sample's whole-community metagenome assembly. The pileup file MUST have the following format: * V1: Contig accession * V2: Mapped read coverage values averaged over 100 bp windows * V3: Starting position (bp) of each 100 bp window. Restarts from 0 at the start of each new contig. * V4: Starting position (bp) of each 100 bp window. Does NOT restart at the start of each new contig.

windowSize

The number of basepairs to average read coverage values over. Options are 100, 200, 500, 1000 ONLY. Default is 1000.

minBlockSize

The minimum size (in bp) of the Prophage-like block pattern. Default is 10000. Must be at least 1000.

maxBlockSize

The maximum size (in bp) of the Prophage-like block pattern. Default is NA (no maximum).

minContigLength

The minimum contig size (in bp) to perform pattern-matching on. Must be at least 25000. Default is 30000.

minSlope

The minimum slope value to test for sloping patterns. Default is 0.001 (i.e minimum change of 10x read coverage over 100,000 bp).

minSlopeSize

The minimum width of sloping patterns.Default and absolute minimum is 20,000 bp.

minHCNPRatio

The minimum VLP:WC ratio value used for HighCovNoPattern classifications. Default is 2. (i.e the median VLP-fraction coverage must be at least 2x the median WC read coverage to be classified as HighCovNoPattern).

VLPReads

Optional, the number of VLP-fraction reads used for mapping and creation of pileup.

WCReads

Optional, the number of WC reads used for mapping and creation of pileup.

verbose

TRUE or FALSE. Print progress messages to console. Default is TRUE.

searchMethod

Search method to use. Either "grid" for the original grid search or "direct" for DIRECT global optimization.

DirectMaxEval

Maximum number of DIRECT evaluations to make. Default is 100. Default is 100.

SaveFilesTo

Optional, Provide a path to the directory you wish to save output to. A folder will be made within the provided directory to store results.

Value

Large list containing 5 objects

Examples

data("VLPFractionSamplePileup")
data("WholeCommunitySamplePileup")

TrIdent_results <- TrIdentClassifier(
  VLPpileup = VLPFractionSamplePileup,
  WCpileup = WholeCommunitySamplePileup
)