| Title: | A toolkit for single-cell immune receptor profiling |
|---|---|
| Description: | scRepertoire is a toolkit for processing and analyzing single-cell T-cell receptor (TCR) and immunoglobulin (Ig). The scRepertoire framework supports use of 10x, AIRR, BD, MiXCR, TRUST4, and WAT3R single-cell formats. The functionality includes basic clonal analyses, repertoire summaries, distance-based clustering and interaction with the popular Seurat and SingleCellExperiment/Bioconductor R single-cell workflows. |
| Authors: | Nick Borcherding [aut, cre], Qile Yang [aut], Ksenia Safina [aut], Justin Reimertz [ctb] |
| Maintainer: | Nick Borcherding <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 2.9.0 |
| Built: | 2026-05-12 17:18:19 UTC |
| Source: | https://github.com/bioc/scRepertoire |
This function adds variables to the product of combineTCR(),
or combineBCR() to be used in later visualizations.
For each element, the function will add a column (labeled by
variable.name) with the variable. The length of the
variables parameter needs to match the length of the
combined object.
addVariable(input.data, variable.name = NULL, variables = NULL)addVariable(input.data, variable.name = NULL, variables = NULL)
input.data |
The product of |
variable.name |
A character string that defines the new variable to add. |
variables |
A character vector defining the desired column value for
each list element. Must match the length of the product of |
input.data list with the variable column added to each element.
combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) combined <- addVariable(combined, variable.name = "Type", variables = rep(c("B", "L"), 4))combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) combined <- addVariable(combined, variable.name = "Type", variables = rep(c("B", "L"), 4))
View the proportional contribution of clones by Seurat or SCE object
meta data after combineExpression(). The visualization
is based on the ggalluvial package, which requires the aesthetics
to be part of the axes that are visualized. Therefore, alpha, facet,
and color should be part of the the axes you wish to view or will
add an additional stratum/column to the end of the graph.
alluvialClones( sc.data, clone.call = NULL, chain = "both", y.axes = NULL, color = NULL, facet = NULL, alpha = NULL, top.clones = NULL, min.freq = 0, highlight.clones = NULL, highlight.color = "red", stratum.width = 0.2, flow.alpha = 0.5, show.labels = TRUE, label.size = 2, order.strata = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )alluvialClones( sc.data, clone.call = NULL, chain = "both", y.axes = NULL, color = NULL, facet = NULL, alpha = NULL, top.clones = NULL, min.freq = 0, highlight.clones = NULL, highlight.color = "red", stratum.width = 0.2, flow.alpha = 0.5, show.labels = TRUE, label.size = 2, order.strata = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
sc.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
y.axes |
The columns that will separate the proportional visualizations. |
color |
The column header or clone(s) to be highlighted. |
facet |
The column label to separate. |
alpha |
The column header to have gradated opacity. |
top.clones |
Show only the top N clones by frequency. If |
min.freq |
Minimum frequency threshold for displaying flows. Clones appearing fewer than this many times are filtered out. |
highlight.clones |
Character vector of specific clone sequences to highlight. These clones will be colored distinctly while others are shown in gray. |
highlight.color |
Color to use for highlighted clones (default: "red"). |
stratum.width |
Width of the stratum bars (default: 0.2). |
flow.alpha |
Transparency of the flows (default: 0.5). Highlighted clones use full opacity. |
show.labels |
If |
label.size |
Text size for stratum labels (default: 2). |
order.strata |
Named list specifying the order of levels within each stratum. Names should match column names in y.axes. |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing categorical distribution of clones, or a
data.frame if export.table = TRUE.
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) scRep_example$Patient <- substring(scRep_example$orig.ident, 1,3) # Using alluvialClones() alluvialClones(scRep_example, clone.call = "gene", y.axes = c("Patient", "ident"), color = "ident") # Show only top 50 most frequent clones alluvialClones(scRep_example, clone.call = "aa", y.axes = c("Patient", "ident"), top.clones = 50) # Highlight specific clones alluvialClones(scRep_example, clone.call = "aa", y.axes = c("Patient", "ident"), highlight.clones = c("CVVSDNTGGFKTIF_CASSVRRERANTGELFF"))# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) scRep_example$Patient <- substring(scRep_example$orig.ident, 1,3) # Using alluvialClones() alluvialClones(scRep_example, clone.call = "gene", y.axes = c("Patient", "ident"), color = "ident") # Show only top 50 most frequent clones alluvialClones(scRep_example, clone.call = "aa", y.axes = c("Patient", "ident"), top.clones = 50) # Highlight specific clones alluvialClones(scRep_example, clone.call = "aa", y.axes = c("Patient", "ident"), highlight.clones = c("CVVSDNTGGFKTIF_CASSVRRERANTGELFF"))
The annotateInvariant() function identifies potential mucosal-associated
invariant T (MAIT``) cells or invariant natural killer T (iNKT') cells from
single-cell sequencing datasets based on their characteristic TCR usage.
It extracts TCR chain information from the provided single-cell
data, checks it against known invariant T-cell receptor criteria for either
MAIT or iNKT cells, and returns a score indicating the presence (1) or
absence (0) of these invariant cell populations for each individual cell.
The function supports data from mouse and human samples, providing a
convenient method to annotate specialized T-cell subsets within single-cell
analyses.
annotateInvariant( input.data, type = c("MAIT", "iNKT"), species = c("mouse", "human") )annotateInvariant( input.data, type = c("MAIT", "iNKT"), species = c("mouse", "human") )
input.data |
The product of |
type |
Character specifying the type of invariant cells to
annotate ( |
species |
Character specifying the species ('mouse' or 'human'). |
A single-cell object or list with the corresponding annotation scores (0 or 1) added.
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using annotateInvariant() annotateInvariant(input.data = scRep_example, type = "MAIT", species = "human") annotateInvariant(input.data = scRep_example, type = "iNKT", species = "human")# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using annotateInvariant() annotateInvariant(input.data = scRep_example, type = "MAIT", species = "human") annotateInvariant(input.data = scRep_example, type = "iNKT", species = "human")
Displays the number of clones at specific frequencies by sample
or group. Visualization can either be a line graph (
scale = FALSE) using calculated numbers or density
plot (scale = TRUE). Multiple sequencing runs can
be group together using the group parameter. If a matrix
output for the data is preferred, set
export.table = TRUE.
clonalAbundance( input.data, clone.call = NULL, chain = "both", scale = FALSE, group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalAbundance( input.data, clone.call = NULL, chain = "both", scale = FALSE, group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
scale |
Converts the graphs into density plots in order to show relative distributions. |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing clonal abundance by group, or a
data.frame if export.table = TRUE.
Nick Borcherding, Justin Reimertz
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalAbundance() clonalAbundance(combined, clone.call = "gene", scale = FALSE)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalAbundance() clonalAbundance(combined, clone.call = "gene", scale = FALSE)
The metric seeks to quantify how individual clones are skewed towards
a specific cellular compartment or cluster. A clone bias of 1 -
indicates that a clone is composed of cells from a single
compartment or cluster, while a clone bias of 0 - matches the
background subtype distribution. Please read and cite the following
manuscript
if using clonalBias().
clonalBias( sc.data, clone.call = NULL, split.by = NULL, group.by = NULL, n.boots = 20, min.expand = 10, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalBias( sc.data, clone.call = NULL, split.by = NULL, group.by = NULL, n.boots = 20, min.expand = 10, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
sc.data |
The single-cell object after |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
split.by |
The variable to use for calculating the baseline frequencies. For example, "Type" for lung vs peripheral blood comparison |
group.by |
A column header in the metadata that bias will be based on. |
n.boots |
number of bootstraps to downsample. |
min.expand |
clone frequency cut off for the purpose of comparison. |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
ggplot scatter plot with clone bias
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) scRep_example$Patient <- substring(scRep_example$orig.ident,1,3) # Using clonalBias() clonalBias(scRep_example, clone.call = "aa", split.by = "Patient", group.by = "seurat_clusters", n.boots = 5, min.expand = 2)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) scRep_example$Patient <- substring(scRep_example$orig.ident,1,3) # Using clonalBias() clonalBias(scRep_example, clone.call = "aa", split.by = "Patient", group.by = "seurat_clusters", n.boots = 5, min.expand = 2)
This function adds a clonal grouping variable (cloneSize) to the output
of combineTCR(), combineBCR(), or combineExpression(). It calculates
the clonal frequency and proportion, then bins clones into categories based
on customizable thresholds. This is useful for categorizing clones prior to
downstream analysis or visualization.
clonalBin( input.data, clone.call = NULL, chain = "both", group.by = NULL, proportion = TRUE, clone.size = NULL, cloneCall = NULL, cloneSize = NULL )clonalBin( input.data, clone.call = NULL, chain = "both", group.by = NULL, proportion = TRUE, clone.size = NULL, cloneCall = NULL, cloneSize = NULL )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata to group the analysis
by (e.g., "sample", "treatment"). If |
proportion |
Whether to use proportion ( |
clone.size |
The bins for the grouping based on proportion or frequency.
If proportion is |
cloneCall |
|
cloneSize |
A list of data frames with clonal frequency, clonal proportion, and cloneSize columns added.
Nick Borcherding
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Adding clonal bins with default settings (proportion-based) combined <- clonalBin(combined) # Adding clonal bins based on frequency combined <- clonalBin(combined, proportion = FALSE, clone.size = c(Rare = 1, Small = 5, Medium = 20, Large = 100, Hyperexpanded = 500)) # Using a custom grouping variable combined <- addVariable(combined, variable.name = "Type", variables = rep(c("B", "L"), 4)) combined <- clonalBin(combined, group.by = "Type")# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Adding clonal bins with default settings (proportion-based) combined <- clonalBin(combined) # Adding clonal bins based on frequency combined <- clonalBin(combined, proportion = FALSE, clone.size = c(Rare = 1, Small = 5, Medium = 20, Large = 100, Hyperexpanded = 500)) # Using a custom grouping variable combined <- addVariable(combined, variable.name = "Type", variables = rep(c("B", "L"), 4)) combined <- clonalBin(combined, group.by = "Type")
This function clusters TCRs or BCRs based on the edit distance or alignment
score of their CDR3 sequences. It can operate on either nucleotide (nt)
or amino acid (aa) sequences and can optionally enforce that clones share
the same V and/or J genes. The output can be the input object with an added
metadata column for cluster IDs, a sparse adjacency matrix, or an igraph
graph object representing the cluster network.
clonalCluster( input.data, chain = "TRB", sequence = "aa", threshold = 0.85, group.by = NULL, dist.type = NULL, dist.mat = NULL, normalize = "length", gap.open = NULL, gap.extend = NULL, cluster.method = "components", cluster.prefix = "cluster.", use.V = TRUE, use.J = FALSE, export.adj.matrix = NULL, export.graph = NULL, dist_type = NULL, dist_mat = NULL, gap_open = NULL, gap_extend = NULL, exportAdjMatrix = NULL, exportGraph = NULL )clonalCluster( input.data, chain = "TRB", sequence = "aa", threshold = 0.85, group.by = NULL, dist.type = NULL, dist.mat = NULL, normalize = "length", gap.open = NULL, gap.extend = NULL, cluster.method = "components", cluster.prefix = "cluster.", use.V = TRUE, use.J = FALSE, export.adj.matrix = NULL, export.graph = NULL, dist_type = NULL, dist_mat = NULL, gap_open = NULL, gap_extend = NULL, exportAdjMatrix = NULL, exportGraph = NULL )
input.data |
The product of |
chain |
The TCR/BCR chain to use. Use |
sequence |
Clustering based on either |
threshold |
The similarity threshold. If < 1, treated as normalized similarity (higher is stricter). If >= 1, treated as raw edit distance (lower is stricter). |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
dist.type |
The distance metric to use. Options: |
dist.mat |
The substitution matrix to use for alignment-based metrics
( |
normalize |
Method for normalizing distances. Options: |
gap.open |
Penalty for opening a gap in alignment metrics (default: -10). |
gap.extend |
Penalty for extending a gap in alignment metrics (default: -1). |
cluster.method |
The clustering algorithm to use. Defaults to |
cluster.prefix |
A character prefix to add to the cluster names (e.g., "cluster."). |
use.V |
If |
use.J |
If |
export.adj.matrix |
If |
export.graph |
If |
dist_type |
|
dist_mat |
|
gap_open |
|
gap_extend |
|
exportAdjMatrix |
|
exportGraph |
The clustering process is as follows:
The function retrieves the relevant chain data from the input object.
It calculates the distance between all sequences within each group
(or across the entire dataset if group.by is NULL).
An edge list is constructed, connecting sequences that meet the similarity
threshold.
The threshold parameter behaves differently based on its value:
threshold < 1 (e.g., 0.85): Interpreted as a normalized
distance. A higher value means greater similarity is required.
threshold >= 1 (e.g., 2): Interpreted as a maximum raw edit
distance. A lower value means greater similarity is required.
Distance Metrics:
Levenshtein/Hamming/Damerau: Standard edit distance calculations.
Alignment (NW/SW): If dist.type is "nw" (Needleman-Wunsch) or
"sw" (Smith-Waterman), alignment scores are calculated using the
specified substitution matrix (dist.mat). These scores are converted
to a distance-like metric for clustering.
An igraph graph is built from the edge list.
A clustering algorithm is run on the graph (default: connected components).
Depending on the export parameters, one of the following:
An amended input.data object with a new metadata column containing cluster IDs (default).
An igraph object if export.graph = TRUE.
A sparse dgCMatrix object if export.adj.matrix = TRUE.
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Standard Levenshtein clustering (85% similarity) sub_combined <- clonalCluster(combined[c(1,2)], chain = "TRA", sequence = "aa", threshold = 0.85) # Alignment-based clustering using BLOSUM80 sub_combined_nw <- clonalCluster(combined[c(1,2)], chain = "TRA", dist.type = "nw", dist.mat = "BLOSUM80", threshold = 0.85) # Export the graph object instead graph_obj <- clonalCluster(combined[c(1,2)], chain = "TRA", export.graph = TRUE)# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Standard Levenshtein clustering (85% similarity) sub_combined <- clonalCluster(combined[c(1,2)], chain = "TRA", sequence = "aa", threshold = 0.85) # Alignment-based clustering using BLOSUM80 sub_combined_nw <- clonalCluster(combined[c(1,2)], chain = "TRA", dist.type = "nw", dist.mat = "BLOSUM80", threshold = 0.85) # Export the graph object instead graph_obj <- clonalCluster(combined[c(1,2)], chain = "TRA", export.graph = TRUE)
This function visualizes the relative abundance of specific clones across different samples or groups. It is useful for tracking how the proportions of top clones change between conditions. The output can be an alluvial plot to trace clonal dynamics or an area plot to show compositional changes.
clonalCompare( input.data, clone.call = NULL, chain = "both", samples = NULL, clones = NULL, top.clones = NULL, highlight.clones = NULL, relabel.clones = FALSE, group.by = NULL, order.by = NULL, graph = "alluvial", proportion = TRUE, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalCompare( input.data, clone.call = NULL, chain = "both", samples = NULL, clones = NULL, top.clones = NULL, highlight.clones = NULL, relabel.clones = FALSE, group.by = NULL, order.by = NULL, graph = "alluvial", proportion = TRUE, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
samples |
The specific samples to isolate for visualization. |
clones |
The specific clonal sequences of interest |
top.clones |
The top number of clonal sequences per group. (e.g., top.clones = 5) |
highlight.clones |
Clonal sequences to highlight, if present, all other clones returned will be grey |
relabel.clones |
Simplify the legend of the graph by returning clones that are numerically indexed |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
graph |
The type of plot to generate. Accepted values are |
proportion |
If |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing proportions of clones by groupings, or a
data.frame if export.table = TRUE.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalCompares() clonalCompare(combined, top.clones = 5, samples = c("P17B", "P17L"), clone.call = "aa")# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalCompares() clonalCompare(combined, top.clones = 5, samples = c("P17B", "P17L"), clone.call = "aa")
This function calculates a specified diversity metric for samples or groups within a dataset. To control for variations in library size, the function can perform bootstrapping with downsampling. It resamples each group to the size of the smallest group and calculates the diversity metric across multiple iterations, returning the mean value.
clonalDiversity( input.data, clone.call = NULL, metric = "shannon", chain = "both", group.by = NULL, order.by = NULL, x.axis = NULL, export.table = NULL, palette = "inferno", n.boots = 100, return.boots = FALSE, skip.boots = FALSE, cloneCall = NULL, exportTable = NULL, ... )clonalDiversity( input.data, clone.call = NULL, metric = "shannon", chain = "both", group.by = NULL, order.by = NULL, x.axis = NULL, export.table = NULL, palette = "inferno", n.boots = 100, return.boots = FALSE, skip.boots = FALSE, cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
metric |
The diversity metric to calculate. Must be a single string from the list of available metrics (see Details). |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
x.axis |
An additional metadata variable to group samples along the x-axis. |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
n.boots |
The number of bootstrap iterations to perform (default is 100). |
return.boots |
If |
skip.boots |
If |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
The function operates by first splitting the dataset by the specified group.by
variable.
Downsampling and Bootstrapping: To make a fair comparison between groups of different sizes, diversity metrics often require normalization. This function implements this by downsampling.
It determines the number of clones in the smallest group.
For each group, it performs n.boots iterations (default = 100).
In each iteration, it randomly samples the clones (with replacement) down to the size of the smallest group.
It calculates the selected diversity metric on this downsampled set.
The final reported diversity value is the mean of the results from all bootstrap iterations.
This process can be disabled by setting skip.boots = TRUE.
Available Diversity Metrics (metric): The function uses a registry of metrics imported from the immApex package. You can select one of the following:
"shannon": Shannon's Entropy. See shannon_entropy.
"inv.simpson": Inverse Simpson Index. See inv_simpson.
"gini.simpson": Gini-Simpson Index. See gini_simpson.
"norm.entropy": Normalized Shannon Entropy. See norm_entropy.
"pielou": Pielou's Evenness (same as norm.entropy). See pielou_evenness.
"ace": Abundance-based Coverage Estimator. See ace_richness.
"chao1": Chao1 Richness Estimator. See chao1_richness.
"gini": Gini Coefficient for inequality. See gini_coef.
"d50": The number of top clones making up 50% of the library. See d50_dom.
"hill0", "hill1", "hill2": Hill numbers of order 0, 1, and 2. See hill_q.
A ggplot object visualizing the diversity metric, or a data.frame if
export.table = TRUE.
Andrew Malone, Nick Borcherding, Nathan Vanderkraan
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Calculate Shannon diversity, grouped by sample clonalDiversity(combined, clone.call = "gene", metric = "shannon") # Calculate Inverse Simpson without bootstrapping clonalDiversity(combined, clone.call = "aa", metric = "inv.simpson", skip.boots = TRUE)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Calculate Shannon diversity, grouped by sample clonalDiversity(combined, clone.call = "gene", metric = "shannon") # Calculate Inverse Simpson without bootstrapping clonalDiversity(combined, clone.call = "aa", metric = "inv.simpson", skip.boots = TRUE)
This function calculates the space occupied by clone proportions.
The grouping of these clones is based on the parameter clone.size,
at default, clone.size will group the clones into bins of Rare = 0
to 0.0001, Small = 0.0001 to 0.001, etc. To adjust the proportions,
change the number or labeling of the clone.size parameter. If a matrix
output for the data is preferred, set export.table = TRUE.
clonalHomeostasis( input.data, clone.size = NULL, clone.call = NULL, chain = "both", group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneSize = NULL, cloneCall = NULL, exportTable = NULL, ... )clonalHomeostasis( input.data, clone.size = NULL, clone.call = NULL, chain = "both", group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneSize = NULL, cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.size |
The cut points of the proportions. |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneSize |
|
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing clonal homeostasis, or a data.frame if
export.table = TRUE.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) clonalHomeostasis(combined, clone.call = "gene")# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) clonalHomeostasis(combined, clone.call = "gene")
This function displays either the nucleotide nt or amino
acid aa sequence length. The sequence length visualized
can be selected using the chains parameter, either the combined clone
(both chains) or across all single chains. Visualization can either
be a histogram or if scale = TRUE, the output will
be a density plot. Multiple sequencing runs can be group together
using the group.by parameter.
clonalLength( input.data, clone.call = NULL, chain = "both", group.by = NULL, order.by = NULL, scale = FALSE, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalLength( input.data, clone.call = NULL, chain = "both", group.by = NULL, order.by = NULL, scale = FALSE, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
scale |
Converts the graphs into density plots in order to show relative distributions. |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing the distributions by length, or a data.frame if
export.table = TRUE.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) clonalLength(combined, clone.call = "aa", chain = "both")# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) clonalLength(combined, clone.call = "aa", chain = "both")
This function generates a network based on clonal proportions of an indicated identity and then superimposes the network onto a single-cell object dimensional reduction plot.
clonalNetwork( sc.data, clone.call = NULL, chain = "both", reduction = "umap", group.by = "ident", filter.clones = NULL, filter.identity = NULL, filter.proportion = NULL, filter.graph = FALSE, export.clones = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportClones = NULL, exportTable = NULL, ... )clonalNetwork( sc.data, clone.call = NULL, chain = "both", reduction = "umap", group.by = "ident", filter.clones = NULL, filter.identity = NULL, filter.proportion = NULL, filter.graph = FALSE, export.clones = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportClones = NULL, exportTable = NULL, ... )
sc.data |
The single-cell object after |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
reduction |
The name of the dimensional reduction of the single-cell object. |
group.by |
A column header in the metadata or lists to group the analysis by (e.g., "sample", "treatment"). This will be the nodes overlaid onto the graph. |
filter.clones |
Use to select the top n clones (e.g., |
filter.identity |
Display the network for a specific level of the indicated identity. |
filter.proportion |
Remove clones from the network below a specific proportion. |
filter.graph |
Remove the reciprocal edges from the half of the graph, allowing for cleaner visualization. |
export.clones |
Exports a table of clones that are shared across multiple identity groups and ordered by the total number of clone copies. |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportClones |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
ggplot object
## Not run: # Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using clonalNetwork() clonalNetwork(scRep_example, reduction = "umap", group.by = "seurat_clusters") ## End(Not run)## Not run: # Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using clonalNetwork() clonalNetwork(scRep_example, reduction = "umap", group.by = "seurat_clusters") ## End(Not run)
View the count of clones frequency group in Seurat or SCE object
meta data after combineExpression(). The visualization
will take the new meta data variable cloneSize and
plot the number of cells with each designation using a secondary
variable, like cluster. Credit to the idea goes to Drs. Carmona
and Andreatta and their work with ProjectTIL.
clonalOccupy( sc.data, x.axis = "ident", label = TRUE, facet.by = NULL, order.by = NULL, proportion = FALSE, na.include = FALSE, export.table = NULL, palette = "inferno", exportTable = NULL, ... )clonalOccupy( sc.data, x.axis = "ident", label = TRUE, facet.by = NULL, order.by = NULL, proportion = FALSE, na.include = FALSE, export.table = NULL, palette = "inferno", exportTable = NULL, ... )
sc.data |
The single-cell object after |
x.axis |
The variable in the meta data to graph along the x.axis. |
label |
Include the number of clone in each category by x.axis variable |
facet.by |
The column header used for faceting the graph |
order.by |
A character vector defining the desired order of elements
of the |
proportion |
Convert the stacked bars into relative proportion |
na.include |
Visualize NA values or not |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
Stacked bar plot of counts of cells by clone frequency group
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using clonalOccupy clonalOccupy(scRep_example, x.axis = "ident") table <- clonalOccupy(scRep_example, x.axis = "ident", export.table = TRUE)# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using clonalOccupy clonalOccupy(scRep_example, x.axis = "ident") table <- clonalOccupy(scRep_example, x.axis = "ident", export.table = TRUE)
This functions allows for the calculation and visualizations of
various overlap metrics for clones. The methods include overlap
coefficient (overlap), Morisita's overlap index
(morisita), Jaccard index (jaccard), cosine
similarity (cosine) or the exact number of clonal
overlap (raw).
clonalOverlap( input.data, clone.call = NULL, method = c("overlap", "morisita", "jaccard", "cosine", "raw"), chain = "both", group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalOverlap( input.data, clone.call = NULL, method = c("overlap", "morisita", "jaccard", "cosine", "raw"), chain = "both", group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
method |
The method to calculate the |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
The formulas for the indices are as follows:
Overlap Coefficient:
Raw Count Overlap:
Morisita Index:
Jaccard Index:
Cosine Similarity:
Where:
and are the abundances of species in groups A and B, respectively.
A ggplot object visualizing clonal overlap or a data.frame if
exportTable = TRUE.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalOverlap() clonalOverlap(combined, clone.call = "aa", method = "jaccard")# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalOverlap() clonalOverlap(combined, clone.call = "aa", method = "jaccard")
This function allows the user to visualize the clonal expansion by overlaying the cells with specific clonal frequency onto the dimensional reduction plots in Seurat. Credit to the idea goes to Drs Andreatta and Carmona and their work with ProjectTIL.
clonalOverlay( sc.data, reduction = NULL, cut.category = "clonalFrequency", cutpoint = 30, bins = 25, pt.size = 0.5, pt.alpha = 1, facet.by = NULL, ... )clonalOverlay( sc.data, reduction = NULL, cut.category = "clonalFrequency", cutpoint = 30, bins = 25, pt.size = 0.5, pt.alpha = 1, facet.by = NULL, ... )
sc.data |
The single-cell object after |
reduction |
The dimensional reduction to visualize. |
cut.category |
Meta data variable of the single-cell object to use for filtering. |
cutpoint |
The overlay cut point to include, this corresponds to the cut.category variable in the meta data of the single-cell object. |
bins |
The number of contours to the overlay |
pt.size |
The point size for plotting (default is 0.5) |
pt.alpha |
The alpha value for plotting (default is 1) |
facet.by |
meta data variable to facet the comparison |
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing distributions of clones along a dimensional reduction within the single-cell object
Francesco Mazziotta, Nick Borcherding
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using clonalOverlay() clonalOverlay(scRep_example, reduction = "umap", cutpoint = 3, bins = 5)# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using clonalOverlay() clonalOverlay(scRep_example, reduction = "umap", cutpoint = 3, bins = 5)
This function calculates the relative clonal space occupied by the
clones. The grouping of these clones is based on the parameter
clonalSplit, at default, clonalSplit will group the clones
into bins of 1:10, 11:100, 101:1001, etc. To adjust the clones
selected, change the numbers in the variable split. If a matrix output
for the data is preferred, set exportTable = TRUE.
clonalProportion( input.data, clonal.split = NULL, clone.call = NULL, chain = "both", group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", clonalSplit = NULL, cloneCall = NULL, exportTable = NULL, ... )clonalProportion( input.data, clonal.split = NULL, clone.call = NULL, chain = "both", group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", clonalSplit = NULL, cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clonal.split |
The cut points for the specific clones, default = c(10, 100, 1000, 10000, 30000, 1e+05) |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
clonalSplit |
|
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object dividing space occupied by ranks of clones or a
data.frame if exportTable = TRUE.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalProportion() clonalProportion(combined, clone.call = "gene")# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalProportion() clonalProportion(combined, clone.call = "gene")
This function quantifies unique clones. The unique clones can be either reported as a raw output or scaled to the total number of clones recovered using the scale parameter.
clonalQuant( input.data, clone.call = NULL, chain = "both", scale = FALSE, group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalQuant( input.data, clone.call = NULL, chain = "both", scale = FALSE, group.by = NULL, order.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
scale |
Converts the graphs into percentage of unique clones |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing the total or relative number of clones
or a data.frame if exportTable = TRUE.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalQuant() clonalQuant(combined, clone.call="strict", scale = TRUE)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalQuant() clonalQuant(combined, clone.call="strict", scale = TRUE)
This functions uses the Hill numbers of order q: species richness (q = 0),
Shannon diversity (q = 1), the exponential of Shannon entropy and Simpson
diversity (q = 2, the inverse of Simpson concentration) to compute diversity
estimates for rarefaction and extrapolation. The function relies on the
iNEXT::iNEXT() R package. Please read and cite the
manuscript
if using this function. The input into the iNEXT calculation is abundance,
incidence-based calculations are not supported.
clonalRarefaction( input.data, clone.call = NULL, chain = "both", group.by = NULL, plot.type = 1, hill.numbers = 0, n.boots = 20, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalRarefaction( input.data, clone.call = NULL, chain = "both", group.by = NULL, plot.type = 1, hill.numbers = 0, n.boots = 20, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
plot.type |
sample-size-based rarefaction/extrapolation curve
( |
hill.numbers |
The Hill numbers to be plotted out (0 - species richness, 1 - Shannon, 2 - Simpson) |
n.boots |
The number of bootstrap replicates used to derive confidence intervals for the diversity estimates. More replicates can provide a more reliable measure of statistical variability. |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalRarefaction() clonalRarefaction(combined[c(1,2)], clone.call = "gene", n.boots = 3)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalRarefaction() clonalRarefaction(combined[c(1,2)], clone.call = "gene", n.boots = 3)
This function produces a scatter plot directly comparing the specific clones between two samples. The clones will be categorized by counts into singlets or expanded, either exclusive or shared between the selected samples.
clonalScatter( input.data, clone.call = NULL, x.axis = NULL, y.axis = NULL, chain = "both", dot.size = "total", group.by = NULL, graph = "proportion", export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalScatter( input.data, clone.call = NULL, x.axis = NULL, y.axis = NULL, chain = "both", dot.size = "total", group.by = NULL, graph = "proportion", export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
x.axis |
name of the list element to appear on the x.axis. |
y.axis |
name of the list element to appear on the y.axis. |
chain |
The TCR/BCR chain to use. Use |
dot.size |
either total or the name of the list element to use for size of dots. |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
graph |
graph either the clonal "proportion" or "count". |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing clonal dynamics between two groupings or
a data.frame if exportTable = TRUE.
#Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalScatter() clonalScatter(combined, x.axis = "P17B", y.axis = "P17L", graph = "proportion")#Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalScatter() clonalScatter(combined, x.axis = "P17B", y.axis = "P17L", graph = "proportion")
This function produces a hierarchical clustering of clones by sample using discrete gamma-GPD spliced threshold model. If using this model please read and cite powerTCR (more info available at PMID: 30485278).
clonalSizeDistribution( input.data, clone.call = NULL, chain = "both", method = "ward.D2", threshold = 1, group.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )clonalSizeDistribution( input.data, clone.call = NULL, chain = "both", method = "ward.D2", threshold = 1, group.by = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
method |
The clustering parameter for the dendrogram. |
threshold |
Numerical vector containing the thresholds the grid search was performed over. |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
The probability density function (pdf) for the Generalized Pareto Distribution (GPD) is given by:
Where:
is a location parameter
is a scale parameter
is a shape parameter
if and if
The probability density function (pdf) for the Gamma Distribution is given by:
Where:
is the shape parameter
is the scale parameter
is the gamma function of
A ggplot object visualizing dendrogram of clonal size distribution
or a data.frame if exportTable = TRUE.
Hillary Koch
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalSizeDistribution() clonalSizeDistribution(combined, clone.call = "strict", method="ward.D2")# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using clonalSizeDistribution() clonalSizeDistribution(combined, clone.call = "strict", method="ward.D2")
This function consolidates a list of BCR sequencing results to the level
of the individual cell barcodes. Using the samples and ID parameters,
the function will add the strings as prefixes to prevent issues with
repeated barcodes. The resulting new barcodes will need to match the
Seurat or SCE object in order to use, combineExpression(). Unlike
combineTCR(), combineBCR produces a column CTstrict based on the
edit distance clustering from clonalCluster(). The CTstrict column
is formatted as Heavy_Light (underscore-separated) for downstream
compatibility. Connected clones are labeled with cluster.X, while
unconnected clones (singlets) are labeled with the V gene and CDR3
sequence (e.g., IGHV3-64.CAKSYS..._IGKV3-15.CQQYSN...).
combineBCR( input.data, samples = NULL, ID = NULL, chain = "IGH", sequence = "nt", dist.type = NULL, dist.mat = NULL, normalize = "length", gap.open = NULL, gap.extend = NULL, call.related.clones = TRUE, group.by = NULL, threshold = 0.85, cluster.method = "components", use.V = TRUE, use.J = TRUE, remove.na = NULL, remove.multi = NULL, filter.multi = NULL, filter.nonproductive = NULL, removeNA = NULL, removeMulti = NULL, filterMulti = NULL, filterNonproductive = NULL, dist_type = NULL, dist_mat = NULL, gap_open = NULL, gap_extend = NULL )combineBCR( input.data, samples = NULL, ID = NULL, chain = "IGH", sequence = "nt", dist.type = NULL, dist.mat = NULL, normalize = "length", gap.open = NULL, gap.extend = NULL, call.related.clones = TRUE, group.by = NULL, threshold = 0.85, cluster.method = "components", use.V = TRUE, use.J = TRUE, remove.na = NULL, remove.multi = NULL, filter.multi = NULL, filter.nonproductive = NULL, removeNA = NULL, removeMulti = NULL, filterMulti = NULL, filterNonproductive = NULL, dist_type = NULL, dist_mat = NULL, gap_open = NULL, gap_extend = NULL )
input.data |
List of filtered contig annotations or outputs from
|
samples |
A character vector of sample labels. Must be the same length as the input list. |
ID |
An optional character vector for additional sample identifiers. |
chain |
The chain to use for clustering when |
sequence |
The sequence type ( |
dist.type |
The distance metric to use. Options: |
dist.mat |
The substitution matrix to use for alignment-based metrics
( |
normalize |
Method for normalizing distances. Options: |
gap.open |
Penalty for opening a gap in alignment metrics (default: -10). |
gap.extend |
Penalty for extending a gap in alignment metrics (default: -1). |
call.related.clones |
Logical. If |
group.by |
The column header used for to group clones. If ('NULL“), clusters will be calculated across samples. |
threshold |
The similarity threshold passed to |
cluster.method |
The clustering algorithm to use. Defaults to |
use.V |
Logical. If |
use.J |
Logical. If |
remove.na |
This will remove any chain without values. |
remove.multi |
Logical. If |
filter.multi |
Logical. If |
filter.nonproductive |
Logical. If |
removeNA |
|
removeMulti |
|
filterMulti |
|
filterNonproductive |
|
dist_type |
|
dist_mat |
|
gap_open |
|
gap_extend |
A list of data frames, where each data frame represents a sample. Each row corresponds to a unique cell barcode, with columns detailing the BCR chains and the assigned clone ID.
# Data derived from the 10x Genomics intratumoral NSCLC B cells BCR <- read.csv("https://www.borch.dev/uploads/contigs/b_contigs.csv") combined <- combineBCR(BCR, samples = "Patient1", threshold = 0.85)# Data derived from the 10x Genomics intratumoral NSCLC B cells BCR <- read.csv("https://www.borch.dev/uploads/contigs/b_contigs.csv") combined <- combineBCR(BCR, samples = "Patient1", threshold = 0.85)
This function adds the immune receptor information to the Seurat or
SCE object to the meta data. By default this function also calculates
the frequencies and proportion of the clones by sequencing
run (group.by = NULL). To change how the frequencies/proportions
are calculated, select a column header for the group.by variable.
Importantly, before using combineExpression() ensure the
barcodes of the single-cell object object match the barcodes in the output
of the combineTCR() or combineBCR().
combineExpression( input.data, sc.data, clone.call = NULL, chain = "both", group.by = NULL, proportion = TRUE, filter.na = NULL, clone.size = NULL, add.label = NULL, cloneCall = NULL, cloneSize = NULL, filterNA = NULL, addLabel = NULL )combineExpression( input.data, sc.data, clone.call = NULL, chain = "both", group.by = NULL, proportion = TRUE, filter.na = NULL, clone.size = NULL, add.label = NULL, cloneCall = NULL, cloneSize = NULL, filterNA = NULL, addLabel = NULL )
input.data |
The product of |
sc.data |
The Seurat or Single-Cell Experiment (SCE) object to attach |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in lists to group the analysis
by (e.g., "sample", "treatment"). If |
proportion |
Whether to proportion ( |
filter.na |
Method to subset Seurat/SCE object of barcodes without clone information |
clone.size |
The bins for the grouping based on proportion or frequency.
If proportion is |
add.label |
This will add a label to the frequency header, allowing the user to try multiple group.by variables or recalculate frequencies after subsetting the data. |
cloneCall |
|
cloneSize |
|
filterNA |
|
addLabel |
Single-cell object with clone information added to meta data information
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example)# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example)
This function consolidates a list of TCR sequencing results to
the level of the individual cell barcodes. Using the samples and
ID parameters, the function will add the strings as prefixes to
prevent issues with repeated barcodes. The resulting new barcodes will
need to match the Seurat or SCE object in order to use,
combineExpression(). Several levels of filtering exist -
remove.na, remove.multi, or filter.multi are parameters
that control how the function deals with barcodes with multiple chains
recovered.
combineTCR( input.data, samples = NULL, ID = NULL, remove.na = NULL, remove.multi = NULL, filter.multi = NULL, filter.nonproductive = NULL, removeNA = NULL, removeMulti = NULL, filterMulti = NULL, filterNonproductive = NULL )combineTCR( input.data, samples = NULL, ID = NULL, remove.na = NULL, remove.multi = NULL, filter.multi = NULL, filter.nonproductive = NULL, removeNA = NULL, removeMulti = NULL, filterMulti = NULL, filterNonproductive = NULL )
input.data |
List of filtered contig annotations or
outputs from |
samples |
The labels of samples (recommended). |
ID |
The additional sample labeling (optional). |
remove.na |
This will remove any chain without values. |
remove.multi |
This will remove barcodes with greater than 2 chains. |
filter.multi |
This option will allow for the selection of the 2 corresponding chains with the highest expression for a single barcode. |
filter.nonproductive |
This option will allow for the removal of nonproductive chains if the variable exists in the contig data. Default is set to TRUE to remove nonproductive contigs. |
removeNA |
|
removeMulti |
|
filterMulti |
|
filterNonproductive |
List of clones for individual cell barcodes
combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L"))combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L"))
A list of 8 filtered_contig_annotations.csv files
outputted from 10X Cell Ranger. More information on the
data can be found in the following
manuscript.
This function reprocess and forms a list of contigs for downstream analysis
in scRepertoire, createHTOContigList() take the filtered contig
annotation output and the single-cell RNA object to create the list.
If using an integrated single-cell object, it is recommended to split the
object by sequencing run and remove extra prefixes and suffixes on the
barcode before using createHTOContigList(). Alternatively,
the variable multi.run can be used to separate a list of contigs
by a meta data variable. This may have issues with the repeated barcodes.
createHTOContigList(contig, sc.data, group.by = NULL, multi.run = NULL)createHTOContigList(contig, sc.data, group.by = NULL, multi.run = NULL)
contig |
The filtered contig annotation file from multiplexed experiment |
sc.data |
The Seurat or Single-Cell Experiment object. |
group.by |
One or more meta data headers to create the contig list based on. If more than one header listed, the function combines them into a single variable. |
multi.run |
If using integrated single-cell object, the meta data variable that indicates the sequencing run. |
Returns a list of contigs as input for combineBCR()
or combineTCR()
## Not run: filtered.contig <- read.csv(".../Sample/outs/filtered_contig_annotations.csv") contig.list <- createHTOContigList(contig = filtered.contig, sc.data = Seurat.Obj, group.by = "HTO_maxID") ## End(Not run)## Not run: filtered.contig <- read.csv(".../Sample/outs/filtered_contig_annotations.csv") contig.list <- createHTOContigList(contig = filtered.contig, sc.data = Seurat.Obj, group.by = "HTO_maxID") ## End(Not run)
Exports clonal information (gene sequences, amino acids, nucleotides) from scRepertoire objects into a file or a data frame. The output format can be tailored for compatibility with different analysis workflows.
exportClones( input.data, format = "paired", group.by = NULL, write.file = TRUE, dir = NULL, file.name = "clones.csv" )exportClones( input.data, format = "paired", group.by = NULL, write.file = TRUE, dir = NULL, file.name = "clones.csv" )
input.data |
The product of |
format |
The format for exporting clones.
Options are: |
group.by |
The variable in the metadata to use for grouping. If |
write.file |
If |
dir |
The directory where the output file will be saved. Defaults to the current working directory. |
file.name |
The name of the file to be saved. |
The format parameter determines the structure of the output:
paired: Exports a data frame where each row represents a barcode,
with paired chain information (amino acid, nucleotide, genes) in separate
columns.
airr: Exports a data frame that adheres to the Adaptive Immune
Receptor Repertoire (AIRR) Community format, with each row representing
a single receptor chain.
TCRMatch: Exports a data frame specifically for the TCRMatch
algorithm, containing the TRB chain amino acid sequence and clonal
frequency.
tcrpheno: Exports a data frame compatible with the tcrpheno
pipeline, with TRA and TRB chains in separate columns.
immunarch: Exports a list containing a data frame and metadata
formatted for use with the immunarch package.
A data frame or list in the specified format, either returned to the R environment or saved as a CSV file.
Jonathan Noonan, Nick Borcherding
## Not run: #Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B", "P19L", "P20B", "P20L")) # Export as a paired data frame and save to a file exportClones(combined, format = "paired", file.name = "paired_clones.csv") # Return an AIRR-formatted data frame to the environment airr_df <- exportClones(combined, format = "airr", write.file = FALSE) ## End(Not run)## Not run: #Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B", "P19L", "P20B", "P20L")) # Export as a paired data frame and save to a file exportClones(combined, format = "paired", file.name = "paired_clones.csv") # Return an AIRR-formatted data frame to the environment airr_df <- exportClones(combined, format = "airr", write.file = FALSE) ## End(Not run)
This function will take the meta data from the product of
combineExpression() and generate a relational data frame to
be used for a chord diagram. Each chord will represent the number of
clones unique and shared across the multiple group.by variable.
If using the downstream circlize R package, please read and cite the
following manuscript.
If looking for more advanced ways for circular visualizations, there
is a great cookbook
for the circlize package.
getCirclize( sc.data, clone.call = NULL, group.by = NULL, method = c("unique", "abundance", "jaccard", "overlap"), proportion = FALSE, symmetric = TRUE, include.self = TRUE, include.metadata = FALSE, min.shared = 0, top.links = NULL, filter.sectors = NULL, palette = "inferno", cloneCall = NULL )getCirclize( sc.data, clone.call = NULL, group.by = NULL, method = c("unique", "abundance", "jaccard", "overlap"), proportion = FALSE, symmetric = TRUE, include.self = TRUE, include.metadata = FALSE, min.shared = 0, top.links = NULL, filter.sectors = NULL, palette = "inferno", cloneCall = NULL )
sc.data |
The single-cell object after |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
group.by |
A column header (or vector of column headers for hierarchical
grouping) in the metadata to group the analysis by (e.g., "sample", "treatment").
If |
method |
The method for calculating link values: |
proportion |
Calculate the relationship by unique clones ( |
symmetric |
If |
include.self |
Include counting the clones within a single group.by comparison. |
include.metadata |
If |
min.shared |
Minimum number of shared clones to include a link (default 0). |
top.links |
Keep only the top N links by value. If |
filter.sectors |
Character vector of sectors to include. If |
palette |
Colors to use for sector color suggestions - input any hcl.pals. |
cloneCall |
A data frame of shared clones between groups formatted for
chordDiagram. If include.metadata = TRUE, returns
a list with links (the edge data frame), sectors (sector-level statistics),
and colors (suggested colors for each sector).
Dillon Corvino, Nick Borcherding
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) scRep_example <- combineExpression(combined, scRep_example) # Getting data frame output for Circlize circles <- getCirclize(scRep_example, group.by = "seurat_clusters") # Multi-level grouping for hierarchical chord diagrams scRep_example$Patient <- substring(scRep_example$orig.ident, 1, 3) circles <- getCirclize(scRep_example, group.by = c("Patient", "seurat_clusters")) # Get rich output with sector metadata result <- getCirclize(scRep_example, group.by = "seurat_clusters", include.metadata = TRUE)# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) scRep_example <- combineExpression(combined, scRep_example) # Getting data frame output for Circlize circles <- getCirclize(scRep_example, group.by = "seurat_clusters") # Multi-level grouping for hierarchical chord diagrams scRep_example$Patient <- substring(scRep_example$orig.ident, 1, 3) circles <- getCirclize(scRep_example, group.by = c("Patient", "seurat_clusters")) # Get rich output with sector metadata result <- getCirclize(scRep_example, group.by = "seurat_clusters", include.metadata = TRUE)
This function identifies potential doublets by finding common barcodes between TCR and BCR outputs. It extracts unique barcodes from each list of dataframes, finds the intersection of the barcodes, and joins the resulting data.
getContigDoublets(tcrOutput, bcrOutput)getContigDoublets(tcrOutput, bcrOutput)
tcrOutput |
Output of |
bcrOutput |
Output of |
A dataframe of barcodes that exist in both the TCR and BCR data, with
columns from both sets of data. There will be an additional column
contigType of type factor with levels 'TCR' and 'BCR' indicating the
origin of the contig - this will be the new first column.
If there are no doublets, the returned data.frame will have the same colnames but no rows.
Use a specific clonal sequence to highlight on top of the dimensional reduction in single-cell object.
highlightClones(sc.data, clone.call = NULL, sequence = NULL, cloneCall = NULL)highlightClones(sc.data, clone.call = NULL, sequence = NULL, cloneCall = NULL)
sc.data |
The single-cell object to attach after
|
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
sequence |
The specific sequence or sequence to highlight |
cloneCall |
Single-cell object object with new meta data column for indicated clones
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using highlightClones() scRep_example <- highlightClones(scRep_example, clone.call= "aa", sequence = c("CVVSDNTGGFKTIF_CASSVRRERANTGELFF"))# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) # Using combineExpresion() scRep_example <- combineExpression(combined, scRep_example) # Using highlightClones() scRep_example <- highlightClones(scRep_example, clone.call= "aa", sequence = c("CVVSDNTGGFKTIF_CASSVRRERANTGELFF"))
This function loads and processes contig data from various single-cell
immune receptor sequencing formats. It reads data from a directory
(recursively) or from an already loaded list/data frame, transforms it to a
common structure, and returns a list of contigs ready for downstream analysis
with combineTCR() or combineBCR().
Supported file formats and their expected file names:
10X: "filtered_contig_annotations.csv"
AIRR: "airr_rearrangement.tsv"
BD: "Contigs_AIRR.tsv"
Dandelion: "all_contig_dandelion.tsv"
Immcantation: "_data.tsv" (or similar)
“JSON': ".json"
ParseBio: "barcode_report.tsv"
MiXCR: "clones.tsv"
TRUST4: "barcode_report.tsv"
WAT3R: "barcode_results.csv"
loadContigs(input, format = "10X")loadContigs(input, format = "10X")
input |
A directory path containing contig files or a list/data frame of pre-loaded contig data. |
format |
A string specifying the data format. Must be one of:
|
A list of contigs formatted for use with combineTCR() or
combineBCR(). Rows containing only NA values (aside from the barcode)
are dropped.
TRUST4 <- read.csv("https://www.borch.dev/uploads/contigs/TRUST4_contigs.csv") contig.list <- loadContigs(TRUST4, format = "TRUST4") BD <- read.csv("https://www.borch.dev/uploads/contigs/BD_contigs.csv") contig.list <- loadContigs(BD, format = "BD") WAT3R <- read.csv("https://www.borch.dev/uploads/contigs/WAT3R_contigs.csv") contig.list <- loadContigs(WAT3R, format = "WAT3R")TRUST4 <- read.csv("https://www.borch.dev/uploads/contigs/TRUST4_contigs.csv") contig.list <- loadContigs(TRUST4, format = "TRUST4") BD <- read.csv("https://www.borch.dev/uploads/contigs/BD_contigs.csv") contig.list <- loadContigs(BD, format = "BD") WAT3R <- read.csv("https://www.borch.dev/uploads/contigs/WAT3R_contigs.csv") contig.list <- loadContigs(WAT3R, format = "WAT3R")
This function the proportion of amino acids along the residues of the CDR3 amino acid sequence.
percentAA( input.data, chain = "TRB", group.by = NULL, order.by = NULL, aa.length = 20, export.table = NULL, palette = "inferno", exportTable = NULL, ... )percentAA( input.data, chain = "TRB", group.by = NULL, order.by = NULL, aa.length = 20, export.table = NULL, palette = "inferno", exportTable = NULL, ... )
input.data |
The product of |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
aa.length |
The maximum length of the CDR3 amino acid sequence. |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object visualizing amino acid by proportion or a data.frame if
exportTable = TRUE.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using percentAA() percentAA(combined, chain = "TRB", aa.length = 20)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using percentAA() percentAA(combined, chain = "TRB", aa.length = 20)
This function quantifies and visualizes the usage of V, D, or J genes, or gene pairings across T or B cell clones.
percentGeneUsage( input.data, chain = "TRB", genes = "TRBV", group.by = NULL, order.by = NULL, summary.fun = c("percent", "proportion", "count"), plot.type = "heatmap", export.table = NULL, palette = "inferno", exportTable = NULL, ... ) vizGenes( input.data, x.axis = "TRBV", y.axis = NULL, group.by = NULL, plot = "heatmap", order.by = NULL, summary.fun = c("percent", "proportion", "count"), export.table = NULL, palette = "inferno", exportTable = NULL ) percentGenes( input.data, chain = "TRB", gene = "Vgene", group.by = NULL, order.by = NULL, export.table = NULL, summary.fun = c("percent", "proportion", "count"), palette = "inferno", exportTable = NULL ) percentVJ( input.data, chain = "TRB", group.by = NULL, order.by = NULL, summary.fun = c("percent", "proportion", "count"), export.table = NULL, palette = "inferno", exportTable = NULL )percentGeneUsage( input.data, chain = "TRB", genes = "TRBV", group.by = NULL, order.by = NULL, summary.fun = c("percent", "proportion", "count"), plot.type = "heatmap", export.table = NULL, palette = "inferno", exportTable = NULL, ... ) vizGenes( input.data, x.axis = "TRBV", y.axis = NULL, group.by = NULL, plot = "heatmap", order.by = NULL, summary.fun = c("percent", "proportion", "count"), export.table = NULL, palette = "inferno", exportTable = NULL ) percentGenes( input.data, chain = "TRB", gene = "Vgene", group.by = NULL, order.by = NULL, export.table = NULL, summary.fun = c("percent", "proportion", "count"), palette = "inferno", exportTable = NULL ) percentVJ( input.data, chain = "TRB", group.by = NULL, order.by = NULL, summary.fun = c("percent", "proportion", "count"), export.table = NULL, palette = "inferno", exportTable = NULL )
input.data |
The product of |
chain |
The TCR/BCR chain to use. Accepted values: |
genes |
A character vector specifying the gene loci to analyze. Can be a single gene e.g., "TRBV" or "IGHJ" or a pair for genes analysis (e.g., c("TRBV", "TRAV"), or "TRBV", "TRBJ"). |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
summary.fun |
Character string choosing the summary statistic -
|
plot.type |
The type of plot to return: |
export.table |
If |
palette |
Colors to use in visualization - input any |
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
x.axis |
Gene segments to separate the x-axis, such as |
y.axis |
Variable to separate the y-axis, can be both categorical
or other gene gene segments, such as |
plot |
The type of plot to return - heatmap or barplot. |
gene |
|
A ggplot object displaying a heatmap or bar plot of gene usage.
If exportTable = TRUE, a matrix or data frame of the raw data is returned.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Visualize single gene (TRBV) usage as a heatmap, grouped by sample percentGeneUsage(combined, genes = "TRBV", group.by = "sample", plot.type = "heatmap", summary.fun = "percent") # Visualize single gene (TRBV) usage as a barplot, grouped by sample percentGeneUsage(combined, genes = "TRBV", group.by = "sample", plot.type = "barplot", summary.fun = "count") # Visualize paired gene (TRBV-TRBJ) usage as a heatmap percentGeneUsage(combined[1:2], genes = c("TRBV", "TRBJ"), group.by = "sample", plot.type = "heatmap", summary.fun = "proportion") # Export the raw data table for single gene usage trbv_usage_table <- percentGeneUsage(combined, genes = "TRBV", group.by = "sample", export.table = TRUE, summary.fun = "count") # Export the raw data table for paired gene usage trbv_trbj_usage_table <- percentGeneUsage(combined, genes = c("TRBV", "TRBJ"), group.by = "sample", export.table = TRUE, summary.fun = "percent") # Visualize paired gene (TRAV-TRAJ) usage as a heatmap vizGenes(combined[1:2], x.axis = "TRAV", y.axis = "TRAJ", group.by = "sample", summary.fun = "count") # Visualize cross-chain gene pairing (TRBV-TRAV) vizGenes(combined[1:2], x.axis = "TRBV", y.axis = "TRAV", group.by = "sample", summary.fun = "percent") # Quantify and visualize TRA V-gene usage as a heatmap percentGenes(combined, chain = "TRA", gene = "Vgene", group.by = "sample", summary.fun = "percent") # Quantify TRA J-gene usage and export the table traj_usage_table <- percentGenes(combined, chain = "TRA", gene = "Jgene", group.by = "sample", export.table = TRUE, summary.fun = "count") # Quantify and visualize TRB V-J gene pairings as a heatmap percentVJ(combined[1:2], chain = "TRB", group.by = "sample", summary.fun = "percent") # 2. Quantify TRA V-J gene pairings and export the table trav_traj_table <- percentVJ(combined, chain = "TRA", group.by = "sample", export.table = TRUE, summary.fun = "proportion")# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Visualize single gene (TRBV) usage as a heatmap, grouped by sample percentGeneUsage(combined, genes = "TRBV", group.by = "sample", plot.type = "heatmap", summary.fun = "percent") # Visualize single gene (TRBV) usage as a barplot, grouped by sample percentGeneUsage(combined, genes = "TRBV", group.by = "sample", plot.type = "barplot", summary.fun = "count") # Visualize paired gene (TRBV-TRBJ) usage as a heatmap percentGeneUsage(combined[1:2], genes = c("TRBV", "TRBJ"), group.by = "sample", plot.type = "heatmap", summary.fun = "proportion") # Export the raw data table for single gene usage trbv_usage_table <- percentGeneUsage(combined, genes = "TRBV", group.by = "sample", export.table = TRUE, summary.fun = "count") # Export the raw data table for paired gene usage trbv_trbj_usage_table <- percentGeneUsage(combined, genes = c("TRBV", "TRBJ"), group.by = "sample", export.table = TRUE, summary.fun = "percent") # Visualize paired gene (TRAV-TRAJ) usage as a heatmap vizGenes(combined[1:2], x.axis = "TRAV", y.axis = "TRAJ", group.by = "sample", summary.fun = "count") # Visualize cross-chain gene pairing (TRBV-TRAV) vizGenes(combined[1:2], x.axis = "TRBV", y.axis = "TRAV", group.by = "sample", summary.fun = "percent") # Quantify and visualize TRA V-gene usage as a heatmap percentGenes(combined, chain = "TRA", gene = "Vgene", group.by = "sample", summary.fun = "percent") # Quantify TRA J-gene usage and export the table traj_usage_table <- percentGenes(combined, chain = "TRA", gene = "Jgene", group.by = "sample", export.table = TRUE, summary.fun = "count") # Quantify and visualize TRB V-J gene pairings as a heatmap percentVJ(combined[1:2], chain = "TRB", group.by = "sample", summary.fun = "percent") # 2. Quantify TRA V-J gene pairings and export the table trav_traj_table <- percentVJ(combined, chain = "TRA", group.by = "sample", export.table = TRUE, summary.fun = "proportion")
This function calculates and visualizes the frequency of k-mer motifs for either nucleotide (nt) or amino acid (aa) sequences. It produces a heatmap showing the relative composition of the most variable motifs across samples or groups.
percentKmer( input.data, chain = "TRB", clone.call = NULL, group.by = NULL, order.by = NULL, motif.length = 3, min.depth = 3, top.motifs = 30, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )percentKmer( input.data, chain = "TRB", clone.call = NULL, group.by = NULL, order.by = NULL, motif.length = 3, min.depth = 3, top.motifs = 30, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
input.data |
The product of |
chain |
The TCR/BCR chain to use. Use |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
motif.length |
The length of the kmer to analyze |
min.depth |
Minimum count a motif must reach to be retained in the
output ( |
top.motifs |
Return the n most variable motifs as a function of median absolute deviation |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
The function first calculates k-mer frequencies for each sample/group. By default, it then identifies the 30 most variable motifs based on the Median Absolute Deviation (MAD) across all samples and displays their frequencies in a heatmap.
A ggplot object displaying a heatmap of motif percentages.
If exportTable = TRUE, a matrix of the raw data is returned.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using percentKmer() percentKmer(combined, chain = "TRB", motif.length = 3)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using percentKmer() percentKmer(combined, chain = "TRB", motif.length = 3)
This function the diversity amino acids along the residues of the CDR3
amino acid sequence. Please see clonalDiversity() for more information
on the underlying methods for diversity/entropy calculations. Positions
without variance will have a value reported as 0 for the purposes of comparison.
positionalEntropy( input.data, chain = "TRB", group.by = NULL, order.by = NULL, aa.length = 20, method = "norm.entropy", export.table = NULL, palette = "inferno", exportTable = NULL, ... )positionalEntropy( input.data, chain = "TRB", group.by = NULL, order.by = NULL, aa.length = 20, method = "norm.entropy", export.table = NULL, palette = "inferno", exportTable = NULL, ... )
input.data |
The product of |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
aa.length |
The maximum length of the CDR3 amino acid sequence. |
method |
The method to calculate the entropy/diversity -
|
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
A ggplot object displaying entropy or diversity by amino acid position.
If exportTable = TRUE, a matrix of the raw data is returned.
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using positionalEntropy() positionalEntropy(combined, chain = "TRB", aa.length = 20)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using positionalEntropy() positionalEntropy(combined, chain = "TRB", aa.length = 20)
This function analyzes the physicochemical properties of amino acids at each position along the CDR3 sequence. It calculates the mean property value and the 95% confidence interval for each position across one or more groups, visualizing the results as a line plot with a confidence ribbon.
positionalProperty( input.data, chain = "TRB", group.by = NULL, order.by = NULL, aa.length = 20, method = "atchleyFactors", export.table = NULL, palette = "inferno", exportTable = NULL, ... )positionalProperty( input.data, chain = "TRB", group.by = NULL, order.by = NULL, aa.length = 20, method = "atchleyFactors", export.table = NULL, palette = "inferno", exportTable = NULL, ... )
input.data |
The product of |
chain |
The TCR/BCR chain to use. Use |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
order.by |
A character vector defining the desired order of elements
of the |
aa.length |
The maximum length of the CDR3 amino acid sequence. |
method |
Character string (one of the supported names)
Defaults to |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals |
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
The function uses one of several established physicochemical property scales to convert amino acid sequences into numerical vectors. More information for the individual methods can be found at the following citations:
atchleyFactors: citation
crucianiProperties: citation
FASGAI: citation
kideraFactors: citation
MSWHIM: citation
ProtFP: citation
stScales: citation
tScales: citation
VHSE: citation
zScales: citation
A ggplot object displaying property by amino acid position.
If exportTable = TRUE, a matrix of the raw data is returned.
Florian Bach, Nick Borcherding
# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using positionalProperty() positionalProperty(combined, chain = "TRB", method = "atchleyFactors", aa.length = 20)# Making combined contig data combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Using positionalProperty() positionalProperty(combined, chain = "TRB", method = "atchleyFactors", aa.length = 20)
Most single-cell workflows use highly-expressed and highly-variable genes for the initial calculation of PCA and subsequent dimensional reduction. This function will remove the TCR and/or BCR genes from the variable features in a Seurat object or from a vector (potentially generated by the Bioconductor scran workflow).
quietVDJgenes(input.data, ...) quietTCRgenes(input.data, ...) ## Default S3 method: quietTCRgenes(input.data, ...) ## S3 method for class 'Seurat' quietTCRgenes(input.data, assay = NULL, ...) quietBCRgenes(input.data, ...) ## Default S3 method: quietBCRgenes(input.data, ...) ## S3 method for class 'Seurat' quietBCRgenes(input.data, assay = NULL, ...)quietVDJgenes(input.data, ...) quietTCRgenes(input.data, ...) ## Default S3 method: quietTCRgenes(input.data, ...) ## S3 method for class 'Seurat' quietTCRgenes(input.data, assay = NULL, ...) quietBCRgenes(input.data, ...) ## Default S3 method: quietBCRgenes(input.data, ...) ## S3 method for class 'Seurat' quietBCRgenes(input.data, assay = NULL, ...)
input.data |
Single-cell object in Seurat format or vector of variable genes to use in reduction |
... |
Reserved for future arguments |
assay |
The Seurat assay slot to use to remove immune receptor genes from, NULL value will default to the default assay |
Seurat object or vector list with TCR genes removed.
Nicky de Vrij, Nikolaj Pagh, Nick Borcherding, Qile Yang
example <- quietVDJgenes(scRep_example) scRep <- quietTCRgenes(scRep_example) ibex_example <- quietBCRgenes(scRep_example)example <- quietVDJgenes(scRep_example) scRep <- quietTCRgenes(scRep_example) ibex_example <- quietBCRgenes(scRep_example)
The object is compatible with contig_list and the TCR
sequencing data can be added with combineExpression. The data is
from 4 patients with acute respiratory distress, with samples taken
from both the lung and peripheral blood. More information on the
data can be found in the following
manuscript.
This function utilizes the STARTRAC approach to calculate T cell
diversity metrics based on the work of Zhang et al. (2018, Nature)
PMID: 30479382. It can compute
three distinct indices: clonal expansion (expa), cross-tissue migration
(migr), and state transition (tran).
StartracDiversity( sc.data, clone.call = NULL, chain = "both", index = c("expa", "migr", "tran"), type = NULL, group.by = NULL, pairwise = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )StartracDiversity( sc.data, clone.call = NULL, chain = "both", index = c("expa", "migr", "tran"), type = NULL, group.by = NULL, pairwise = NULL, export.table = NULL, palette = "inferno", cloneCall = NULL, exportTable = NULL, ... )
sc.data |
The single-cell object after |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
chain |
The TCR/BCR chain to use. Use |
index |
A character vector specifying which indices to calculate. Options: "expa", "migr", "tran". Default is all three. |
type |
The metadata variable that specifies tissue type for migration analysis. |
group.by |
A column header in the metadata or lists to group the analysis
by (e.g., "sample", "treatment"). If |
pairwise |
The metadata column to be used for pairwise comparisons.
Set to the |
export.table |
If |
palette |
Colors to use in visualization - input any hcl.pals. |
cloneCall |
|
exportTable |
|
... |
Additional arguments passed to the ggplot theme |
The function requires a type variable in the metadata, which specifies the
tissue origin or any other categorical variable for migration analysis.
Indices:
expa (Clonal Expansion): Measures the extent of clonal
proliferation within a T cell cluster. It is calculated as
1 - normalized Shannon entropy. A higher value indicates greater
expansion of a few clones.
migr (Cross-Tissue Migration): Quantifies the movement of
clonal T cells across different tissues (as defined by the type
parameter). It is based on the entropy of a clonotype's distribution
across tissues.
tran (State Transition): Measures the developmental transition of clonal T cells between different functional clusters. It is based on the entropy of a clonotype's distribution across clusters.
Pairwise Analysis:
The pairwise parameter enables the calculation of migration or transition
between specific pairs of tissues or clusters, respectively.
For migration (index = "migr"), set pairwise to the type column
(e.g., pairwise = "Type").
For transition (index = "tran"), set pairwise to "cluster".
A ggplot object visualizing STARTRAC diversity metrics or data.frame if
exportTable = TRUE.
Liangtao Zheng
# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) scRep_example <- combineExpression(combined, scRep_example) scRep_example$Patient <- substring(scRep_example$orig.ident,1,3) scRep_example$Type <- substring(scRep_example$orig.ident,4,4) # Calculate a single index (expansion) StartracDiversity(scRep_example, type = "Type", group.by = "Patient", index = "expa") # Calculate pairwise transition StartracDiversity(scRep_example, type = "Type", group.by = "Patient", index = "tran", pairwise = "cluster")# Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) scRep_example <- combineExpression(combined, scRep_example) scRep_example$Patient <- substring(scRep_example$orig.ident,1,3) scRep_example$Type <- substring(scRep_example$orig.ident,4,4) # Calculate a single index (expansion) StartracDiversity(scRep_example, type = "Type", group.by = "Patient", index = "expa") # Calculate pairwise transition StartracDiversity(scRep_example, type = "Type", group.by = "Patient", index = "tran", pairwise = "cluster")
This function allows for the subsetting of the product of
combineTCR() or combineBCR()
by the name of the individual list element.
subsetClones(input.data, name, variables = NULL)subsetClones(input.data, name, variables = NULL)
input.data |
The product of |
name |
The column header/name to use for subsetting. |
variables |
The values to subset by, must be in the in the variable. |
list of contigs that have been filtered for the name parameter
combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) subset <- subsetClones(combined, name = "sample", variables = c("P17B"))combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) subset <- subsetClones(combined, name = "sample", variables = c("P17B"))
This function creates a chord diagram visualization of shared clones between
groups using the circlize package. It provides a convenient wrapper around
getCirclize() that handles the circlize plotting code automatically.
vizCirclize( sc.data, clone.call = NULL, group.by = NULL, method = c("unique", "jaccard", "overlap"), proportion = FALSE, directional = FALSE, self.link = 1, include.self = TRUE, transparency = 0.5, link.visible = TRUE, annotate.sectors = TRUE, min.shared = 0, palette = "inferno", sector.colors = NULL, export.table = FALSE )vizCirclize( sc.data, clone.call = NULL, group.by = NULL, method = c("unique", "jaccard", "overlap"), proportion = FALSE, directional = FALSE, self.link = 1, include.self = TRUE, transparency = 0.5, link.visible = TRUE, annotate.sectors = TRUE, min.shared = 0, palette = "inferno", sector.colors = NULL, export.table = FALSE )
sc.data |
The single-cell object after |
clone.call |
Defines the clonal sequence grouping. Accepted values
are: |
group.by |
A column header (or vector of column headers for hierarchical grouping) in the metadata to group the analysis by. |
method |
The method for calculating link values: |
proportion |
Calculate the relationship by unique clones ( |
directional |
If |
self.link |
How to handle self-links. |
include.self |
Include self-links (clones within a single group). Default is |
transparency |
Transparency of the chord links (0-1). Default is |
link.visible |
If |
annotate.sectors |
If |
min.shared |
Minimum number of shared clones to include a link (default 0). |
palette |
Colors to use for sectors - input any hcl.pals. |
sector.colors |
Named vector of colors for specific sectors. Overrides palette. |
export.table |
If |
This function requires the circlize package to be installed. If circlize is not available, the function will return the data that would be used for plotting and provide instructions for manual plotting.
The chord diagram shows relationships between groups (sectors) where the width of each chord represents the number or proportion of shared clones between the connected groups.
Invisibly returns the circlize data list. If circlize is not installed
or export.table = TRUE, returns the data that would be used for plotting.
Nick Borcherding
getCirclize() for generating the underlying data
## Not run: # Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) scRep_example <- combineExpression(combined, scRep_example) # Simple chord diagram vizCirclize(scRep_example, group.by = "seurat_clusters") # Directional chord diagram with arrows vizCirclize(scRep_example, group.by = "seurat_clusters", directional = TRUE) # Multi-level grouping scRep_example$Patient <- substring(scRep_example$orig.ident, 1, 3) vizCirclize(scRep_example, group.by = c("Patient", "seurat_clusters")) ## End(Not run)## Not run: # Getting the combined contigs combined <- combineTCR(contig_list, samples = c("P17B", "P17L", "P18B", "P18L", "P19B","P19L", "P20B", "P20L")) # Getting a sample of a Seurat object scRep_example <- get(data("scRep_example")) scRep_example <- combineExpression(combined, scRep_example) # Simple chord diagram vizCirclize(scRep_example, group.by = "seurat_clusters") # Directional chord diagram with arrows vizCirclize(scRep_example, group.by = "seurat_clusters", directional = TRUE) # Multi-level grouping scRep_example$Patient <- substring(scRep_example$orig.ident, 1, 3) vizCirclize(scRep_example, group.by = c("Patient", "seurat_clusters")) ## End(Not run)