Title: | Analysis Tools for 10X V(D)J Data |
---|---|
Description: | This package provides functions for handling and analyzing immune receptor repertoire data, such as produced by the CellRanger V(D)J pipeline. This includes reading the data into R, merging it with paired single-cell data, quantifying clonotype abundances, calculating diversity metrics, and producing common plots. It implements the E-M Algorithm for clonotype assignment, along with other methods, which makes use of ambiguous cells for improved quantification. |
Authors: | Kelly Street [aut, cre] , Mercedeh Movassagh [aut] , Jill Lundell [aut] , Jared Brown [ctb], Linglin Huang [ctb], Mingzhi Ye [ctb] |
Maintainer: | Kelly Street <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.9.0 |
Built: | 2024-12-27 04:45:33 UTC |
Source: | https://github.com/bioc/VDJdive |
abundanceVDJ
creates a dot plot using ggplot that shows the
number of reads for each clonotype in each sample and labels the most
abundant clonotypes.
abundanceVDJ(x, ...) ## S4 method for signature 'clonoStats' abundanceVDJ(x, annotate = 5, title = NULL)
abundanceVDJ(x, ...) ## S4 method for signature 'clonoStats' abundanceVDJ(x, annotate = 5, title = NULL)
x |
A |
... |
additional arguments. |
annotate |
An integer that specifies how many of the most abundant clonotypes should be annotated on the plot. |
title |
Character vector with an optional title. If FALSE, no title is generated. |
Returns a ggplot
plot with a dot plot that shows the
abundance of the clonotypes in each sample. The most abundant
clonotypes are annotated on the plot and ordered from most abundant
to least abundant.
data('contigs') x <- clonoStats(contigs) abundanceVDJ(x)
data('contigs') x <- clonoStats(contigs) abundanceVDJ(x)
Matches CellRanger V(D)J data to paired data in an existing
SingleCellExperiment
object.
addVDJtoSCE(samples, sce, ...) ## S4 method for signature 'SplitDataFrameList,SingleCellExperiment' addVDJtoSCE(samples, sce, sample.names = NULL, barcode = "Barcode") ## S4 method for signature 'character,SingleCellExperiment' addVDJtoSCE(samples, sce, sample.names = names(samples), barcode = "Barcode")
addVDJtoSCE(samples, sce, ...) ## S4 method for signature 'SplitDataFrameList,SingleCellExperiment' addVDJtoSCE(samples, sce, sample.names = NULL, barcode = "Barcode") ## S4 method for signature 'character,SingleCellExperiment' addVDJtoSCE(samples, sce, sample.names = names(samples), barcode = "Barcode")
samples |
A character vector containing one or more directory names,
each corresponding to a 10X sample. Each directory should contain a file
named |
sce |
A |
... |
additional arguments. |
sample.names |
A character vector of length equal to |
barcode |
The column name from the |
Matching cell barcodes between data objects can cause problems,
because different methods have different ways of ensuring barcodes are
unique across all samples. This function and readVDJcontigs
follow the naming conventions of read10xCounts
,
where the sample index (in the samples
vector) is appended to each
cell barcode, to ensure that each barcode is unique, across all samples. If
sce
was created by a different method, such as conversion from a
Seurat
object, you may need to check the barcode naming convention.
A SingleCellExperiment
object
with an element named "contigs"
added to the colData
,
representing the V(D)J data.
# load example V(D)J data data('contigs') # make SCE object with matching barcodes and sample IDs ncells <- 24 u <- matrix(rpois(1000 * ncells, 5), ncol = ncells) barcodes <- vapply(contigs[,'barcode'], function(x){ x[1] }, 'A') samples <- vapply(contigs[,'sample'], function(x){ x[1] }, 'A') sce <- SingleCellExperiment::SingleCellExperiment(assays = list(counts = u), colData = data.frame(Barcode = barcodes, sample = samples)) sce <- addVDJtoSCE(contigs, sce) sce$contigs
# load example V(D)J data data('contigs') # make SCE object with matching barcodes and sample IDs ncells <- 24 u <- matrix(rpois(1000 * ncells, 5), ncol = ncells) barcodes <- vapply(contigs[,'barcode'], function(x){ x[1] }, 'A') samples <- vapply(contigs[,'sample'], function(x){ x[1] }, 'A') sce <- SingleCellExperiment::SingleCellExperiment(assays = list(counts = u), colData = data.frame(Barcode = barcodes, sample = samples)) sce <- addVDJtoSCE(contigs, sce) sce$contigs
barVDJ
creates a barplot using ggplot that shows the
number of reads in the sample and colors the sample in accordance to the
amount of diversity.
barVDJ(x, ...) ## S4 method for signature 'Matrix' barVDJ(x, title = NULL, legend = FALSE) ## S4 method for signature 'matrix' barVDJ(x, ...) ## S4 method for signature 'clonoStats' barVDJ(x, ...)
barVDJ(x, ...) ## S4 method for signature 'Matrix' barVDJ(x, title = NULL, legend = FALSE) ## S4 method for signature 'matrix' barVDJ(x, ...) ## S4 method for signature 'clonoStats' barVDJ(x, ...)
x |
A |
... |
additional arguments. |
title |
Character vector with an optional title. If FALSE, no title is generated. |
legend |
If TRUE, a legend will be included with the plot. If FALSE, no legend is included in the plot. |
Returns a ggplot
plot with a barplot that shows the
abundance of the clonotypes. The coloring indicates the number of cells
for each clonotype with darker colors being clonotypes with a single cell
(singletons) and lighter colors having more cells with that clonotype
(expanded clonotype).
data('contigs') x <- clonoStats(contigs) barVDJ(x)
data('contigs') x <- clonoStats(contigs) barVDJ(x)
boxVDJ
creates a box plot of the specified diversity.
boxVDJ(d, ...) ## S4 method for signature 'matrix' boxVDJ( d, sampleGroups = NULL, method = c("shannon", "simpson", "invsimpson", "chao1", "chaobunge"), title = NULL, legend = FALSE )
boxVDJ(d, ...) ## S4 method for signature 'matrix' boxVDJ( d, sampleGroups = NULL, method = c("shannon", "simpson", "invsimpson", "chao1", "chaobunge"), title = NULL, legend = FALSE )
d |
A |
... |
additional arguments. |
sampleGroups |
A |
method |
Identifies the type of diversity that is to be plotted. |
title |
Character vector with an optional title. |
legend |
If TRUE, a legend will be included with the plot. If FALSE, no legend is included in the plot. |
Returns a ggplot
plot with a box plot that shows the
diversity for each sample. A box plot is created for each of the
grouping variables. The individual diversity measures are
plotted on the box plots.
data('contigs') x <- clonoStats(contigs) d <- calculateDiversity(x) sampleGroups <- data.frame(Sample = c("sample1", "sample2"), Group = c("Cancer", "Normal")) boxVDJ(d, sampleGroups = sampleGroups, method = "shannon", title = "Shannon diversity", legend = FALSE)
data('contigs') x <- clonoStats(contigs) d <- calculateDiversity(x) sampleGroups <- data.frame(Sample = c("sample1", "sample2"), Group = c("Cancer", "Normal")) boxVDJ(d, sampleGroups = sampleGroups, method = "shannon", title = "Shannon diversity", legend = FALSE)
This function uses various methods to estimate the clonotypic diversity of samples based on a matrix of clonotype abundances (samples are columns).
calculateDiversity(x, ...) ## S4 method for signature 'clonoStats' calculateDiversity( x, methods = c("all", "nCells", "nClonotypes", "shannon", "normentropy", "invsimpson", "ginisimpson", "chao1", "chaobunge"), ... ) ## S4 method for signature 'SingleCellExperiment' calculateDiversity(x, ...)
calculateDiversity(x, ...) ## S4 method for signature 'clonoStats' calculateDiversity( x, methods = c("all", "nCells", "nClonotypes", "shannon", "normentropy", "invsimpson", "ginisimpson", "chao1", "chaobunge"), ... ) ## S4 method for signature 'SingleCellExperiment' calculateDiversity(x, ...)
x |
A matrix of abundance values where rows are features (clonotypes)
and columns are samples. This is created with |
... |
Additional arguments passed to external calculation methods. |
methods |
A character vector specifying which diversity measures to use
(default = |
Available methods are total cells with appropriate TCR data
('nCells'
, not a diversity measure, but a useful point of
comparison), total clonotypes ('nClonotypes'
), Shannon entropy
('shannon'
), Simpson index ('simpson'
), inverse Simpson index
('invsimpson'
), Chao1 richness ('chao1'
), and Chao-Bunge
richness ('chaobunge'
). A special value of 'all'
is also
allowed, which will run all methods listed above.
The 'chao1'
and 'chaobunge'
estimates assume all
abundances are integers. When this is not the case for the input matrix,
k
, all values are multiplied by the scaling_factor
and
rounded to the nearest integer. The resulting estimate is then divided by
scaling_factor
to return to the original scale. The
'shannon'
, 'simpson'
, and 'invsimpson'
methods work
with any input type.
A matrix of diversity estimates for each sample. Note that the
'chaobunge'
method also includes an estimate of the standard error.
data('contigs') x <- clonoStats(contigs) calculateDiversity(x)
data('contigs') x <- clonoStats(contigs) calculateDiversity(x)
Assign clonotype labels to cells and produce two summary tables:
the clonotypes x samples
table of abundances and the counts x
samples
table of clonotype frequencies.
clonoStats(x, ...) ## S4 method for signature 'SplitDataFrameList' clonoStats( x, group = "sample", type = NULL, assignment = FALSE, method = "EM", lang = c("cpp", "r"), thresh = 0.01, iter.max = 1000, BPPARAM = SerialParam() ) ## S4 method for signature 'SingleCellExperiment' clonoStats(x, contigs = "contigs", group = "sample", ...) ## S4 method for signature 'clonoStats' clonoStats(x, group = NULL, lang = c("cpp", "r"))
clonoStats(x, ...) ## S4 method for signature 'SplitDataFrameList' clonoStats( x, group = "sample", type = NULL, assignment = FALSE, method = "EM", lang = c("cpp", "r"), thresh = 0.01, iter.max = 1000, BPPARAM = SerialParam() ) ## S4 method for signature 'SingleCellExperiment' clonoStats(x, contigs = "contigs", group = "sample", ...) ## S4 method for signature 'clonoStats' clonoStats(x, group = NULL, lang = c("cpp", "r"))
x |
A |
... |
additional arguments. |
group |
character. The name of the column in |
type |
character. The type of VDJ data (one of |
assignment |
logical. Whether or not to return the full |
method |
character. Which method to use for assigning cell-level
clonotypes. Options are |
lang |
character. Indicates which implementation of certain methods to
use. The EM algorithm is implemented in both pure R ( |
thresh |
Numeric threshold for convergence of the EM algorithm.
Indicates the maximum allowable deviation in a count between updates. Only
used if |
iter.max |
Maximum number of iterations for the EM algorithm. Only used
if |
BPPARAM |
A BiocParallelParam object specifying the
parallel backend for distributed clonotype assignment operations (split by
|
contigs |
character. When |
Assign cells (with at least one V(D)J contig) to clonotypes and
produce summary tables that can be used for downstream analysis. Clonotype
assignment can be handled in multiple ways depending on the choice of
"method"
:
"EM"
: Cells are assigned probabilistically to their most
likely clonotype(s) with the Expectation-Maximization (EM) algorithm. For
ambiguous cells, this leads to proportional (non-integer) assignment across
multiple clonotypes and a frequency table of (non-integer) expected
counts.
"unique"
: Cells are assigned a clonotype if (and only if)
they can be uniquely assigned a single clonotype. For a T cell, this means
having exactly one alpha chain and one beta chain.
"CellRanger"
: Clonotype labels are taken from contig data
and matched across samples.
column name in contig data
: Similar to "unique"
, but
additionally, cells with multiples of a particular chain are assigned a
"dominant" clonotype based on which contig has the higher value in this
column (typical choices being "umis"
or "reads"
).
type of chain in contig data
: Clonotypes are based entirely
on this type of chain (eg. "TRA"
or "TRB"
) and cells may be
assigned to multiple clonotypes, if multiples of that chain are present.
The "EM"
, "unique"
, and UMI/read-based quantification
methods all define a clonotype as a pair of specific chains (alpha and beta
for T cells, heavy and light for B cells). Unlike other methods, the EM
algorithm assigns clonotypes probabilistically, which can lead to
non-integer counts for cells with ambiguous information (ie. only an alpha
chain, or two alphas and one beta chain).
We highly recommend providing information on each cell's sample of origin, as this can speed up computation and provide more accurate results. This is particularly important for the EM algorithm, which shares information across cells in the same group, so splitting by sample can improve accuracy by removing extraneous clonotypes from the set of possibilities for a particular cell.
Returns an object of class clonoStats
, containing group-level
clonotype summaries. May optionally include a sparse matrix of cell-level
assignment information, if assignment = TRUE
. If x
is a
SingleCellExperiment
object, this output is added to the metadata.
data('contigs') clonoStats(contigs)
data('contigs') clonoStats(contigs)
clonoStats
object classThe clonoStats
class is designed to hold the output of
the clonoStats
function. This always includes two group-level
summaries: clonotype abundances and clonotype frequencies. "Group" most
often refers to sample of origin, but may alternatively refer to any
partitioning of cells, such as clusters. Clonotype names are stored
efficiently in two factors. Additionally, a large, sparse matrix of each
cell's clonotype assignment may be included.
## S4 method for signature 'clonoStats' show(object) clonoNames(object) ## S4 method for signature 'clonoStats' clonoNames(object) ## S4 method for signature 'SingleCellExperiment' clonoNames(object) clonoAbundance(object) ## S4 method for signature 'clonoStats' clonoAbundance(object) ## S4 method for signature 'SingleCellExperiment' clonoAbundance(object) clonoFrequency(object) ## S4 method for signature 'clonoStats' clonoFrequency(object) ## S4 method for signature 'SingleCellExperiment' clonoFrequency(object) clonoAssignment(object) ## S4 method for signature 'clonoStats' clonoAssignment(object) ## S4 method for signature 'SingleCellExperiment' clonoAssignment(object) clonoGroup(object) ## S4 method for signature 'clonoStats' clonoGroup(object) ## S4 method for signature 'SingleCellExperiment' clonoGroup(object)
## S4 method for signature 'clonoStats' show(object) clonoNames(object) ## S4 method for signature 'clonoStats' clonoNames(object) ## S4 method for signature 'SingleCellExperiment' clonoNames(object) clonoAbundance(object) ## S4 method for signature 'clonoStats' clonoAbundance(object) ## S4 method for signature 'SingleCellExperiment' clonoAbundance(object) clonoFrequency(object) ## S4 method for signature 'clonoStats' clonoFrequency(object) ## S4 method for signature 'SingleCellExperiment' clonoFrequency(object) clonoAssignment(object) ## S4 method for signature 'clonoStats' clonoAssignment(object) ## S4 method for signature 'SingleCellExperiment' clonoAssignment(object) clonoGroup(object) ## S4 method for signature 'clonoStats' clonoGroup(object) ## S4 method for signature 'SingleCellExperiment' clonoGroup(object)
object |
a |
An object of class clonoStats
.
show(clonoStats)
: a short summary of a clonoStats
object.
clonoNames()
: Get the full clonotype names
clonoAbundance()
: Get the clonotype abundance matrix (clonotype
counts per group).
clonoFrequency()
: Get the matrix of clonotype frequencies
(singletons, doubletons, etc. per group).
clonoAssignment()
: Get the matrix of cell-level clonotype
assignments (mapping cells to clonotypes).
clonoGroup()
: Get the factor variable of group labels
abundance
Summary table of clonotype abundances within each group (sample). Provides specific information about how often each clonotype is observed in each group.
frequency
Summary table of clonotype frequencies within each group (sample). Provides a summary of clonotype abundances (ie. number of singletons, doubletons, etc.).
group
Factor variable giving the group of origin for each cell.
assignment
Optional matrix of clonotype assignments in each individual
cell. Rows correspond to cells and row sums are all 0 or 1. When using
clonoStats
with method = "unique"
or method =
"CellRanger"
, all values will be 0 or 1. With method = "EM"
,
fractional values are allowed, representing the assignment confidence.
names1
Factor listing the first component of each clonotype name (alpha chains, for TCR data).
names2
Factor listing the second component of each clonotype name (beta chains, for TCR data).
data('contigs') cs <- clonoStats(contigs) cs
data('contigs') cs <- clonoStats(contigs) cs
Data are a small subset of the TCR-seq data from the paper, "Progressive immune dysfunction with advancing disease stage in renal cell carcinoma" (Braun et al. 2021). The full dataset can be obtained from dbGap phs002252.v1.p1.
data(contigs)
data(contigs)
A SplitDataFrameList with six elements. Each list element
contains the TCR-seq data for a single cell in a DFrame
.
Each DFrame
has 19 variables and as many rows as there
are contigs for the cell.
The variables in the dataset are the same as those in the contig_annotations.csv file created by 10X. The meaning of each variable label is specified at https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/output/annotation, but they are also summarized below:
Cell barcode for the contig in the list element.
True or False value indicating whether the barcode was called as a cell.
Unique identifier for this contig.
True or False value indicating whether the contig was called as high-confidence (unlikely to be a chimeric sequence or some other artifact).
The contig sequence length in nucleotides.
The chain associated with this contig; for example, TRA, TRB, IGK, IGL, or IGH. A value of "Multi" indicates that segments from multiple chains were present.
The highest-scoring V segment, for example, TRAV1-1.
The highest-scoring D segment, for example, TRBD1.
The highest-scoring J segment, for example, TRAJ1-1.
The highest-scoring C segment, for example, TRAC.
If the contig was declared as full-length.
If the contig was declared as productive.
The predicted CDR3 amino acid sequence.
The predicted CDR3 nucleotide sequence.
The number of reads aligned to this contig.
The number of distinct UMIs aligned to this contig.
The ID of the clonotype to which this cell barcode was assigned.
The ID of the consensus sequence to which this contig was assigned.
Sample identifier. The data for contigs come from two different samples.
Braun, David A., et al. "Progressive immune dysfunction with advancing disease stage in renal cell carcinoma." Cancer cell 39, no. 5 (2021): 632-648.
data('contigs') x <- clonoStats(contigs)
data('contigs') x <- clonoStats(contigs)
pieVDJ
creates a list of pie charts created using ggplot
that shows the the level of expansion in each clonotype.
pieVDJ(x, ...) ## S4 method for signature 'Matrix' pieVDJ(x, legend = "bottom") ## S4 method for signature 'matrix' pieVDJ(x, ...) ## S4 method for signature 'clonoStats' pieVDJ(x, ...)
pieVDJ(x, ...) ## S4 method for signature 'Matrix' pieVDJ(x, legend = "bottom") ## S4 method for signature 'matrix' pieVDJ(x, ...) ## S4 method for signature 'clonoStats' pieVDJ(x, ...)
x |
A |
... |
additional arguments. |
legend |
Can take on the values use in the legend.position command in ggplot ("left","top", "right", "bottom", or a numeric vector) to indicate where the legend should be placed. If left NULL, no legend will be created. |
If x
contains more than one sample, a list of pie charts
will be returned. If x
contains only one sample, the pie chart
will be returned. The coloring indicates the number of cells
for each clonotype with darker colors being clonotypes with a single cell
(singletons) and lighter colors having more cells with that clonotype
(expanded clonotype).
data('contigs') x <- clonoStats(contigs) pieVDJ(x)
data('contigs') x <- clonoStats(contigs) pieVDJ(x)
Creates a SplitDataFrameList
(see
DataFrameList
) from a character vector of directory
names corresponding to the output of the CellRanger V(D)J pipeline.
readVDJcontigs(samples, ...) ## S4 method for signature 'character' readVDJcontigs(samples, sample.names = names(samples))
readVDJcontigs(samples, ...) ## S4 method for signature 'character' readVDJcontigs(samples, sample.names = names(samples))
samples |
A character vector containing one or more directory names,
each corresponding to a 10X sample. Each directory should contain a file
named |
... |
additional arguments. |
sample.names |
A character vector of length equal to |
The resulting list of DataFrame
s contains all the data in
filtered_contig_annotations.csv
, split by cell barcode. Note that
the index of each sample in samples
is concatenated to the cell
barcodes, so that cells from different samples cannot have identical
barcodes.
A SplitDataFrameList
object containing data on all contigs,
grouped by cell barcode.
# write the example data to a temporary directory example(writeVDJcontigs) # specify sample locations and read in data samples <- file.path(loc, c('sample1','sample2')) contigs <- readVDJcontigs(samples)
# write the example data to a temporary directory example(writeVDJcontigs) # specify sample locations and read in data samples <- file.path(loc, c('sample1','sample2')) contigs <- readVDJcontigs(samples)
This function uses the Breakaway method to estimate the
clonotype richness (total number of clonotypes) present in each group of a
clonoStats
object.
runBreakaway(x, ...) ## S4 method for signature 'clonoStats' runBreakaway(x, nof1 = FALSE, ...)
runBreakaway(x, ...) ## S4 method for signature 'clonoStats' runBreakaway(x, nof1 = FALSE, ...)
x |
A |
... |
Additional arguments passed to |
nof1 |
logical. Indicates whether to use the |
A list of alpha_estimate
objects, one per group, containing
detailed results of running the Breakaway estimator on the vector of
clonotype frequencies from that group.
Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics.
Willis, A. (2015). Species richness estimation with high diversity but spurious singletons. arXiv.
data('contigs') x <- clonoStats(contigs, method = 'unique') runBreakaway(x)
data('contigs') x <- clonoStats(contigs, method = 'unique') runBreakaway(x)
Perform Principal Components Analysis (PCA) on the matrix of sample-level clonotype abundances. In the context of clonotype analysis, this is a form of beta diversity.
runVDJPCA(x, ...) ## S4 method for signature 'clonoStats' runVDJPCA(x, unit = c("samples", "clonotypes"))
runVDJPCA(x, ...) ## S4 method for signature 'clonoStats' runVDJPCA(x, unit = c("samples", "clonotypes"))
x |
A matrix of abundance values where rows are features (clonotypes) and columns are samples. |
... |
additional arguments. |
unit |
Character value indicating whether the unit of interest is
|
A list with class "prcomp"
. The component x
stores the
reduced-dimensional representation of the data. For a full description, see
prcomp
.
data('contigs') x <- clonoStats(contigs) runVDJPCA(x)
data('contigs') x <- clonoStats(contigs) runVDJPCA(x)
scatterVDJ
creates a scatterplot that shows the
abundance of the sample on the x-axis and the evenness on the y-axis.
scatterVDJ(d, ...) ## S4 method for signature 'matrix' scatterVDJ(d, sampleGroups = NULL, title = NULL, legend = FALSE)
scatterVDJ(d, ...) ## S4 method for signature 'matrix' scatterVDJ(d, sampleGroups = NULL, title = NULL, legend = FALSE)
d |
A |
... |
additional arguments. |
sampleGroups |
A |
title |
Character vector with an optional title. |
legend |
If TRUE, a legend will be included with the plot. If FALSE, no legend is included in the plot. |
Returns a ggplot
plot with a scatterplot that shows the
abundance for each sample on the x-axis and the evenness for each sample
on the y-axis. Richness can be expressed as the total number of unique
clonotypes in the sample or as the breakaway diversity measure (Willis and
Bunge 2015), which estimates the total number of unique clonotypes in the
population. Evenness is measured as the normalized entropy, which
is a measure of how evenly cells are distributed across the different
clonotypes. Evenness is a measure between 0 and 1 that is independent of
the number of cells in a sample.
Diversity measures such as Shannon entropy contain information about
both the evenness and the abundance of a sample, but because both
characteristics are combined into one number, comparison between
samples or groups of samples is difficult. Other measures, such as the
breakaway measure of diversity, only express the abundance of the sample
and not the evenness. The scatterplot shows how evenness and abundance
differs between each sample and between each group of samples.
data('contigs') x <- clonoStats(contigs) d <- calculateDiversity(x) sampleGroups <- data.frame(Sample = c("sample1", "sample2"), Group = c("Cancer", "Normal")) scatterVDJ(d, sampleGroups = NULL, title = "Evenness-abundance plot", legend = TRUE)
data('contigs') x <- clonoStats(contigs) d <- calculateDiversity(x) sampleGroups <- data.frame(Sample = c("sample1", "sample2"), Group = c("Cancer", "Normal")) scatterVDJ(d, sampleGroups = NULL, title = "Evenness-abundance plot", legend = TRUE)
Takes a matrix of cell-level clonotype counts and splits them into a list of group-specific counts (typically samples).
splitClonotypes(x, by, ...) ## S4 method for signature 'Matrix' splitClonotypes(x, by) ## S4 method for signature 'matrix' splitClonotypes(x, by) ## S4 method for signature 'SingleCellExperiment' splitClonotypes(x, by, contigs = "contigs", clonoStats = "clonoStats") ## S4 method for signature 'clonoStats' splitClonotypes(x, by)
splitClonotypes(x, by, ...) ## S4 method for signature 'Matrix' splitClonotypes(x, by) ## S4 method for signature 'matrix' splitClonotypes(x, by) ## S4 method for signature 'SingleCellExperiment' splitClonotypes(x, by, contigs = "contigs", clonoStats = "clonoStats") ## S4 method for signature 'clonoStats' splitClonotypes(x, by)
x |
A |
by |
A character vector or factor by which to split the clonotype
counts. If |
... |
additional arguments. |
contigs |
character. If |
clonoStats |
character. If |
A list of Matrix
objects providing the cell-level clonotype
assignments for each unique value of by
(if by
denotes sample
labels, each matrix in the list will contain the cells from a single
sample).
example(addVDJtoSCE) x <- clonoStats(contigs, assignment = TRUE) splitClonotypes(x, by = sce$sample)
example(addVDJtoSCE) x <- clonoStats(contigs, assignment = TRUE) splitClonotypes(x, by = sce$sample)
Takes a matrix of cell-level clonotype counts and sums them within groups (typically samples).
summarizeClonotypes(x, by, ...) ## S4 method for signature 'Matrix' summarizeClonotypes( x, by, mode = c("sum", "tab"), lang = c("r", "cpp"), BPPARAM = SerialParam() ) ## S4 method for signature 'SingleCellExperiment' summarizeClonotypes( x, by = "sample", contigs = "contigs", clonoStats = "clonoStats", ... ) ## S4 method for signature 'matrix' summarizeClonotypes(x, by, ...) ## S4 method for signature 'clonoStats' summarizeClonotypes(x, by, ...)
summarizeClonotypes(x, by, ...) ## S4 method for signature 'Matrix' summarizeClonotypes( x, by, mode = c("sum", "tab"), lang = c("r", "cpp"), BPPARAM = SerialParam() ) ## S4 method for signature 'SingleCellExperiment' summarizeClonotypes( x, by = "sample", contigs = "contigs", clonoStats = "clonoStats", ... ) ## S4 method for signature 'matrix' summarizeClonotypes(x, by, ...) ## S4 method for signature 'clonoStats' summarizeClonotypes(x, by, ...)
x |
A (usually sparse) matrix of cell-level clonotype counts (cells are
rows and clonotypes are columns). Alternatively, a
|
by |
A character vector or factor by which to summarize the clonotype
counts. If |
... |
additional arguments. |
mode |
Type of summarization to perform. Default is |
lang |
Indicates which implementation of the |
BPPARAM |
A BiocParallelParam object specifying the
parallel backend for distributed clonotype assignment operations (split by
|
contigs |
character. If |
clonoStats |
character. If |
If mode = 'sum'
, returns a matrix clonotype abundances where
each row corresponds to a clonotype and each column a value of by
(if by
denotes sample labels, this is a matrix of sample-level
clonotype counts). If mode = 'tab'
, returns a matrix of clonotype
frequencies, where each row corresponds to a frequency (singletons,
doubletons, etc.) and each column a value of by
.
example(addVDJtoSCE) x <- clonoStats(contigs, assignment = TRUE) summarizeClonotypes(x, by = sce$sample)
example(addVDJtoSCE) x <- clonoStats(contigs, assignment = TRUE) summarizeClonotypes(x, by = sce$sample)
Write V(D)J data to a series of directories, each containing a CSV file with the data for an individual sample.
writeVDJcontigs(path, x, ...) ## S4 method for signature 'character,SplitDataFrameList' writeVDJcontigs(path, x)
writeVDJcontigs(path, x, ...) ## S4 method for signature 'character,SplitDataFrameList' writeVDJcontigs(path, x)
path |
A string containing the path to the output directory. |
x |
A |
... |
additional arguments. |
Creates various subdirectories of the directory specified in path
.
Each subdirectory is named for a sample found in the dataset, x
, and
contains a CSV filed named filtered_contig_annotations.csv
.
data('contigs') loc <- tempdir() writeVDJcontigs(loc, contigs)
data('contigs') loc <- tempdir() writeVDJcontigs(loc, contigs)