Title: | Decomposition of individual tumors into mutational signatures by signature refitting |
---|---|
Description: | Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor. |
Authors: | Rosario M. Piro [aut, cre], Sandra Krueger [ctb] |
Maintainer: | Rosario M. Piro <[email protected]> |
License: | GPL-2 |
Version: | 2.23.0 |
Built: | 2024-10-30 05:45:48 UTC |
Source: | https://github.com/bioc/decompTumor2Sig |
The decompTumor2Sig package uses quadratic programming to decompose the somatic mutation catalog from an individual tumor sample (or multiple individual tumor samples) into a set of given mutational signatures (either of the "Alexandrov model" by Alexandrov et al, Nature 500(7463):415-421, 2013, or the "Shiraishi model" by Shiraishi et al, PLoS Genet 11(12):e1005657, 2015), thus computing weights (or "exposures") that reflect the contributions of the signatures to the mutation load of the tumor.
Package: | decompTumor2Sig |
Type: | Package |
Version: | 2.13.1 |
Date: | 2022-05-09 |
License: | GPL (>=2) |
The package provides the following functions:
adjustSignaturesForRegionSet(): | adjust (normalize) mutational |
signatures for use with mutation data from a | |
specific, limited subset of genomic regions | |
(e.g., for targetted sequencing). | |
composeGenomesFromExposures(): | (re-)construct tumor genome mutation |
frequencies from the signatures and | |
their corresponding exposures, or | |
contributions. | |
computeExplainedVariance(): | determine the variance explained by |
estimated signature contributions | |
(i.e., exposures to signatures). | |
convertAlexandrov2Shiraishi(): | convert a set of Alexandrov |
signatures to Shiraishi signatures. | |
convertGenomesFromVRanges(): | convert a genome or set of genomes |
from a VariantAnnotation::VRanges
|
|
object. | |
decomposeTumorGenomes(): | determine the weights/contributions of |
a set of signatures to each of a set of | |
individual tumor genomes. | |
determineSignatureDistances(): | for a given signature |
compute its distances to each of a set | |
of target signatures. | |
downgradeShiraishiSignatures(): | downgrade Shiraishi signatures |
by removing flanking bases and/or the | |
transcription direction. | |
evaluateDecompositionQuality(): | evaluate the quality of a |
decomposition by comparing the | |
re-composed (=re-constructed) tumor | |
mutation frequencies to those actually | |
observed in the tumor genome. | |
getGenomesFromMutFeatData(): | extract the genomes from a |
MutationFeatureData object as |
|
provided by, for example, | |
pmsignature::readMPFile . |
|
getSignaturesFromEstParam(): | extract a set of signatures from an |
EstimatedParameters object as |
|
returned by function getPMSignature
|
|
of the pmsignature package. |
|
isAlexandrovSet(): | checks whether the input list is |
compatible with the Alexandrov format | |
(probability vectors). | |
isExposureSet(): | checks whether the input list is |
compatible with exposure output obtained | |
from decomposeTumorGenomes . |
|
isShiraishiSet(): | checks whether the input list is |
compatible with the Shiraishi format | |
(matrices or data.frames of | |
probabilities). | |
isSignatureSet(): | checks whether the input list is |
compatible with either the Alexandrov | |
or Shiraishi format. | |
mapSignatureSets(): | find a mapping from one signature |
set to another. | |
plotDecomposedContribution(): | plot the decomposition of a |
genome into mutational signatures | |
(i.e., the contributions of, or | |
exposures to, the signatures). | |
plotExplainedVariance(): | plot the variance of a genome's |
mutation patterns which can be | |
explained with an increasing number | |
of signatures. | |
plotMutationDistribution(): | plot a single signature or the |
mutation frequency data for a single | |
genome. | |
readAlexandrovSignatures(): | read Alexandrov signatures in the |
COSMIC format from a flat file or URL. | |
readGenomesFromMPF(): | read a genome or set of genomes from a |
Mutation Position Format (MPF) file. | |
readGenomesFromVCF(): | read a genome or set of genomes from a |
Variant Call Format (VCF) file. | |
readShiraishiSignatures(): | read Shiraishi signatures from |
flat files. | |
sameSignatureFormat(): | checks whether two input lists are sets |
of signatures of the same format. | |
Rosario M. Piro, Politecnico di Milano [aut, cre]
Sandra Krueger, Freie Universitaet Berlin [ctb]
Maintainer: Rosario M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
'adjustSignaturesForRegionSet()' takes a set of signatures that have been orginally defined with respect to the nucleotide frequencies within a specific reference genome or region (e.g., by deriving them from whole genome mutation data) and adjusts or normalizes them to the often different nucleotide frequencies of another specific subset of genomic regions.
adjustSignaturesForRegionSet(signatures, regionsTarget, regionsOriginal=NULL, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19)
adjustSignaturesForRegionSet(signatures, regionsTarget, regionsOriginal=NULL, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19)
signatures |
(Mandatory) Signatures to be adjusted to the nucleotide
frequencies of the genomic regions defined by the parameter |
regionsTarget |
(Mandatory) |
regionsOriginal |
(Optional) |
refGenome |
(Optional) Reference genome sequence from which to
compute the nucleotide frequencies. Default: |
This may be useful, for example, to perform signature refitting (using
decomposeTumorGenomes
) for mutation data from targetted
sequencing (e.g., only a subset of genes), whole exome sequencing (only
exonic regions), or other limited subsets of the genome with particular
nucleotide frequencies.
For Alexandrov-type signatures, the important frequencies are those of the whole sequence patterns (e.g., trinucleotides) whose central base can be mutated. Therefore, adjustment factors for individual mutation types (e.g., A[C>T]G) are computed by comparing the corresponding sequence pattern frequencies (e.g., ACG) between the original reference regions (e.g., whole genome) and the target regions (e.g., target regions of whole exome sequencing).
In the Shiraishi-type signature model, individual bases of the sequence patterns are considered as independent features. Thus, to compute nucleotide frequencies for such signatures, the frequencies of the sequence patterns (computed as for Alexandrov-type signatures) are broken down to single-nucleotide frequencies for the individual positions of the patterns.
In both cases, after the appropriate adjustment of individual features, signatures are re-normalized such that overall probabilites sum up to 1.
A set of adjusted mutational signatures in the same format as those
specified for signatures
.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### get gene annotation for the default reference genome (hg19) library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### get a GRanges object for gene promoters (-2000 to +200 bases from TSS) ### [taking only the first 1000 for testing purpose] library(GenomicRanges) regionsTarget <- promoters(txdb, upstream=2000, downstream=200)[seq(1000)] ### assume these signatures were derived only from mutation data from ### exons on chromosome X [not true; just for illustrative purpose] filter <- list(tx_chrom = c("chrX")) regionsOriginal <- exons(txdb, filter=filter) ### adjust signatures according to nucleotide frequencies in the target ### subset of the genome sign_adj <- adjustSignaturesForRegionSet(signatures, regionsTarget, regionsOriginal)
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### get gene annotation for the default reference genome (hg19) library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### get a GRanges object for gene promoters (-2000 to +200 bases from TSS) ### [taking only the first 1000 for testing purpose] library(GenomicRanges) regionsTarget <- promoters(txdb, upstream=2000, downstream=200)[seq(1000)] ### assume these signatures were derived only from mutation data from ### exons on chromosome X [not true; just for illustrative purpose] filter <- list(tx_chrom = c("chrX")) regionsOriginal <- exons(txdb, filter=filter) ### adjust signatures according to nucleotide frequencies in the target ### subset of the genome sign_adj <- adjustSignaturesForRegionSet(signatures, regionsTarget, regionsOriginal)
'composeGenomesFromExposures()' re-composes (or predicts) tumor genomes
(i.e., their mutation frequencies) from the given mutational signatures and
their corresponding exposures, or contributions. The (re-)composition is
performed by computing the weighted sum of the mutational signatures, where
the weights are the exposures (=contributions) of the corresponding
signatures. This can, for example, be used to verify that a decomposition
obtained from decomposeTumorGenomes
is meaningful.
composeGenomesFromExposures(exposures, signatures)
composeGenomesFromExposures(exposures, signatures)
exposures |
(Mandatory) A single vector or list of vectors containing
the estimated signature contributions/exposures as computed by the function
|
signatures |
(Mandatory) The list of signatures (vectors, data frames or matrices) for which the exposures were obtained. Each of the list objects represents one mutational signature. Vectors are used for Alexandrov signatures, data frames or matrices for Shiraishi signatures. |
A list of "predicted" genomes, i.e., the frequencies of their mutational patterns computed as weighted sums of the mutational signatures, where the weights correspond to the contributions of, i.e., exposures to, the corresponding signatures.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### re-compose (predict) tumor genome features from exposures predGenomes <- composeGenomesFromExposures(exposures, signatures)
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### re-compose (predict) tumor genome features from exposures predGenomes <- composeGenomesFromExposures(exposures, signatures)
'computeExplainedVariance()' computes for one or more tumors the variance which is explained by the estimated contributions (exposures) of a set of signatures when compared to the observed genomes.
computeExplainedVariance(exposures, signatures, genomes)
computeExplainedVariance(exposures, signatures, genomes)
exposures |
(Mandatory) A single vector or list of vectors containing
the estimated signature contributions/exposures as provided by the function
|
signatures |
(Mandatory) The list of signatures (vectors, data frames or matrices) for which the exposures were obtained. Each of the list objects represents one mutational signature. Vectors are used for Alexandrov signatures, data frames or matrices for Shiraishi signatures. |
genomes |
(Mandatory) Can be either a vector, a data frame or a matrix
(for an individual tumor genome), or a list of one of these object types
(for multiple tumors). Each tumor genome must be of the same form as the
|
A numeric vector of explained variances, one for each genome.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
plotExplainedVariance
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### compute explained variance for the tumor genomes computeExplainedVariance(exposures, signatures, genomes)
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### compute explained variance for the tumor genomes computeExplainedVariance(exposures, signatures, genomes)
'convertAlexandrov2Shiraishi()' converts a set Alexandrov signatures to the Shiraishi model, summing the respective frequencies of base changes, and upstream and downstream flanking bases. In most cases, the resulting Shiraishi signatures don't provide information on the transcription strand, as this is not part of the standard Alexandrov signatures. While the conversion is mainly thought for signatures, it actually works also for mutation frequency data from genomes which have the same format. [Attention: this conversion entails a loss of specificity and the applicability of Shiraishi signatures derived from Alexandrov signatures has not been extensively explored!]
convertAlexandrov2Shiraishi(signatures)
convertAlexandrov2Shiraishi(signatures)
signatures |
(Mandatory) A list of Alexandrov signatures with named
elements as produced by |
A list of Shiraishi signatures that can be used for
decomposeTumorGenomes
.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
readAlexandrovSignatures
readShiraishiSignatures
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to the Shiraishi model signShiraishi <- convertAlexandrov2Shiraishi(signAlexandrov)
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to the Shiraishi model signShiraishi <- convertAlexandrov2Shiraishi(signAlexandrov)
VRanges
object'convertGenomesFromVRanges()' converts the SNVs of a single tumor genome
(sample) or a set of genomes from a VRanges
object (package
VariantAnnotation
) and determines the mutation frequencies according
to a specific model of mutational signatures (Alexandrov or Shiraishi),
such that the resulting format can be used as genomes input for
decomposeTumorGenomes
.
convertGenomesFromVRanges(vranges, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, transcriptAnno= TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene, verbose=TRUE)
convertGenomesFromVRanges(vranges, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, transcriptAnno= TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene, verbose=TRUE)
vranges |
(Mandatory) The |
numBases |
(Mandatory) Total number of bases (mutated base and flanking bases) to be used for sequence patterns. Must be odd. Default: 5 |
type |
(Mandatory) Signature model or type ( |
trDir |
(Mandatory) Specifies whether the transcription direction is
taken into account in the signature model. If so, only mutations within
genomic regions with a defined transcription direction can be considered.
Default: |
enforceUniqueTrDir |
(Optional) Used only if |
refGenome |
(Mandatory) The reference genome ( |
transcriptAnno |
(Optional) Transcript annotation ( |
verbose |
(Optional) Print information about reading and processing
the mutation data. Default: |
A list containing the genomes in terms of frequencies of the
mutated sequence patterns. This list of genomes can be used for
decomposeTumorGenomes
.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
readGenomesFromVCF
readGenomesFromMPF
getGenomesFromMutFeatData
### load the reference genome and the transcript annotation database refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 transcriptAnno <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### take the breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") ### get the corresponding VRanges object (using the VariantAnnotation ### package) library(VariantAnnotation) vr <- readVcfAsVRanges(gfile, genome="hg19") ### convert the VRanges object to the decompTumor2Sig format genomes <- convertGenomesFromVRanges(vr, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=refGenome, transcriptAnno=transcriptAnno, verbose=FALSE)
### load the reference genome and the transcript annotation database refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 transcriptAnno <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### take the breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") ### get the corresponding VRanges object (using the VariantAnnotation ### package) library(VariantAnnotation) vr <- readVcfAsVRanges(gfile, genome="hg19") ### convert the VRanges object to the decompTumor2Sig format genomes <- convertGenomesFromVRanges(vr, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=refGenome, transcriptAnno=transcriptAnno, verbose=FALSE)
'decomposeTumorGenomes()' is the core function of this package. It decomposes tumor genomes into a given set of mutational signatures by computing their contributions (exposures) to the mutational load via quadratic programming. The function takes a set of mutational signatures and the mutation features of one or more tumor genomes and computes weights, i.e., contributions for each of the signatures in each individual genome. Alternatively, the function can determine for each genome only a subset of signatures whose contributions are sufficient to exceed a user-given minimum threshold for the explained variance of the genome's mutation patterns.
decomposeTumorGenomes(genomes, signatures, minExplainedVariance=NULL, minNumSignatures=2, maxNumSignatures=NULL, greedySearch=FALSE, constrainToMaxContribution=FALSE, tolerance=0.1, verbose=FALSE)
decomposeTumorGenomes(genomes, signatures, minExplainedVariance=NULL, minNumSignatures=2, maxNumSignatures=NULL, greedySearch=FALSE, constrainToMaxContribution=FALSE, tolerance=0.1, verbose=FALSE)
genomes |
(Mandatory) Can be either a vector, a data frame or a
matrix (for an individual tumor genome), or a list of one of these
object types (for multiple tumors). Each tumor genome must be of the
same form as the |
signatures |
(Mandatory) A list of vectors, data frames or matrices. Each of the objects represents one mutational signature. Vectors are used for Alexandrov signatures, data frames or matrices for Shiraishi signatures. |
minExplainedVariance |
(Optional) If |
minNumSignatures |
(Optional) Used if |
maxNumSignatures |
(Optional) If |
greedySearch |
(Optional) Used only in case |
constrainToMaxContribution |
(Optional) [Note: this is EXPERIMENTAL
and is usually not needed!] If |
tolerance |
(Optional) If |
verbose |
(Optional) If |
A list of signature weight vectors (also called 'exposures'), one
for each tumor genome. E.g., the first vector element of the first list
object is the weight/contribution of the first signature to the first
tumor genome. IMPORTANT: If minExplainedVariance
is specified, then
the exposures of a genome will NOT be returned if the minimum explained
variance is not reached within the requested minimum and maximum numbers
of signatures (minNumSignatures
and maxNumSignatures
)! The
corresponding exposure vector will be set to NULL
.
Rosario M. Piro, Politecnico di Milano
Sandra Krueger, Freie Universitaet Berlin
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load reference genome refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") genomes <- readGenomesFromVCF(gfile, numBases=3, type="Alexandrov", trDir=FALSE, refGenome=refGenome, verbose=FALSE) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### (for further examples on searching subsets, please see the vignette)
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load reference genome refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") genomes <- readGenomesFromVCF(gfile, numBases=3, type="Alexandrov", trDir=FALSE, refGenome=refGenome, verbose=FALSE) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### (for further examples on searching subsets, please see the vignette)
'determineSignatureDistances()' determines all distances (i.e., differences) between a given signature (of type Alexandrov or Shiraishi) and a set of target signatures (of the same type). This can help to compare signatures that have been determined in different ways or from different datasets. Different distance measures can be used (see details below).
determineSignatureDistances(fromSignature, toSignatures, method="euclidean")
determineSignatureDistances(fromSignature, toSignatures, method="euclidean")
fromSignature |
(Mandatory) A single signature of the Alexandrov (vector) or Shiraishi type (data frame or matrix). |
toSignatures |
(Mandatory) The list of target signatures for which to
compute the distances to |
method |
(Optional) The distance measure to be used. This can be one
of the following: |
Distances that can be used are:
"frobenius" |
Forbenius distance between real-valued matrices |
(or Shiraishi signatures) A and B : |
|
F = sqrt(trace( (A-B) %*% t(A-B) )) |
|
"rss" |
Residual sum of squares (i.e., squared error): |
rss = sum((A-B)^2) |
|
"euclidean" |
(see ?dist for details) |
"maximum" |
(see ?dist for details) |
"manhattan" |
(see ?dist for details) |
"canberra" |
(see ?dist for details) |
"binary" |
(see ?dist for details) |
"minkowski" |
(see ?dist for details) |
A signature-named vector containing all distances. This vector has the same order as the target signature list, so it is not sorted according to distance.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
mapSignatureSets
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to Shiraishi signatures signAlex2Shi <- convertAlexandrov2Shiraishi(signAlexandrov) ### define an arbitrary signature just for testing ### (similar to signature 1) testSig <- matrix(c(0.1, 0, 0.7, 0.1, 0.1, 0, 0.3, 0.2, 0.3, 0.2, 0, 0, 0.2, 0.1, 0.5, 0.2, 0, 0), nrow=3, byrow=TRUE) ### compute distances of the test signature to the converted ### Alexandrov signatures from COSMIC determineSignatureDistances(testSig, signAlex2Shi, method="frobenius")
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to Shiraishi signatures signAlex2Shi <- convertAlexandrov2Shiraishi(signAlexandrov) ### define an arbitrary signature just for testing ### (similar to signature 1) testSig <- matrix(c(0.1, 0, 0.7, 0.1, 0.1, 0, 0.3, 0.2, 0.3, 0.2, 0, 0, 0.2, 0.1, 0.5, 0.2, 0, 0), nrow=3, byrow=TRUE) ### compute distances of the test signature to the converted ### Alexandrov signatures from COSMIC determineSignatureDistances(testSig, signAlex2Shi, method="frobenius")
'downgradeShiraishiSignatures()' downgrades/trims signatures of the Shiraishi type by discarding flanking bases (reducing the length of the sequence pattern) and/or the transcription direction. The downgrade doesn't pose a problem because the flanking bases and the transcription direction are considered as independent features according to the Shiraishi model of mutational signatures.
downgradeShiraishiSignatures(signatures, numBases=NULL, removeTrDir=FALSE)
downgradeShiraishiSignatures(signatures, numBases=NULL, removeTrDir=FALSE)
signatures |
(Mandatory) A list of Shiraishi signatures that need to be downgraded/trimmed. |
numBases |
(Conditionally optional) The total number of bases
(mutated base plus flanking bases around the mutated base) that should
be kept. All further flanking bases farther away from the mutated bases
are dropped. If specified, |
removeTrDir |
(Conditionally optional) Logical value that specifies
whether information on the transcript direction should be dropped (if
present at all). At least one of |
A list of Shiraishi signatures that have been accordingly downgraded.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
### Load 15 Shiraishi signatures obtained from 435 tumor genomes from ### Alexandrov et al. (number of bases: 5, transcription direction: yes) sfile <- system.file("extdata", "Alexandrov_PMID_23945592_435_tumors-pmsignature-15sig.Rdata", package="decompTumor2Sig") load(sfile) ### downgrade the signatures to include only 3 bases and drop the ### transcription direction downgradeShiraishiSignatures(signatures, numBases=3, removeTrDir=TRUE)
### Load 15 Shiraishi signatures obtained from 435 tumor genomes from ### Alexandrov et al. (number of bases: 5, transcription direction: yes) sfile <- system.file("extdata", "Alexandrov_PMID_23945592_435_tumors-pmsignature-15sig.Rdata", package="decompTumor2Sig") load(sfile) ### downgrade the signatures to include only 3 bases and drop the ### transcription direction downgradeShiraishiSignatures(signatures, numBases=3, removeTrDir=TRUE)
'evaluateDecompositionQuality()' evaluates the quality of the decomposition
into exposures of a single tumor. The function evaluates the quality of a
decomposition obtained from the function decomposeTumorGenomes
by comparing the re-composed (=re-constructed) tumor genome mutation
frequencies to those actually observed in the tumor genome. Tumor genome
mutation frequencies are reconstructed using
composeGenomesFromExposures
and the results can optionally be plotted.
evaluateDecompositionQuality(exposure, signatures, genome, plot=FALSE)
evaluateDecompositionQuality(exposure, signatures, genome, plot=FALSE)
exposure |
(Mandatory) A single vector containing the estimated
signature contributions, or exposures, of a single tumor as provided by
|
signatures |
(Mandatory) The list of signatures (vectors, data frames or matrices) for which the exposures were obtained. Each of the list objects represents one mutational signature. Vectors are used for Alexandrov signatures, data frames or matrices for Shiraishi signatures. |
genome |
(Mandatory) A single tumor genome in form of mutation
frequencies specified either in the Alexandrov or the Shiraishi format
(must match the format used for |
plot |
(Optional) If |
A named list object containing measurements for the Pearson
correlation coefficient between the reconstructed and observed mutation
frequencies, and the explained variance; or alternatively, a plot with
these measurements (see option plot
above).
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
composeGenomesFromExposures
computeExplainedVariance
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### evaluate the decomposition by comparing to the original data evaluateDecompositionQuality(exposures[[1]], signatures, genomes[[1]])
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### evaluate the decomposition by comparing to the original data evaluateDecompositionQuality(exposures[[1]], signatures, genomes[[1]])
MutationFeatureData
.'getGenomesFromMutFeatData()' takes a MutationFeatureData
object
(mutation count data) as read by the 'pmsignature
' package (e.g.,
by pmsignature::readMPFile
, version 0.3.0) and extracts the mutation
counts of the genomes therein. For passing the genomes to
decomposeTumorGenomes
, the mutation counts must be normalized to
mutation frequencies, which is done by default. [IMPORTANT: set
normalize
to FALSE
only if you are interested in full
integer counts, but do not pass unnormalized counts to
decomposeTumorGenomes
!]
getGenomesFromMutFeatData(mutFeatData, normalize=TRUE)
getGenomesFromMutFeatData(mutFeatData, normalize=TRUE)
mutFeatData |
(Mandatory) A |
normalize |
(Optional) Boolean value to specify whether to normalize
the mutation count data to mutation fractions between 0 and 1. This is
the default and NECESSARY in case you want to pass the return value to
|
A list of mutation frequencies (or mutation counts if not
normalized), one object per genome. The format is either according to the
Shiraishi or the Alexandrov model, depending on how the mutation data was
loaded with pmsignature
.
Rosario M. Piro, Politecnico di Milano
Sandra Krueger, Freie Universitaet Berlin
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
### get breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) in the format produced by ### pmsignature (PMID: 26630308) pmsigdata <- system.file("extdata", "Nik-Zainal_PMID_22608084-pmsignature-G.Rdata", package="decompTumor2Sig") load(pmsigdata) ### extract the genomes from the pmsignature G object genomes <- getGenomesFromMutFeatData(G, normalize=TRUE)
### get breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) in the format produced by ### pmsignature (PMID: 26630308) pmsigdata <- system.file("extdata", "Nik-Zainal_PMID_22608084-pmsignature-G.Rdata", package="decompTumor2Sig") load(pmsigdata) ### extract the genomes from the pmsignature G object genomes <- getGenomesFromMutFeatData(G, normalize=TRUE)
EstimatedParameters
object.'getSignaturesFromEstParam()' takes an EstimatedParameters
object
(signatures data) as computed by the 'pmsignature
' package (by
pmsignature::getPMSignature
; version 0.3.0) and extracts the
signature information. The signatures can then be passed to
decomposeTumorGenomes
.
getSignaturesFromEstParam(Param)
getSignaturesFromEstParam(Param)
Param |
(Mandatory) A |
A list of Shiraishi signatures, one object per signature. Please
see readShiraishiSignatures
or the decompTumor2Sig
vignette
for more information on the format of Shiraishi signatures.
Rosario M. Piro, Politecnico di Milano
Sandra Krueger, Freie Universitaet Berlin
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
readShiraishiSignatures
### load signatures for breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) in the format produced by ### pmsignature (PMID: 26630308) pmsigdata <- system.file("extdata", "Nik-Zainal_PMID_22608084-pmsignature-Param.Rdata", package="decompTumor2Sig") load(pmsigdata) ### extract the signatures from the pmsignature Param object signatures <- getSignaturesFromEstParam(Param)
### load signatures for breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) in the format produced by ### pmsignature (PMID: 26630308) pmsigdata <- system.file("extdata", "Nik-Zainal_PMID_22608084-pmsignature-Param.Rdata", package="decompTumor2Sig") load(pmsigdata) ### extract the signatures from the pmsignature Param object signatures <- getSignaturesFromEstParam(Param)
'isAlexandrovSet()' checks whether the input object is a set (list) of numeric objects compatible with the Alexandrov format (probability vectors; sum up to 1). NOTE: These can also be genomes compatible with the Alexandrov format!
isAlexandrovSet(x)
isAlexandrovSet(x)
x |
Object to be checked. |
Logical value (true or false).
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
readAlexandrovSignatures
isSignatureSet
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() isAlexandrovSet(signAlexandrov)
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() isAlexandrovSet(signAlexandrov)
'isExposureSet()' checks whether the input object is a set (list) of
numeric objects compatible with exposure output obtained from
decomposeTumorGenomes
.
isExposureSet(x)
isExposureSet(x)
x |
Object to be checked. |
Logical value (true or false).
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load reference genome refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") genomes <- readGenomesFromVCF(gfile, numBases=3, type="Alexandrov", trDir=FALSE, refGenome=refGenome, verbose=FALSE) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) isExposureSet(exposures)
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load reference genome refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") genomes <- readGenomesFromVCF(gfile, numBases=3, type="Alexandrov", trDir=FALSE, refGenome=refGenome, verbose=FALSE) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) isExposureSet(exposures)
'isShiraishiSet()' checks whether the input object is a set (list) of numeric objects compatible with the Shiraishi format (matrices or data.frames of probabilities; 6 columns, each row sums up to 1). NOTE: These can also be genomes compatible with the Shiraishi format!
isShiraishiSet(x)
isShiraishiSet(x)
x |
Object to be checked. |
Logical value (true or false).
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
isSignatureSet
readShiraishiSignatures
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to the Shiraishi model signShiraishi <- convertAlexandrov2Shiraishi(signAlexandrov) isShiraishiSet(signShiraishi)
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to the Shiraishi model signShiraishi <- convertAlexandrov2Shiraishi(signAlexandrov) isShiraishiSet(signShiraishi)
'isSignatureSet()' checks whether the input object is a set (list) of
numeric objects compatible with either the Alexandrov format (probability
vectors; see isAlexandrovSet
) or the Shiraishi format (matrices or
data.frames of probabilities; see isShiraishiSet
).
NOTE: These can also be genomes compatible with one of the two formats!
isSignatureSet(x)
isSignatureSet(x)
x |
Object to be checked. |
Logical value (true or false).
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
isAlexandrovSet
isShiraishiSet
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() isSignatureSet(signAlexandrov)
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() isSignatureSet(signAlexandrov)
'mapSignatureSets()' determines a mapping from one set of signatures to another. Both Alexandrov and Shiraishi signatures can be handled, but both sets must be of the same type. The mapping can either be a unique (one-to-one) mapping or identify best matches while allowing multiple signatures to be mapped to the same target signature if it is the best match for more than one signature. Different distance measures can be used (see details below).
mapSignatureSets(fromSignatures, toSignatures, method="euclidean", unique=FALSE)
mapSignatureSets(fromSignatures, toSignatures, method="euclidean", unique=FALSE)
fromSignatures |
(Mandatory) A set (list) of signatures of the
Alexandrov (vector) or Shiraishi type (data frame or matrix), that has
to be mapped to the signatures of a second set ( |
toSignatures |
(Mandatory) The set (list) of signatures to which the
set of |
method |
(Optional) The distance measure to be used. This can be one
of the following: |
unique |
(Optional) If set to |
Distances that can be used are:
"frobenius" |
Forbenius distance between real-valued matrices |
(or Shiraishi signatures) A and B : |
|
F = sqrt(trace( (A-B) %*% t(A-B) )) |
|
"rss" |
Residual sum of squares (i.e., squared error): |
rss = sum((A-B)^2) |
|
"euclidean" |
(see ?dist for details) |
"maximum" |
(see ?dist for details) |
"manhattan" |
(see ?dist for details) |
"canberra" |
(see ?dist for details) |
"binary" |
(see ?dist for details) |
"minkowski" |
(see ?dist for details) |
A vector having as elements the mapped signatures of
toSignatures
, and as names the signatures of fromSignatures
with which they have been associated.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
determineSignatureDistances
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to Shiraishi signatures signAlex2Shi <- convertAlexandrov2Shiraishi(signAlexandrov) ### define a small set of arbitrary signatures just for testing ### (similar to signatures 1, 5 and 13, respectively) test1 <- matrix(c( 0.1, 0, 0.7, 0.1, 0.1, 0, 0.3, 0.2, 0.3, 0.2, 0, 0, 0.2, 0.1, 0.5, 0.2, 0, 0 ), nrow=3, byrow=TRUE) test2 <- matrix(c( 0.1, 0.1, 0.3, 0.1, 0.3, 0.1, 0.3, 0.25, 0.2, 0.25, 0, 0, 0.3, 0.2, 0.2, 0.3, 0, 0 ), nrow=3, byrow=TRUE) test3 <- matrix(c( 0.1, 0.7, 0.2, 0, 0, 0, 0, 0, 0, 1.0, 0, 0, 0.5, 0.1, 0, 0.4, 0, 0 ), nrow=3, byrow=TRUE) fromSig <- list(sig1=test1, sig2=test2, sig3=test3) ### compute distances of the test signature to the converted ### Alexandrov signatures from COSMIC mapSignatureSets(fromSig, signAlex2Shi, method="frobenius", unique=TRUE)
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to Shiraishi signatures signAlex2Shi <- convertAlexandrov2Shiraishi(signAlexandrov) ### define a small set of arbitrary signatures just for testing ### (similar to signatures 1, 5 and 13, respectively) test1 <- matrix(c( 0.1, 0, 0.7, 0.1, 0.1, 0, 0.3, 0.2, 0.3, 0.2, 0, 0, 0.2, 0.1, 0.5, 0.2, 0, 0 ), nrow=3, byrow=TRUE) test2 <- matrix(c( 0.1, 0.1, 0.3, 0.1, 0.3, 0.1, 0.3, 0.25, 0.2, 0.25, 0, 0, 0.3, 0.2, 0.2, 0.3, 0, 0 ), nrow=3, byrow=TRUE) test3 <- matrix(c( 0.1, 0.7, 0.2, 0, 0, 0, 0, 0, 0, 1.0, 0, 0, 0.5, 0.1, 0, 0.4, 0, 0 ), nrow=3, byrow=TRUE) fromSig <- list(sig1=test1, sig2=test2, sig3=test3) ### compute distances of the test signature to the converted ### Alexandrov signatures from COSMIC mapSignatureSets(fromSig, signAlex2Shi, method="frobenius", unique=TRUE)
'plotDecomposedContribution()' plots the decomposition of a tumor genome,
i.e., the contributions/exposures obtained from decomposeTumorGenomes
for a set of signatures.
plotDecomposedContribution(decomposition, signatures=NULL, removeNA=TRUE)
plotDecomposedContribution(decomposition, signatures=NULL, removeNA=TRUE)
decomposition |
(Mandatory) A decomposition vector (exposure vector) obtained for a single tumor genome. |
signatures |
(Optional) A list object containing the signatures used to compute the decomposition. If specified, the signature labels used in the plot will be taken from the element names of the list; otherwise signature names will be taken from the exposure object (decomposition) or named from sign_1 to sign_N. |
removeNA |
(Optional) If |
Returns (or draws) a plot of the decomposed tumor genome (i.e., contributions of the single signatures).
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### plot signature composition of the first genome plotDecomposedContribution(exposures[[1]], signatures=NULL)
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Alexandrov_3bases.Rdata", package="decompTumor2Sig") load(gfile) ### compute exposures exposures <- decomposeTumorGenomes(genomes, signatures, verbose=FALSE) ### plot signature composition of the first genome plotDecomposedContribution(exposures[[1]], signatures=NULL)
'plotExplainedVariance()' plots the explained variance of a single tumor
genome's mutation patterns as a function of the number of signatures
(increasing subsets of signatures) used for decomposition. For each
number K of signatures, the highest variance explained by possible
subsets of K signatures will be plotted (full or greedy search, see below).
This can help to evaluate what minimum threshold for the explained variance
can be used to decompose tumor genomes with the function
decomposeTumorGenomes
.
plotExplainedVariance(genome, signatures, minExplainedVariance=NULL, minNumSignatures=2, maxNumSignatures=NULL, greedySearch=FALSE)
plotExplainedVariance(genome, signatures, minExplainedVariance=NULL, minNumSignatures=2, maxNumSignatures=NULL, greedySearch=FALSE)
genome |
(Mandatory) The mutation load of a single genome in
Alexandrov- of Shiraishi-format, i.e. as vector or matrix. The format
must be the same as the one used for the |
signatures |
(Mandatory) The list of signatures (vectors, data frames or matrices) which are to be evaluated. Each of the list objects represents one mutational signature. Vectors are used for Alexandrov signatures, data frames or matrices for Shiraishi signatures. |
minExplainedVariance |
(Optional) If a numeric value between 0 and 1
is specified, the plot highlights the smallest subset of signatures which
is sufficient to explain at least the specified fraction of the variance
of the genome's mutation patterns. If, for example,
|
minNumSignatures |
(Optional) The plot will be generated only for
K>= |
maxNumSignatures |
(Optional) The plot will be generated only for
K<= |
greedySearch |
(Optional) If |
Returns (or draws) a plot of the explained variance as a function of the number of signatures.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
computeExplainedVariance
### get 15 pre-processed Shiraishi signatures computed (object 'signatures') ### from 435 tumor genomes Alexandrov et al (PMID: 23945592) ### using the pmsignature package sfile <- system.file("extdata", "Alexandrov_PMID_23945592_435_tumors-pmsignature-15sig.Rdata", package="decompTumor2Sig") load(sfile) ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Shiraishi_5bases_trDir.Rdata", package="decompTumor2Sig") load(gfile) ### plot the explained variance for 2 to 6 signatures of the first genome plotExplainedVariance(genomes[[1]], signatures, minExplainedVariance=0.98, minNumSignatures=2, maxNumSignatures=6)
### get 15 pre-processed Shiraishi signatures computed (object 'signatures') ### from 435 tumor genomes Alexandrov et al (PMID: 23945592) ### using the pmsignature package sfile <- system.file("extdata", "Alexandrov_PMID_23945592_435_tumors-pmsignature-15sig.Rdata", package="decompTumor2Sig") load(sfile) ### load preprocessed breast cancer genomes (object 'genomes') from ### Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-genomes-Shiraishi_5bases_trDir.Rdata", package="decompTumor2Sig") load(gfile) ### plot the explained variance for 2 to 6 signatures of the first genome plotExplainedVariance(genomes[[1]], signatures, minExplainedVariance=0.98, minNumSignatures=2, maxNumSignatures=6)
'plotMutationDistribution()' plots a single signature or the mutation frequency data for a single genome. This works for signatures or genome data of both the Shiraishi and the Alexandrov type.
plotMutationDistribution(mutData, colors = NULL, strip = NULL)
plotMutationDistribution(mutData, colors = NULL, strip = NULL)
mutData |
(Mandatory) The signature or genome mutation frequency data to be plotted. This can either be a matrix (Shiraishi model) or a numeric vector (Alexandrov model). |
colors |
Vector of colors to be used for the base change data. For
Alexandrov-type data, this vector must contain six elements (one per base
change). For Shiraishi-type data, this vector must contain four elements
(one per base). If |
strip |
Background color for strip labels; used only for
Alexandrov-type data. If |
Returns (or draws) a plot according to the Alexandrov or Shiraishi model of mutational signatures.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
### Attention: using plotMutationDistribution requires the package ### pmsignature to be installed! ### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### plot the first Alexandrov signature plotMutationDistribution(signatures[[1]]) ### read four Shiraishi signatures for breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) from flat files sigfiles <- system.file("extdata", paste0("Nik-Zainal_PMID_22608084-pmsignature-sig",1:4,".tsv"), package="decompTumor2Sig") signatures <- readShiraishiSignatures(sigfiles) ### plot the first Shiraishi signature plotMutationDistribution(signatures[[1]])
### Attention: using plotMutationDistribution requires the package ### pmsignature to be installed! ### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures() ### plot the first Alexandrov signature plotMutationDistribution(signatures[[1]]) ### read four Shiraishi signatures for breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) from flat files sigfiles <- system.file("extdata", paste0("Nik-Zainal_PMID_22608084-pmsignature-sig",1:4,".tsv"), package="decompTumor2Sig") signatures <- readShiraishiSignatures(sigfiles) ### plot the first Shiraishi signature plotMutationDistribution(signatures[[1]])
'readAlexandrovSignatures()' reads a set of Alexandrov-type signatures (COSMIC format) from a flat file or URL. Signatures must be specified in the tab-separated format used by the COSMIC website for signatures version 2 (March 2015), the comma-separated format used for signatures version 3 (May 2019), the Microsoft Excel 2007+ sheet used for version 3.1, or the tab-sperated format used for version 3.2 (see Details below). Excel sheets cannot be read from an URL and must be downloaded first.
readAlexandrovSignatures(file)
readAlexandrovSignatures(file)
file |
(Mandatory) Can be a file name or an URL for download.
Default: |
For details on the accepted signature formats, see below or
http://cancer.sanger.ac.uk/cosmic/signatures_v2 ->
"Download signatures" for version 2,
https://www.synapse.org/#!Synapse:syn12009743 for version 3,
https://cancer.sanger.ac.uk/sigs-assets-20/COSMIC_Mutational_Signatures_v3.1.xlsx
for version 3.1,
and https://cancer.sanger.ac.uk/signatures/ for version 3.2.
For versions 3, 3.1 and 3.2, only Single Base Substitution (SBS)
signatures can be used.
COSMIC format for Alexandrov signatures, version 2:
Subst. | Trinucleotide | Mutation Type | Signature 1 | Signature 2 | ... |
C>A | ACA | A[C>A]A | 0.0110983262 | 0.0006827082 | ... |
C>A | ACC | A[C>A]C | 0.0091493407 | 0.0006191072 | ... |
C>A | ACG | A[C>A]G | 0.0014900705 | 0.0000992790 | ... |
C>A | ACT | A[C>A]T | 0.0062338852 | 0.0003238914 | ... |
[...] | |||||
T>G | TTG | T[T>G]G | 0.0020310769 | 0.0002066152 | ... |
T>G | TTT | T[T>G]T | 0.0040301281 | 0.0000235982 | ... |
COSMIC/Synapse format for Alexandrov signatures, version 3 and 3.1:
Type,SubType,SBS1,SBS2,SBS3,SBS4,SBS5,SBS6, ... |
C>A,ACA,8.86E-04,5.80E-07,2.08E-02,4.22E-02,1.20E-02,4.25E-04, ... |
C>A,ACC,2.28E-03,1.48E-04,1.65E-02,3.33E-02,9.44E-03,5.24E-04, ... |
C>A,ACG,1.77E-04,5.23E-05,1.75E-03,1.56E-02,1.85E-03,5.20E-05, ... |
C>A,ACT,1.28E-03,9.78E-05,1.22E-02,2.95E-02,6.61E-03,1.80E-04, ... |
[...] |
T>G,TTG,5.83E-04,9.54E-05,8.05E-03,2.32E-03,6.94E-03,3.24E-04, ... |
T>G,TTT,2.23E-16,2.23E-16,1.05E-02,5.68E-04,1.35E-02,1.01E-03, ... |
Version 3.1 has assentially the same format as version 3, but is distributed as an Excel spread sheet.
COSMIC/Synapse format for Alexandrov signatures, version 3.2:
Type | SBS1 | SBS2 | ... |
A[C>A]A | 0.0110983262 | 0.0006827082 | ... |
A[C>A]C | 0.0091493407 | 0.0006191072 | ... |
A[C>A]G | 0.0014900705 | 0.0000992790 | ... |
A[C>A]T | 0.0062338852 | 0.0003238914 | ... |
[...] | |||
T[T>G]G | 0.0020310769 | 0.0002066152 | ... |
T[T>G]T | 0.0040301281 | 0.0000235982 | ... |
A list of Alexandrov signatures that can be used for
decomposeTumorGenomes
.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
readShiraishiSignatures
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures()
### get Alexandrov signatures from COSMIC signatures <- readAlexandrovSignatures()
'readGenomesFromMPF()' reads somatic mutations of a single tumor genome (sample) or a set of genomes from an MPF file (Mutation Position Format; see details below) and determines the mutation frequencies according to a specific model of mutational signatures (Alexandrov or Shiraishi).
readGenomesFromMPF(file, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, transcriptAnno= TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene, verbose=TRUE)
readGenomesFromMPF(file, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, transcriptAnno= TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene, verbose=TRUE)
file |
(Mandatory) The name of the MPF file (can be compressed with
|
numBases |
(Mandatory) Total number of bases (mutated base and flanking bases) to be used for sequence patterns. Must be odd. Default: 5 |
type |
(Mandatory) Signature model or type ( |
trDir |
(Mandatory) Specifies whether the transcription direction is
taken into account in the signature model. If so, only mutations within
genomic regions with a defined transcription direction can be considered.
Default: |
enforceUniqueTrDir |
(Optional) Used only if |
refGenome |
(Mandatory) The reference genome ( |
transcriptAnno |
(Optional) Transcript annotation ( |
verbose |
(Optional) Print information about reading and processing the
mutation data. Default: |
An MPF file has the following format (one line per mutation and patient/sample):
[sampleID]<tab>[chrom]<tab>[position]<tab>[ref_bases]<tab>[alt_bases]
A list containing the genomes in terms of frequencies of the mutated
sequence patterns. This list of genomes can be used for
decomposeTumorGenomes
.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
readGenomesFromVCF
getGenomesFromMutFeatData
### load reference genome and transcript annotation (if direction is needed) refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 transcriptAnno <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-MPF.txt.gz", package="decompTumor2Sig") genomes <- readGenomesFromMPF(gfile, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=refGenome, transcriptAnno=transcriptAnno, verbose=FALSE)
### load reference genome and transcript annotation (if direction is needed) refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 transcriptAnno <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-MPF.txt.gz", package="decompTumor2Sig") genomes <- readGenomesFromMPF(gfile, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=refGenome, transcriptAnno=transcriptAnno, verbose=FALSE)
'readGenomesFromVCF()' reads somatic mutations of a single tumor genome (sample) or a set of genomes from a VCF file (Variant Call Format) and determines the mutation frequencies according to a specific model of mutational signatures (Alexandrov or Shiraishi).
readGenomesFromVCF(file, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, transcriptAnno= TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene, verbose=TRUE)
readGenomesFromVCF(file, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, transcriptAnno= TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene, verbose=TRUE)
file |
(Mandatory) The name of the VCF file (can be compressed with
|
numBases |
(Mandatory) Total number of bases (mutated base and flanking bases) to be used for sequence patterns. Must be odd. Default: 5 |
type |
(Mandatory) Signature model or type ( |
trDir |
(Mandatory) Specifies whether the transcription direction is
taken into account in the signature model. If so, only mutations within
genomic regions with a defined transcription direction can be considered.
Default: |
enforceUniqueTrDir |
(Optional) Used only if |
refGenome |
(Mandatory) The reference genome ( |
transcriptAnno |
(Optional) Transcript annotation ( |
verbose |
(Optional) Print information about reading and processing the
mutation data. Default: |
A list containing the genomes in terms of frequencies of the mutated
sequence patterns. This list of genomes can be used for
decomposeTumorGenomes
.
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
decomposeTumorGenomes
readGenomesFromMPF
getGenomesFromMutFeatData
### load reference genome and transcript annotation (if direction is needed) refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 transcriptAnno <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") genomes <- readGenomesFromVCF(gfile, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=refGenome, transcriptAnno=transcriptAnno, verbose=FALSE)
### load reference genome and transcript annotation (if direction is needed) refGenome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 transcriptAnno <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene ### read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-VCF-convertedfromMPF.vcf.gz", package="decompTumor2Sig") genomes <- readGenomesFromVCF(gfile, numBases=5, type="Shiraishi", trDir=TRUE, enforceUniqueTrDir=TRUE, refGenome=refGenome, transcriptAnno=transcriptAnno, verbose=FALSE)
'readShiraishiSignatures()' reads one or more Shiraishi-type signatures from flat files (one file per signature). The signatures must be specified as matrices without headers and row names (see details below).
readShiraishiSignatures(files)
readShiraishiSignatures(files)
files |
(Mandatory) Can be a single file name, a vector of file names, or a list of file names. |
Format (see Shiraishi et al. PLoS Genetics 11(12):e1005657, 2015):
First line: Frequencies of the base changes C>A, C>G, C>T, T>A, T>C, and T>G
Following 2k lines (for k up- and downstream flanking bases): Frequencies of the bases A, C, G, and T, followed by two 0 values
Final line (only if transcription direction is considered): Frequencies of occurrences on the transcription strand, and on the opposite strand, followed by four 0 values.
Example:
1.8874e-14 | 0.10974 | 0.045918 | 0.11308 | 0.07429 | 0.65697 |
3.8079e-01 | 0.12215 | 0.191456 | 0.30561 | 0.00000 | 0.00000 |
1.5311e-01 | 0.34214 | 0.179774 | 0.32497 | 0.00000 | 0.00000 |
1.2378e-01 | 0.10243 | 0.163461 | 0.61032 | 0.00000 | 0.00000 |
3.4891e-01 | 0.15346 | 0.156687 | 0.34094 | 0.00000 | 0.00000 |
5.6435e-01 | 0.43565 | 0.000000 | 0.00000 | 0.00000 | 0.00000 |
A list of Shiraishi signatures that can be used for
decomposeTumorGenomes
.
Rosario M. Piro, Politecnico di Milano
Sandra Krueger, Freie Universitaet Berlin
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
readAlexandrovSignatures
getSignaturesFromEstParam
### read four Shiraishi signatures for breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) from flat files sigfiles <- system.file("extdata", paste0("Nik-Zainal_PMID_22608084-pmsignature-sig",1:4,".tsv"), package="decompTumor2Sig") signatures <- readShiraishiSignatures(sigfiles)
### read four Shiraishi signatures for breast cancer genomes from ### Nik-Zainal et al (PMID: 22608084) from flat files sigfiles <- system.file("extdata", paste0("Nik-Zainal_PMID_22608084-pmsignature-sig",1:4,".tsv"), package="decompTumor2Sig") signatures <- readShiraishiSignatures(sigfiles)
'sameSignatureFormat()' checks whether two input object are sets (lists) of numeric objects both compatible with the same signature format (probability vectors for Alexandrov signatures and probability matrices or data.frames for Shiraishi signatures). For Shiraishi signatures also the number of flanking bases and the presence of transcription-strand information are compared. For Alexandrov signatures also the number of triplet changes are compared.
sameSignatureFormat(x, y)
sameSignatureFormat(x, y)
x |
First object to be checked. |
y |
Second object to be checked. |
Logical value (true or false).
Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario
M. Piro
E-Mail: <[email protected]> or <[email protected]>
http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational
signatures active in individual tumors. BMC Bioinformatics
20(Suppl 4):152.
decompTumor2Sig
isAlexandrovSet
isShiraishiSet
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to the Shiraishi model signShiraishi <- convertAlexandrov2Shiraishi(signAlexandrov) sameSignatureFormat(signAlexandrov, signShiraishi)
### get Alexandrov signatures from COSMIC signAlexandrov <- readAlexandrovSignatures() ### convert them to the Shiraishi model signShiraishi <- convertAlexandrov2Shiraishi(signAlexandrov) sameSignatureFormat(signAlexandrov, signShiraishi)