Title: | RESOLVE: An R package for the efficient analysis of mutational signatures from cancer genomes |
---|---|
Description: | Cancer is a genetic disease caused by somatic mutations in genes controlling key biological functions such as cellular growth and division. Such mutations may arise both through cell-intrinsic and exogenous processes, generating characteristic mutational patterns over the genome named mutational signatures. The study of mutational signatures have become a standard component of modern genomics studies, since it can reveal which (environmental and endogenous) mutagenic processes are active in a tumor, and may highlight markers for therapeutic response. Mutational signatures computational analysis presents many pitfalls. First, the task of determining the number of signatures is very complex and depends on heuristics. Second, several signatures have no clear etiology, casting doubt on them being computational artifacts rather than due to mutagenic processes. Last, approaches for signatures assignment are greatly influenced by the set of signatures used for the analysis. To overcome these limitations, we developed RESOLVE (Robust EStimation Of mutationaL signatures Via rEgularization), a framework that allows the efficient extraction and assignment of mutational signatures. RESOLVE implements a novel algorithm that enables (i) the efficient extraction, (ii) exposure estimation, and (iii) confidence assessment during the computational inference of mutational signatures. |
Authors: | Daniele Ramazzotti [aut] , Luca De Sano [cre, aut] |
Maintainer: | Luca De Sano <[email protected]> |
License: | file LICENSE |
Version: | 1.9.0 |
Built: | 2024-11-18 04:13:27 UTC |
Source: | https://github.com/bioc/RESOLVE |
Germline replication error estimated in Rahbari, Raheleh, et al. (2016).
data(background)
data(background)
Vector of rates
Vector of rates for the 96 trinucleotides
Nat Genet. 2016 Feb;48(2):126-133 (https://www.nature.com/articles/ng.3469).
Background replication error signature derived from COSMIC SBS5.
data(background2)
data(background2)
Vector of rates
Vector of rates for the 96 trinucleotides
COSMIC database (https://cancer.sanger.ac.uk/cosmic/signatures) v3.
Reduced version of the dataset of counts of copy numbers detected in TCGA tumors published in Steele, Christopher D., et al. (2022).
data(cn_example_reduced)
data(cn_example_reduced)
Reduced version of the counts of copy numbers in the format compatible with the import function
Reduced version of the counts of copy numbers for 5 tumors and 48 copy number classes in the format compatible with the import function
Nature. 2022 Jun;606(7916):984-991 (https://www.nature.com/articles/s41586-022-04738-6).
Create Copy Numbers (CNs) counts matrix from input data. This function has been derived from: https://github.com/UCL-Research-Department-of-Pathology/panConusig/blob/main/R/setup_CNsigs.R
getCNCounts(data)
getCNCounts(data)
data |
A data.frame with copy number data having 6 columns: sample name, chromosome, start position, end position, major CN, minor CN. |
A matrix with Copy Numbers (CNs) counts per patient.
data(cn_example_reduced) res <- getCNCounts(data = cn_example_reduced)
data(cn_example_reduced) res <- getCNCounts(data = cn_example_reduced)
Create Small Insertions and Deletions (IDs) counts matrix from input data.
getIDCounts(data, reference = NULL)
getIDCounts(data, reference = NULL)
data |
A data.frame with variants having 6 columns: sample name, chromosome, start position, end position, ref, alt. |
reference |
A BSgenome object with the reference genome to be used. |
A matrix with Small Insertions and Deletions (IDs) counts per patient.
library('BSgenome.Hsapiens.1000genomes.hs37d5') data(id_example_reduced) res <- getIDCounts(data = id_example_reduced, reference = BSgenome.Hsapiens.1000genomes.hs37d5)
library('BSgenome.Hsapiens.1000genomes.hs37d5') data(id_example_reduced) res <- getIDCounts(data = id_example_reduced, reference = BSgenome.Hsapiens.1000genomes.hs37d5)
Create Multi-Nucleotide Variants (MNVs) counts matrix from input data.
getMNVCounts(data, predefined_dbs_mbs = FALSE)
getMNVCounts(data, predefined_dbs_mbs = FALSE)
data |
A data.frame with variants having 6 columns: sample name, chromosome, start position, end position, ref, alt. |
predefined_dbs_mbs |
Boolean. As defined by the function get_mut_type from the package MutationalPatterns, it specifies whether dbs and mbs mutations have been predefined in the input data. This function by default assumes that dbs and mbs mutations are present in the vcf as snvs, which are positioned next to each other. If your dbs/mbs mutations are called separately, you should set this argument to TRUE. |
A matrix with Multi-Nucleotide Variants (MNVs) counts per patient.
data(ssm560_reduced) res <- getMNVCounts(data = ssm560_reduced)
data(ssm560_reduced) res <- getMNVCounts(data = ssm560_reduced)
Create Single Base Substitutions (SBS) counts matrix from input data for a provided reference genome.
getSBSCounts(data, reference = NULL)
getSBSCounts(data, reference = NULL)
data |
A data.frame with variants having 6 columns: sample name, chromosome, start position, end position, ref, alt. |
reference |
A BSgenome object with the reference genome to be used to retrieve flanking bases. |
A matrix with Single Base Substitutions (SBS) counts per patient.
library('BSgenome.Hsapiens.1000genomes.hs37d5') data(ssm560_reduced) res <- getSBSCounts(data = ssm560_reduced, reference = BSgenome.Hsapiens.1000genomes.hs37d5)
library('BSgenome.Hsapiens.1000genomes.hs37d5') data(ssm560_reduced) res <- getSBSCounts(data = ssm560_reduced, reference = BSgenome.Hsapiens.1000genomes.hs37d5)
Plot observed Copy Number (CN) counts for different groups of patients.
groupsCNPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
groupsCNPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
counts |
Matrix with Copy Number (CN) counts data. |
groups |
List where names are groups labels and elements are patients labels corresponding to rownames in counts. |
normalize |
Boolean value; shall I normalize observed counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['groups.CN.plot']][['counts']] groups <- plot_data_examples[['groups.CN.plot']][['groups']] groupsCNPlot(counts=counts,groups=groups)
data(plot_data_examples) counts <- plot_data_examples[['groups.CN.plot']][['counts']] groups <- plot_data_examples[['groups.CN.plot']][['groups']] groupsCNPlot(counts=counts,groups=groups)
Plot observed Copy Number (Reduced, CX) counts for different groups of patients.
groupsCXPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
groupsCXPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
counts |
Matrix with Copy Number (Reduced, CX) counts data. |
groups |
List where names are groups labels and elements are patients labels corresponding to rownames in counts. |
normalize |
Boolean value; shall I normalize observed counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['groups.CX.plot']][['counts']] groups <- plot_data_examples[['groups.CX.plot']][['groups']] groupsCXPlot(counts=counts,groups=groups)
data(plot_data_examples) counts <- plot_data_examples[['groups.CX.plot']][['counts']] groups <- plot_data_examples[['groups.CX.plot']][['groups']] groupsCXPlot(counts=counts,groups=groups)
Plot observed Small Insertions and Deletions (ID) counts for different groups of patients.
groupsIDPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
groupsIDPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
counts |
Matrix with Small Insertions and Deletions (ID) counts data. |
groups |
List where names are groups labels and elements are patients labels corresponding to rownames in counts. |
normalize |
Boolean value; shall I normalize observed counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['groups.ID.plot']][['counts']] groups <- plot_data_examples[['groups.ID.plot']][['groups']] groupsIDPlot(counts=counts,groups=groups)
data(plot_data_examples) counts <- plot_data_examples[['groups.ID.plot']][['counts']] groups <- plot_data_examples[['groups.ID.plot']][['groups']] groupsIDPlot(counts=counts,groups=groups)
Plot observed Multi-Nucleotide Variants (MNVs) counts for different groups of patients.
groupsMNVPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
groupsMNVPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
counts |
Matrix with Multi-Nucleotide Variants (MNVs) counts data. |
groups |
List where names are groups labels and elements are patients labels corresponding to rownames in counts. |
normalize |
Boolean value; shall I normalize observed counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['groups.MNV.plot']][['counts']] groups <- plot_data_examples[['groups.MNV.plot']][['groups']] groupsMNVPlot(counts=counts,groups=groups)
data(plot_data_examples) counts <- plot_data_examples[['groups.MNV.plot']][['counts']] groups <- plot_data_examples[['groups.MNV.plot']][['groups']] groupsMNVPlot(counts=counts,groups=groups)
Plot observed Single Base Substitutions (SBS) counts for different groups of patients.
groupsSBSPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
groupsSBSPlot(counts, groups, normalize = TRUE, xlabels = FALSE)
counts |
Matrix with Single Base Substitutions (SBS) counts data. |
groups |
List where names are groups labels and elements are patients labels corresponding to rownames in counts. |
normalize |
Boolean value; shall I normalize observed counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['groups.SBS.plot']][['counts']] groups <- plot_data_examples[['groups.SBS.plot']][['groups']] groupsSBSPlot(counts=counts,groups=groups)
data(plot_data_examples) counts <- plot_data_examples[['groups.SBS.plot']][['counts']] groups <- plot_data_examples[['groups.SBS.plot']][['groups']] groupsSBSPlot(counts=counts,groups=groups)
Reduced version of the dataset of counts of indels detected in 3 samples from Osorio, Fernando G., et al. (2018).
data(id_example_reduced)
data(id_example_reduced)
Reduced version of the counts of indels in the format compatible with the import function
Reduced version of the counts of indels for 3 samples and 83 indel classes in the format compatible with the import function
Cell Rep. 2018 Nov 27;25(9):2308-2316.e4 (10.1016/j.celrep.2018.11.014).
Dataset of counts of the point mutations detected in 560 breast tumors published in Nik-Zainal, Serena, et al. (2016).
data(patients)
data(patients)
Counts of the point mutations
Counts of point mutations for 560 tumors and 96 trinucleotides
Nature. 2016 Jun 2;534(7605):47-54 (https://www.nature.com/articles/nature17676).
Plot Copy Number (CN) counts for a set of given patients.
patientsCNPlot( cn_data_counts, samples = rownames(cn_data_counts), freq = FALSE, xlabels = FALSE )
patientsCNPlot( cn_data_counts, samples = rownames(cn_data_counts), freq = FALSE, xlabels = FALSE )
cn_data_counts |
Copy Number counts matrix. |
samples |
Name of the samples. This should match a rownames in cn_data_counts |
freq |
Boolean value; shall I display rates instead of counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['patients.CN.plot']][['counts']] patientsCNPlot(cn_data_counts=counts,samples=rownames(counts)[seq_len(2)])
data(plot_data_examples) counts <- plot_data_examples[['patients.CN.plot']][['counts']] patientsCNPlot(cn_data_counts=counts,samples=rownames(counts)[seq_len(2)])
Plot Copy Number (Reduced, CX) counts for a set of given patients.
patientsCXPlot( cn_data_counts, samples = rownames(cn_data_counts), freq = FALSE, xlabels = FALSE )
patientsCXPlot( cn_data_counts, samples = rownames(cn_data_counts), freq = FALSE, xlabels = FALSE )
cn_data_counts |
Copy Number counts matrix. |
samples |
Name of the samples. This should match a rownames in cn_data_counts |
freq |
Boolean value; shall I display rates instead of counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['patients.CX.plot']][['counts']] patientsCXPlot(cn_data_counts=counts,samples=rownames(counts)[seq_len(2)])
data(plot_data_examples) counts <- plot_data_examples[['patients.CX.plot']][['counts']] patientsCXPlot(cn_data_counts=counts,samples=rownames(counts)[seq_len(2)])
Plot Small Insertions and Deletions (ID) counts for a set of given patients.
patientsIDPlot( id_data_counts, samples = rownames(id_data_counts), freq = FALSE, xlabels = FALSE )
patientsIDPlot( id_data_counts, samples = rownames(id_data_counts), freq = FALSE, xlabels = FALSE )
id_data_counts |
Small Insertions and Deletions counts matrix. |
samples |
Name of the samples. This should match a rownames in id_data_counts |
freq |
Boolean value; shall I display rates instead of counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['patients.ID.plot']][['counts']] patientsIDPlot(id_data_counts=counts,samples=rownames(counts)[seq_len(2)])
data(plot_data_examples) counts <- plot_data_examples[['patients.ID.plot']][['counts']] patientsIDPlot(id_data_counts=counts,samples=rownames(counts)[seq_len(2)])
Plot Multi-Nucleotide Variants (MNVs) counts for a set of given patients.
patientsMNVPlot( multi_nucleotides_counts, samples = rownames(multi_nucleotides_counts), freq = FALSE, xlabels = FALSE )
patientsMNVPlot( multi_nucleotides_counts, samples = rownames(multi_nucleotides_counts), freq = FALSE, xlabels = FALSE )
multi_nucleotides_counts |
Multi-Nucleotide counts matrix. |
samples |
Name of the samples. This should match a rownames in multi_nucleotides_counts |
freq |
Boolean value; shall I display rates instead of counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['patients.MNV.plot']][['counts']] patientsMNVPlot(multi_nucleotides_counts=counts,samples=rownames(counts)[seq_len(2)])
data(plot_data_examples) counts <- plot_data_examples[['patients.MNV.plot']][['counts']] patientsMNVPlot(multi_nucleotides_counts=counts,samples=rownames(counts)[seq_len(2)])
Plot Single Base Substitutions (SBS) counts for a set of given patients.
patientsSBSPlot( trinucleotides_counts, samples = rownames(trinucleotides_counts), freq = FALSE, xlabels = FALSE )
patientsSBSPlot( trinucleotides_counts, samples = rownames(trinucleotides_counts), freq = FALSE, xlabels = FALSE )
trinucleotides_counts |
Trinucleotides counts matrix. |
samples |
Name of the samples. This should match a rownames in trinucleotides_counts. |
freq |
Boolean value; shall I display rates instead of counts? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) counts <- plot_data_examples[['patients.SBS.plot']][['counts']] patientsSBSPlot(trinucleotides_counts=counts,samples=rownames(counts)[seq_len(2)])
data(plot_data_examples) counts <- plot_data_examples[['patients.SBS.plot']][['counts']] patientsSBSPlot(trinucleotides_counts=counts,samples=rownames(counts)[seq_len(2)])
List data structure to run examples.
data(plot_data_examples)
data(plot_data_examples)
List data structure to run examples
List data structure to run examples
List data structure to run examples.
Perform the assignment of K somatic mutational signatures provided as input to samples given a set of observed counts x. This function can be used to estimate different types of mutational signatures such as: SBS (single base substitutions) and MNV (multi-nucleotide variant) (see Degasperi, Andrea, et al. 'Substitution mutational signatures in whole-genome–sequenced cancers in the UK population.' Science 376.6591 (2022): abl9283), CX (chromosomal instability) (see Drews, Ruben M., et al. 'A pan-cancer compendium of chromosomal instability.' Nature 606.7916 (2022): 976-983) and CN (copy number) signatures (see Steele, Christopher D., et al. 'Signatures of copy number alterations in human cancer.' Nature 606.7916 (2022): 984-991).
signaturesAssignment(x, beta)
signaturesAssignment(x, beta)
x |
Counts matrix for a set of n patients and m categories. These can be, e.g., SBS, MNV, CN or CN counts; in the case of SBS it would be an n patients x 96 trinucleotides matrix. |
beta |
Matrix of the discovered signatures to be used for the assignment. |
A list with the discovered signatures. It includes 3 elements: alpha: matrix of the discovered exposure values. beta: matrix of the discovered signatures. unexplained_mutations: number of unexplained mutations per sample.
data(background) data(patients) set.seed(12345) beta <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1) set.seed(12345) res <- signaturesAssignment(x = patients[seq_len(3),seq_len(2)], beta = beta$beta[[1]])
data(background) data(patients) set.seed(12345) beta <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1) set.seed(12345) res <- signaturesAssignment(x = patients[seq_len(3),seq_len(2)], beta = beta$beta[[1]])
Plot the inferred Copy Number (CN) mutational signatures.
signaturesCNPlot(beta, useRowNames = FALSE, xlabels = FALSE)
signaturesCNPlot(beta, useRowNames = FALSE, xlabels = FALSE)
beta |
Matrix with the inferred mutational signatures. |
useRowNames |
Boolean value; shall I use the rownames from beta as names for the signatures? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) beta <- plot_data_examples[['signatures.CN.plot']][['beta']] signaturesCNPlot(beta=beta)
data(plot_data_examples) beta <- plot_data_examples[['signatures.CN.plot']][['beta']] signaturesCNPlot(beta=beta)
Perform the assessment of different signaturesDecomposition solutions by cross-validation for K (beta, as estimated by signaturesDecomposition) somatic mutational signatures given a set of observations x and discovered signatures beta.
signaturesCV( x, beta, normalize_counts = FALSE, cross_validation_entries = 0.01, cross_validation_iterations = 5, cross_validation_repetitions = 100, num_processes = Inf, verbose = TRUE )
signaturesCV( x, beta, normalize_counts = FALSE, cross_validation_entries = 0.01, cross_validation_iterations = 5, cross_validation_repetitions = 100, num_processes = Inf, verbose = TRUE )
x |
Counts matrix for a set of n patients and m categories. These can be, e.g., SBS, MNV, CN or CN counts; in the case of SBS it would be an n patients x 96 trinucleotides matrix. |
beta |
A set of inferred signatures as returned by signaturesDecomposition function. |
normalize_counts |
If true, the input counts matrix x is normalized such that the patients have the same number of mutation. |
cross_validation_entries |
Percentage of cells in the counts matrix to be replaced by 0s during cross-validation. |
cross_validation_iterations |
For each configuration, the first time the signatures are fitted form a matrix with a percentage of values replaced by 0s. This may result in poor fit/results. Then, we perform predictions of these entries and replace them with such predicted values. This parameter is the number of restarts to be performed to improve this estimate and obtain more stable solutions. |
cross_validation_repetitions |
Number of time cross-validation should be repeated. Higher values result in better estimate, but are computationally more expensive. |
num_processes |
Number of processes to be used during parallel execution. To execute in single process mode, this parameter needs to be set to either NA or NULL. |
verbose |
Boolean. Shall I print information messages? |
A list of 2 elements: estimates and summary. Here, cv_estimates reports the mean squared error for each configuration of performed cross-validation; rank_estimates reports mean and median values for each value of K.
data(background) data(patients) set.seed(12345) sigs <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3:4, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1) set.seed(12345) res <- signaturesCV(x = patients[seq_len(3),seq_len(2)], beta = sigs$beta, cross_validation_iterations = 2, cross_validation_repetitions = 2, num_processes = 1)
data(background) data(patients) set.seed(12345) sigs <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3:4, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1) set.seed(12345) res <- signaturesCV(x = patients[seq_len(3),seq_len(2)], beta = sigs$beta, cross_validation_iterations = 2, cross_validation_repetitions = 2, num_processes = 1)
Plot the inferred Copy Number (Reduced, CX) mutational signatures.
signaturesCXPlot(beta, useRowNames = FALSE, xlabels = FALSE)
signaturesCXPlot(beta, useRowNames = FALSE, xlabels = FALSE)
beta |
Matrix with the inferred mutational signatures. |
useRowNames |
Boolean value; shall I use the rownames from beta as names for the signatures? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) beta <- plot_data_examples[['signatures.CX.plot']][['beta']] signaturesCXPlot(beta=beta)
data(plot_data_examples) beta <- plot_data_examples[['signatures.CX.plot']][['beta']] signaturesCXPlot(beta=beta)
Perform signatures discovery and rank estimation for a range of K somatic mutational signatures given a set of observed counts x. This function can be used to estimate different types of mutational signatures such as: SBS (single base substitutions) and MNV (multi-nucleotide variant) (see Degasperi, Andrea, et al. 'Substitution mutational signatures in whole-genome–sequenced cancers in the UK population.' Science 376.6591 (2022): abl9283), CX (chromosomal instability) (see Drews, Ruben M., et al. 'A pan-cancer compendium of chromosomal instability.' Nature 606.7916 (2022): 976-983) and CN (copy number) signatures (see Steele, Christopher D., et al. 'Signatures of copy number alterations in human cancer.' Nature 606.7916 (2022): 984-991).
signaturesDecomposition( x, K, background_signature = NULL, normalize_counts = FALSE, nmf_runs = 100, num_processes = Inf, verbose = TRUE )
signaturesDecomposition( x, K, background_signature = NULL, normalize_counts = FALSE, nmf_runs = 100, num_processes = Inf, verbose = TRUE )
x |
Counts matrix for a set of n patients and m categories. These can be, e.g., SBS, MNV, CN or CN counts; in the case of SBS it would be an n patients x 96 trinucleotides matrix. |
K |
Either one value or a range of numeric values (each of them greater than 0) indicating the number of signatures to be considered. |
background_signature |
Background signature to be used. |
normalize_counts |
If true, the input counts matrix x is normalized such that the patients have the same number of mutation. |
nmf_runs |
Number of iteration (minimum 1) of NMF to be performed for a robust estimation of beta. |
num_processes |
Number of processes to be used during parallel execution. To execute in single process mode, this parameter needs to be set to either NA or NULL. |
verbose |
Boolean. Shall I print information messages? |
A list with the discovered signatures and related rank measures. It includes 5 elements: alpha: list of matrices of the discovered exposure values for each possible rank in the range K. beta: list of matrices of the discovered signatures for each possible rank in the range K. unexplained_mutations: number of unexplained mutations per sample. cosine_similarity: cosine similarity comparing input data x and predictions for each rank in the range K. measures: a data.frame containing the quality measures for each possible rank in the range K.
data(background) data(patients) set.seed(12345) res <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3:4, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1)
data(background) data(patients) set.seed(12345) res <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3:4, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1)
Plot the inferred Small Insertions and Deletions (ID) mutational signatures.
signaturesIDPlot(beta, useRowNames = FALSE, xlabels = FALSE)
signaturesIDPlot(beta, useRowNames = FALSE, xlabels = FALSE)
beta |
Matrix with the inferred mutational signatures. |
useRowNames |
Boolean value; shall I use the rownames from beta as names for the signatures? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) beta <- plot_data_examples[['signatures.ID.plot']][['beta']] signaturesIDPlot(beta=beta)
data(plot_data_examples) beta <- plot_data_examples[['signatures.ID.plot']][['beta']] signaturesIDPlot(beta=beta)
Plot the inferred Multi-Nucleotide Variants (MNVs) mutational signatures.
signaturesMNVPlot(beta, useRowNames = FALSE, xlabels = FALSE)
signaturesMNVPlot(beta, useRowNames = FALSE, xlabels = FALSE)
beta |
Matrix with the inferred mutational signatures. |
useRowNames |
Boolean value; shall I use the rownames from beta as names for the signatures? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) beta <- plot_data_examples[['signatures.MNV.plot']][['beta']] signaturesMNVPlot(beta=beta)
data(plot_data_examples) beta <- plot_data_examples[['signatures.MNV.plot']][['beta']] signaturesMNVPlot(beta=beta)
Plot the inferred Single Base Substitutions (SBS) mutational signatures.
signaturesSBSPlot(beta, useRowNames = FALSE, xlabels = FALSE)
signaturesSBSPlot(beta, useRowNames = FALSE, xlabels = FALSE)
beta |
Matrix with the inferred mutational signatures. |
useRowNames |
Boolean value; shall I use the rownames from beta as names for the signatures? |
xlabels |
Boolean value; shall I display x labels? |
A ggplot2 object.
data(plot_data_examples) beta <- plot_data_examples[['signatures.SBS.plot']][['beta']] signaturesSBSPlot(beta=beta)
data(plot_data_examples) beta <- plot_data_examples[['signatures.SBS.plot']][['beta']] signaturesSBSPlot(beta=beta)
Perform a robust estimation of alpha coefficients by bootstrap to reach a certain level of cosine similarity given a set of observed counts x and discovered signatures beta.
signaturesSignificance( x, beta, cosine_thr = 0.95, min_contribution = 0.05, pvalue_thr = 0.05, nboot = 100, num_processes = Inf, verbose = TRUE )
signaturesSignificance( x, beta, cosine_thr = 0.95, min_contribution = 0.05, pvalue_thr = 0.05, nboot = 100, num_processes = Inf, verbose = TRUE )
x |
Counts matrix for a set of n patients and m categories. These can be, e.g., SBS, MNV, CN or CN counts; in the case of SBS it would be an n patients x 96 trinucleotides matrix. |
beta |
Discovered signatures to be used for the fit of alpha. |
cosine_thr |
Level of cosine similarity to be reached for the fit of alpha. |
min_contribution |
Minimum contribution of a signature to be considered significant. |
pvalue_thr |
Pvalue level to be used to assess significance. |
nboot |
Number of bootstrap iterations to be performed. |
num_processes |
Number of processes to be used during parallel execution. To execute in single process mode, this parameter needs to be set to either NA or NULL. |
verbose |
Boolean. Shall I print information messages? |
A list with the bootstrap estimates. It includes 5 elements: alpha: matrix of the discovered exposure values considering significant signatures as estimated by bootstrap. beta: matrix of the discovered signatures. unexplained_mutations: number of unexplained mutations per sample. goodness_fit: vector reporting cosine similarities between predictions and observations. bootstrap_estimates: list of matrices reporting results by bootstrap estimates.
data(background) data(patients) set.seed(12345) beta <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3:4, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1) set.seed(12345) res <- signaturesSignificance(x = patients[seq_len(3),seq_len(2)], beta = beta$beta[[1]], cosine_thr = 0.95, min_contribution = 0.05, pvalue_thr = 0.05, nboot = 5, num_processes = 1)
data(background) data(patients) set.seed(12345) beta <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)], K = 3:4, background_signature = background[seq_len(2)], nmf_runs = 2, num_processes = 1) set.seed(12345) res <- signaturesSignificance(x = patients[seq_len(3),seq_len(2)], beta = beta$beta[[1]], cosine_thr = 0.95, min_contribution = 0.05, pvalue_thr = 0.05, nboot = 5, num_processes = 1)
Reduced version of the dataset of counts of the point mutations detected in 560 breast tumors published in Nik-Zainal, Serena, et al. (2016).
data(ssm560_reduced)
data(ssm560_reduced)
Reduced version of the counts of the point mutations in the format compatible with the import function
Reduced version of the counts of point mutations for 560 tumors and 96 trinucleotides in the format compatible with the import function
Nature. 2016 Jun 2;534(7605):47-54 (https://www.nature.com/articles/nature17676).