Intra-tumor heterogeneity (ITH) is now thought to be a key factor that results in the therapeutic failures and drug resistance, which have arose increasing attention in cancer research. Here, we present an R package, MesKit, for characterizing cancer genomic ITH and inferring the history of tumor evolutionary. MesKit provides a wide range of analyses including ITH evaluation, enrichment, signature, clone evolution analysis via implementation of well-established computational and statistical methods. The source code and documents are freely available through Github (https://github.com/Niinleslie/MesKit). We also developed a shiny application to provide easier analysis and visualization.
In R console, enter citation("MesKit")
.
_MesKit: A Tool Kit for Dissecting Cancer Evolution of Multi-region Tumor Biopsies through Somatic Alterations (In production)
To analyze with MesKit, you need to provide:
*.maf / *.maf.gz
). RequiredNote: Tumor_Sample_Barcode
should be
consistent in all input files, respectively.
Mutation Annotation Format (MAF) files are tab-delimited text files
with aggregated mutations information from VCF Files. The input MAF file
could be gz compressed, and allowed values of
Variant_Classification
column can be found at Mutation
Annotation Format Page.
The following fields are required to be contained in the MAF files with MesKit.
Mandatory fields:
Hugo_Symbol
, Chromosome
,
Start_Position
, End_Position
,
Variant_Classification
, Variant_Type
,
Reference_Allele
, Tumor_Seq_Allele2
,
Ref_allele_depth
, Alt_allele_depth
,
VAF
, Tumor_Sample_Barcode
Note:
Tumor_Sample_Barcode
of each sample should be
unique.VAF
(variant allele frequencie) can be on the scale
0-1 or 0-100.Example MAF file
## Hugo_Symbol Chromosome Start_Position End_Position Variant_Classification
## 1 KLHL17 1 899515 899515 Silent
## 2 MFSD6 2 191301342 191301342 Missense_Mutation
## 3 KIAA0319 6 24596837 24596837 Missense_Mutation
## Variant_Type Reference_Allele Tumor_Seq_Allele2 Ref_allele_depth
## 1 SNP C T 85
## 2 SNP G A 53
## 3 SNP C T 6
## Alt_allele_depth VAF Tumor_Sample_Barcode
## 1 1 0.012 V402_P_1
## 2 0 0.000 V750_P_2
## 3 0 0.000 V750_BM_3
Clinical data file contains clinical information about each patient
and their tumor samples, and mandatory fields are
Tumor_Sample_Barcode
, Tumor_ID
,
Patient_ID
, and Tumor_Sample_Label
.
Example clinical data file
## Tumor_Sample_Barcode Tumor_ID Patient_ID Tumor_Sample_Label
## 1 V402_P_1 P V402 P_1
## 2 V402_P_2 P V402 P_2
## 3 V402_P_3 P V402 P_3
## 4 V402_P_4 P V402 P_4
## 5 V402_BM_1 BM V402 BM_1
By default, there are six mandatory fields in input CCF file:
Patient_ID
, Tumor_Sample_Barcode
,
Chromosome
, Start_Position
, CCF
and CCF_Std
/CCF_CI_High
(required when
identifying clonal/subclonal mutations). The Chromosome
field of your MAF file and CCF file should be in the same format (both
in number or both start with “chr”). Notably,
Reference_Allele
and Tumor_Seq_Allele2
are
also required if you want include contains INDELs in the CCF file.
Example CCF file
## Patient_ID Tumor_Sample_Barcode Chromosome Start_Position CCF CCF_Std
## 1 V402 V402_P_1 1 899515 0.031 0.126
## 2 V402 V402_P_1 1 982996 0.031 0.117
## 3 V402 V402_P_1 1 2452742 0.125 0.239
## 4 V402 V402_P_1 1 6203883 0.422 0.750
## 5 V402 V402_P_1 1 11106655 0.094 0.324
The segmentation file is a tab-delimited file with the following columns:
Patient_ID
- patient IDTumor_Sample_Barcode
- tumor sample barcode of
samplesChromosome
- chromosome name or IDStart_Position
- genomic start position of segments
(1-indexed)End_Position
- genomic end position of segments
(1-indexed)SegmentMean/CopyNumber
- segment mean value or absolute
integer copy numberMinor_CN
- copy number of minor allele
OptionalMajor_CN
- copy number of major allele
OptionalTumor_Sample_Label
- the specific label of each tumor
sample. OptionalNote: Positions are in base pair units.
Example Segmentation file
## Patient_ID Tumor_Sample_Barcode Chromosome Start_Position End_Position
## 1 V402 V402_P_1 1 1 1650882
## 2 V402 V402_P_1 1 1650883 33159352
## 3 V402 V402_P_1 1 33159353 33610373
## 4 V402 V402_P_1 1 33610374 88509894
## 5 V402 V402_P_1 1 88509895 89462108
## CopyNumber Minor_CN Major_CN Tumor_Sample_Label
## 1 2 1 1 P_1
## 2 1 0 1 P_1
## 3 3 0 3 P_1
## 4 2 1 1 P_1
## 5 2 0 2 P_1
readMaf
function creates Maf/MafList objects by reading
MAF files, clinical files and cancer cell fraction (CCF) data (optional
but recommended). Parameter refBuild
is used to set
reference genome version for Homo sapiens reference
("hg18"
, "hg19"
or "hg38"
). You
should set use.indel.ccf = TRUE
when your
ccfFile
contains INDELs apart from SNVs.
maf.File <- system.file("extdata/", "CRC_HZ.maf", package = "MesKit")
ccf.File <- system.file("extdata/", "CRC_HZ.ccf.tsv", package = "MesKit")
clin.File <- system.file("extdata", "CRC_HZ.clin.txt", package = "MesKit")
# Maf object with CCF information
maf <- readMaf(mafFile = maf.File,
ccfFile = ccf.File,
clinicalFile = clin.File,
refBuild = "hg19")
In order to explore the genomic alterations during cancer progression
with multi-region sequencing approach, we provided
classifyMut
function to categorize mutations. The
classification is based on shared pattern or clonal status (CCF data is
required) of mutations, which can be specified by class
option. Additionally, classByTumor
can be used to reveal
the mutational profile within tumors.
# Driver genes of CRC collected from [IntOGen] (https://www.intogen.org/search) (v.2020.2)
driverGene.file <- system.file("extdata/", "IntOGen-DriverGenes_COREAD.tsv", package = "MesKit")
driverGene <- as.character(read.table(driverGene.file)$V1)
mut.class <- classifyMut(maf, class = "SP", patient.id = 'V402')
head(mut.class)
## Patient_ID Tumor_Sample_Barcode Mut_ID Mutation_Type
## 1 V402 V402_P_1 KLHL17:1:899515:C:T Shared
## 2 V402 V402_P_1 AGRN:1:982996:G:A Shared
## 3 V402 V402_P_1 PANK4:1:2452742:C:A Shared
## 4 V402 V402_P_1 CHD5:1:6203883:G:C Private
## 5 V402 V402_P_1 MASP2:1:11106655:C:T Shared
## 6 V402 V402_P_1 AADACL4:1:12704595:G:- Private
plotMutProfile
function can visualize the mutational
profile of tumor samples.
The plotCNA
function can characterize the CNA landscape
across samples based on copy number data from segmentation algorithms.
Besides, MesKit provides options to integrate GISTIC2 results, which can
be obtained from http://gdac.broadinstitute.org.
Please make sure the genome version based on these results is consistent
with refBuild
of the Maf/MafList object .
# Read segment file
segCN <- system.file("extdata", "CRC_HZ.seg.txt", package = "MesKit")
# Read gistic output files
all.lesions <- system.file("extdata", "COREAD_all_lesions.conf_99.txt", package = "MesKit")
amp.genes <- system.file("extdata", "COREAD_amp_genes.conf_99.txt", package = "MesKit")
del.genes <- system.file("extdata", "COREAD_del_genes.conf_99.txt", package = "MesKit")
seg <- readSegment(segFile = segCN, gisticAllLesionsFile = all.lesions,
gisticAmpGenesFile = amp.genes, gisticDelGenesFile = del.genes)
## --Processing COREAD_amp_genes.conf_99.txt
## --Processing COREAD_del_genes.conf_99.txt
## --Processing COREAD_all_lesions.conf_99.txt
## Key: <Patient_ID, Tumor_Sample_Barcode>
## Patient_ID Tumor_Sample_Barcode Tumor_Sample_Label Cytoband Cytoband_pos
## <char> <char> <char> <char> <int>
## 1: V402 V402_BM_1 BM_1 1p36.11 27117862
## 2: V402 V402_BM_1 BM_1 1q31.1 184887288
## 3: V402 V402_BM_1 BM_1 1q31.1 184887288
## 4: V402 V402_BM_1 BM_1 1q31.1 184887288
## 5: V402 V402_BM_1 BM_1 1q31.1 184887288
## Chromosome Start_Position End_Position CopyNumber Type Gistic_type
## <char> <num> <num> <num> <char> <char>
## 1: 1 16916297 31853222 1 Loss Del
## 2: 1 142720237 144392892 4 Amplification AMP
## 3: 1 144716031 144856440 4 Amplification AMP
## 4: 1 144856441 144874871 4 Amplification AMP
## 5: 1 144874872 144879232 4 Amplification AMP
## Minor_CN Major_CN
## <int> <int>
## 1: 0 1
## 2: 1 3
## 3: 1 3
## 4: 2 2
## 5: 1 3
The mathScore
function estimates ITH per sample using
mutant-allele tumor heterogeneity (MATH) approach 1. Typically, the higher the MATH score is, more
heterogeneous a tumor sample is 2 3. For MRS, this function can estimate the MATH score
within tumor based on the merged VAF when
withTumor = TRUE
.
## $V402
## Patient_ID Tumor_Sample_Barcode MATH_Score
## 1 V402 V402_BM_1 123.550
## 2 V402 V402_BM_2 105.900
## 3 V402 V402_BM_3 101.664
## 4 V402 V402_BM_4 108.724
## 5 V402 V402_P_1 80.191
## 6 V402 V402_P_2 99.897
## 7 V402 V402_P_3 100.515
## 8 V402 V402_P_4 92.337
The ccfAUC
function calculates the area under the curve
(AUC) of the cumulative density function based on the CCFs per tumor.
Tumors with higher AUC values are considered to be more heterogeneous 4.
## [1] "AUC.value" "CCF.density.plot"
For a single region/tumor, mutCluster
function clusters
mutations in one dimension by VAFs or CCFs based on a Gaussian finite
mixture model (using mclust). This function only focuses on heterozygous
mutations within copy-number neutral and loss of heterozygosity
(LOH)-free regions when clustering VAFs, as copy number aberrations will
alter the fraction of reads bearing mutations. Note that, “low VAF/CCF”
populations might be a mixture of subclones mutations represent admixed
polyphyletic lineages 14.
To quantify the genetic divergence of ITH between regions or tumors, we introduced two classical metrics derived from population genetics, which were Wright’s fixation index (Fst) 5 6 7 and Nei’s genetic distance 8.
# calculate the Fst of brain metastasis from V402
calFst(maf, patient.id = 'V402', plot = TRUE, use.tumorSampleLabel = TRUE,
withinTumor = TRUE, number.cex = 10)[["V402_BM"]]
## $Fst.avg
## [1] 0.03600727
##
## $Fst.pair
## BM_1 BM_2 BM_3 BM_4
## BM_1 1.00000000 0.03825504 0.05133115 0.02173604
## BM_2 0.03825504 1.00000000 0.04722441 0.02393010
## BM_3 0.05133115 0.04722441 1.00000000 0.03356684
## BM_4 0.02173604 0.02393010 0.03356684 1.00000000
##
## $Fst.plot
# calculate the Nei's genetic distance of brain metastasis from V402
calNeiDist(maf, patient.id = 'V402', use.tumorSampleLabel = TRUE,
withinTumor = TRUE, number.cex = 10)[["V402_BM"]]
## $Nei.dist.avg
## [1] 0.01008835
##
## $Nei.dist
## BM_1 BM_2 BM_3 BM_4
## BM_1 1.000000000 0.007271303 0.01751193 0.004240568
## BM_2 0.007271303 1.000000000 0.01573008 0.004302810
## BM_3 0.017511933 0.015730084 1.00000000 0.011473404
## BM_4 0.004240568 0.004302810 0.01147340 1.000000000
##
## $Nei.plot
Metastasis remains poorly understood despite its critical clinical significance, and the understanding of metastasis process offers supplementary information for clinical treatments.
Distinct patterns of monoclonal versus polyclonal seeding based on
the CCFs of somatic mutations between sample/tumor pairs.
compareCCF
function returns a result list of pairwise CCF
of mutations, which are identified across samples from a single patient.
Recently, this method has been widely used to deduce the potential
metastatic route between different paired tumor lesions 9 10.
ccf.list <- compareCCF(maf, pairByTumor = TRUE, min.ccf = 0.02,
use.adjVAF = TRUE, use.indel = FALSE)
V402_P_BM <- ccf.list$V402$`P-BM`
# visualize via smoothScatter R package
graphics::smoothScatter(matrix(c(V402_P_BM[, 3], V402_P_BM[, 4]),ncol = 2),
xlim = c(0, 1), ylim = c(0, 1),
colramp = colorRampPalette(c("white", RColorBrewer::brewer.pal(9, "BuPu"))),
xlab = "P", ylab = "BM")
## show driver genes
gene.idx <- which(V402_P_BM$Hugo_Symbol %in% driverGene)
points(V402_P_BM[gene.idx, 3:4], cex = 0.6, col = 2, pch = 2)
text(V402_P_BM[gene.idx, 3:4], cex = 0.7, pos = 1,
V402_P_BM$Hugo_Symbol[gene.idx])
title("V402 JSI = 0.341", cex.main = 1.5)
The Jaccard similarity index (JSI) can be used to calculate
mutational similarity between regions, which is defined as the ratio of
shared variants to all variants for sample pairs 11. Users can distinguish monoclonal versus polyclonal
seeding in metastases (including lymph node metastases and distant
metastases) via calJSI
function, and higher JSI values
indicate the higher possibility of polyclonal seeding 12.
JSI.res <- calJSI(maf, patient.id = 'V402', pairByTumor = TRUE, min.ccf = 0.02,
use.adjVAF = TRUE, use.indel = FALSE, use.tumorSampleLabel = TRUE)
names(JSI.res)
## [1] "JSI.multi" "JSI.pair"
## [1] 0.3410853
## P BM
## P 1.0000000 0.3410853
## BM 0.3410853 1.0000000
The subclonal mutant allele frequencies of a tumor follow a simple
power-law distribution predicted by neutral growth 13. Users can evaluate whether a tumor follows neutral
evolution or not under strong selection via the testNeutral
function. Tumors with R2 >= R2.threshold
(Default: 0.98) are considered to follow neutral evolution. Besides,
this function can also generate the model fitting plot of each sample if
the argument plot
is set toTRUE
. Please note
that this analysis has been superseded by mobster 14.
neutralResult <- testNeutral(maf, min.mut.count = 10, patient.id = 'V402', use.tumorSampleLabel = TRUE)
neutralResult$neutrality.metrics
## Patient_ID Tumor_Sample_Barcode Eligible_Mut_Count Area
## 1 V402 P_1 10 0.2139765
## 2 V402 P_2 21 0.2193398
## 3 V402 P_3 21 0.1486480
## 4 V402 P_4 30 0.2616017
## Kolmogorov_Distance Mean_Distance R2 Type
## 1 0.5881459 0.1811801 0.9316982 non-neutral
## 2 0.4452888 0.2116813 0.9217920 non-neutral
## 3 0.3045113 0.1350263 0.9629610 non-neutral
## 4 0.4464912 0.2578185 0.9095160 non-neutral
With MesKit, phylogenetic tree construction for each individual is
based on the binary present/absence matrix of mutations across all tumor
regions.
Based on the Maf object, getPhyloTree
function reconstructs
phylogenetic tree in different methods, including “NJ” (Neibor-Joining)
, “MP” (maximum parsimony), “ML” (maximum likelihood), “FASTME.ols” and
“FASTME.bal”, which can be set by controlling the method
parameter. The phylogenetic tree would be stored in a
phyloTree/phyloTreeList
object, and it can be further used
for functional exploration, mutational signature analysis and tree
visualization.
Comparison between phylogenetic trees can reveal consensus patterns
of tumor evolution. The compareeTree
function computes
distances between phylogenetic trees constructed through different
methods, and it returns a vector containing four distances by
treedist
from phangorn
R package. See treedist
for details.
tree.NJ <- getPhyloTree(maf, patient.id = 'V402', method = "NJ")
tree.MP <- getPhyloTree(maf, patient.id = 'V402', method = "MP")
# compare phylogenetic trees constructed by two approaches
compareTree(tree.NJ, tree.MP, plot = TRUE, use.tumorSampleLabel = TRUE)
## 4 clades are common between two trees.
## $compare.dist
## Symmetric.difference KF-branch distance Path difference
## 4.000000 111.858196 5.291503
## Weighted path difference
## 678.297716
##
## $compare.plot
Users can conduct functional exploration based on
phyloTree
objects. Below is an example showing how to
perform KEGG enrichment analysis via ClusterProfiler
by extracting genes in certain categories from a phylotree object.
library(org.Hs.eg.db)
library(clusterProfiler)
# Pathway enrichment analysis
V402.branches <- getMutBranches(phyloTree)
# pathway enrichment for private mutated genes of the primary tumor in patient V402
V402_Public <- V402.branches[V402.branches$Mutation_Type == "Private_P", ]
geneIDs = suppressMessages(bitr(V402_Public$Hugo_Symbol, fromType="SYMBOL",
toType=c("ENTREZID"), OrgDb="org.Hs.eg.db"))
KEGG_V402_Private_P = enrichKEGG(
gene = geneIDs$ENTREZID,
organism = 'hsa',
keyType = 'kegg',
pvalueCutoff = 0.05,
)
dotplot(KEGG_V402_Private_P)
The sequence context of the base substitutions can be retrieved from the corresponding reference genome to construct a mutation matrix with counts for all 96 trinucleotide changes using “mut_matrix”. Subsequently, the 6 base substitution type spectrum can be plotted with “plot_spectrum” , which can be divided into several sample groups.
mutTrunkBranch
function calculates the fraction of
branch/trunk mutations occurring in each of the six types of base
substitution types (C>A, C>G, C>T, T>A, T>C, T>G) and
conducts two-sided Fisher’s exact tests. For C>T mutations, it can be
further classified as C>T at CpG sites or other sites by setting
CT = TRUE
. This function provides option plot
to show the distribution of branch/trunk mutations. Substitutions types
with significant difference between trunk and branch mutations are
marked with Asterisks.
## [1] "mutTrunkBranch.res" "mutTrunkBranch.plot"
## Patient_ID Group Trunk Branch P_Value
## 1 V402 C>A 0.09523810 0.06581741 0.51626823
## 2 V402 C>G 0.11904762 0.04033970 0.03806057
## 3 V402 C>T at CpG 0.45238095 0.59660297 0.07412823
## 4 V402 C>T other 0.19047619 0.19320594 1.00000000
## 5 V402 T>A 0.04761905 0.01486200 0.16324670
## 6 V402 T>C 0.07142857 0.07218684 1.00000000
## 7 V402 T>G 0.02380952 0.01698514 0.53933795
triMatrix
function generates a mutation count matrix of
96 trinucleotides based on somatic SNVs per sample, which can be later
fed into the fitSignatures
function to estimates the
optimal contributions of known signatures to reconstruct a mutational
profile. Besides, the signature matrix can be specified
("cosmic_v2"
, "exome_cosmic_v3"
and
"nature2013"
) or provided by users via
signaturesRef
parameter. plotMutSigProfile
function can be utilized to visualize both original mutational profile
and reconstructed mutational profile.
trimatrix_V402 <- triMatrix(phyloTree, level = 5)
# Visualize the 96 trinucleodide mutational profile
plotMutSigProfile(trimatrix_V402)[[1]]
The plotPhyloTree
functionan of MesKit implemented an
auto-layout algorithm to visualize rooted phylogenetic trees with
annotations. The branches can be either colored according to
classification of mutations or putative known signatures by
branchCol
parameter. Argument show.bootstrap
is provided to show the support values of internal nodes. Additionally,
a phylogenetic tree along with a heatmap of mutation profile (via
mutHeatmap
functionan) enable a better depiction of
mutational patterns in tumor phylogeny.
# A phylogenetic tree along with binary and CCF heatmap of mutations
phylotree_V402 <- plotPhyloTree(phyloTree, use.tumorSampleLabel = TRUE)
binary_heatmap_V402 <- mutHeatmap(maf, min.ccf = 0.04, use.ccf = FALSE, patient.id = "V402", use.tumorSampleLabel = TRUE)
CCF_heatmap_V402 <- mutHeatmap(maf, use.ccf = TRUE, patient.id = "V402",
min.ccf = 0.04, use.tumorSampleLabel = TRUE)
cowplot::plot_grid(phylotree_V402, binary_heatmap_V402,
CCF_heatmap_V402, nrow = 1, rel_widths = c(1.5, 1, 1))
The guidance video for MesKit Shiny APP can be found at http://meskit.renlab.org/video.html.
Mroz, E. A., & Rocco, J. W. (2013). MATH, a novel measure of intratumor genetic heterogeneity, is high in poor-outcome classes of head and neck squamous cell carcinoma. Oral oncology, 49(3), 211–215. https://doi.org/10.1016/j.oraloncology.2012.09.007
Mroz, E. A., Tward, A. D., Pickering, C. R., Myers, J. N., Ferris, R. L., & Rocco, J. W. (2013). High intratumor genetic heterogeneity is related to worse outcome in patients with head and neck squamous cell carcinoma. Cancer, 119(16), 3034–3042. https://doi.org/10.1002/cncr.28150
Mroz, E. A., Tward, A. D., Hammon, R. J., Ren, Y., & Rocco, J. W. (2015). Intra-tumor genetic heterogeneity and mortality in head and neck cancer: analysis of data from the Cancer Genome Atlas. PLoS medicine, 12(2), e1001786. https://doi.org/10.1371/journal.pmed.1001786
Charoentong, P., Finotello, F., Angelova, M., Mayer, C., Efremova, M., Rieder, D., Hackl, H., & Trajanoski, Z. (2017). Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade. Cell reports, 18(1), 248–262. https://doi.org/10.1016/j.celrep.2016.12.019
Weir, B. S., & Cockerham, C. C. (1984). ESTIMATING F-STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE. Evolution; international journal of organic evolution, 38(6), 1358–1370. https://doi.org/10.1111/j.1558-5646.1984.tb05657.x
Bhatia, G., Patterson, N., Sankararaman, S., & Price, A. L. (2013). Estimating and interpreting FST: the impact of rare variants. Genome research, 23(9), 1514–1521. https://doi.org/10.1101/gr.154831.113
Holsinger, K. E., & Weir, B. S. (2009). Genetics in geographically structured populations: defining, estimating and interpreting F(ST). Nature reviews. Genetics, 10(9), 639–650. https://doi.org/10.1038/nrg2611
Puente, X. S., Pinyol, M., Quesada, V., Conde, L., Ordóñez, G. R., Villamor, N., Escaramis, G., Jares, P., Beà, S., González-Díaz, M., Bassaganyas, L., Baumann, T., Juan, M., López-Guerra, M., Colomer, D., Tubío, J. M., López, C., Navarro, A., Tornador, C., Aymerich, M., … Campo, E. (2011). Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature, 475(7354), 101–105. https://doi.org/10.1038/nature10113
Hu, Z., Ding, J., Ma, Z., Sun, R., Seoane, J. A., Scott Shaffer, J., Suarez, C. J., Berghoff, A. S., Cremolini, C., Falcone, A., Loupakis, F., Birner, P., Preusser, M., Lenz, H. J., & Curtis, C. (2019). Quantitative evidence for early metastatic seeding in colorectal cancer. Nature genetics, 51(7), 1113–1122. https://doi.org/10.1038/s41588-019-0423-x
Xue, R., Chen, L., Zhang, C., Fujita, M., Li, R., Yan, S. M., Ong, C. K., Liao, X., Gao, Q., Sasagawa, S., Li, Y., Wang, J., Guo, H., Huang, Q. T., Zhong, Q., Tan, J., Qi, L., Gong, W., Hong, Z., Li, M., … Zhang, N. (2019). Genomic and Transcriptomic Profiling of Combined Hepatocellular and Intrahepatic Cholangiocarcinoma Reveals Distinct Molecular Subtypes. Cancer cell, 35(6), 932–947.e8. https://doi.org/10.1016/j.ccell.2019.04.007
Makohon-Moore, A. P., Zhang, M., Reiter, J. G., Bozic, I., Allen, B., Kundu, D., Chatterjee, K., Wong, F., Jiao, Y., Kohutek, Z. A., Hong, J., Attiyeh, M., Javier, B., Wood, L. D., Hruban, R. H., Nowak, M. A., Papadopoulos, N., Kinzler, K. W., Vogelstein, B., & Iacobuzio-Donahue, C. A. (2017). Limited heterogeneity of known driver gene mutations among the metastases of individual patients with pancreatic cancer. Nature genetics, 49(3), 358–366. https://doi.org/10.1038/ng.3764
Hu, Z., Li, Z., Ma, Z., & Curtis, C. (2020). Multi-cancer analysis of clonality and the timing of systemic spread in paired primary tumors and metastases. Nature genetics, 52(7), 701–708. https://doi.org/10.1038/s41588-020-0628-z
Williams, M. J., Werner, B., Barnes, C. P., Graham, T. A., & Sottoriva, A. (2016). Identification of neutral tumor evolution across cancer types. Nature genetics, 48(3), 238–244. https://doi.org/10.1038/ng.3489
Caravagna, G., Heide, T., Williams, M. J., Zapata, L., Nichol, D., Chkhaidze, K., Cross, W., Cresswell, G. D., Werner, B., Acar, A., Chesler, L., Barnes, C. P., Sanguinetti, G., Graham, T. A., & Sottoriva, A. (2020). Subclonal reconstruction of tumors by using machine learning and population genetics. Nature genetics, 52(9), 898–907. https://doi.org/10.1038/s41588-020-0675-5
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] grid stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] ComplexHeatmap_2.23.0 BSgenome.Hsapiens.UCSC.hg19_1.4.3
## [3] BSgenome_1.75.0 rtracklayer_1.67.0
## [5] BiocIO_1.17.1 Biostrings_2.75.2
## [7] XVector_0.47.0 GenomicRanges_1.59.1
## [9] GenomeInfoDb_1.43.2 clusterProfiler_4.15.1
## [11] org.Hs.eg.db_3.20.0 AnnotationDbi_1.69.0
## [13] IRanges_2.41.2 S4Vectors_0.45.2
## [15] Biobase_2.67.0 BiocGenerics_0.53.3
## [17] generics_0.1.3 MesKit_1.17.0
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 sys_3.4.3
## [3] jsonlite_1.8.9 shape_1.4.6.1
## [5] magrittr_2.0.3 ggtangle_0.0.5
## [7] farver_2.1.2 rmarkdown_2.29
## [9] GlobalOptions_0.1.2 fs_1.6.5
## [11] zlibbioc_1.52.0 vctrs_0.6.5
## [13] Rsamtools_2.23.1 memoise_2.0.1
## [15] RCurl_1.98-1.16 ggtree_3.15.0
## [17] S4Arrays_1.7.1 htmltools_0.5.8.1
## [19] curl_6.0.1 SparseArray_1.7.2
## [21] gridGraphics_0.5-1 sass_0.4.9
## [23] pracma_2.4.4 KernSmooth_2.23-24
## [25] bslib_0.8.0 plyr_1.8.9
## [27] cachem_1.1.0 GenomicAlignments_1.43.0
## [29] buildtools_1.0.0 igraph_2.1.2
## [31] lifecycle_1.0.4 iterators_1.0.14
## [33] pkgconfig_2.0.3 Matrix_1.7-1
## [35] R6_2.5.1 fastmap_1.2.0
## [37] gson_0.1.0 MatrixGenerics_1.19.0
## [39] GenomeInfoDbData_1.2.13 clue_0.3-66
## [41] digest_0.6.37 aplot_0.2.3
## [43] enrichplot_1.27.1 colorspace_2.1-1
## [45] patchwork_1.3.0 RSQLite_2.3.9
## [47] labeling_0.4.3 fansi_1.0.6
## [49] abind_1.4-8 httr_1.4.7
## [51] mgcv_1.9-1 compiler_4.4.2
## [53] bit64_4.5.2 withr_3.0.2
## [55] doParallel_1.0.17 BiocParallel_1.41.0
## [57] DBI_1.2.3 R.utils_2.12.3
## [59] DelayedArray_0.33.3 rjson_0.2.23
## [61] tools_4.4.2 ape_5.8
## [63] R.oo_1.27.0 glue_1.8.0
## [65] quadprog_1.5-8 restfulr_0.0.15
## [67] nlme_3.1-166 GOSemSim_2.33.0
## [69] cluster_2.1.8 reshape2_1.4.4
## [71] fgsea_1.33.0 gtable_0.3.6
## [73] R.methodsS3_1.8.2 tidyr_1.3.1
## [75] data.table_1.16.4 utf8_1.2.4
## [77] ggrepel_0.9.6 foreach_1.5.2
## [79] pillar_1.9.0 stringr_1.5.1
## [81] yulab.utils_0.1.8 circlize_0.4.16
## [83] splines_4.4.2 dplyr_1.1.4
## [85] treeio_1.31.0 lattice_0.22-6
## [87] bit_4.5.0.1 tidyselect_1.2.1
## [89] GO.db_3.20.0 maketools_1.3.1
## [91] knitr_1.49 SummarizedExperiment_1.37.0
## [93] xfun_0.49 matrixStats_1.4.1
## [95] stringi_1.8.4 UCSC.utils_1.3.0
## [97] lazyeval_0.2.2 ggfun_0.1.8
## [99] yaml_2.3.10 evaluate_1.0.1
## [101] codetools_0.2-20 tibble_3.2.1
## [103] qvalue_2.39.0 ggplotify_0.1.2
## [105] cli_3.6.3 munsell_0.5.1
## [107] jquerylib_0.1.4 Rcpp_1.0.13-1
## [109] png_0.1-8 XML_3.99-0.17
## [111] parallel_4.4.2 ggplot2_3.5.1
## [113] blob_1.2.4 mclust_6.1.1
## [115] DOSE_4.1.0 bitops_1.0-9
## [117] phangorn_2.12.1 tidytree_0.4.6
## [119] scales_1.3.0 ggridges_0.5.6
## [121] purrr_1.0.2 crayon_1.5.3
## [123] GetoptLong_1.0.5 rlang_1.1.4
## [125] cowplot_1.1.3 fastmatch_1.1-4
## [127] KEGGREST_1.47.0