Title: | Infers clonal composition of a tumor |
---|---|
Description: | Clomial fits binomial distributions to counts obtained from Next Gen Sequencing data of multiple samples of the same tumor. The trained parameters can be interpreted to infer the clonal structure of the tumor. |
Authors: | Habil Zare and Alex Hu |
Maintainer: | Habil Zare <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.43.0 |
Built: | 2024-11-23 06:19:36 UTC |
Source: | https://github.com/bioc/Clomial |
Clomial fits binomial distributions to counts obtained from Next Gen Sequencing data of multiple samples of the same tumor. The trained parameters can be interpreted to infer the clonal structure of the tumor.
Package: | Clomial |
Type: | Package |
Version: | 0.99.0 |
Date: | 2014-02-11 |
License: | GPL (>= 2) |
The main function is Clomial() which requires 2 matrices Dt and Dc among its inputs. They contain the counts of the alternative allele, and the total number of processed reads, accordingly. Their rows correspond to the genomic loci, and their columns correspond to the samples. Several models should be trained using different initial values to escape from local optima, and the best one in terms of the likelihood can be chosen by choose.best() function.
Habil Zare and Alex Hu
Maintainer: Habil Zare <[email protected]>
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., PLoS Computational Biology 10.7 (2014): e1003703.
Clomial
, choose.best
,
Clomial.iterate
, Clomial.likelihood
,
compute.bic
, breastCancer
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=2) chosen <- choose.best(models=ClomialResult$models) M1 <- chosen$bestModel print("Genotypes:") print(round(M1$Mu)) print("Clone frequencies:") print(M1$P)
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=2) chosen <- choose.best(models=ClomialResult$models) M1 <- chosen$bestModel print("Genotypes:") print(round(M1$Mu)) print("Clone frequencies:") print(M1$P)
Counts data from multiple samples of a single primary breast cancer obtained by deep, next-generation sequencing.
The file is consist of two matrices Dt
and Dc
which contain the counts of the alternative alleles, and the total number of counts on each genomic loci for every tumor samples, accordingly.
data(breastCancer)
data(breastCancer)
A list containing 2 matrices.
Each matrix contains counts of reads mapped to 17 genomic loci for 12 tumor samples where the column A5-2 corresponds to the normal sample.
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
data(breastCancer) breastCancer$Dt
data(breastCancer) breastCancer$Dt
Given the output of Clomial function, the likelihoods of all models are compared, and the best model is determined.
choose.best(models, U = NULL, PTrue = NULL, compareTo = NULL, upto = "All", doTalk=FALSE)
choose.best(models, U = NULL, PTrue = NULL, compareTo = NULL, upto = "All", doTalk=FALSE)
models |
The models trained by |
U |
The optional genotype matrix used for comparison. |
PTrue |
The optional clone frequency matrix used for comparison. |
compareTo |
The index of the model against which all other models are
compared. Set to |
upto |
The models with index less than this value are considered. Set to "All" to include every model. |
doTalk |
If TRUE, information on number of analyzed models is reported. |
If compareTo
, U
, and PTrue
are NULL
no comparison will be done, and the function runs considerably faster.
A list will be made with the following entries:
err |
A list with 2 entries; err$P and err$U the vectors of clonal frequency errors, and genotype errors, accordingly. |
Li |
A vector of the best obtained log-likelihood for each model. |
bestInd |
The index of the best model in terms of log-likelihood. |
comparison |
If |
bestModel |
The best model in terms of log-likelihood. |
seconds |
A vector of the time taken, in seconds, to train each model. |
When the number of assumed clones, C
, is greater than 6,
the comparison will be time taking because all possible permutations
of clones should be considered. The running time will be slowed down
by C!
.
Habil Zare
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
Clomial
,
Clomial.likelihood
, Clomial.iterate
set.seed(4) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=5) chosen <- choose.best(models=ClomialResult$models) M1 <- chosen$bestModel print("Genotypes:") round(M1$Mu) print("Clone frequencies:") M1$P bestInd <- chosen$bestInd plot(chosen$Li,ylab="Log-likelihood",type="l") points(x=bestInd,y=chosen$Li[bestInd],col="red",pch=19)
set.seed(4) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=5) chosen <- choose.best(models=ClomialResult$models) M1 <- chosen$bestModel print("Genotypes:") round(M1$Mu) print("Clone frequencies:") M1$P bestInd <- chosen$bestInd plot(chosen$Li,ylab="Log-likelihood",type="l") points(x=bestInd,y=chosen$Li[bestInd],col="red",pch=19)
Using EM, trains several models using different initial values to escape from local optima. The best one in terms of the likelihood can be later chosen by choose.best() function.
Clomial(Dt = NULL, Dc = NULL, DcDtFile = NULL, C, doParal=FALSE, outPrefix = NULL, binomTryNum = 1000, maxIt = 100, llCutoff = 0.001, jobNamePrefix = "Bi", qstatWait = 2, fitBinomJobFile = NULL, jobShare = 10, ignoredSample = c(), fliProb=0.05, conservative=TRUE, doTalk=FALSE)
Clomial(Dt = NULL, Dc = NULL, DcDtFile = NULL, C, doParal=FALSE, outPrefix = NULL, binomTryNum = 1000, maxIt = 100, llCutoff = 0.001, jobNamePrefix = "Bi", qstatWait = 2, fitBinomJobFile = NULL, jobShare = 10, ignoredSample = c(), fliProb=0.05, conservative=TRUE, doTalk=FALSE)
Dt |
A matrix which contains the counts of the alternative allele where rows correspond to the genomic loci, and columns correspond to the samples. |
Dc |
A matrix which contains the counts of the total number of mapped reads where rows correspond to the genomic loci, and columns correspond to the samples. |
DcDtFile |
A file from which the data can optionally be loaded. It should contain the matrices Dc and Dt. |
C |
The assumed number of clones. |
doParal |
Boolean where TRUE means, in Linux, models with different initialization are trained in parallel on a cluster using qsub. |
outPrefix |
A prefix for the path to save the results. |
binomTryNum |
The number of models trained using different initialization. |
maxIt |
The maximum number of EM iterations. |
llCutoff |
EM iterations stops if the relative improvement in the log-likelihood is not more than this threshold. |
jobNamePrefix |
If run in parallel, this prefix will be used to name the jobs on the cluster. |
qstatWait |
The waiting time between qstat commands to assess the number of running and waiting jobs. |
fitBinomJobFile |
If run in parallel, this is the script which loads data, trains a model using a random initialization, and saves the results. |
jobShare |
If run in parallel, the job_share option of qsub determines the priority of jobs over other submitted jobs. |
ignoredSample |
A vector of indices of samples which will be ignored in training. Used by experts only to measure the stability of the results. |
fliProb |
A "flipping probability" used for noise injection which can be
disabled when |
conservative |
Boolean where TRUE means noise will be injected only if likelihood is improved after an EM iteration, otherwise the original Mu matrix will be used for the next iteration. For expert use only. |
doTalk |
If TRUE, information on the EM optimization iterations is reported. |
The likelihood of the model, given the hidden variables and the
parameters, can be computed based on a combination of binomial
distributions. In each EM iteration, the likelihood is
increased, however, due to presence of local optima, several
models should be tried using different random
initialization. For higher number of assumed clones, C
,
the parameter binomTryNum should be increased because the
dimension of the search space grows linearly with C
.
Returns a list containing the entry called models
,
which is a list of the length equal to binomTryNum where each element is
a trained model.
For each trained model, Mu
models the matrix of genotypes, where
rows and columns correspond to genomic loci and clones,
accordingly. Also, P
is the matrix of clonal frequency where rows
and columns correspond to clones and samples, accordingly.
The first column of P
corresponds to the normal clone.
The history of Mu
, P
, and the log-likelihood over
iterations is saved in lists Ps
, Mus
, and
Likelihoods
, accordingly.
The parallel mode works only in Linux, and when qsub and qstat commands are available on a cluster.
Habil Zare
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
Clomial
,
choose.best
, Clomial.iterate
,
compute.bic
, breastCancer
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,binomTryNum=2) chosen <- choose.best(models=ClomialResult$models) M1 <- chosen$bestModel print("Genotypes:") round(M1$Mu) print("Clone frequencies:") M1$P
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,binomTryNum=2) chosen <- choose.best(models=ClomialResult$models) M1 <- chosen$bestModel print("Genotypes:") round(M1$Mu) print("Clone frequencies:") M1$P
Data sets are simulated based on binomial distribution using random parameters for the model. The accuracy of the EM procedure can be estimated by comparing the inferred parameters vs. the known ones which were used to generate the data.
Clomial.generate.data(N, C, S, averageCoverage, mutFraction, doSample1Normal = FALSE,erroRate=0,doCheckDc=TRUE)
Clomial.generate.data(N, C, S, averageCoverage, mutFraction, doSample1Normal = FALSE,erroRate=0,doCheckDc=TRUE)
N |
The number of genomic loci. |
C |
The number of clones. |
S |
The number of samples. |
averageCoverage |
The average coverage over each loci, each sample. |
mutFraction |
Should be in range 0-1. Each loci in every sample can be mutated with this probability. |
doSample1Normal |
If TRUE, no contamination with the tumor content is allowed for
the normal sample. I.e. the first column of the generated |
erroRate |
The sequencing noise can be simulated by assigning a positive value to this parameter, which is the probability of reading a normal allele as the alternative allele, and vica versa. |
doCheckDc |
If TRUE, generating with be repeated until no row of Dc is all zeros to guarantee all loci have positive coverage in at least one sample. |
See the reference below for details.
A list will be made with the following entries:
Dc |
A matrix of simulated coverage for all loci and samples. |
Dt |
A matrix of alternative allele counts for all loci and samples. |
Ptrue |
The true clone frequency matrix used for generating the data. |
U |
The true genotype matrix used for generating the data. |
Likelihood |
The log-likelihood of the model with the true parameters. |
Phi |
The matrix of the second parameters of the binomial distributions; each entry is the probability that a read contains the variant allele at a locus in a sample. |
Habil Zare
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
set.seed(1) simulated <- Clomial.generate.data(N=20, C=4, S=10, averageCoverage=1000, mutFraction=0.1) simulated$Dc
set.seed(1) simulated <- Clomial.generate.data(N=20, C=4, S=10, averageCoverage=1000, mutFraction=0.1) simulated$Dc
Given the data and the initial values for the model parameters, runs EM iterations until convergence of the Clomial model.
Clomial.iterate(Dt, Dc, Mu, P, maxIt=100, U = NULL, PTrue = NULL, llCutoff = 10^(-3), computePFunction = compute.P.reparam, doSilentOptim = TRUE, doTalk = TRUE, doLog = TRUE, debug = FALSE, noiseReductionRate = 0.01, fliProb=0.05,conservative=TRUE)
Clomial.iterate(Dt, Dc, Mu, P, maxIt=100, U = NULL, PTrue = NULL, llCutoff = 10^(-3), computePFunction = compute.P.reparam, doSilentOptim = TRUE, doTalk = TRUE, doLog = TRUE, debug = FALSE, noiseReductionRate = 0.01, fliProb=0.05,conservative=TRUE)
maxIt |
The maximum number of EM iterations. |
Dt |
A matrix which contains the counts of the alternative allele where rows correspond to the genomic loci, and columns correspond to the samples. |
Dc |
A matrix which contains the counts of the total number of mapped reads where rows correspond to the genomic loci, and columns correspond to the samples. |
Mu |
The initial value for the Mu matrix which models the genotypes, where rows and columns correspond to genomic loci and clones, accordingly. |
P |
The initial matrix of clonal frequency where rows and columns correspond to clones and samples, accordingly. |
U |
The true value for |
PTrue |
The true value for |
llCutoff |
EM iterations stops if the relative improvement in the log-likelihood is not more than this threshold. |
computePFunction |
The function used for updating |
doSilentOptim |
If TRUE, the optimization massages will not be reported. |
doTalk |
If FALSE, the function will be run in silent mode. |
doLog |
Highly recommended to set to TRUE. Then, the computations will be done in log space to avoid numerical issues. |
debug |
If TRUE, the debug mode will be turned on. |
noiseReductionRate |
The noise will be reduce by this rate after each EM iteration. |
fliProb |
A "flipping probability" used for noise injection which can be
disabled when |
conservative |
Boolean where TRUE means noise will be injected only if likelihood is improved after an EM iteration, otherwise the original Mu matrix will be used for the next iteration. For expert use only. |
Injecting noise can be done by assigning a positive value to
fliProb
, and can be disabled by fliProb=0
.
Noise injection is recommended for training models with a high
number of clones (>4).
A list will be made with the following entries:
Qs |
The history of matrices containing the posterior
|
Ps |
The history of |
Mus |
The history of |
Mu |
The value of |
P |
The value of |
llCutoff |
The threshold used to decide convergence. |
LRatio |
The final relative improvement in the log likelihood which lead to convergence. |
Likelihoods |
The history of log-likelihoods. |
fliProb |
The final value of |
timeTaken |
An object of class “difftime” which reports the total computational time for EM iterations. |
endTaken |
An object of class “POSIXct” (see DateTimeClasses) which reports the time EM iterations finished. |
Habil Zare
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
Clomial
,
Clomial
, breastCancer
set.seed(1) ## Getting data: data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt freq1 <- Dt/Dc N <- nrow(Dc) S <- ncol(Dc) Cnum <- 4 ## assumed number of clones. ## Random initialization: random1 <- runif(n=N*(Cnum-1),min=rowMins(freq1)*0.9,max=rowMaxs(freq1)*1.1) random1[random1>1] <- 1 random1[random1<0] <- 0 Mu <- matrix(random1,N,Cnum-1) Mu <- cbind( matrix(0,N,1), Mu ) rownames(Mu) <- rownames(Dc) colnames(Mu) <- paste("C",1:Cnum,sep="") P <- matrix(runif(Cnum*S),Cnum,S) rownames(P) <- colnames(Mu) colnames(P) <- colnames(Dc) ## Normalizing P: for( t in 1:S ){ s <- sum(P[,t]) P[,t] <- P[,t]/s }##End for. ## Running EM: model1 <- Clomial.iterate(Dt=Dt, Dc=Dc, Mu=Mu, P=P) print("Genotypes:") round(model1$Mu) print("Clone frequencies:") model1$P
set.seed(1) ## Getting data: data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt freq1 <- Dt/Dc N <- nrow(Dc) S <- ncol(Dc) Cnum <- 4 ## assumed number of clones. ## Random initialization: random1 <- runif(n=N*(Cnum-1),min=rowMins(freq1)*0.9,max=rowMaxs(freq1)*1.1) random1[random1>1] <- 1 random1[random1<0] <- 0 Mu <- matrix(random1,N,Cnum-1) Mu <- cbind( matrix(0,N,1), Mu ) rownames(Mu) <- rownames(Dc) colnames(Mu) <- paste("C",1:Cnum,sep="") P <- matrix(runif(Cnum*S),Cnum,S) rownames(P) <- colnames(Mu) colnames(P) <- colnames(Dc) ## Normalizing P: for( t in 1:S ){ s <- sum(P[,t]) P[,t] <- P[,t]/s }##End for. ## Running EM: model1 <- Clomial.iterate(Dt=Dt, Dc=Dc, Mu=Mu, P=P) print("Genotypes:") round(model1$Mu) print("Clone frequencies:") model1$P
Computes the expected complete data log-likelihood of a Clomial model over all possible values of the hidden variables.
Clomial.likelihood(Dc, Dt, Mu, P)
Clomial.likelihood(Dc, Dt, Mu, P)
Dt |
A matrix which contains the counts of the alternative allele where rows correspond to the genomic loci, and columns correspond to the samples. |
Dc |
A matrix which contains the counts of the total number of mapped reads where rows correspond to the genomic loci, and columns correspond to the samples. |
Mu |
The matrix which models the genotypes, where rows and columns correspond to genomic loci and clones, accordingly. |
P |
The matrix of clonal frequency where rows and columns correspond to clones and samples, accordingly. |
By assuming that the genomic loci and the samples are independent given the model parameters, the computation is simplified by first summing over the samples for a locus, and then summing over all the loci. This strategy avoids exploring the exponentially huge probability space.
A list will be made with the following entries:
ll |
The expectation of complete log-likelihood over the hidden variables. |
llS |
A vector of computed log-likelihoods at all loci. |
The likelihood is computed assuming the heterozygosity is 2.
Habil Zare
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
Clomial
,
choose.best
,
compute.bic
, breastCancer
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=1) model1 <- ClomialResult$models[[1]] likelihood <- Clomial.likelihood(Dc=Dc, Dt=Dt, Mu=model1$Mu, P=model1$P)$ll print(likelihood)
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=1) model1 <- ClomialResult$models[[1]] likelihood <- Clomial.likelihood(Dc=Dc, Dt=Dt, Mu=model1$Mu, P=model1$P)$ll print(likelihood)
Pre-computed results of Clomial function are provided for demo purposes. It contains 1000 trained models on counts data from multiple samples of a single primary breast cancer obtained by deep, next-generation sequencing.
data(Clomial1000)
data(Clomial1000)
Clomial1000[["models"]]
is the list of trained models.
Each model is the output of Clomial.iterate()
function
on the breastCancer
data assuming there are 4 clones.
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
Clomial
,
Clomial.iterate
, choose.best
,
breastCancer
data(Clomial1000) chosen <- choose.best(models=Clomial1000$models) M1 <- chosen$bestModel print("Genotypes:") round(M1$Mu) print("Clone frequencies:") M1$P bestInd <- chosen$bestInd plot(chosen$Li,ylab="Log-likelihood",type="l") points(x=bestInd,y=chosen$Li[bestInd],col="red",pch=19)
data(Clomial1000) chosen <- choose.best(models=Clomial1000$models) M1 <- chosen$bestModel print("Genotypes:") round(M1$Mu) print("Clone frequencies:") M1$P bestInd <- chosen$bestInd plot(chosen$Li,ylab="Log-likelihood",type="l") points(x=bestInd,y=chosen$Li[bestInd],col="red",pch=19)
Computes the Bayesian Information Criterion (BIC) for a Clomial model, which might be useful to estimate the number of clones. A "significantly" smaller BIC is usually interpreted as a better fit to the data.
compute.bic(Dc, Dt, Mu, P)
compute.bic(Dc, Dt, Mu, P)
Dt |
A matrix which contains the counts of the alternative allele where rows correspond to the genomic loci, and columns correspond to the samples. |
Dc |
A matrix which contains the counts of the total number of mapped reads where rows correspond to the genomic loci, and columns correspond to the samples. |
Mu |
The matrix which models the genotypes, where rows and columns correspond to genomic loci and clones, accordingly. |
P |
The matrix of clonal frequency where rows and columns correspond to clones and samples, accordingly. |
The Bayesian Information Criterion (BIC) for a model is computed
by subtracting the expected log-likelihood times 2, from the
number of free parameters of the model times logarithm of the
total number of observations. For a Clomial model, we have
BIC = (NC+SC-S)log(sum(Dc))-2L
, where L
is the
likelihood, N
is the number of genomic loci, C
is
the assumed number of clones, S
is the number of samples,
and sum(Dc)
is the total number of observed reads.
A list will be made with the following entries:
bic |
The BIC value. |
aic |
The AIC value. |
obsNum |
The total number of observed reades. |
Theoretically, a method such as the Bayesian information criterion (BIC) or the Akaike information criterion (AIC) may be applied to estimate the number of clones. However, in practice, the outcome of such approaches should be interpreted with great caution because some of the underlying assumptions of the statistical analysis may not be necessarily true for a given model. For example, while a "small" improvement in the BIC is generally considered as a sign to stop making the model more complicated, making such decisions is very objective, and requires relying on thresholds with little statistical basis.
Habil Zare
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt bics <- c() Clomial3 <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=3,doParal=FALSE,binomTryNum=1) model3 <- Clomial3$models[[1]] bics[3] <- compute.bic(Dc=Dc,Dt=Dt, Mu=model3$Mu, P=model3$P)$bic Clomial4 <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=1) model4 <- Clomial4$models[[1]] bics[4] <- compute.bic(Dc=Dc,Dt=Dt, Mu=model4$Mu, P=model4$P)$bic print(bics) ## 4 is a better estimate for the number of clones.
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt bics <- c() Clomial3 <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=3,doParal=FALSE,binomTryNum=1) model3 <- Clomial3$models[[1]] bics[3] <- compute.bic(Dc=Dc,Dt=Dt, Mu=model3$Mu, P=model3$P)$bic Clomial4 <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=4,doParal=FALSE,binomTryNum=1) model4 <- Clomial4$models[[1]] bics[4] <- compute.bic(Dc=Dc,Dt=Dt, Mu=model4$Mu, P=model4$P)$bic print(bics) ## 4 is a better estimate for the number of clones.
Given the true genotype and frequency matrices, finds the permutation of genotypes matrix which best matches the true genotypes and returns the corresponding errors.
compute.errors(Mu, U, P, PTrue)
compute.errors(Mu, U, P, PTrue)
Mu |
The matrix which models the genotypes, where rows and columns correspond to genomic loci and clones, accordingly. |
U |
The true genotype matrix defined similar to |
P |
The matrix of clonal frequency where rows and columns correspond to clones and samples, accordingly. |
PTrue |
The true clonal frequency matrix defined similar to |
Computing the error is useful for estimating the performance of
inference on simulated, and for comparing different trained models.
Genotype and frequency errors are defined as
the normalized l1
-error in
reconstructing the genotype, and the clone frequency matrices,
accordingly, where by normalized l1
-error we mean the sum of absolute
values of an error matrix divided by the size of the matrix.
A list will be made with the following entries:
UError |
The |
discretizedUError |
The |
.
PErrorAbsolute |
The normalized |
PErrorRelative |
Each entry of the error clone frequency
matrix is normalized by the corresponding entry in
|
The use of UError
and PErrorAbsolute
is recommended.
Computing the error is not feasible for more than 7 clones because
the number of all possible permutations is factorial in the
number of clones which grows super fast. Such input will trigger an
error message.
Habil Zare
Inferring clonal composition from multiple sections of a breast cancer, Zare et al., Submitted.
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt bics <- c() ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=3,doParal=FALSE,binomTryNum=2) model1 <- ClomialResult$models[[1]] model2 <- ClomialResult$models[[2]] ## Comparing 2 trained models: compute.errors(Mu=model1$Mu,U=model2$Mu,P=model1$P,PTrue=model2$P)
set.seed(1) data(breastCancer) Dc <- breastCancer$Dc Dt <- breastCancer$Dt bics <- c() ClomialResult <-Clomial(Dc=Dc,Dt=Dt,maxIt=20,C=3,doParal=FALSE,binomTryNum=2) model1 <- ClomialResult$models[[1]] model2 <- ClomialResult$models[[2]] ## Comparing 2 trained models: compute.errors(Mu=model1$Mu,U=model2$Mu,P=model1$P,PTrue=model2$P)