Title: | Selecting the number of mutational signatures using a perplexity-based measure and cross-validation |
---|---|
Description: | A package to suggest the number of mutational signatures in a collection of somatic mutations using calculating the cross-validated perplexity score. |
Authors: | Zhi Yang [aut, cre], Yuichi Shiraishi [ctb] |
Maintainer: | Zhi Yang <[email protected]> |
License: | GPL-3 |
Version: | 1.19.0 |
Built: | 2024-11-17 06:14:38 UTC |
Source: | https://github.com/bioc/selectKSigs |
A function for calculating the log-likelihood from the data and parameters
calcPMSLikelihood(p, y)
calcPMSLikelihood(p, y)
p |
this variable includes the parameters for mutation signatures and membership parameters |
y |
this variable includes the information on the mutation features, the number of mutation signatures specified and so on |
a value
Output the maximum potential scale reduction statistic of all parameters estimated
Calculate_Likelihood_test(train, test, paramG)
Calculate_Likelihood_test(train, test, paramG)
train |
a MutationFeatureData S4 class output of training data. |
test |
a MutationFeatureData S4 class output of test data. |
paramG |
an estimatedParameters S4 class with estimated parameters |
the likelihood of the test data
Restore the converted parameter F for turboEM
convertFromTurbo_F(turboF, fdim, signatureNum, isBackground)
convertFromTurbo_F(turboF, fdim, signatureNum, isBackground)
turboF |
F (converted for turboEM) |
fdim |
a vector specifying the number of possible values for each mutation signature |
signatureNum |
the number of mutation signatures |
isBackground |
the logical value showing whether a background mutaiton features is included or not |
a vector
Restore the converted parameter Q for turboEM
convertFromTurbo_Q(turboQ, signatureNum, sampleNum)
convertFromTurbo_Q(turboQ, signatureNum, sampleNum)
turboQ |
Q (converted for turboEM) |
signatureNum |
the number of mutation signatures |
sampleNum |
the number of cancer genomes |
a vector
Convert the parameter F so that turboEM can treat
convertToTurbo_F(vF, fdim, signatureNum, isBackground)
convertToTurbo_F(vF, fdim, signatureNum, isBackground)
vF |
F (converted to a vector) |
fdim |
a vector specifying the number of possible values for each mutation signature |
signatureNum |
the number of mutation signatures |
isBackground |
the logical value showing whether a background mutaiton features is included or not |
a vector
Convert the parameter Q so that turboEM can treat
convertToTurbo_Q(vQ, signatureNum, sampleNum)
convertToTurbo_Q(vQ, signatureNum, sampleNum)
vQ |
Q (converted to a vector) |
signatureNum |
the number of mutation signatures |
sampleNum |
the number of cancer genomes |
a vector
Output the maximum potential scale reduction statistic of all parameters estimated
cv_PMSignature(inputG, Kfold = 3, nRep = 3, Klimit = 8)
cv_PMSignature(inputG, Kfold = 3, nRep = 3, Klimit = 8)
inputG |
a MutationFeatureData S4 class. |
Kfold |
an integer number of the number of cross-validation folds. |
nRep |
an integer number of replications. |
Klimit |
an integer of the maximum value of number of signatures. |
a matrix of measures
load(system.file("extdata/sample.rdata", package = "selectKSigs")) results <- cv_PMSignature(G, Kfold = 3)
load(system.file("extdata/sample.rdata", package = "selectKSigs")) results <- cv_PMSignature(G, Kfold = 3)
Get the statsus of using the background signature
getBG(object)
getBG(object)
object |
the EstimatedParameters class (the result of pmgetSignature) |
the status of using the background signature
Get the count data in a matrix
getCounts(object)
getCounts(object)
object |
the MutationFeatureData class |
the count data in a matrix
Get a matrix of mutational exposures of signatures
getExposures(object)
getExposures(object)
object |
the EstimatedParameters class (the result of pmgetSignature) |
a matrix of mutational exposures of signatures
Get a vector of possible features
getFeatures(object)
getFeatures(object)
object |
the EstimatedParameters class (the result of pmgetSignature) |
a vector of possible features
Get a matrix of feature vector list
getFeatureVec(object)
getFeatureVec(object)
object |
the MutationFeatureData class |
a matrix of feature vector list
Get the number of signatures
getK(object)
getK(object)
object |
the EstimatedParameters class (the result of pmgetSignature) |
the number of signatures in pmgetSignature
in HiLDA
Get the values of loglikelihood
getLL(object)
getLL(object)
object |
the EstimatedParameters class (the result of pmgetSignature) |
likelihood values estimated by pmgetSignature
in HiLDA
Calculate the value of the log-likelihood for given parameters
getLogLikelihoodC( vPatternList, vSparseCount, vF, vQ, fdim, signatureNum, sampleNum, patternNum, samplePatternNum, isBackground, vF0 )
getLogLikelihoodC( vPatternList, vSparseCount, vF, vQ, fdim, signatureNum, sampleNum, patternNum, samplePatternNum, isBackground, vF0 )
vPatternList |
The list of possible mutation features (converted to a vector) |
vSparseCount |
The table showing (mutation feature, sample, the number of mutation) (converted to a vector) |
vF |
F (converted to a vector) |
vQ |
Q (converted to a vector) |
fdim |
a vector specifying the number of possible values for each mutation signature |
signatureNum |
the number of mutation signatures |
sampleNum |
the number of cancer genomes |
patternNum |
the number of possible combinations of all the mutation features |
samplePatternNum |
the number of possible combination of samples and mutation patternns |
isBackground |
the logical value showing whether a background mutaiton features is included or not |
vF0 |
a background mutaiton features |
a value
Get the sample list
getSamplelist(object)
getSamplelist(object)
object |
the EstimatedParameters class (the result of pmgetSignature) |
the sample list of named elements.
Get the sample list
getSamplelistG(object)
getSamplelistG(object)
object |
the MutationFeatureData class |
the sample list of named elements.
Get an array of signature feature distributions
getSignatures(object)
getSignatures(object)
object |
the EstimatedParameters class (the result of pmgetSignature) |
an array of signature feature distributions
Get the statsus of specifying the transcription bias
getTranscription(object)
getTranscription(object)
object |
the MutationFeatureData class |
the status of specifying the transcription bias
Output the training data or test data
select_kth_fold(inputG, k, f_s, folds, include)
select_kth_fold(inputG, k, f_s, folds, include)
inputG |
a MutationFeatureData S4 class output by the pmsignature. |
k |
an integer number of the number of cross-validation folds. |
f_s |
a primary key of combining the feature pattern and sample ID. |
folds |
the assignment to each fold. |
include |
a boolean indictor of whether to include kth fold or not. |
a MutationFeatureData S4 class of either include or exclude kth fold.
Output the maximum potential scale reduction statistic of all parameters estimated
splitG(inputG, Kfold = 3)
splitG(inputG, Kfold = 3)
inputG |
a MutationFeatureData S4 class output by the pmsignature. |
Kfold |
an integer number of the number of cross-validation folds. |
a matrix made of perplexity from the results of cross-validation.
load(system.file("extdata/sample.rdata", package = "selectKSigs")) G_split <- splitG(G, Kfold = 3)
load(system.file("extdata/sample.rdata", package = "selectKSigs")) G_split <- splitG(G, Kfold = 3)