Title: | Variable Selection in Finite Mixture of AFT Regression and FMR Models |
---|---|
Description: | The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net. |
Authors: | Farhad Shokoohi [aut, cre] |
Maintainer: | Farhad Shokoohi <[email protected]> |
License: | GPL-3 |
Version: | 1.17.0 |
Built: | 2024-10-30 07:53:28 UTC |
Source: | https://github.com/bioc/fmrs |
The package obtains parameter estimation, i.e., maximum
likelihood estimators (MLE), via the Expectation-Maximization (EM)
algorithm for the Finite Mixture of Regression (FMR) models with Normal
distribution, and MLE for the Finite Mixture of Accelerated Failure Time
Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull
distributions via the EM algorithm and the Newton-Raphson algorithm (for
Weibull distribution).
More importantly, the package obtains the maximum penalized likelihood
(MPLE) for both FMR and FMAFTR models (collectively called FMRs).
A component-wise tuning parameter selection based on a component-wise BIC is
implemented in the package.
Furthermore, this package provides Ridge Regression and Elastic Net.
The fmrs.mle
method provides MLE for FMRs models.
The fmrs.tunsel
method provides component-wise tuning
parameters. The fmrs.varsel
method provides variable selection
for FMRs models.
fmrs
methodsfmrs.mle
,
fmrs.tunsel
,
fmrs.varsel
,
fmrs.gendata
.
fmrs
objectsfmrsfit-class
,
fmrstunpar-class
Provides the estimated BIC of an FMRs
model from
an fmrsfit-class
BIC(object, ...) ## S4 method for signature 'fmrsfit' BIC(object, ...)
BIC(object, ...) ## S4 method for signature 'fmrsfit' BIC(object, ...)
object |
|
... |
Other possible arguments |
A numeric value
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) BIC(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) BIC(res.mle)
Provides the estimated regression coefficients from the
fitted FMRs
model from an fmrsfit-class
coefficients(object, ...) ## S4 method for signature 'fmrsfit' coefficients(object, ...)
coefficients(object, ...) ## S4 method for signature 'fmrsfit' coefficients(object, ...)
object |
|
... |
Other possible arguments |
A numeric array of dimension-(nCov+1)
-nComp
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) coefficients(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) coefficients(res.mle)
Provides the estimated dispersions of the fitted FMRs
model from an fmrsfit-class
dispersion(object, ...) ## S4 method for signature 'fmrsfit' dispersion(object, ...)
dispersion(object, ...) ## S4 method for signature 'fmrsfit' dispersion(object, ...)
object |
|
... |
Other possible arguments |
A numeric array of dimension-(nCov+1)
-nComp
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) dispersion(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) dispersion(res.mle)
Provides the fitted response of the fitted FMRs
model
from an fmrsfit-class
fitted(object, ...) ## S4 method for signature 'fmrsfit' fitted(object, ...)
fitted(object, ...) ## S4 method for signature 'fmrsfit' fitted(object, ...)
object |
|
... |
Other possible arguments |
A numeric array of dimension-nObs
-nComp
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) head(fitted(res.mle))
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) head(fitted(res.mle))
Generates a data set from Finite Mixture of AFT regression models or Finite Mixture of Regression models under the specified setting.
fmrs.gendata(nObs, nComp, nCov, coeff, dispersion, mixProp, rho, umax, ...) ## S4 method for signature 'ANY' fmrs.gendata( nObs, nComp, nCov, coeff, dispersion, mixProp, rho, umax, disFamily = "lnorm" )
fmrs.gendata(nObs, nComp, nCov, coeff, dispersion, mixProp, rho, umax, ...) ## S4 method for signature 'ANY' fmrs.gendata( nObs, nComp, nCov, coeff, dispersion, mixProp, rho, umax, disFamily = "lnorm" )
nObs |
A numeric value represents sample size |
nComp |
A numeric value represents the order mixture in |
nCov |
A numeric value represents the number of covariates in design matrix |
coeff |
A vector of all regression coefficients including
intercepts. It must be a vector of length
|
dispersion |
A vector of positive values for dispersion parameters of
sub-distributions in |
mixProp |
A vector of mixing proportions which their sum must be one |
rho |
A numeric value in [-1, 1] which represents the correlation between covariates of design matrix |
umax |
A numeric value represents the upper bound in Uniform distribution for censoring |
... |
Other possible options |
disFamily |
A sub-distribution family. The options
are |
A list including response, covariates and censoring variables
Farhad Shokoohi <[email protected]>
Other lnorm, norm, weibull:
fmrs.mle()
,
fmrs.tunsel()
,
fmrs.varsel()
set.seed(1980) K = 2 D = 10 n = 500 REP = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')
set.seed(1980) K = 2 D = 10 n = 500 REP = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')
Provides MLE for Finite Mixture of Accelerated Failure Time Regression Models or Finite Mixture of Regression Models. It also provides Ridge Regression.
fmrs.mle(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.mle( y, delta, x, nComp = 2, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, lambRidge = 0, nIterEM = 400, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, NRpor = 2, activeset )
fmrs.mle(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.mle( y, delta, x, nComp = 2, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, lambRidge = 0, nIterEM = 400, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, NRpor = 2, activeset )
y |
Responses (observations) |
delta |
Censoring indicator vector |
x |
Design matrix (covariates) |
nComp |
Order (Number of components) of mixture model |
... |
Other possible options |
disFamily |
A sub-distribution family. The options
are |
initCoeff |
Vector of initial values for regression coefficients including intercepts |
initDispersion |
Vector of initial values for standard deviations |
initmixProp |
Vector of initial values for proportion of components |
lambRidge |
A positive value for tuning parameter in Ridge Regression or Elastic Net |
nIterEM |
Maximum number of iterations for EM algorithm |
nIterNR |
Maximum number of iterations for Newton-Raphson algorithm |
conveps |
A positive value for avoiding NaN in computing divisions |
convepsEM |
A positive value for threshold of convergence in EM algorithm |
convepsNR |
A positive value for threshold of convergence in Newton-Raphson algorithm |
NRpor |
A positive integer for maximum number of searches in NR algorithm |
activeset |
A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model |
Finite mixture of AFT regression models are represented as
follows. Let be the survival time with non-negative values,
and
be a
-dimensional vector of covariates that may have an effect
on
.
If the survival time is subject to right censoring, then the observed
response time is
,
where
,
is logarithm of the censoring time
and
is the censoring indicator.
We say that
follows a finite mixture
of AFT regression models of order
if the
conditional density of
given
has the form
where and
are respectively the density and
survival functions of
,
and
are
respectively the density and survival functions of
; and
for a known link function
,
with
and
with
.
The log-likelihood of a sample of size $n$ is formed
as
Note that we assume the censoring distribution is non-informative and hence won't play any role in the estimation process. We use EM and Newton-Raphson algorithms in our method to find the maximizer of above Log-Likelihood.
An fmrsfit-class
that includes parameter
estimates of the specified FMRs
model
Farhad Shokoohi <[email protected]>
Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies
Other lnorm, norm, weibull:
fmrs.gendata()
,
fmrs.tunsel()
,
fmrs.varsel()
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) summary(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) summary(res.mle)
Provides component-wise tuning parameters using BIC for Finite Mixture of Accelerated Failure Time Regression Models and Finite Mixture of Regression Models.
fmrs.tunsel(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.tunsel( y, delta, x, nComp, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, penFamily = "lasso", lambRidge = 0, nIterEM = 400, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, NRpor = 2, gamMixPor = 1, activeset, lambMCP, lambSICA, cutpoint = 0.05, LambMin = 0.01, LambMax = 1, nLamb = 100 )
fmrs.tunsel(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.tunsel( y, delta, x, nComp, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, penFamily = "lasso", lambRidge = 0, nIterEM = 400, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, NRpor = 2, gamMixPor = 1, activeset, lambMCP, lambSICA, cutpoint = 0.05, LambMin = 0.01, LambMax = 1, nLamb = 100 )
y |
Responses (observations) |
delta |
Censoring indicator vector |
x |
Design matrix (covariates) |
nComp |
Order (Number of components) of mixture model |
... |
Other possible options |
disFamily |
A sub-distribution family. The options
are |
initCoeff |
Vector of initial values for regression coefficients including intercepts |
initDispersion |
Vector of initial values for standard deviations |
initmixProp |
Vector of initial values for proportion of components |
penFamily |
Penalty name that is used in variable selection method.
The available options are |
lambRidge |
A positive value for tuning parameter in Ridge Regression or Elastic Net |
nIterEM |
Maximum number of iterations for EM algorithm |
nIterNR |
Maximum number of iterations for Newton-Raphson algorithm |
conveps |
A positive value for avoiding NaN in computing divisions |
convepsEM |
A positive value for threshold of convergence in EM algorithm |
convepsNR |
A positive value for threshold of convergence in NR algorithm |
NRpor |
A positive interger for maximum number of searches in NR algorithm |
gamMixPor |
Proportion of mixing parameters in the penalty. The
value must be in the interval [0,1]. If |
activeset |
A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model |
lambMCP |
A positive numbers for |
lambSICA |
A positive numbers for |
cutpoint |
A positive value for shrinking small values of parameter estimations in the EM algorithm toward zero |
LambMin |
A positive value for minimum value of tuning parameter |
LambMax |
A positive value for maximum value of tuning parameter |
nLamb |
A positive value for number of tuning parameter |
The maximizer of penalized Log-Likelihood depends on selecting a
set of good tuning parameters which is a rather thorny issue. We
choose a value in an equally spaced set of values in
for a pre-specified
that maximize the component-wise
BIC,
where is the active set excluding the intercept
and
is its size. This approach is much faster than using an
nComp
by nComp
grid to select the set to
maximize the penallized Log-Likelihood.
An fmrstunpar-class
that includes
component-wise tuning parameter estimates that can be used in
variable selection procedure.
Farhad Shokoohi <[email protected]>
Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies
Other lnorm, norm, weibull:
fmrs.gendata()
,
fmrs.mle()
,
fmrs.varsel()
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso') show(res.lam)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso') show(res.lam)
Provides variable selection and penalized MLE for
Finite Mixture of Accelerated Failure Time Regression (FMAFTR) Models
and Finite Mixture of Regression (FMR
) Models.
It also provide Ridge Regression and Elastic Net.
fmrs.varsel(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.varsel( y, delta, x, nComp, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, penFamily = "lasso", lambPen, lambRidge = 0, nIterEM = 2000, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, NRpor = 2, gamMixPor = 1, activeset, lambMCP, lambSICA, cutpoint = 0.05 )
fmrs.varsel(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.varsel( y, delta, x, nComp, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, penFamily = "lasso", lambPen, lambRidge = 0, nIterEM = 2000, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, NRpor = 2, gamMixPor = 1, activeset, lambMCP, lambSICA, cutpoint = 0.05 )
y |
Responses (observations) |
delta |
Censoring indicators |
x |
Design matrix (covariates) |
nComp |
Order (Number of components) of mixture model |
... |
Other possible options |
disFamily |
A sub-distribution family. The options
are |
initCoeff |
Vector of initial values for regression coefficients including intercepts |
initDispersion |
Vector of initial values for standard deviations |
initmixProp |
Vector of initial values for proportion of components |
penFamily |
Penalty name that is used in variable selection method
The available options are |
lambPen |
A vector of positive numbers for tuning parameters |
lambRidge |
A positive value for tuning parameter in Ridge Regression or Elastic Net |
nIterEM |
Maximum number of iterations for EM algorithm |
nIterNR |
Maximum number of iterations for Newton-Raphson algorithm |
conveps |
A positive value for avoiding NaN in computing divisions |
convepsEM |
A positive value for threshold of convergence in EM algorithm |
convepsNR |
A positive value for threshold of convergence in NR algorithm |
NRpor |
A positive interger for maximum number of searches in NR algorithm |
gamMixPor |
Proportion of mixing parameters in the penalty. The
value must be in the interval [0,1]. If |
activeset |
A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model |
lambMCP |
A positive numbers for |
lambSICA |
A positive numbers for |
cutpoint |
A positive value for shrinking small values of parameter estimations in the EM algorithm tward zero |
The penalized likelihood of a finite mixture of AFT regression models is written as
where
In the M step of EM algorithm the function
is maximized. Since the penalty function is singular at origin, we use a local quadratic approximation (LQA) for the penalty as follows,
Therefore maximizing is
equivalent to maximizing the
function
In case of Log-Normal sub-distributions, the maximizers of
functions are as follows. Given the data and current estimates of
parameters, the maximizers are
where
and
is equal to
For the Weibull distribution, on the other hand, we have
,
where
is the penalized version of hessian matrix
and
is the penalized version of vector of first derivatives evaluated
at
.
Farhad Shokoohi <[email protected]>
Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies
Other lnorm, norm, weibull:
fmrs.gendata()
,
fmrs.mle()
,
fmrs.tunsel()
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = ncomp(res.mle), disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso') res.var <- fmrs.varsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = ncomp(res.mle), disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso', lambPen = slot(res.lam, 'lambPen')) summary(res.var)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = ncomp(res.mle), disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso') res.var <- fmrs.varsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = ncomp(res.mle), disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso', lambPen = slot(res.lam, 'lambPen')) summary(res.var)
is an S4 class represents a fitted of FMRs model
resulted from running fmrs.mle
or fmrs.varsel
y
A length-nobs
numeric vector
delta
A length-nobs
numeric vector
x
A dimension-nobs
-ncov
numeric matrix
nobs
A length-one numeric vector
ncov
A length-one numeric vector
ncomp
A length-one numeric vector
coefficients
A length-(ncov+1)
-ncomp
numeric matrix
dispersion
A length-ncomp
numeric vector
mixProp
A length-ncomp
numeric vector
logLik
A length-one numeric vector
BIC
A length-one numeric vector
nIterEMconv
A length-one numeric vector
disFamily
A length-one character vector
penFamily
A length-one character vector
lambPen
A length-ncomp
numeric vector
lambRidge
A length-one numeric vector
MCPGam
A length-one numeric vector
SICAGam
A length-one numeric vector
model
A length-one character vector
fitted
A dimension-nobs
-ncomp
numeric matrix
residuals
A dimension-nobs
-ncomp
numeric matrix
weights
A dimension-nobs
-ncomp
numeric matrix
activeset
A dimension-ncov+1
-ncomp
0-1 matrix
selectedset
A dimension-ncov
-ncomp
0-1 matrix
An S4 class to represent estimated optimal lambdas resulted
from running fmrs.tunsel
ncov
A length-one numeric vector
ncomp
A length-one numeric vector
lambPen
A dimension-one-ncomp
numeric array
MCPGam
A length-one numeric vector
SICAGam
A length-one numeric vector
disFamily
A length-one character vector
penFamily
A length-one character vector
lambRidge
A length-one numeric vector
model
A length-one character vector
activeset
A dimension-nobs
-ncomp
0-1 matrix
Provides the estimated logLikelihood of an FMRs
model
from an fmrsfit-class
logLik(object, ...) ## S4 method for signature 'fmrsfit' logLik(object, ...)
logLik(object, ...) ## S4 method for signature 'fmrsfit' logLik(object, ...)
object |
|
... |
Other possible arguments |
A numeric value
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) logLik(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) logLik(res.mle)
Provides the estimated mixing proportions of an FMRs
model form an fmrsfit-class
mixProp(object, ...) ## S4 method for signature 'fmrsfit' mixProp(object, ...)
mixProp(object, ...) ## S4 method for signature 'fmrsfit' mixProp(object, ...)
object |
|
... |
Other possible arguments |
A numeric vector of length-nComp
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) mixProp(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) mixProp(res.mle)
Provides the order of an FMRs
model from
an fmrsfit-class
ncomp(object, ...) ## S4 method for signature 'fmrsfit' ncomp(object, ...)
ncomp(object, ...) ## S4 method for signature 'fmrsfit' ncomp(object, ...)
object |
|
... |
Other possible arguments |
An integer value
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) ncomp(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) ncomp(res.mle)
Provides the number of covariates of an FMRs
model from
an fmrsfit-class
ncov(object, ...) ## S4 method for signature 'fmrsfit' ncov(object, ...)
ncov(object, ...) ## S4 method for signature 'fmrsfit' ncov(object, ...)
object |
|
... |
Other possible arguments |
An integer value
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) ncov(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) ncov(res.mle)
Provides the number of observations in an FMRs
model
from an fmrsfit-class
nobs(object, ...) ## S4 method for signature 'fmrsfit' nobs(object, ...)
nobs(object, ...) ## S4 method for signature 'fmrsfit' nobs(object, ...)
object |
|
... |
Other possible arguments |
An integer value
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) nobs(res.mle)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) nobs(res.mle)
Provides the residuals of the fitted FMRs
model from
an fmrsfit-class
residuals(object, ...) ## S4 method for signature 'fmrsfit' residuals(object, ...)
residuals(object, ...) ## S4 method for signature 'fmrsfit' residuals(object, ...)
object |
|
... |
Other possible arguments |
A numeric array of dimension-nObs
-nComp
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) head(residuals(res.mle))
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) head(residuals(res.mle))
Displays the fitted FMRs
model by showing the estimated
coefficients, dispersions and mixing proportions
Display the selected component-wise tuning parameters
summary(object, ...) summary(object, ...) ## S4 method for signature 'fmrsfit' summary(object, ...) ## S4 method for signature 'fmrstunpar' summary(object, ...)
summary(object, ...) summary(object, ...) ## S4 method for signature 'fmrsfit' summary(object, ...) ## S4 method for signature 'fmrstunpar' summary(object, ...)
object |
|
... |
Other possible arguments |
Summary of the fitted FMRs
model
Summary of the selected component-wise tuning parameters
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) summary(res.mle) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso') summary(res.lam)
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) summary(res.mle) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso') summary(res.lam)
Provides the weights of fitted observations for
each observation under all components of an FMRs
model
weights(object, ...) ## S4 method for signature 'fmrsfit' weights(object, ...)
weights(object, ...) ## S4 method for signature 'fmrsfit' weights(object, ...)
object |
|
... |
Other possible arguments |
A numeric array of dimension-nObs
-nComp
Farhad Shokoohi <[email protected]>
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) head(weights(res.mle))
set.seed(1980) K = 2 D = 10 n = 500 sig = c(1, 1) piM = c(0.4, 0.6) r1 = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) Um = 40 dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2), dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K, disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K), initmixProp = rep(1/K, K)) head(weights(res.mle))