Package 'fmrs'

Title: Variable Selection in Finite Mixture of AFT Regression and FMR Models
Description: The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net.
Authors: Farhad Shokoohi [aut, cre]
Maintainer: Farhad Shokoohi <shokoohi@icloud.com>
License: GPL-3
Version: 1.17.0
Built: 2024-12-29 05:36:26 UTC
Source: https://github.com/bioc/fmrs

Help Index


Variable Selection in Finite Mixture of AFT Regression and FMR Models

Description

The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net. The fmrs.mle method provides MLE for FMRs models. The fmrs.tunsel method provides component-wise tuning parameters. The fmrs.varsel method provides variable selection for FMRs models.

fmrs methods

fmrs.mle, fmrs.tunsel, fmrs.varsel, fmrs.gendata.

fmrs objects

fmrsfit-class, fmrstunpar-class


BIC method

Description

Provides the estimated BIC of an FMRs model from an fmrsfit-class

Usage

BIC(object, ...)

## S4 method for signature 'fmrsfit'
BIC(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric value

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

BIC(res.mle)

coefficients method

Description

Provides the estimated regression coefficients from the fitted FMRs model from an fmrsfit-class

Usage

coefficients(object, ...)

## S4 method for signature 'fmrsfit'
coefficients(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric array of dimension-(nCov+1)-nComp

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

coefficients(res.mle)

dispersion method

Description

Provides the estimated dispersions of the fitted FMRs model from an fmrsfit-class

Usage

dispersion(object, ...)

## S4 method for signature 'fmrsfit'
dispersion(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric array of dimension-(nCov+1)-nComp

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

dispersion(res.mle)

fitted method

Description

Provides the fitted response of the fitted FMRs model from an fmrsfit-class

Usage

fitted(object, ...)

## S4 method for signature 'fmrsfit'
fitted(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric array of dimension-nObs-nComp

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

head(fitted(res.mle))

fmrs.gendata method

Description

Generates a data set from Finite Mixture of AFT regression models or Finite Mixture of Regression models under the specified setting.

Usage

fmrs.gendata(nObs, nComp, nCov, coeff, dispersion, mixProp, rho, umax, ...)

## S4 method for signature 'ANY'
fmrs.gendata(
  nObs,
  nComp,
  nCov,
  coeff,
  dispersion,
  mixProp,
  rho,
  umax,
  disFamily = "lnorm"
)

Arguments

nObs

A numeric value represents sample size

nComp

A numeric value represents the order mixture in FMRs

nCov

A numeric value represents the number of covariates in design matrix

coeff

A vector of all regression coefficients including intercepts. It must be a vector of length nComp *(nCov+1).

dispersion

A vector of positive values for dispersion parameters of sub-distributions in FMRs models

mixProp

A vector of mixing proportions which their sum must be one

rho

A numeric value in [-1, 1] which represents the correlation between covariates of design matrix

umax

A numeric value represents the upper bound in Uniform distribution for censoring

...

Other possible options

disFamily

A sub-distribution family. The options are 'norm' for FMR models, 'lnorm' for mixture of AFT regression models with Log-Normal sub-distributions,'weibull' for mixture of AFT regression models with Weibull sub-distributions

Value

A list including response, covariates and censoring variables

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

See Also

Other lnorm, norm, weibull: fmrs.mle(), fmrs.tunsel(), fmrs.varsel()

Examples

set.seed(1980)
K = 2
D = 10
n = 500
REP = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

fmrs.mle method

Description

Provides MLE for Finite Mixture of Accelerated Failure Time Regression Models or Finite Mixture of Regression Models. It also provides Ridge Regression.

Usage

fmrs.mle(y, delta, x, nComp, ...)

## S4 method for signature 'ANY'
fmrs.mle(
  y,
  delta,
  x,
  nComp = 2,
  disFamily = "lnorm",
  initCoeff,
  initDispersion,
  initmixProp,
  lambRidge = 0,
  nIterEM = 400,
  nIterNR = 2,
  conveps = 1e-08,
  convepsEM = 1e-08,
  convepsNR = 1e-08,
  NRpor = 2,
  activeset
)

Arguments

y

Responses (observations)

delta

Censoring indicator vector

x

Design matrix (covariates)

nComp

Order (Number of components) of mixture model

...

Other possible options

disFamily

A sub-distribution family. The options are 'norm' for FMR models, 'lnorm' for mixture of AFT regression models with Log-Normal sub-distributions,'weibull' for mixture of AFT regression models with Weibull sub-distributions

initCoeff

Vector of initial values for regression coefficients including intercepts

initDispersion

Vector of initial values for standard deviations

initmixProp

Vector of initial values for proportion of components

lambRidge

A positive value for tuning parameter in Ridge Regression or Elastic Net

nIterEM

Maximum number of iterations for EM algorithm

nIterNR

Maximum number of iterations for Newton-Raphson algorithm

conveps

A positive value for avoiding NaN in computing divisions

convepsEM

A positive value for threshold of convergence in EM algorithm

convepsNR

A positive value for threshold of convergence in Newton-Raphson algorithm

NRpor

A positive integer for maximum number of searches in NR algorithm

activeset

A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model

Details

Finite mixture of AFT regression models are represented as follows. Let XX be the survival time with non-negative values, and z=(z1,,zd)\boldsymbol{z} =(z_{1}, \ldots, z_{d})^{\top} be a dd-dimensional vector of covariates that may have an effect on XX. If the survival time is subject to right censoring, then the observed response time is T=min{Y,C}T=\min \{Y, C\}, where Y=logXY=\log X, CC is logarithm of the censoring time and δ=I{y<c}\delta=I_{\{y<c\}} is the censoring indicator. We say that V=(T,δ,z)V=(T,\delta,\boldsymbol z) follows a finite mixture of AFT regression models of order KK if the conditional density of (T,δ)(T,\delta) given z\boldsymbol z has the form

f(t,δ;z,Ψ)=k=1Kπk[fY(t;θk(z),σk)]δ[SY(t;θk(z),σk)]1δ[fC(t)]1δ[SC(t)]δf(t,\delta;\boldsymbol{z},\boldsymbol\Psi) =\sum\limits_{k=1}^{K}\pi_{k}[f_Y(t;\theta_{k}(\boldsymbol z), \sigma_{k})]^{\delta}[S_Y(t;\theta_{k}(\boldsymbol z) ,\sigma_{k})]^{1-\delta}[f_{C}(t)]^{1-\delta}[S_{C}(t)]^{\delta}

where fY(.)f_Y(.) and SY(.)S_Y(.) are respectively the density and survival functions of YY, fC(.)f_C(.) and SC(.)S_C(.) are respectively the density and survival functions of CC; and θk(z)=h(β0k+zβk){\theta}_{k}(\boldsymbol{z})=h(\beta_{0k}+\boldsymbol{z}^{\top} \boldsymbol\beta_{k}) for a known link function h(.)h(.), Ψ=(π1,,πK,β01,,β0K,β1,,βK,σ1,,σK)\boldsymbol\Psi=(\pi_{1},\ldots,\pi_{K},\beta_{01},\ldots, \beta_{0K},\boldsymbol\beta_{1}, \ldots,\boldsymbol\beta_{K},\sigma_{1}, \ldots,\sigma_{K})^{\top} with βk=(βk1,βk2,,βkd)\boldsymbol\beta_{k}= (\beta_{k1},\beta_{k2},\ldots,\beta_{kd})^{\top} and 0<πk<10<\pi_{k}<1 with k=1Kπk=1\sum_{k=1}^{K}\pi_{k}=1. The log-likelihood of a sample of size $n$ is formed as

n(Ψ)=i=1nlogk=1Kπk[fY(ti,θk(zi),σk)]δi[SY(ti,θk(zi),σk)]1δi.\ell_{n}(\boldsymbol\Psi) = \sum\limits_{i=1}^{n}\log\sum\limits_{k=1}^{K}\pi_{k}\left[f_Y(t_{i}, \theta_{k}({\boldsymbol z}_{i}),\sigma_{k}) \right]^{\delta_{i}} \left[S_Y(t_{i},\theta_{k}({\boldsymbol z}_{i}), \sigma_{k})\right]^{1-\delta_{i}}.

Note that we assume the censoring distribution is non-informative and hence won't play any role in the estimation process. We use EM and Newton-Raphson algorithms in our method to find the maximizer of above Log-Likelihood.

Value

An fmrsfit-class that includes parameter estimates of the specified FMRs model

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

References

Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies

See Also

Other lnorm, norm, weibull: fmrs.gendata(), fmrs.tunsel(), fmrs.varsel()

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

summary(res.mle)

fmrs.tunsel method

Description

Provides component-wise tuning parameters using BIC for Finite Mixture of Accelerated Failure Time Regression Models and Finite Mixture of Regression Models.

Usage

fmrs.tunsel(y, delta, x, nComp, ...)

## S4 method for signature 'ANY'
fmrs.tunsel(
  y,
  delta,
  x,
  nComp,
  disFamily = "lnorm",
  initCoeff,
  initDispersion,
  initmixProp,
  penFamily = "lasso",
  lambRidge = 0,
  nIterEM = 400,
  nIterNR = 2,
  conveps = 1e-08,
  convepsEM = 1e-08,
  convepsNR = 1e-08,
  NRpor = 2,
  gamMixPor = 1,
  activeset,
  lambMCP,
  lambSICA,
  cutpoint = 0.05,
  LambMin = 0.01,
  LambMax = 1,
  nLamb = 100
)

Arguments

y

Responses (observations)

delta

Censoring indicator vector

x

Design matrix (covariates)

nComp

Order (Number of components) of mixture model

...

Other possible options

disFamily

A sub-distribution family. The options are 'norm' for FMR models, 'lnorm' for mixture of AFT regression models with Log-Normal sub-distributions, 'weibull' for mixture of AFT regression models with Weibull sub-distributions,

initCoeff

Vector of initial values for regression coefficients including intercepts

initDispersion

Vector of initial values for standard deviations

initmixProp

Vector of initial values for proportion of components

penFamily

Penalty name that is used in variable selection method. The available options are 'lasso', 'adplasso', 'mcp', 'scad', 'sica' and 'hard'.

lambRidge

A positive value for tuning parameter in Ridge Regression or Elastic Net

nIterEM

Maximum number of iterations for EM algorithm

nIterNR

Maximum number of iterations for Newton-Raphson algorithm

conveps

A positive value for avoiding NaN in computing divisions

convepsEM

A positive value for threshold of convergence in EM algorithm

convepsNR

A positive value for threshold of convergence in NR algorithm

NRpor

A positive interger for maximum number of searches in NR algorithm

gamMixPor

Proportion of mixing parameters in the penalty. The value must be in the interval [0,1]. If gamMixPor = 0, the penalty structure is no longer mixture.

activeset

A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model

lambMCP

A positive numbers for mcp's extra tuning parameter

lambSICA

A positive numbers for sica's extra tuning parameter

cutpoint

A positive value for shrinking small values of parameter estimations in the EM algorithm toward zero

LambMin

A positive value for minimum value of tuning parameter

LambMax

A positive value for maximum value of tuning parameter

nLamb

A positive value for number of tuning parameter

Details

The maximizer of penalized Log-Likelihood depends on selecting a set of good tuning parameters which is a rather thorny issue. We choose a value in an equally spaced set of values in (0,λmax)(0, \lambda_{max}) for a pre-specified λmax\lambda_{max} that maximize the component-wise BIC,

λ^k=argmaxλkBICk(λk)=argmaxλk{k,nc(Ψ^λk,k)dλk,klog(n)},\hat\lambda_{k} ={argmax}_{\lambda_{k}}BIC_k(\lambda_{k})= {argmax}_{\lambda_{k}}\left\{\ell^{c}_{k, n} (\hat{\boldsymbol\Psi}_{\lambda_{k}, k}) - |d_{\lambda_{k},k}| \log (n)\right\},

where dλk,k={j:β^λk,kj0,j=1,,d}d_{\lambda_{k},k}=\{j:\hat{\beta}_{\lambda_{k},kj}\neq 0, j=1,\ldots,d\} is the active set excluding the intercept and dλk,k|d_{\lambda_{k},k}| is its size. This approach is much faster than using an nComp by nComp grid to select the set λ\boldsymbol\lambda to maximize the penallized Log-Likelihood.

Value

An fmrstunpar-class that includes component-wise tuning parameter estimates that can be used in variable selection procedure.

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

References

Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies

See Also

Other lnorm, norm, weibull: fmrs.gendata(), fmrs.mle(), fmrs.varsel()

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)),
initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle),
penFamily = 'adplasso')

show(res.lam)

fmrs.varsel method

Description

Provides variable selection and penalized MLE for Finite Mixture of Accelerated Failure Time Regression (FMAFTR) Models and Finite Mixture of Regression (FMR) Models. It also provide Ridge Regression and Elastic Net.

Usage

fmrs.varsel(y, delta, x, nComp, ...)

## S4 method for signature 'ANY'
fmrs.varsel(
  y,
  delta,
  x,
  nComp,
  disFamily = "lnorm",
  initCoeff,
  initDispersion,
  initmixProp,
  penFamily = "lasso",
  lambPen,
  lambRidge = 0,
  nIterEM = 2000,
  nIterNR = 2,
  conveps = 1e-08,
  convepsEM = 1e-08,
  convepsNR = 1e-08,
  NRpor = 2,
  gamMixPor = 1,
  activeset,
  lambMCP,
  lambSICA,
  cutpoint = 0.05
)

Arguments

y

Responses (observations)

delta

Censoring indicators

x

Design matrix (covariates)

nComp

Order (Number of components) of mixture model

...

Other possible options

disFamily

A sub-distribution family. The options are 'norm' for FMR models, 'lnorm' for mixture of AFT regression models with Log-Normal sub-distributions, 'weibull' for mixture of AFT regression models with Weibull sub-distributions

initCoeff

Vector of initial values for regression coefficients including intercepts

initDispersion

Vector of initial values for standard deviations

initmixProp

Vector of initial values for proportion of components

penFamily

Penalty name that is used in variable selection method The available options are 'lasso', 'adplasso', 'mcp', 'scad', 'sica' and 'hard'.

lambPen

A vector of positive numbers for tuning parameters

lambRidge

A positive value for tuning parameter in Ridge Regression or Elastic Net

nIterEM

Maximum number of iterations for EM algorithm

nIterNR

Maximum number of iterations for Newton-Raphson algorithm

conveps

A positive value for avoiding NaN in computing divisions

convepsEM

A positive value for threshold of convergence in EM algorithm

convepsNR

A positive value for threshold of convergence in NR algorithm

NRpor

A positive interger for maximum number of searches in NR algorithm

gamMixPor

Proportion of mixing parameters in the penalty. The value must be in the interval [0,1]. If gamMixPor = 0, the penalty structure is no longer mixture.

activeset

A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model

lambMCP

A positive numbers for mcp's extra tuning parameter

lambSICA

A positive numbers for sica's extra tuning parameter

cutpoint

A positive value for shrinking small values of parameter estimations in the EM algorithm tward zero

Details

The penalized likelihood of a finite mixture of AFT regression models is written as

~n(Ψ)=n(Ψ)pλn(Ψ)\tilde\ell_{n}(\boldsymbol\Psi) =\ell_{n}(\boldsymbol\Psi) - \mathbf{p}_{\boldsymbol\lambda_{n}}(\boldsymbol\Psi)

where

pλn(Ψ)=k=1Kπkα{j=1dpλn,k(βkj)}.\mathbf{p}_{\boldsymbol\lambda_{n}}(\boldsymbol\Psi) = \sum\limits_{k=1}^{K}\pi_{k}^\alpha\left\{ \sum\limits_{j=1}^{d}p_{\lambda_{n,k}}(\beta_{kj}) \right\}.

In the M step of EM algorithm the function

Q~(Ψ,Ψ(m))=k=1KQ~k(Ψk,Ψk(m))=k=1K[Qk(Ψk,Ψk(m))πkα{j=1dpλn,k(βkj)}]\tilde{Q}(\boldsymbol\Psi,\boldsymbol\Psi^{(m)}) =\sum\limits_{k=1}^{K} \tilde{Q}_{k}(\boldsymbol\Psi_k, \boldsymbol\Psi^{(m)}_k) = \sum\limits_{k=1}^{K} \left[{Q}_{k}(\boldsymbol\Psi_k, \boldsymbol\Psi^{(m)}_k) - \pi_{k}^\alpha\left\{ \sum\limits_{j=1}^{d}p_{\lambda_{n,k}}(\beta_{kj}) \right\}\right]

is maximized. Since the penalty function is singular at origin, we use a local quadratic approximation (LQA) for the penalty as follows,

pk,λn(β,β(m))=(πk(m))αj=1d{pλn,k(βkj(m))+pλn,k(βkj(m))2βkj(m)(βkj2βkj(m)2)}.\mathbf{p}^\ast_{k,\boldsymbol\lambda_{n}} (\boldsymbol\beta,\boldsymbol\beta^{(m)}) =(\pi_{k}^{(m)})^{\alpha}\sum\limits_{j=1}^{d}\left\{ p_{\lambda_{n,k}}(\beta_{kj}^{(m)}) + { p^{\prime}_{\lambda_{n,k}} (\beta_{kj}^{(m)}) \over 2\beta_{kj}^{(m)}}(\beta_{kj}^{2} - {\beta_{kj}^{(m)}}^{2}) \right\}.

Therefore maximizing QQ is equivalent to maximizing the function

Q(Ψ,Ψ(m))=k=1KQk(Ψk,Ψk(m))=k=1K[Qk(Ψk,Ψk(m))pk,λn(β,β(m))].{Q}^\ast(\boldsymbol\Psi,\boldsymbol\Psi^{(m)}) =\sum\limits_{k=1}^{K} {Q}^\ast_{k}(\boldsymbol\Psi_k, \boldsymbol\Psi^{(m)}_k) = \sum\limits_{k=1}^{K} \left[{Q}_{k}(\boldsymbol\Psi_k,\boldsymbol\Psi^{(m)}_k)- \mathbf{p}^\ast_{k,\boldsymbol\lambda_{n}}(\boldsymbol\beta, \boldsymbol\beta^{(m)})\right].

In case of Log-Normal sub-distributions, the maximizers of QkQ_k functions are as follows. Given the data and current estimates of parameters, the maximizers are

βk(m+1)=(zτk(m)z+ϖk(βkj(m)))1zτk(m)Tk(m),{\boldsymbol\beta}^{(m+1)}_{k} =({\boldsymbol z}^{\prime}\boldsymbol\tau^{(m)}_{k}{\boldsymbol z}+ \varpi_{k}(\boldsymbol\beta_{kj}^{(m)}))^{-1}{\boldsymbol z}^{\prime} \boldsymbol\tau^{(m)}_{k}T^{(m)}_{k},

where ϖk(βkj(m))=diag((πk(m+1))αpλn,k(βkj(m))βkj(m))\varpi_{k}(\boldsymbol\beta_{kj}^{(m)})={diag} \left(\left(\pi_{k}^{(m+1)}\right)^\alpha \frac{{p}^{\prime}_{\lambda_{n},k}(\boldsymbol\beta_{kj}^{(m)})} {\boldsymbol\beta_{kj}^{(m)}}\right) and σk(m+1)\sigma_{k}^{(m+1)} is equal to

σk(m+1)=i=1nτik(m)(tik(m)ziβk(m))2i=1nτik(m)[δi+(1δi){A(wik(m))[A(wik(m))wik(m)]}].\sigma_{k}^{(m+1)} =\sqrt{\frac{\sum\limits_{i=1}^{n}\tau^{(m)}_{ik} (t^{(m)}_{ik} -{\boldsymbol z}_{i}\boldsymbol\beta^{(m)}_{k})^{2}} {\sum\limits_{i=1}^{n}\tau^{(m)}_{ik} {\left[\delta_{i} +(1-\delta_{i})\{A(w^{(m)}_{ik})[A(w^{(m)}_{ik})- w^{(m)}_{ik}]\}\right]}}}.

For the Weibull distribution, on the other hand, we have Ψ~k(m+1)=Ψ~k(m)0.5κ[Hkp,(m)]1Ikp,(m)\tilde{\boldsymbol\Psi}^{(m+1)}_k =\tilde{\boldsymbol\Psi}^{(m)}_k - 0.5^{\kappa}\left[{H_{k}^{p,(m)}}\right]^{-1}I_{k}^{p,(m)}, where Hkp=Hk+h(Ψk)H^p_{k}=H_k+h(\boldsymbol\Psi_k) is the penalized version of hessian matrix and Ikp=Ik+h(Ψk)ΨkI^p_{k}=I_k+h(\boldsymbol\Psi_k)\boldsymbol\Psi_k is the penalized version of vector of first derivatives evaluated at Ψ~k(m)\tilde{\boldsymbol\Psi}_k^{(m)}.

Value

fmrsfit-class

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

References

Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies

See Also

Other lnorm, norm, weibull: fmrs.gendata(), fmrs.mle(), fmrs.tunsel()

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta,
nComp = ncomp(res.mle), disFamily = 'lnorm',
initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle),
initmixProp = mixProp(res.mle), penFamily = 'adplasso')

res.var <- fmrs.varsel(y = dat$y, x = dat$x, delta = dat$delta,
nComp = ncomp(res.mle), disFamily = 'lnorm',
initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle),
initmixProp = mixProp(res.mle), penFamily = 'adplasso',
lambPen = slot(res.lam, 'lambPen'))

summary(res.var)

An S4 class to represent a fitted FMRs model

Description

is an S4 class represents a fitted of FMRs model resulted from running fmrs.mle or fmrs.varsel

Slots

y

A length-nobs numeric vector

delta

A length-nobs numeric vector

x

A dimension-nobs-ncov numeric matrix

nobs

A length-one numeric vector

ncov

A length-one numeric vector

ncomp

A length-one numeric vector

coefficients

A length-(ncov+1)-ncomp numeric matrix

dispersion

A length-ncomp numeric vector

mixProp

A length-ncomp numeric vector

logLik

A length-one numeric vector

BIC

A length-one numeric vector

nIterEMconv

A length-one numeric vector

disFamily

A length-one character vector

penFamily

A length-one character vector

lambPen

A length-ncomp numeric vector

lambRidge

A length-one numeric vector

MCPGam

A length-one numeric vector

SICAGam

A length-one numeric vector

model

A length-one character vector

fitted

A dimension-nobs-ncomp numeric matrix

residuals

A dimension-nobs-ncomp numeric matrix

weights

A dimension-nobs-ncomp numeric matrix

activeset

A dimension-ncov+1-ncomp 0-1 matrix

selectedset

A dimension-ncov-ncomp 0-1 matrix


An S4 class to represent estimated optimal lambdas

Description

An S4 class to represent estimated optimal lambdas resulted from running fmrs.tunsel

Slots

ncov

A length-one numeric vector

ncomp

A length-one numeric vector

lambPen

A dimension-one-ncomp numeric array

MCPGam

A length-one numeric vector

SICAGam

A length-one numeric vector

disFamily

A length-one character vector

penFamily

A length-one character vector

lambRidge

A length-one numeric vector

model

A length-one character vector

activeset

A dimension-nobs-ncomp 0-1 matrix


logLik method

Description

Provides the estimated logLikelihood of an FMRs model from an fmrsfit-class

Usage

logLik(object, ...)

## S4 method for signature 'fmrsfit'
logLik(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric value

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

logLik(res.mle)

mixProp method

Description

Provides the estimated mixing proportions of an FMRs model form an fmrsfit-class

Usage

mixProp(object, ...)

## S4 method for signature 'fmrsfit'
mixProp(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric vector of length-nComp

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

mixProp(res.mle)

ncomp method

Description

Provides the order of an FMRs model from an fmrsfit-class

Usage

ncomp(object, ...)

## S4 method for signature 'fmrsfit'
ncomp(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

An integer value

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

ncomp(res.mle)

ncov method

Description

Provides the number of covariates of an FMRs model from an fmrsfit-class

Usage

ncov(object, ...)

## S4 method for signature 'fmrsfit'
ncov(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

An integer value

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

ncov(res.mle)

nobs method

Description

Provides the number of observations in an FMRs model from an fmrsfit-class

Usage

nobs(object, ...)

## S4 method for signature 'fmrsfit'
nobs(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

An integer value

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

nobs(res.mle)

residuals method

Description

Provides the residuals of the fitted FMRs model from an fmrsfit-class

Usage

residuals(object, ...)

## S4 method for signature 'fmrsfit'
residuals(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric array of dimension-nObs-nComp

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

head(residuals(res.mle))

summary method

Description

Displays the fitted FMRs model by showing the estimated coefficients, dispersions and mixing proportions

Display the selected component-wise tuning parameters

Usage

summary(object, ...)

summary(object, ...)

## S4 method for signature 'fmrsfit'
summary(object, ...)

## S4 method for signature 'fmrstunpar'
summary(object, ...)

Arguments

object

An fmrsfit-class or fmrstunpar-class

...

Other possible arguments

Value

Summary of the fitted FMRs model

Summary of the selected component-wise tuning parameters

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

summary(res.mle)
res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta,
nComp = K, disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)),
initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle),
penFamily = 'adplasso')

summary(res.lam)

weights method

Description

Provides the weights of fitted observations for each observation under all components of an FMRs model

Usage

weights(object, ...)

## S4 method for signature 'fmrsfit'
weights(object, ...)

Arguments

object

An fmrsfit-class

...

Other possible arguments

Value

A numeric array of dimension-nObs-nComp

Author(s)

Farhad Shokoohi <shokoohi@icloud.com>

Examples

set.seed(1980)
K = 2
D = 10
n = 500
sig = c(1, 1)
piM = c(0.4, 0.6)
r1 = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
Um = 40

dat <- fmrs.gendata(nObs = n, nComp = K, nCov = D, coeff = c(coeff1, coeff2),
dispersion = sig, mixProp = piM, rho = r1, umax = Um, disFamily = 'lnorm')

res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = K,
disFamily = 'lnorm', initCoeff = rnorm(K*D+K), initDispersion = rep(1, K),
initmixProp = rep(1/K, K))

head(weights(res.mle))