Package 'semisup' reference manual

Title:	Semi-Supervised Mixture Model
Description:	Implements a parametric semi-supervised mixture model. The permutation test detects markers with main or interactive effects, without distinguishing them. Possible applications include genome-wide association analysis and differential expression analysis.
Authors:	Armin Rauschenberger [aut, cre]
Maintainer:	Armin Rauschenberger <[email protected]>
License:	GPL-3
Version:	1.31.0
Built:	2025-02-28 03:46:51 UTC
Source:	https://github.com/bioc/semisup

Semi-supervised mixture model

Description

This R package implements the semi-supervised mixture model. Use mixtura for model fitting, and scrutor for hypothesis testing.

Getting started

Please type the following commands:
utils::vignette("semisup")
?semisup::mixtura
?semisup::scrutor

More information

A Rauschenberger, RX Menezes, MA van de Wiel, NM van Schoor, and MA Jonker (2020). "Semi-supervised mixture test for detecting markers associated with a quantitative trait", Manuscript in preparation.

[email protected]

Model fitting

Description

This function fits a semi-supervised mixture model. It simultaneously estimates two mixture components, and assigns the unlabelled observations to these.

Usage

mixtura(y, z, dist = "norm",
        phi = NULL, pi = NULL, gamma = NULL,
        test = NULL, iter = 100, kind = 0.05,
        debug = TRUE, ...)
mixtura(y, z, dist = "norm",
        phi = NULL, pi = NULL, gamma = NULL,
        test = NULL, iter = 100, kind = 0.05,
        debug = TRUE, ...)

Arguments

`y`	observations: numeric vector of length `n`
`z`	class labels: integer vector of length `n`, with entries `0`, `1` and `NA`
`dist`	distributional assumption: character `"norm"` (Gaussian), `"nbinom"` (negative bionomial), or `"zinb"` (zero-inflated negative binomial)
`phi`	dispersion parameters: numeric vector of length `q`, or `NULL`
`pi`	zero-inflation parameter(s): numeric vector of length `q`, or `NULL`
`gamma`	offset: numeric vector of length `n`, or `NULL`
`test`	resampling procedure: character `"perm"` (permutation) or `"boot"` (parametric bootstrap), or `NULL`
`iter`	(maximum) number of resampling iterations : positive integer, or `NULL`
`kind`	resampling accuracy: numeric between `0` and `1`, or `NULL`; all `p`-values above `kind` are approximate
`debug`	verification of arguments: `TRUE` or `FALSE`
`...`	settings `EM` algorithm: `starts`, `it.em` and `epsilon` (see `arguments`)

Details

By default, phi and pi are estimated by the maximum likelihood method, and gamma is replaced by a vector of ones.

Value

This function fits and compares a one-component (H0) and a two-component (H1) mixture model.

`posterior`	probability of belonging to class 1: numeric vector of length `n`
`converge`	path of the log-likelihood: numeric vector with maximum length `it.em`
`estim0`	parameter estimates under `H0`: data frame
`estim1`	parameter estimates under `H1`: data frame
`loglik0`	log-likelihood under `H0`: numeric
`loglik1`	log-likelihood under `H1`: numeric
`lrts`	likelihood-ratio test statistic: positive numeric
`p.value`	`H0` versus `H1`: numeric between `0` and `1`, or `NULL`

Reference

Examples

# data simulation
n <- 100
z <- rep(0:1,each=n/2)
y <- rnorm(n=n,mean=2,sd=1)
z[(n/4):n] <- NA

# model fitting
mixtura(y,z,dist="norm",test="perm")

# data simulation
n <- 100
z <- rep(0:1,each=n/2)
y <- rnorm(n=n,mean=2,sd=1)
z[(n/4):n] <- NA

# model fitting
mixtura(y,z,dist="norm",test="perm")

Hypothesis testing

Description

This function tests whether the unlabelled observations come from a mixture of two distributions.

Usage

scrutor(Y, Z, dist = "norm",
        phi = NULL, pi = NULL, gamma = NULL,
        test = "perm", iter = NULL, kind = NULL,
        debug = TRUE, ...)
scrutor(Y, Z, dist = "norm",
        phi = NULL, pi = NULL, gamma = NULL,
        test = "perm", iter = NULL, kind = NULL,
        debug = TRUE, ...)

Arguments

`Y`	observations: numeric vector of length `n`, or numeric matrix with `n` rows (samples) and `q` columns (variables)
`Z`	class labels: numeric vector of length `n`, or numeric matrix with `n` rows (samples) and `p` columns (variables), with entries `0` and `NA`
`dist`	distributional assumption: character `"norm"` (Gaussian), `"nbinom"` (negative bionomial), or `"zinb"` (zero-inflated negative binomial)
`phi`	dispersion parameter(s): numeric vector of length `q`, or `NULL` (`norm:` none, `nbinom:` MLE)
`pi`	zero-inflation parameter(s): numeric vector of length `q`, or `NULL` (`norm:` none,`nbinom:` MLE)
`gamma`	offset: numeric vector of length `n`, or `NULL`
`test`	resampling procedure: character `"perm"` (permutation) or `"boot"` (parametric bootstrap), or `NULL`
`iter`	(maximum) number of resampling iterations : positive integer, or `NULL`
`kind`	resampling accuracy: numeric between `0` and `1`, or `NULL`; all `p`-values above `kind` are approximate
`debug`	verification of arguments: `TRUE` or `FALSE`
`...`	settings `EM` algorithm: `starts`, `it.em` and `epsilon` (see `arguments`)

Details

By default, phi and pi are estimated by the maximum likelihood method, and gamma is replaced by a vector of ones.

Value

This function tests a one-component (H0) against a two-component mixture model (H1).

`y`	index observations
`z`	index class labels
`lrts`	test statistic
`p.value`	`p`-value

Reference

Examples

# data simulation
n <- 100
z <- rep(0:1,each=n/2)
y <- rnorm(n=n,mean=2*z,sd=1)
z[(n/4):n] <- NA

# hypothesis testing
scrutor(y,z,dist="norm")

# data simulation
n <- 100
z <- rep(0:1,each=n/2)
y <- rnorm(n=n,mean=2*z,sd=1)
z[(n/4):n] <- NA

# hypothesis testing
scrutor(y,z,dist="norm")

Package 'semisup'

Help Index

Semi-supervised mixture model

Description

Getting started

More information

Model fitting

Description

Usage

Arguments

Details

Value

Reference

See Also

Examples

Hypothesis testing

Description

Usage

Arguments

Details

Value

Reference

See Also

Examples