Title: | Genotype Conditional Association TEST |
---|---|
Description: | GCAT is an association test for genome wide association studies that controls for population structure under a general class of trait models. This test conditions on the trait, which makes it immune to confounding by unmodeled environmental factors. Population structure is modeled via logistic factors, which are estimated using the `lfa` package. |
Authors: | Wei Hao [aut], Minsun Song [aut], Alejandro Ochoa [aut, cre] , John D. Storey [aut] |
Maintainer: | Alejandro Ochoa <[email protected]> |
License: | GPL (>= 3) |
Version: | 2.7.0 |
Built: | 2024-10-30 07:57:49 UTC |
Source: | https://github.com/bioc/gcatest |
This function fits, at each locus of a given genotype matrix, two logistic
models, and under the assumption that the models are nested, calculates the
delta deviance between the two.
This general function is intended for testing models in a broad setting; for
the specific problem of genetic association, the interface in
gcat()
and
gcat.stat()
are more user-friendly.
delta_deviance_lf(X, LF0, LF1)
delta_deviance_lf(X, LF0, LF1)
X |
A matrix of SNP genotypes, i.e. an integer matrix of 0's,
1's, 2's and |
LF0 |
Logistic factors for null model. |
LF1 |
Logistic factors for alternative model. |
The vector of delta deviance values, one per locus of X
.
library(lfa) # make example data smaller so example is fast # goes from 1000 to 100 individuals indexes <- sample.int( ncol(sim_geno), 100 ) sim_geno <- sim_geno[ , indexes ] sim_trait <- sim_trait[ indexes ] # now run LFA and get delta deviances for trait assoc # (recapitulating `gcat.stat` in this case) LF <- lfa(sim_geno, 3) LF0 <- LF # structure is null LF1 <- cbind(LF, sim_trait) # trait is alt devdiff_assoc <- delta_deviance_lf(sim_geno, LF0, LF1) # can instead do delta deviances for structure only LF0 <- cbind(rep.int(1, ncol(sim_geno))) # intercept only is null LF1 <- LF # structure is alt, no trait devdiff_struc <- delta_deviance_lf(sim_geno, LF0, LF1)
library(lfa) # make example data smaller so example is fast # goes from 1000 to 100 individuals indexes <- sample.int( ncol(sim_geno), 100 ) sim_geno <- sim_geno[ , indexes ] sim_trait <- sim_trait[ indexes ] # now run LFA and get delta deviances for trait assoc # (recapitulating `gcat.stat` in this case) LF <- lfa(sim_geno, 3) LF0 <- LF # structure is null LF1 <- cbind(LF, sim_trait) # trait is alt devdiff_assoc <- delta_deviance_lf(sim_geno, LF0, LF1) # can instead do delta deviances for structure only LF0 <- cbind(rep.int(1, ncol(sim_geno))) # intercept only is null LF1 <- LF # structure is alt, no trait devdiff_struc <- delta_deviance_lf(sim_geno, LF0, LF1)
Performs the GCAT association test between SNPs and trait, returning p-values.
gcat(X, LF, trait, adjustment = NULL) gcatest(X, LF, trait, adjustment = NULL) gcat.stat(X, LF, trait, adjustment = NULL)
gcat(X, LF, trait, adjustment = NULL) gcatest(X, LF, trait, adjustment = NULL) gcat.stat(X, LF, trait, adjustment = NULL)
X |
A matrix of SNP genotypes, i.e. an integer matrix of 0's,
1's, 2's and |
LF |
matrix of logistic factors from |
trait |
vector |
adjustment |
matrix of adjustment variables |
vector of p-values
gcatest()
: Alias of gcat
gcat.stat()
: returns the association statistics instead of the
p-value.
Song, M, Hao, W, Storey, JD (2015). Testing for genetic associations in arbitrarily structured populations. Nat. Genet., 47, 5:550-4.
library(lfa) # make example data smaller so example is fast # goes from 1000 to 100 individuals indexes <- sample.int( ncol(sim_geno), 100 ) sim_geno <- sim_geno[ , indexes ] sim_trait <- sim_trait[ indexes ] # now run LFA and GCATest LF <- lfa(sim_geno, 3) gcat_p <- gcat(sim_geno, LF, sim_trait) gcat_stat <- gcat.stat(sim_geno, LF, sim_trait)
library(lfa) # make example data smaller so example is fast # goes from 1000 to 100 individuals indexes <- sample.int( ncol(sim_geno), 100 ) sim_geno <- sim_geno[ , indexes ] sim_trait <- sim_trait[ indexes ] # now run LFA and GCATest LF <- lfa(sim_geno, 3) gcat_p <- gcat(sim_geno, LF, sim_trait) gcat_stat <- gcat.stat(sim_geno, LF, sim_trait)
10,000 SNPs, 1,000 individuals, first five SNPs are associated.
sim_geno
sim_geno
a matrix of 0's, 1's and 2's for the genotypes
simulated genotype matrix
10,000 SNPs, 1,000 individuals, first five SNPs are associated.
sim_trait
sim_trait
a vector of traits
simulated traits