Title: | GWAS Incorporating Networks |
---|---|
Description: | martini deals with the low power inherent to GWAS studies by using prior knowledge represented as a network. SNPs are the vertices of the network, and the edges represent biological relationships between them (genomic adjacency, belonging to the same gene, physical interaction between protein products). The network is scanned using SConES, which looks for groups of SNPs maximally associated with the phenotype, that form a close subnetwork. |
Authors: | Hector Climente-Gonzalez [aut, cre] , Chloe-Agathe Azencott [aut] |
Maintainer: | Hector Climente-Gonzalez <[email protected]> |
License: | GPL-3 |
Version: | 1.27.0 |
Built: | 2024-11-04 06:06:09 UTC |
Source: | https://github.com/bioc/martini |
Creates a network of SNPs where each SNP is connected as in the GM network and, in addition, to all the other SNPs pertaining to any interactor of the gene it is mapped to. Corresponds to the gene-interaction (GI) network described by Azencott et al.
get_GI_network( gwas, organism = 9606, snpMapping = snp2ensembl(gwas, organism), ppi = get_gxg("biogrid", organism, flush), col_ppi = c("gene1", "gene2"), col_genes = c("snp", "gene"), flush = FALSE )
get_GI_network( gwas, organism = 9606, snpMapping = snp2ensembl(gwas, organism), ppi = get_gxg("biogrid", organism, flush), col_ppi = c("gene1", "gene2"), col_genes = c("snp", "gene"), flush = FALSE )
gwas |
A SnpMatrix object with the GWAS information. |
organism |
Tax ID of the studied organism. The default is 9606 (human). |
snpMapping |
A data.frame informing how SNPs map to genes. It contains
minimum two columns: SNP id and a gene it maps to. Each row corresponds to
one gene-SNP mapping. Unless column names are specified using
|
ppi |
A data.frame describing protein-protein interactions with at least
two colums. Gene ids must be the contained in snpMapping. Unless column names
are specified using |
col_ppi |
Optional, length-2 character vector with the names of the two columns involving the protein-protein interactions. |
col_genes |
Optional, length-2 character vector with the names of the two columns involving the SNP-gene mapping. The first element is the column of the SNP, and the second is the column of the gene. |
flush |
Remove cached results? Boolean value. |
An igraph network of the GI network of the SNPs.
Azencott, C. A., Grimm, D., Sugiyama, M., Kawahara, Y., & Borgwardt, K. M. (2013). Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics, 29(13), 171-179. https://doi.org/10.1093/bioinformatics/btt238
get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi)
get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi)
Creates a network of SNPs where each SNP is connected as in the GS network and, in addition, to all the other SNPs pertaining to the same gene. Corresponds to the gene membership (GM) network described by Azencott et al.
get_GM_network( gwas, organism = 9606, snpMapping = snp2ensembl(gwas, organism), col_genes = c("snp", "gene") )
get_GM_network( gwas, organism = 9606, snpMapping = snp2ensembl(gwas, organism), col_genes = c("snp", "gene") )
gwas |
A SnpMatrix object with the GWAS information. |
organism |
Tax ID of the studied organism. The default is 9606 (human). |
snpMapping |
A data.frame informing how SNPs map to genes. It contains
minimum two columns: SNP id and a gene it maps to. Each row corresponds to
one gene-SNP mapping. Unless column names are specified using
|
col_genes |
Optional, length-2 character vector with the names of the two columns involving the SNP-gene mapping. The first element is the column of the SNP, and the second is the column of the gene. |
An igraph network of the GM network of the SNPs.
Azencott, C. A., Grimm, D., Sugiyama, M., Kawahara, Y., & Borgwardt, K. M. (2013). Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics, 29(13), 171-179. https://doi.org/10.1093/bioinformatics/btt238
get_GM_network(minigwas, snpMapping = minisnpMapping)
get_GM_network(minigwas, snpMapping = minisnpMapping)
Creates a network of SNPs where each SNP is connected to its adjacent SNPs in the genome sequence. Corresponds to the genomic sequence (GS) network described by Azencott et al.
get_GS_network(gwas)
get_GS_network(gwas)
gwas |
A SnpMatrix object with the GWAS information. |
An igraph network of the GS network of the SNPs.
Azencott, C. A., Grimm, D., Sugiyama, M., Kawahara, Y., & Borgwardt, K. M. (2013). Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics, 29(13), 171-179. https://doi.org/10.1093/bioinformatics/btt238
get_GS_network(minigwas)
get_GS_network(minigwas)
Takes a map file and:
column 1: Used as the chromosome column in the BED file.
column 4: Used as start and end in the BED data.frame (as we work with SNPs).
gwas2bed(gwas)
gwas2bed(gwas)
gwas |
A SnpMatrix object with the GWAS information. |
A BED data.frame.
Include linkage disequilibrium information in the SNP network. The weight of the edges will be lower the higher the linkage is.
ldweight_edges(net, ld, method = "inverse")
ldweight_edges(net, ld, method = "inverse")
net |
A SNP network. |
ld |
A |
method |
How to incorporate linkage-disequilibrium values into the network. |
An copy of net where the edges weighted according to linkage disequilibrium.
ld <- snpStats::ld(minigwas[['genotypes']], depth = 2, stats = "R.squared") # don't weight edges for which LD cannot be calculated ld[is.na(ld)] <- 0 gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) ldGi <- ldweight_edges(gi, ld)
ld <- snpStats::ld(minigwas[['genotypes']], depth = 2, stats = "R.squared") # don't weight edges for which LD cannot be calculated ld[is.na(ld)] <- 0 gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) ldGi <- ldweight_edges(gi, ld)
Run the maxflow algorithm.
maxflow(A, As, At)
maxflow(A, As, At)
A |
A sparse matrix with the connectivity. |
As |
A vector containing the edges to the source. |
At |
A vector containing the edges to the sink. |
A list with vector indicating if the feature was selected and the objective score.
Run the mincut algorithm.
mincut_c(c, eta, lambda, W)
mincut_c(c, eta, lambda, W)
c |
A vector with the association of each SNP with the phenotype. |
eta |
A numeric with the value of the eta parameter. |
lambda |
A numeric with the value of the eta parameter. |
W |
A sparse matrix with the connectivity. |
A list with vector indicating if the feature was selected and the objective score.
Small GWAS example.
A list with 3 items:
Genotype and phenotype information.
Simulated network.
Result of runing find_cones
with gwas and net.
data(minigwas) # access different elements minigwas[["genotypes"]] minigwas[["map"]] minigwas[["fam"]]
data(minigwas) # access different elements minigwas[["genotypes"]] minigwas[["map"]] minigwas[["fam"]]
data.frame describing pairs of proteins that interact for minigwas.
data(minippi) head(minippi)
data(minippi) head(minippi)
data.frame that maps SNPs from minigwas to their gene.
data(minisnpMapping) head(minisnpMapping)
data(minisnpMapping) head(minisnpMapping)
Create a circular ideogram of the a network results using the circlize package (Gu et al., 2014).
plot_ideogram(gwas, net, covars = data.frame(), genome = "hg19")
plot_ideogram(gwas, net, covars = data.frame(), genome = "hg19")
gwas |
A SnpMatrix object with the GWAS information. |
net |
An igraph network that connects the SNPs. |
covars |
A data frame with the covariates. It must contain a column 'sample' containing the sample IDs, and an additional columns for each covariate. |
genome |
Abbreviations of the genome to use: hg19 for human (default), mm10 for mouse, etc. |
A circular ideogram, including the manhattan plot, and the interactions between the selected SNPs.
Gu, Z., Gu, L., Eils, R., Schlesner, M., & Brors, B. (2014). circlize Implements and enhances circular visualization in R. Bioinformatics (Oxford, England), 30(19), 2811-2. https://doi.org/10.1093/bioinformatics/btu393
Finds the SNPs maximally associated with a phenotype while being connected in an underlying network.
scones( gwas, net, eta, lambda, covars = data.frame(), score = c("chi2", "glm", "r2"), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse") )
scones( gwas, net, eta, lambda, covars = data.frame(), score = c("chi2", "glm", "r2"), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse") )
gwas |
A SnpMatrix object with the GWAS information. |
net |
An igraph network that connects the SNPs. |
eta |
Value of the eta parameter. |
lambda |
Value of the lambda parameter. |
covars |
A data frame with the covariates. It must contain a column 'sample' containing the sample IDs, and an additional columns for each covariate. |
score |
Association score to measure association between genotype and phenotype. Possible values: chi2 (default), glm. |
family |
A string defining the generalized linear model family. This should match one of "binomial", "poisson", "gaussian" or "gamma". See snp.rhs.tests for details. |
link |
A string defining the link function for the GLM. This should match one of "logit", "log", "identity" or "inverse". See snp.rhs.tests for details. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
Azencott, C. A., Grimm, D., Sugiyama, M., Kawahara, Y., & Borgwardt, K. M. (2013). Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics, 29(13), 171-179. https://doi.org/10.1093/bioinformatics/btt238
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones(minigwas, gi, 10, 1)
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones(minigwas, gi, 10, 1)
Finds the features maximally associated with a phenotype while being connected in an underlying network.
scones_(X, y, featnames, net, eta, lambda)
scones_(X, y, featnames, net, eta, lambda)
X |
n x d design matrix |
y |
Vector of length n with the outcomes |
featnames |
Vector of length d with the feature names |
net |
An igraph network that connects the SNPs. |
eta |
Value of the eta parameter. |
lambda |
Value of the lambda parameter. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi, 10, 1)
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi, 10, 1)
Finds the SNPs maximally associated with a phenotype while being connected in an underlying network. Select the hyperparameters by cross-validation.
scones.cv( gwas, net, covars = data.frame(), score = c("chi2", "glm", "r2"), criterion = c("stability", "bic", "aic", "aicc", "global_clustering", "local_clustering"), etas = numeric(), lambdas = numeric(), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse"), max_prop_snp = 0.5 )
scones.cv( gwas, net, covars = data.frame(), score = c("chi2", "glm", "r2"), criterion = c("stability", "bic", "aic", "aicc", "global_clustering", "local_clustering"), etas = numeric(), lambdas = numeric(), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse"), max_prop_snp = 0.5 )
gwas |
A SnpMatrix object with the GWAS information. |
net |
An igraph network that connects the SNPs. |
covars |
A data frame with the covariates. It must contain a column 'sample' containing the sample IDs, and an additional columns for each covariate. |
score |
Association score to measure association between genotype and phenotype. Possible values: chi2 (default), glm. |
criterion |
String with the function to measure the quality of a split. |
etas |
Numeric vector with the etas to explore in the grid search. If ommited, it's automatically created based on the association scores. |
lambdas |
Numeric vector with the lambdas to explore in the grid search. If ommited, it's automatically created based on the association scores. |
family |
A string defining the generalized linear model family. This should match one of "binomial", "poisson", "gaussian" or "gamma". See snp.rhs.tests for details. |
link |
A string defining the link function for the GLM. This should match one of "logit", "log", "identity" or "inverse". See snp.rhs.tests for details. |
max_prop_snp |
Maximum proportion of SNPs accepted in the model (between 0 and 1). Larger solutions will be discarded. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
Azencott, C. A., Grimm, D., Sugiyama, M., Kawahara, Y., & Borgwardt, K. M. (2013). Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics, 29(13), 171-179. https://doi.org/10.1093/bioinformatics/btt238
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones.cv(minigwas, gi) scones.cv(minigwas, gi, score = "glm")
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones.cv(minigwas, gi) scones.cv(minigwas, gi, score = "glm")
Finds the features maximally associated with a phenotype while being connected in an underlying network. Select the hyperparameters by cross-validation.
scones.cv_(X, y, featnames, net)
scones.cv_(X, y, featnames, net)
X |
n x d design matrix |
y |
Vector of length n with the outcomes |
featnames |
Vector of length d with the feature names |
net |
An igraph network that connects the SNPs. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones.cv_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi)
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) scones.cv_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi)
Finds the SNPs maximally associated with a phenotype while being connected in an underlying network (Azencott et al., 2013).
search_cones( gwas, net, encoding = "additive", sigmod = FALSE, covars = data.frame(), associationScore = c("chi2", "glm"), modelScore = c("stability", "bic", "aic", "aicc", "global_clustering", "local_clustering"), etas = numeric(), lambdas = numeric() )
search_cones( gwas, net, encoding = "additive", sigmod = FALSE, covars = data.frame(), associationScore = c("chi2", "glm"), modelScore = c("stability", "bic", "aic", "aicc", "global_clustering", "local_clustering"), etas = numeric(), lambdas = numeric() )
gwas |
A SnpMatrix object with the GWAS information. |
net |
An igraph network that connects the SNPs. |
encoding |
SNP encoding (unused argument). |
sigmod |
Boolean. If TRUE, use the Sigmod variant of SConES, meant to prioritize tightly connected clusters of SNPs. |
covars |
A data frame with the covariates. It must contain a column 'sample' containing the sample IDs, and an additional columns for each covariate. |
associationScore |
Association score to measure association between genotype and phenotype. |
modelScore |
String with the function to measure the quality of a split. |
etas |
Numeric vector with the etas to explore in the grid search. If ommited, it's automatically created based on the association scores. |
lambdas |
Numeric vector with the lambdas to explore in the grid search. If ommited, it's automatically created based on the association scores. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
Azencott, C. A., Grimm, D., Sugiyama, M., Kawahara, Y., & Borgwardt, K. M. (2013). Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics, 29(13), 171-179. https://doi.org/10.1093/bioinformatics/btt238
## Not run: gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) search_cones(minigwas, gi) search_cones(minigwas, gi, encoding = "recessive") search_cones(minigwas, gi, associationScore = "skat") ## End(Not run)
## Not run: gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) search_cones(minigwas, gi) search_cones(minigwas, gi, encoding = "recessive") search_cones(minigwas, gi, associationScore = "skat") ## End(Not run)
Finds the SNPs maximally associated with a phenotype while being connected in an underlying network.
sigmod( gwas, net, eta, lambda, covars = data.frame(), score = c("chi2", "glm", "r2"), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse") )
sigmod( gwas, net, eta, lambda, covars = data.frame(), score = c("chi2", "glm", "r2"), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse") )
gwas |
A SnpMatrix object with the GWAS information. |
net |
An igraph network that connects the SNPs. |
eta |
Value of the eta parameter. |
lambda |
Value of the lambda parameter. |
covars |
A data frame with the covariates. It must contain a column 'sample' containing the sample IDs, and an additional columns for each covariate. |
score |
Association score to measure association between genotype and phenotype. Possible values: chi2 (default), glm. |
family |
A string defining the generalized linear model family. This should match one of "binomial", "poisson", "gaussian" or "gamma". See snp.rhs.tests for details. |
link |
A string defining the link function for the GLM. This should match one of "logit", "log", "identity" or "inverse". See snp.rhs.tests for details. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
Liu, Y., Brossard, M., Roqueiro, D., Margaritte-Jeannin, P., Sarnowski, C., Bouzigon, E., Demenais, F. (2017). SigMod: an exact and efficient method to identify a strongly interconnected disease-associated module in a gene network. Bioinformatics, 33(10), 1536–1544. https://doi.org/10.1093/bioinformatics/btx004
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod(minigwas, gi, 10, 1)
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod(minigwas, gi, 10, 1)
Finds the features maximally associated with a phenotype while being connected in an underlying network.
sigmod_(X, y, featnames, net, eta, lambda)
sigmod_(X, y, featnames, net, eta, lambda)
X |
n x d design matrix |
y |
Vector of length n with the outcomes |
featnames |
Vector of length d with the feature names |
net |
An igraph network that connects the SNPs. |
eta |
Value of the eta parameter. |
lambda |
Value of the lambda parameter. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi, 10, 1)
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi, 10, 1)
Finds the SNPs maximally associated with a phenotype while being connected in an underlying network. Select the hyperparameters by cross-validation.
sigmod.cv( gwas, net, covars = data.frame(), score = c("chi2", "glm", "r2"), criterion = c("stability", "bic", "aic", "aicc", "global_clustering", "local_clustering"), etas = numeric(), lambdas = numeric(), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse"), max_prop_snp = 0.5 )
sigmod.cv( gwas, net, covars = data.frame(), score = c("chi2", "glm", "r2"), criterion = c("stability", "bic", "aic", "aicc", "global_clustering", "local_clustering"), etas = numeric(), lambdas = numeric(), family = c("binomial", "poisson", "gaussian", "gamma"), link = c("logit", "log", "identity", "inverse"), max_prop_snp = 0.5 )
gwas |
A SnpMatrix object with the GWAS information. |
net |
An igraph network that connects the SNPs. |
covars |
A data frame with the covariates. It must contain a column 'sample' containing the sample IDs, and an additional columns for each covariate. |
score |
Association score to measure association between genotype and phenotype. Possible values: chi2 (default), glm. |
criterion |
String with the function to measure the quality of a split. |
etas |
Numeric vector with the etas to explore in the grid search. If ommited, it's automatically created based on the association scores. |
lambdas |
Numeric vector with the lambdas to explore in the grid search. If ommited, it's automatically created based on the association scores. |
family |
A string defining the generalized linear model family. This should match one of "binomial", "poisson", "gaussian" or "gamma". See snp.rhs.tests for details. |
link |
A string defining the link function for the GLM. This should match one of "logit", "log", "identity" or "inverse". See snp.rhs.tests for details. |
max_prop_snp |
Maximum proportion of SNPs accepted in the model (between 0 and 1). Larger solutions will be discarded. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
Liu, Y., Brossard, M., Roqueiro, D., Margaritte-Jeannin, P., Sarnowski, C., Bouzigon, E., Demenais, F. (2017). SigMod: an exact and efficient method to identify a strongly interconnected disease-associated module in a gene network. Bioinformatics, 33(10), 1536–1544. https://doi.org/10.1093/bioinformatics/btx004
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod.cv(minigwas, gi) sigmod.cv(minigwas, gi, score = "glm")
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod.cv(minigwas, gi) sigmod.cv(minigwas, gi, score = "glm")
Finds the features maximally associated with a phenotype while being connected in an underlying network. Select the hyperparameters by cross-validation.
sigmod.cv_(X, y, featnames, net)
sigmod.cv_(X, y, featnames, net)
X |
n x d design matrix |
y |
Vector of length n with the outcomes |
featnames |
Vector of length d with the feature names |
net |
An igraph network that connects the SNPs. |
A copy of the SnpMatrix$map
data.frame
, with the
following additions:
c: contains the univariate association score for every single SNP.
selected: logical vector indicating if the SNP was selected by SConES or not.
module: integer with the number of the module the SNP belongs to.
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod.cv_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi)
X <- as(minigwas[['genotypes']], 'numeric') X <- X + matrix(rnorm(2500, sd = 0.1), nrow(X), ncol(X)) gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) sigmod.cv_(X, minigwas[['fam']]$affected, minigwas[['map']]$snp, gi)
Selects randomly interconnected genes as causal, then selects a proportion of them as causal.
simulate_causal_snps(net, ngenes = 20, pcausal = 1)
simulate_causal_snps(net, ngenes = 20, pcausal = 1)
net |
An igraph gene-interaction (GI) network that connects the SNPs. |
ngenes |
Number of causal genes. |
pcausal |
Number between 0 and 1, proportion of the SNPs in causal genes that are causal themselves. |
A vector with the ids of the simulated causal SNPs.
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) simulate_causal_snps(gi, ngenes=2)
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) simulate_causal_snps(gi, ngenes=2)
Simulates a phenotype from a GWAS experiment and a specified set of causal SNPs. If the data is qualitative, only controls are used.
simulate_phenotype( gwas, snps, h2, model = "additive", effectSize = rnorm(length(snps)), qualitative = FALSE, ncases, ncontrols, prevalence )
simulate_phenotype( gwas, snps, h2, model = "additive", effectSize = rnorm(length(snps)), qualitative = FALSE, ncases, ncontrols, prevalence )
gwas |
A SnpMatrix object with the GWAS information. |
snps |
Character vector with the SNP ids of the causal SNPs. Must match SNPs in gwas[["map"]][["snp.name"]]. |
h2 |
Heritability of the phenotype (between 0 and 1). |
model |
String specifying the genetic model under the phenotype. Accepted values: "additive". |
effectSize |
Numeric vector with the same lenght as the number of causal SNPs. It indicates the effect size of each of the SNPs; if absent, they are sampled fron a normal distribution. |
qualitative |
Bool indicating if the phenotype is qualitative or not (quantitative). |
ncases |
Integer specifying the number of cases to simulate in a qualitative phenotype. Required if qualitative = TRUE. |
ncontrols |
Integer specifying the number of controls to simulate in a qualitative phenotype. Required if qualitative = TRUE. |
prevalence |
Value between 0 and 1 specifying the population prevalence of the disease. Note that ncases cannot be greater than prevalence * number of samples. Required if qualitative = TRUE. |
A copy of the GWAS experiment with the new phenotypes in
gwas[["fam"]][["affected"]]
.
Inspired from GCTA simulation tool: http://cnsgenomics.com/software/gcta/Simu.html.
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) causal <- simulate_causal_snps(gi, ngenes = 2) simulate_phenotype(minigwas, causal, h2 = 1)
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) causal <- simulate_causal_snps(gi, ngenes = 2) simulate_phenotype(minigwas, causal, h2 = 1)
Returns the nodes matching some condition.
subvert(net, attr, values, affirmative = TRUE)
subvert(net, attr, values, affirmative = TRUE)
net |
An igraph network. |
attr |
An attribute of the vertices. |
values |
Possible values of |
affirmative |
Logical. States if a condition must be its affirmation (e.g. all nodes with gene name "X"), or its negation (all nodes not with gene name "X"). |
The vertices with attribute equal to any of the values in
values
.
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) martini:::subvert(gi, "gene", "A") martini:::subvert(gi, "name", c("1A1", "1A3"))
gi <- get_GI_network(minigwas, snpMapping = minisnpMapping, ppi = minippi) martini:::subvert(gi, "gene", "A") martini:::subvert(gi, "name", c("1A1", "1A3"))
Wrap design matrix and outcome vector into a pseudo SnpMatrix object.
wrap_Xy(X, y, featnames, net)
wrap_Xy(X, y, featnames, net)
X |
n x d design matrix |
y |
Vector of length n with the outcomes |
featnames |
Vector of length d with the feature names |
net |
An igraph network that connects the SNPs. |