Title: | Gene set analysis methods for SNP association p-values that lie in genes in given gene sets |
---|---|
Description: | Gene set analysis methods exist to combine SNP-level association p-values into gene sets, calculating a single association p-value for each gene set. This package implements two such methods that require only the calculated SNP p-values, the gene set(s) of interest, and a correlation matrix (if desired). One method (GLOSSI) requires independent SNPs and the other (VEGAS) can take into account correlation (LD) among the SNPs. Built-in plotting functions are available to help users visualize results. |
Authors: | Caitlin McHugh, Jessica Larson, and Jason Hackney |
Maintainer: | Caitlin McHugh <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.39.0 |
Built: | 2024-12-03 06:32:39 UTC |
Source: | https://github.com/bioc/cpvSNP |
Plots association test p-values, highlighting p-values within gene sets.
assocPvalBySetPlot(pvals, set, title = "", geneSetColor = "red", xlab = expression(-log[10]~p-value), ylab = "Density", xlim = NULL, ylim = NULL, ...)
assocPvalBySetPlot(pvals, set, title = "", geneSetColor = "red", xlab = expression(-log[10]~p-value), ylab = "Density", xlim = NULL, ylim = NULL, ...)
pvals |
A numerical vector of p-values with names corresponding to elements listed in the geneIds slot in the set object. |
set |
An object of class |
title |
A character string containing the title of the plot. Default is "", a blank title. |
geneSetColor |
A character vector holding the color the
highlighted SNPs in the set object should be. Default is
|
xlab |
A character vector with the label for the x-axis on
the plot. Default is |
ylab |
A character vector holding the label for the y-axis
on the plot. Default is |
xlim |
A vector of two numeric values with the limits of plotting on the x-axis. By default, the range of the x-axis values will be determined from the data. |
ylim |
A vector with two numeric values used as limits for plotting on the y-axis. By default, the range of the y-axis values will be determined from the data. |
... |
Further arguments to be passed to the plotting methods, such as graphical parameters. |
Creates a density plot of all -log10(p-value), and overlaying a density plot of -log10(p-value) for SNPs within a gene set of interest.
Creates a plot.
Jason Hackney, Jessica Larson, Caitlin McHugh [email protected]
Creates a GRanges object used for SNP set analysis.
createArrayData(arrayData, positionName = NULL, chromosomeName = "chromosome", chromosomeNameConvention = "NCBI", verbose = TRUE)
createArrayData(arrayData, positionName = NULL, chromosomeName = "chromosome", chromosomeNameConvention = "NCBI", verbose = TRUE)
arrayData |
A data.frame containing array data from
which the |
positionName |
The name of the column in the filepathData object that holds position data for each probe. By default, this value is NULL and there are columns Start and End which hold this information. |
chromosomeName |
The name of the column in the filepathData
object that holds chromosome data for each probe. By default,
this value is |
chromosomeNameConvention |
The naming convention used for the
chromosomes, either |
verbose |
A logical argument indicating whether output should be printed. The default is FALSE. |
This function takes a data.frame and creates a GRanges object used for SNP set analysis.
A GRanges object.
Jason Hackney, Jessica Larson, Caitlin McHugh [email protected]
data(geneSetAnalysis) head(geneSetAnalysis$arrayData) arrayDataGR <- createArrayData(geneSetAnalysis[["arrayData"]], positionName="Position")
data(geneSetAnalysis) head(geneSetAnalysis$arrayData) arrayDataGR <- createArrayData(geneSetAnalysis[["arrayData"]], positionName="Position")
degreesOfFreedom
~~This function returns the degrees of freedom used to calculate p-values from the corresponding GeneSet
or a list of degrees of freedom for a GeneSetCollection
.
degreesOfFreedom(object)
degreesOfFreedom(object)
object |
An object of type |
Defined methods include:
signature(object = "GeneSetResult")
Returns the degrees of freedom for a GeneSetResult
object
signature(object = "VEGASResult")
Returns the degrees of freedom for a VEGASResult
object
signature(object = "GLOSSIResult")
Returns the degrees of freedom for a GLOSSIResult
object
signature(object = "GeneSetResultCollection")
Returns a list of degrees of freedom for a GeneSetResultCollection
objects (1 for each set)
signature(object = "VEGASResultCollection")
Returns a list of degrees of freedom for a VEGASResultCollection
objects (1 for each set)
signature(object = "GLOSSIResultCollection")
Returns a list of degrees of freedom for a GLOSSIResultCollection
objects (1 for each set)
Returns an integer or a list of integers indicating the degrees of freedom in the corresponding object.
Caitlin McHugh
VEGASResult
-class, pValue
## Not run: degreesOfFreedom( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
## Not run: degreesOfFreedom( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
This dataset provides simulated association p-values for an example set of SNP rsIDs, including their chromosome and position along the chromosome. The data in this file are all SNPs that lie on chromosome 1.
exampleArrayData
exampleArrayData
A data.frame containing 119,487 rows and the following 4 columns:
P | simulated p-values |
SNP | SNP rsIDs |
Position | SNP position |
chromosome | SNP chromosome |
A list including the following components:
arrayData
– a data.frame of dimension 119487 rows and 4 columns including P-values, SNP ids, position and chromosome for a GWAS experiment;
geneSets
– a GeneSetCollection containing gene Entrez ids;
ldMat
– a matrix of dimension 871 x 871 holding linkage disequilibrium values;
indepSNPs
– a list containing 302 independent SNP ids;
geneSetAnalysis
geneSetAnalysis
data(geneSetAnalysis) head(geneSetAnalysis$arrayData)
data(geneSetAnalysis) head(geneSetAnalysis$arrayData)
geneSetName
~~This function returns the gene set name for a specified GeneSet
or a list of gene set names for a GeneSetCollection
.
geneSetName(object)
geneSetName(object)
object |
An object of type |
Defined methods include:
signature(object = "GeneSetResult")
Returns the gene set name for a specified GeneSetResult
object
signature(object = "VEGASResult")
Returns the gene set name for a specified VEGASResult
object
signature(object = "GLOSSIResult")
Returns the gene set name for a specified GLOSSIResult
object
signature(object = "GeneSetResultCollection")
Returns a list of gene set names for a GeneSetResultCollection
object (1 for each set)
signature(object = "VEGASResultCollection")
Returns a list of gene set names for a VEGASResultCollection
object (1 for each set)
signature(object = "GLOSSIResultCollection")
Returns a list of gene set names for a GLOSSIResultCollection
object (1 for each set)
Returns the name or a list of names of the gene sets in the corresponding object.
Caitlin McHugh
VEGASResult
-class, pValue
## Not run: geneSetName( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
## Not run: geneSetName( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
"GeneSetResult"
Objects of this class store results from running GeneSet methods.
Objects can be created by calls of glossi
or vegas
.
geneSetName
:Object of class "character"
, the name of the geneSet
pValue
:Object of class "numeric"
, the p-value
degreesOfFreedom
:Object of class "integer"
, the degrees of freedom used to calculate the p-value
statistic
:Object of class "numeric"
, the test statistic for the set
No methods defined with class "GeneSetResult" in the signature.
In the code snippets below, object
is a GeneSetResult object.
geneSetName(object)
:The name of the gene set.
pValue(object)
:A numeric p-value.
degreesOfFreedom(object)
:Integer value of the degrees of freedom used in the statistical test.
statistic(object)
:A numeric value corresponding to the statistic calculated from the test performed.
Caitlin McHugh
GeneSetResultCollection
, GLOSSIResult
showClass("GeneSetResult")
showClass("GeneSetResult")
"GeneSetResultCollection"
Objects of this class store results from running gene set methods such as GLOSSI or VEGAS. GeneSetResultCollection objects contain a list of GeneSetResult objects (one result for each GeneSet)
Objects can be created by calls of glossi
or vegas
.
No methods defined with class "GeneSetResultCollection" in the signature.
In the code snippets below, object
is a GeneSetResultCollection object.
geneSetName(object)
:A list of the names of the gene sets.
pValue(object)
:A list of numeric p-values.
degreesOfFreedom(object)
:List of integer values of the degrees of freedom used in the statistical tests performed on each gene set.
statistic(object)
:A list of numeric values corresponding to the statistics calculated from the tests performed on each gene set.
Caitlin McHugh
GeneSetResult
, GLOSSIResultCollection
showClass("GeneSetResultCollection")
showClass("GeneSetResultCollection")
This gene set collection object provides four gene sets of varying sizes to be used as examples of how to run the cpvSNP
package functions.
geneSets
geneSets
A GeneSetCollection
object of length 4. The values stored in the geneIds
slot are Entrez ids. Gene sets 1-3 were randomly selected with sizes 100, 200 and 50, respectively. The fourth GeneSet holds genes in which the simulated p-values for corresponding SNPs are significant.
Translates a set of gene ids to their corresponding SNPs.
geneToSNPList(geneList, arrayData, genes, maxgap = 20000, quiet = TRUE)
geneToSNPList(geneList, arrayData, genes, maxgap = 20000, quiet = TRUE)
geneList |
An object of class |
arrayData |
A |
genes |
A |
maxgap |
An integer value indicating the flanking region around the gene for which SNPs are considered part of the gene. Default is 20000 (20kb). |
quiet |
A logical variable indicating whether warnings should be output from the SNP to gene mapping step. |
Translates a GeneSet or GeneSetCollection of gene Entrez ids to the corresponding SNPs that lie within a prespecified region of the gene.
A GeneSetCollection
object containing all SNP ids that
lie within the genes listed in geneList.
Jason Hackney, Jessica Larson, Caitlin McHugh [email protected]
data(geneSetAnalysis) head(geneSetAnalysis$arrayData) arrayDataGR <- createArrayData(geneSetAnalysis[["arrayData"]], positionName="Position") library(TxDb.Hsapiens.UCSC.hg19.knownGene) genesHg19 <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene) geneSets <- geneSetAnalysis[["geneSets"]] snpsGSC <- geneToSNPList(geneSets, arrayDataGR, genesHg19)
data(geneSetAnalysis) head(geneSetAnalysis$arrayData) arrayDataGR <- createArrayData(geneSetAnalysis[["arrayData"]], positionName="Position") library(TxDb.Hsapiens.UCSC.hg19.knownGene) genesHg19 <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene) geneSets <- geneSetAnalysis[["geneSets"]] snpsGSC <- geneToSNPList(geneSets, arrayDataGR, genesHg19)
Calculates the p-value representing the association of the set with the phenotype of interest, assuming all items in the set are independent, using the GLOSSI method.
glossi(pvals, snp.gsc)
glossi(pvals, snp.gsc)
pvals |
A numerical vector of p-values with names
corresponding to elements listed in the |
snp.gsc |
An object of class |
This function calculates a p-value for sets of SNPs that reside within a gene set collection. We calculate the chi- square p-values and statistic by applying Fisher's transformation to the observed p-values.
An object with the corresponding GLOSSI results. If
snp.gsc is a GeneSetCollection
(i.e., multiple sets of
interest), then the corresponding GLOSSIResultCollection
is returned. If snp.gsc is a GeneSet
, a GLOSSIResult
object will be returned.
Jason Hackney, Jessica Larson, Caitlin McHugh [email protected]
Chai, High-Seng and Sicotte, Hughes et al. GLOSSI: a method to assess the association of genetic loci-sets with complex diseases. BMC Bioinformatics, 2009.
set.seed(30) pvals <- runif(100) names(pvals) <- paste0("rs", 1:100) snpGS5 <- GeneSet(geneIds=names(pvals)[1:5], setName="set5") res <- glossi (pvals, snpGS5)
set.seed(30) pvals <- runif(100) names(pvals) <- paste0("rs", 1:100) snpGS5 <- GeneSet(geneIds=names(pvals)[1:5], setName="set5") res <- glossi (pvals, snpGS5)
A helper function to calculate the chi-squared statistic corresponding to an observed set of p-values.
glossiMarginal(pval, set)
glossiMarginal(pval, set)
pval |
A vector of p-values of length equal to the number
of geneIds in the |
set |
An element of a |
This function calculates a chi-squared statistic from a set of pvalues.
An object of type GLOSSIResult
.
Jason Hackney, Jessica Larson
set.seed(30) pvals <- runif(100) names(pvals) <- paste0("rs", 1:100) snpGS5 <- GeneSet(geneIds=names(pvals)[1:5], setName="set5") res <- glossiMarginal (pvals, snpGS5)
set.seed(30) pvals <- runif(100) names(pvals) <- paste0("rs", 1:100) snpGS5 <- GeneSet(geneIds=names(pvals)[1:5], setName="set5") res <- glossiMarginal (pvals, snpGS5)
"GLOSSIResult"
Objects of this class store results from running GLOSSI.
Objects can be created by calls of glossi
.
geneSetName
:Object of class "character"
, the name of the geneSet
pValue
:Object of class "numeric"
, the p-value
degreesOfFreedom
:Object of class "integer"
, the degrees of freedom used to calculate the p-value
statistic
:Object of class "numeric"
, the test statistic for the set
No methods defined with class "GLOSSIResult" in the signature.
In the code snippets below, object
is a GLOSSIResult object.
geneSetName(object)
:The name of the gene set.
pValue(object)
:A numeric p-value.
degreesOfFreedom(object)
:Integer value of the degrees of freedom used in the statistical test.
statistic(object)
:A numeric value corresponding to the statistic calculated from the test performed.
Caitlin McHugh
GLOSSIResultCollection
, VEGASResult
showClass("GLOSSIResult")
showClass("GLOSSIResult")
"GLOSSIResultCollection"
Objects of this class store results from running GLOSSI. GLOSSIResultCollection objects contain a list of GLOSSIResult objects (one result for each GeneSet)
Objects can be created by calls of glossi
.
No methods defined with class "GLOSSIResult" in the signature.
In the code snippets below, object
is a GLOSSIResultCollection object.
geneSetName(object)
:A list of the names of the gene sets.
pValue(object)
:A list of numeric p-values.
degreesOfFreedom(object)
:List of integer values of the degrees of freedom used in the statistical tests performed on each gene set.
statistic(object)
:A list of numeric values corresponding to the statistics calculated from the tests performed on each gene set.
Caitlin McHugh
GLOSSIResult
, VEGASResultCollection
showClass("GLOSSIResultCollection")
showClass("GLOSSIResultCollection")
Plots association test p-values against number of SNPs per gene set.
plotPvals(set, title = "", xlab = "SNPs per Gene Set", ylab = "PValue", ...)
plotPvals(set, title = "", xlab = "SNPs per Gene Set", ylab = "PValue", ...)
set |
An object of class |
title |
A character string containing the title of the plot. Default is "", a blank title. |
xlab |
A character vector with the label for the x-axis on
the plot. Default is |
ylab |
A character vector holding the label for the y-axis
on the plot. Default is |
... |
Further arguments to be passed to the plotting methods, such as graphical parameters. |
Creates a plot of all p-values against the number of SNPs in the gene set, for all sets within a collection.
Creates a plot.
Caitlin McHugh [email protected]
pValue
~~This function returns the calculated p-value for a specified GeneSet
or a list of p-values for a GeneSetCollection
.
pValue(object)
pValue(object)
object |
An object of type |
Defined methods include:
signature(object = "VEGASResult")
Returns the calculated p-value for a specified VEGASResult
object
signature(object = "GeneSetResult")
Returns the calculated p-value for a specified GeneSetResult
object
signature(object = "GLOSSIResult")
Returns the calculated p-value for a specified GLOSSIResult
object
signature(object = "VEGASResultCollection")
Returns a list of calculated p-values for a VEGASResultCollection
object (1 for each set)
signature(object = "GeneSetResultCollection")
Returns a list of calculated p-values for a GeneSetResultCollection
object (1 for each set)
signature(object = "GLOSSIResultCollection")
Returns a list of calculated p-values for a GLOSSIResultCollection
object (1 for each set)
Returns a decimal p-value corresponding to the gene set or a list of p-values corresponding to the gene set collection.
Caitlin McHugh
VEGASResult
-class, geneSetName
## Not run: pValue( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
## Not run: pValue( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
Simulates a value from a chi-squared distribution with a specified correlation structure.
simulate_chisq(corr_matrix, num_sims)
simulate_chisq(corr_matrix, num_sims)
corr_matrix |
A square matrix of correlations. |
num_sims |
An integer value for the number of simulations to be performed. |
This function simulates a value from a chi-squared distribution with a specified correlation structure and degrees of freedom corresponding to the number of rows in the correlation matrix.
A numeric value corresponding to a chi-squared statistic simulated from a chi-squared-like distribution with specified correlation structure and degrees of freedom corresponding to the number of rows in the correlation matrix.
Caitlin McHugh [email protected]
simulatedStats
~~This function returns the simulated statistics for a specified GeneSet
or a list of simulated statistics for a GeneSetCollection
.
simulatedStats(object)
simulatedStats(object)
object |
An object of type |
Defined methods include:
signature(object = "VEGASResult")
Returns the simulated statistics for a specified VEGASResult
object
signature(object = "VEGASResultCollection")
Returns a list of simulated statistics for a VEGASResultCollection
object (1 for each set)
Returns decimal values of simulated statistics from a VEGAS function call or a list of simulated statistics from a VEGAS function call on a gene set collection.
Caitlin McHugh
VEGASResult
-class, observedStat
## Not run: simulatedStats( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
## Not run: simulatedStats( vegas(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
statistic
~~This function returns the calculated statistic for a specified GeneSet
or a list of calculated statistics for a GeneSetCollection
.
statistic(object)
statistic(object)
object |
An object of type |
Defined methods include:
signature(object = "GeneSetResult")
Returns the calculated statistic for a specified GeneSetResult
object
signature(object = "GeneSetResultCollection")
Returns a list of calculated statistics for a GeneSetResultCollection
object (1 for each set)
signature(object = "VEGASResult")
Returns the calculated statistic for a specified VEGASResult
object
signature(object = "VEGASResultCollection")
Returns a list of calculated statistics for a VEGASResultCollection
object (1 for each set)
signature(object = "GLOSSIResult")
Returns the calculated statistic for a specified GLOSSIResult
object
signature(object = "GLOSSIResultCollection")
Returns a list of calculated statistics for a GLOSSIResultCollection
object (1 for each set)
Returns decimal values or a list of decimal values corresponding to calculated statistics from a GLOSSI function call.
Caitlin McHugh
GLOSSIResult
-class, pValue
## Not run: statistic( glossi(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
## Not run: statistic( glossi(geneSet, assoc_table, ldMatrix) ) ## End(Not run)
show
in Package base ~~~~ Methods for function show
in package base ~~
Defined methods include:
signature(x = "VEGASResult")
,
signature(x = "GLOSSIResult")
These methods display the corresponding statistics and p-values for the corresponding VEGAS or GLOSSI analysis.
Caitlin McHugh
Calculates the p-value representing the association of the set with the phenotype of interest.
vegas(set, assoc_table, ldMatrix, num_sims = 1000, correction = TRUE, seed = NULL, verbose = FALSE)
vegas(set, assoc_table, ldMatrix, num_sims = 1000, correction = TRUE, seed = NULL, verbose = FALSE)
set |
A |
assoc_table |
An object of class |
ldMatrix |
A square, symmetric matrix of LD values, with each row and column corresponding to each of the items in the set. The diagonal entries should be 1, indicating the LD between an item in the set and itself is 1. |
num_sims |
A positive integer value for the number of simulations to be performed. Default is 1000. |
correction |
A logical argument indicating whether a value of one should be added to the numerator and denominator when calculating the p-value based upon the simulated statistics. By default, the correction is added and this argument is TRUE. |
seed |
An integer argument indicating what the random seed should be set to. This allows for replication of results. The default is NULL, and a random seed will be set internally. |
verbose |
A logical argument indicating whether output should be printed. The default is FALSE. |
This function calculates a p-value for sets of SNPs that reside within a gene set collection. We calculate the null distribution by taking into account the observed correlation among the SNPs and simulating a specified number of statistics from which the resulting p-value is calculated.
An object with the corresponding VEGAS results. If set
is a GeneSetCollection
(i.e., multiple sets of
interest), then the corresponding VEGASResultCollection
is returned. If set is a GeneSet
, a VEGASResult
object will be returned.
Caitlin McHugh [email protected]
Liu, Jimmy Z. and Mcrae, Allan F. et al. A Versatile Gene-Based Test for Genome-Wide Association Studies. The American Journal of Human Genetics, 2010.
Calculates the p-value representing the association of the set with the phenotype of interest.
vegasMarginal(pvals, ld_matrix, num_sims, correction = TRUE, seed = NULL, verbose = FALSE)
vegasMarginal(pvals, ld_matrix, num_sims, correction = TRUE, seed = NULL, verbose = FALSE)
pvals |
A vector of p-values corresponding to items in the set. |
ld_matrix |
A square, symmetric matrix of LD values, with each row and column corresponding to each of the items in the set. The diagonal entries should be 1, indicating the LD between an item in the set and itself is 1. |
num_sims |
An integer value for the number of simulations to be performed. |
correction |
A logical argument indicating whether a value of one should be added to the numerator when calculating the p-value based upon the simulated statistics. By default, the correction is added. An argument of FALSE will not add one to the numerator. |
seed |
An integer argument indicating what the random seed should be set to. This allows for replication of results. The default is NULL, and a random seed will be set internally. |
verbose |
A logical argument indicating whether periodic output should be printed. Defaults to FALSE, indicating no output will be printed. |
This is a helper function to calculate the p-value for a set
of SNPs that reside within a gene set collection. The
correlation among the SNPs is taken into account by the LD
matrix. The resulting p-value is calculated from a null
distribution that is simulated num_sims
times based upon
the specified correlation structure.
A VEGASResult
object with the corresponding
VEGAS results.
Caitlin McHugh [email protected]
"VEGASResult"
Objects of this class store results from running VEGAS.
Objects can be created by calls of vegas
.
geneSetName
:Object of class "character"
, the name of the geneSet
pValue
:Object of class "numeric"
, the p-value
degreesOfFreedom
:Object of class "integer"
, the degrees of freedom used to calculate the p-value
statistic
:Object of class "numeric"
, the test statistic for the set
simulatedStats
:Object of class "vector"
, the simulated statistics for the set used to calculate the p-value
No methods defined with class "VEGASResult" in the signature.
In the code snippets below, object
is a VEGASResult object.
geneSetName(object)
:The name of the gene set.
pValue(object)
:A numeric p-value.
degreesOfFreedom(object)
:Integer value of the degrees of freedom used in the statistical test.
statistic(object)
:A numeric value corresponding to the statistic calculated from the test performed.
Caitlin McHugh
VEGASResultCollection
, GLOSSIResult
showClass("VEGASResult")
showClass("VEGASResult")
"VEGASResultCollection"
Objects of this class store results from running VEGAS. VEGASResultCollection objects contain a list of VEGASResult objects (one result for each GeneSet)
Objects can be created by calls of vegas
.
No methods defined with class "VEGASResult" in the signature.
In the code snippets below, object
is a VEGASResultCollection object.
geneSetName(object)
:A list of the names of the gene sets.
pValue(object)
:A list of numeric p-values.
degreesOfFreedom(object)
:List of integer values of the degrees of freedom used in the statistical tests performed on each gene set.
statistic(object)
:A list of numeric values corresponding to the statistics calculated from the tests performed on each gene set.
Caitlin McHugh
VEGASResult
, GLOSSIResultCollection
showClass("VEGASResultCollection")
showClass("VEGASResultCollection")