Title: | GWAS trait-associated SNP enrichment analyses in genomic intervals |
---|---|
Description: | traseR performs GWAS trait-associated SNP enrichment analyses in genomic intervals using different hypothesis testing approaches, also provides various functionalities to explore and visualize the results. |
Authors: | Li Chen, Zhaohui S.Qin |
Maintainer: | li chen<[email protected]> |
License: | GPL |
Version: | 1.37.0 |
Built: | 2024-10-31 06:26:54 UTC |
Source: | https://github.com/bioc/traseR |
Perform GWAS trait-associated SNP enrichment analyses in genomic intervals. Explore and visualize the results.
Package: | traseR |
Type: | Package |
Version: | 1.0 |
Date: | 2015-11-18 |
License: | GPL |
Li Chen <[email protected]>, Zhaohui S.Qin<[email protected]>
A GRange object CEU
contains 5% of all SNPs from CEU by
controling genome-wide density is the same as all SNPs from CEU
data(CEU)
data(CEU)
The data frame CEU
contains three columns,
SNP_ID |
SNP rs number |
seqnames |
Chromosome number associated with rs number |
ranges |
Chromosomal position, in base pairs, associated with rs number |
Li Chen <[email protected]>, Zhaohui S.Qin<[email protected]>
These are a group of functions to generate plot to visualize the trait-associated SNPs.
plotContext(snpdb, region=NULL, keyword = NULL, pvalue = 1e-3) plotPvalue(snpdb, region=NULL, keyword = NULL, plot.type = c("densityplot", "boxplot"), pvalue = 1e-3, xymax = 50) plotSNP(snpdb, snpid, ext = 10000) plotGene(snpdb, gene, ext = 10000) plotInterval(snpdb,interval,ext = 10000)
plotContext(snpdb, region=NULL, keyword = NULL, pvalue = 1e-3) plotPvalue(snpdb, region=NULL, keyword = NULL, plot.type = c("densityplot", "boxplot"), pvalue = 1e-3, xymax = 50) plotSNP(snpdb, snpid, ext = 10000) plotGene(snpdb, gene, ext = 10000) plotInterval(snpdb,interval,ext = 10000)
snpdb |
A GRange object or data frame, which is GWAS trait-associated SNPs downloaded from up-to-date dbGaP and NHGRI public database. It is maintained to be updated to the latest version. The data frame contains the following columns, |
region |
A data frame, which is genomic intervals with three columns, chromosome, genomic start position, genomic end position. |
keyword |
The keyword is used when specific trait is of interest. If |
snpid |
SNP rs number |
gene |
Gene name |
pvalue |
SNPs with p-value less than this threshold are used for analyses. |
plot.type |
Either "densityplot" or "boxplot" |
ext |
Bp extended upstream and downstream |
xymax |
The maximum range on x-axis and y-axis |
interval |
A data frame, genomic interval:chromosome, genomic start position, genomic end position |
plotContext |
A pie plot with the distribution of SNP function class |
plotPvalue |
A density plot of -logPvalue of trait-associated SNPs |
plotSNP |
A plot of trait-associated SNP on chromosome |
plotGene |
A plot with the gene and possible nearby trait-associated SNPs |
plotInterval |
A plot with chromosome interval with possible nearby genes and trait-associated SNPs |
Li Chen <[email protected]>, Zhaohui Qin<[email protected]>
data(taSNP) plotContext(snpdb= taSNP,keyword="Autoimmune") plotGene(snpdb= taSNP,gene="ZFP92",ext=50000) plotSNP(snpdb= taSNP,snpid="rs766420",ext=50000) plotInterval(snpdb= taSNP,data.frame(chr="chrX",start=152633780,end=152737085))
data(taSNP) plotContext(snpdb= taSNP,keyword="Autoimmune") plotGene(snpdb= taSNP,gene="ZFP92",ext=50000) plotSNP(snpdb= taSNP,snpid="rs766420",ext=50000) plotInterval(snpdb= taSNP,data.frame(chr="chrX",start=152633780,end=152737085))
Print the outcome of taSNP enrichment analyses. Print the overall taSNP enrichment, trait-specific taSNP enrichment,trait-class-specific taSNP enrichment.
## S3 method for class 'traseR' print(x,isTopK1=FALSE,topK1=10,isTopK2=FALSE,topK2=10,trait.threshold=10,traitclass.threshold=10,...)
## S3 method for class 'traseR' print(x,isTopK1=FALSE,topK1=10,isTopK2=FALSE,topK2=10,trait.threshold=10,traitclass.threshold=10,...)
x |
Object returned from |
isTopK1 |
If isTopK1 is TRUE, topK1 traits are printed; otherwise, traits with p-value below Bonferroni correction threshold are printed. Default is FALSE. |
topK1 |
Top K1 traits are printed. Default is 10. |
isTopK2 |
If isTopK2 is TRUE, topK2 trait class are printed; otherwise, trait class with p-value below Bonferroni correction threshold are printed. Default is FALSE. |
topK2 |
Top K2 trait class are printed. Default is 10. |
trait.threshold |
Traits above this threshold are reported. Default is 10. |
traitclass.threshold |
Trait class above this threshold are reported. Default is 10. |
... |
Other parameters to |
Print a data frame of traits ranked by p-value
Li Chen <[email protected]>, Zhaohui S.Qin<[email protected]>
data(taSNP) data(Tcell) x=traseR(snpdb=taSNP,region=Tcell) print(x)
data(taSNP) data(Tcell) x=traseR(snpdb=taSNP,region=Tcell) print(x)
These are a group of functions to retrieve the trait-associated SNPs based on input
queryKeyword(snpdb, region=NULL, keyword = NULL, returnby = c("SNP_ID", "trait"), pvalue = 1e-3) queryGene(snpdb, genes = NULL) querySNP(snpdb, snpid, region = NULL)
queryKeyword(snpdb, region=NULL, keyword = NULL, returnby = c("SNP_ID", "trait"), pvalue = 1e-3) queryGene(snpdb, genes = NULL) querySNP(snpdb, snpid, region = NULL)
snpdb |
A GRange object or data frame, which is GWAS trait-associated SNPs downloaded from up-to-date dbGaP and NHGRI public database. It is maintained to be updated to the latest version. The data frame contains the following columns, |
region |
A data frame, which is genomic intervals with three columns, chromosome, genomic start position, genomic end position. |
keyword |
The keyword is used when specific trait is of interest. If |
snpid |
SNP rs number |
genes |
Gene name |
pvalue |
SNPs with p-value less than this threshold are used for analyses. |
returnby |
Either SNP or trait. If |
queryKeyword: Return a data frame of traits containing the keyword queryGene: Return a data frame of traits associated with the gene querySNP: Return a data frame of traits associated with the SNP
Li Chen <[email protected]>, Zhaohui Qin<[email protected]>
data(taSNP) data(Tcell) x=queryKeyword(snpdb=taSNP,region=Tcell,keyword="Autoimmune",returnby="SNP_ID") x=queryGene(snpdb=taSNP,genes=c("AGRN","UBE2J2","SSU72")) x=querySNP(snpdb=taSNP,snpid=c("rs3766178","rs880051"))
data(taSNP) data(Tcell) x=queryKeyword(snpdb=taSNP,region=Tcell,keyword="Autoimmune",returnby="SNP_ID") x=queryGene(snpdb=taSNP,genes=c("AGRN","UBE2J2","SSU72")) x=querySNP(snpdb=taSNP,snpid=c("rs3766178","rs880051"))
A GRange object taSNP
contains trait-associated SNPs from dbGaP and NHGRI downloaded from Association Results Browser.
data(taSNP)
data(taSNP)
The data frame taSNP
contains the following columns
Trait |
Trait |
Trait Class |
Trait class which is formed based on the phenotype tree. Close traits are grouped together to form one class |
SNP_ID |
SNP rs number |
p.value |
GWAS SNP p-value |
seqnames |
Chromosome |
ranges |
Chromosome position |
Context |
SNP functional class |
GENE_NAME |
Nearest gene name |
GENE_START |
Gene start genomic position |
GENE_END |
Gene end genomic position |
GENE_STRAND |
Gene strand |
Li Chen <[email protected]>, Zhaohui S.Qin<[email protected]>
A GRange object taSNPLD
contains linkage disequilibrium (>0.8) SNPs of all trait-associated SNPs from dbGaP and NHGRI.
data(taSNPLD)
data(taSNPLD)
The data frame taSNPLD
contains four columns,
SNP_ID |
SNP rs number |
seqnames |
Chromosome number associated with rs number |
ranges |
Chromosomal position, in base pairs, associated with rs number |
Trait |
Trait the SNP is associated with |
Trait Class |
Trait class which is formed based on the phenotype tree. Close traits are grouped together to form one class |
Li Chen <[email protected]>, Zhaohui S.Qin<[email protected]>
A GRange object Tcell
contains three columns: chromosome, genomic start position and genomic end position.
data(Tcell)
data(Tcell)
The data frame Tcell
contains three columns,
seqnames |
Chromosome id |
ranges |
Chromosome position |
Li Chen <[email protected]>, Zhaohui S.Qin<[email protected]>
Perform GWAS trait-associated SNP enrichment analyses in genomic intervals using different approaches
traseR(snpdb, region, snpdb.bg=NULL, keyword = NULL, rankby = c("pvalue", "odds.ratio"), test.method = c("binomial", "fisher","chisq", "nonparametric"), alternative = c("greater", "less", "two.sided"), ntimes=100,nbatch=1, trait.threshold = 0, traitclass.threshold=0, pvalue = 1e-3)
traseR(snpdb, region, snpdb.bg=NULL, keyword = NULL, rankby = c("pvalue", "odds.ratio"), test.method = c("binomial", "fisher","chisq", "nonparametric"), alternative = c("greater", "less", "two.sided"), ntimes=100,nbatch=1, trait.threshold = 0, traitclass.threshold=0, pvalue = 1e-3)
snpdb |
A GRange object.
It could be GWAS trait-associated SNPs downloaded from up-to-date dbGaP and NHGRI public database. It is maintained to be updated to the latest version. The data frame contains the following columns, |
region |
A GRange object or data frame, which is genomic intervals with three columns, chromosome, genomic start position, genomic end position. |
snpdb.bg |
A GRange object contains non-trait-associated SNPs. They are treated as background for statistical testing instead of whole genome as background if specified. |
keyword |
The keyword is used when specific trait is of interest. If |
rankby |
Traits could be ranked by either p-value or adds.ratio based on the enrichment level of trait-associated SNPs in genomic intervals. |
test.method |
Several hypothesis testing options are provided: |
alternative |
Indicate the alternative hypothesis. If |
ntimes |
The number of shuffling time for one batch. See |
nbatch |
The number of batches. The product of |
trait.threshold |
Test traits with number of SNPs more than the threshold. |
traitclass.threshold |
Test trait class with number of SNPs more than the threshold. |
pvalue |
SNPs with p-value less than this threshold are used for analyses. |
Return a list that contains three data frames. One data frame tb.all
contains the results of
enrichment analyses for all trait-associated SNPs in genomic intervals.
Another data frame tb1
contains the results of enrichment analyses for each trait-associated SNPs
in genomic intervals separately.
Another data frame tb2
contains the results of enrichment analyses for each trait-class-associated SNPs
in genomic intervals separately.
The data frame tb1
has columns,
Trait |
Name of trait |
p.value |
P-value calculated from hypothesis testing |
q.value |
Adjusted p-value from multiple testing using FDR correction |
odds.ratio |
Odds ratio calculated based on number of trait-associated SNPs in genomic intervals, number of trait-associated SNPs across whole genome, genomic intervals size (bps) and genome size (bps) |
taSNP.hits |
Number of trait-associated SNPs in genomic intervals |
taSNP.num |
Number of SNPs for specific trait |
Li Chen <[email protected]>, Zhaohui S.Qin<[email protected]>
print.traseR
data(taSNP) data(Tcell) x=traseR(snpdb=taSNP,region=Tcell) print(x)
data(taSNP) data(Tcell) x=traseR(snpdb=taSNP,region=Tcell) print(x)