| Title: | Loci2path: regulatory annotation of genomic intervals based on tissue-specific expression QTLs |
|---|---|
| Description: | loci2path performs statistics-rigorous enrichment analysis of eQTLs in genomic regions of interest. Using eQTL collections provided by the Genotype-Tissue Expression (GTEx) project and pathway collections from MSigDB. |
| Authors: | Tianlei Xu |
| Maintainer: | Tianlei Xu <[email protected]> |
| License: | Artistic-2.0 |
| Version: | 1.33.0 |
| Built: | 2026-06-02 08:31:39 UTC |
| Source: | https://github.com/bioc/loci2path |
Demo Data to show how to perform eQTL-geneset enrichment query.
data(loci2path.demo)data(loci2path.demo)
An object of class geneSet of length 1.
eqtl.set.list A list of eQTLset objects; eQTL data are collected from
GTeX.
biocarta A Geneset object; Geneset are from Broad Institute's MSigDB:
curated gene set, category 'cp': BIOCARTA
query.gr A Genomic Region object; Query regions are from immunoBase,
crohn's disease.
data(loci2path.demo)data(loci2path.demo)
This function perform enrichment test between one eQTL set and one gene set
check.geneid(e.set, g.set)check.geneid(e.set, g.set)
e.set |
an eqtlSet object; the eQTL set to be queried against |
g.set |
an object of geneSet class; the gene set to be tested |
a data.frame shows the number of genes from
(1) eqtl Set (2) gene Set (3) shared
check.geneid(eset.list$Skin, biocarta)check.geneid(eset.list$Skin, biocarta)
eqtlSet Class contains information for eqtl-gene association, gene identifier, position of SNPs, etc.
tissue(x) eqtlId(x) eqtlRange(x) eqtlGene(x) ## S4 method for signature 'eqtlSet' tissue(x) ## S4 method for signature 'eqtlSet' eqtlId(x) ## S4 method for signature 'eqtlSet' eqtlRange(x) ## S4 method for signature 'eqtlSet' eqtlGene(x)tissue(x) eqtlId(x) eqtlRange(x) eqtlGene(x) ## S4 method for signature 'eqtlSet' tissue(x) ## S4 method for signature 'eqtlSet' eqtlId(x) ## S4 method for signature 'eqtlSet' eqtlRange(x) ## S4 method for signature 'eqtlSet' eqtlGene(x)
x |
An eqtlSet object |
Object of class eqtlSet
tissuecharacter; name of the cell/tissue of the eQTL study
eqtlIdcharacter; name of the SNPs
eqtlRangeGenomicRanges; position of the SNPs
genecharacter; gene identifier
require(GenomicRanges) brain.file <- system.file("extdata", "eqtl/brain.gtex.txt", package="loci2path") tab <- read.table(brain.file, stringsAsFactors=FALSE, header=TRUE) eqtlRange <- GRanges(seqnames=Rle(tab$snp.chr), ranges=IRanges(start=tab$snp.pos, width=1)) brain.eset <- eqtlSet(tissue="brain", eqtlId=tab$snp.id, eqtlRange=eqtlRange, gene=as.character(tab$gene.entrez.id)) tissue(brain.eset) head(eqtlId(brain.eset)) eqtlRange(brain.eset) head(eqtlGene(brain.eset))require(GenomicRanges) brain.file <- system.file("extdata", "eqtl/brain.gtex.txt", package="loci2path") tab <- read.table(brain.file, stringsAsFactors=FALSE, header=TRUE) eqtlRange <- GRanges(seqnames=Rle(tab$snp.chr), ranges=IRanges(start=tab$snp.pos, width=1)) brain.eset <- eqtlSet(tissue="brain", eqtlId=tab$snp.id, eqtlRange=eqtlRange, gene=as.character(tab$gene.entrez.id)) tissue(brain.eset) head(eqtlId(brain.eset)) eqtlRange(brain.eset) head(eqtlGene(brain.eset))
Demo Data to show how to perform eQTL-geneset enrichment query.
data(loci2path.demo)data(loci2path.demo)
An object of class list of length 3.
eset.list A list of eQTLset objects; eQTL data are collected from
GTeX.
biocarta A Geneset object; Geneset are from Broad Institute's
MSigDB: curated gene set, category 'cp': BIOCARTA
query.gr A Genomic Region object; Query regions are from
immunoBase, Psoriasis disease.
data(loci2path.demo)data(loci2path.demo)
geneSet Class contains information for names of gene sets and a list of gene sets
numGene(x) description(x) geneSetList(x) ## S4 method for signature 'geneSet' numGene(x) ## S4 method for signature 'geneSet' description(x) ## S4 method for signature 'geneSet' geneSetList(x)numGene(x) description(x) geneSetList(x) ## S4 method for signature 'geneSet' numGene(x) ## S4 method for signature 'geneSet' description(x) ## S4 method for signature 'geneSet' geneSetList(x)
x |
An geneSet object |
Object of class geneSet
numGenenumeric; the total number of all genes; This number is used in enrichment tests
descriptionvector of character; additional information for gene sets, such as names, URLs, a short description, etc.
geneSetListlist; a list of gene sets; each member is a vector containing a group of gene identifiers
biocarta.link.file <- system.file("extdata", "geneSet/biocarta.txt", package="loci2path") biocarta.link <- read.delim(biocarta.link.file, header=FALSE, stringsAsFactors=FALSE) biocarta.set.file <- system.file("extdata", "geneSet/biocarta.set.txt", package="loci2path") set.geneid <- read.table(biocarta.set.file, stringsAsFactors=FALSE) set.geneid <- strsplit(set.geneid[,1], split=",") names(set.geneid) <- biocarta.link[,1] biocarta <- geneSet( geneSetList=set.geneid, description=biocarta.link[,2], numGene=31847) numGene(biocarta) head(description(biocarta)) head(geneSetList(biocarta))biocarta.link.file <- system.file("extdata", "geneSet/biocarta.txt", package="loci2path") biocarta.link <- read.delim(biocarta.link.file, header=FALSE, stringsAsFactors=FALSE) biocarta.set.file <- system.file("extdata", "geneSet/biocarta.set.txt", package="loci2path") set.geneid <- read.table(biocarta.set.file, stringsAsFactors=FALSE) set.geneid <- strsplit(set.geneid[,1], split=",") names(set.geneid) <- biocarta.link[,1] biocarta <- geneSet( geneSetList=set.geneid, description=biocarta.link[,2], numGene=31847) numGene(biocarta) head(description(biocarta)) head(geneSetList(biocarta))
This function generate the enrichment heatmap using pheatmap package.
getHeatmap(res, ...) ## S4 method for signature 'loci2pathResult' getHeatmap(res, main = "", test.method = c("gene", "eqtl", "glm"), filter.quantile = 0.5, max.ptw.gene = 5000)getHeatmap(res, ...) ## S4 method for signature 'loci2pathResult' getHeatmap(res, main = "", test.method = c("gene", "eqtl", "glm"), filter.quantile = 0.5, max.ptw.gene = 5000)
res |
query result from function query.egset.list() |
... |
additional params |
main |
title of the heatmap, default is "" |
test.method |
Choose which enrichment test should be used to retrive p-values from. Options include:"gene"(default, gene-based fisher's exact test),"eqtl" (eqtl based fisher's exact test), "glm" (ordered query) |
filter.quantile |
Filter option; choose the max quantile of all p-values being kept in the matrix; default is 0.5, which means p-values larger than median p-values are discarded |
max.ptw.gene |
Filter option; minimum number of genes in a pathway; default is 5000 (pathway with >5000 genes are not included in the matrix) |
pathways |
frequent pathways |
tissues |
frequent tissues |
result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) getHeatmap(result)result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) getHeatmap(result)
This function extracts the enrichment matrix from eQTL list query result. The rows of the matrixs are pathways; and the columns of the matrixs are tissues/cell lines of the eQTL sets. P-Values from enrichment tests are summarized in this matrix
getMat(res, ...) ## S4 method for signature 'loci2pathResult' getMat(res, test.method = c("gene", "eqtl", "glm"), filter.quantile = 0.5, max.ptw.gene = 5000)getMat(res, ...) ## S4 method for signature 'loci2pathResult' getMat(res, test.method = c("gene", "eqtl", "glm"), filter.quantile = 0.5, max.ptw.gene = 5000)
res |
query result from function query.egset.list() |
... |
additional params |
test.method |
Choose which enrichment test should be used to retrive p-values from. Options include:"gene"(default, gene-based fisher's exact test),"eqtl" (eqtl based fisher's exact test), "glm" (ordered query) |
filter.quantile |
Filter option; choose the max quantile of all p-values being kept in the matrix; default is 0.5, which means p-values larger than median p-values are discarded |
max.ptw.gene |
Filter option; minimum number of genes in a pathway; default is 5000 (pathway with >5000 genes are not included in the matrix) |
p-value matrix collected from enrichment result table
result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) mat <- getMat(result, test.method="gene")result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) mat <- getMat(result, test.method="gene")
This function extracts the pathway description from geneSet object.
getPathDescription(res, ...) ## S4 method for signature 'loci2pathResult' getPathDescription(res, geneset)getPathDescription(res, ...) ## S4 method for signature 'loci2pathResult' getPathDescription(res, geneset)
res |
query result from function query.egset.list() |
... |
additional params |
geneset |
A |
a vector of gene set description from geneSet description
slot
result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) path.des <- getPathDescription(result, biocarta)result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) path.des <- getPathDescription(result, biocarta)
This function extracts the enrichment p-value distribution from eQTL list query result. P-values from different tissues/cell types are organized, and QQ-plot is generated against uniform distribution
getPval(res, ...) ## S4 method for signature 'loci2pathResult' getPval(res, test.method = c("gene", "eqtl", "glm"))getPval(res, ...) ## S4 method for signature 'loci2pathResult' getPval(res, test.method = c("gene", "eqtl", "glm"))
res |
query result from function query.egset.list() |
... |
additional params |
test.method |
Choose which enrichment test should be used to retrive p-values from. Options include:"gene"(default, gene-based fisher's exact test),"eqtl" (eqtl based fisher's exact test), "glm" (ordered query) |
generate pval distribution plot
result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) getPval(result, test.method="gene")result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) getPval(result, test.method="gene")
This function extracts the tissue degree from eQTL list query result for each pathway.
getTissueDegree(res, ...) ## S4 method for signature 'loci2pathResult' getTissueDegree(res, loci)getTissueDegree(res, ...) ## S4 method for signature 'loci2pathResult' getTissueDegree(res, loci)
res |
query result from function query.egset.list() |
... |
additional params |
loci |
a list of eqtlSet; each member should be an eqtlSet object |
gene.tissue.map |
shows mapping:gene<->tissue |
gene.tissue.degree |
shows tissue degree for each gene |
mean.tissue.degree |
shows the average tissue digree for each pathway in the result table |
result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) tissue.degree=getTissueDegree(result, eset.list) head(tissue.degree$gene.tissue.map) head(tissue.degree$gene.tissue.degree) head(tissue.degree$mean.tissue.degree)result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) tissue.degree=getTissueDegree(result, eset.list) head(tissue.degree$gene.tissue.map) head(tissue.degree$gene.tissue.degree) head(tissue.degree$mean.tissue.degree)
This function draw the enrichment heatmap using
wordcloud package.
getWordcloud(res, ...) ## S4 method for signature 'loci2pathResult' getWordcloud(res, min.freq.tissue = 5, min.freq.gset = 5, max.words = 50)getWordcloud(res, ...) ## S4 method for signature 'loci2pathResult' getWordcloud(res, min.freq.tissue = 5, min.freq.gset = 5, max.words = 50)
res |
query result from function query.egset.list() |
... |
additional params |
min.freq.tissue |
minimum frequency of tissue/cell to be plotted in the word cloud |
min.freq.gset |
minimum frequency of geneset to be plotted in the word cloud |
max.words |
maximum words to be generated |
empty
result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) getWordcloud(result, min.freq.tissue=2, min.freq.gset=1)result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) getWordcloud(result, min.freq.tissue=2, min.freq.gset=1)
loci2pathResult Class contains information for the query result from
query function query. Result object contains a ranked
pathway table, and a vector of gene names that are associated with loci
covered by query regions
resultTable(x) coveredGene(x) ## S4 method for signature 'loci2pathResult' resultTable(x) ## S4 method for signature 'loci2pathResult' coveredGene(x)resultTable(x) coveredGene(x) ## S4 method for signature 'loci2pathResult' resultTable(x) ## S4 method for signature 'loci2pathResult' coveredGene(x)
x |
An geneSet object |
Object of CLass loci2pathResult
resultTabledata.frame; contains enrichment statistics, summary of eQTL and gene numbers, pathway names and gene names, etc.
coveredGenelist; each member is a vector of genes associated with one tissue, whose associating loci are covered by query regions
result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) result resultTable(result) # a data.frame for enriched pathways coveredGene(result)result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) result resultTable(result) # a data.frame for enriched pathways coveredGene(result)
This is the main function for loci2path query. Query can be made on either pathway enrichment or tissue-specificity, depending on the input Class. See Details for more.
query(query.gr, loci, path, ...) ## S4 method for signature 'GenomicRanges,list,ANY' query(query.gr, loci, N = 2897310462) ## S4 method for signature 'GenomicRanges,eqtlSet,geneSet' query(query.gr, loci, path, query.score = NULL, verbose = FALSE) ## S4 method for signature 'GenomicRanges,list,geneSet' query(query.gr, loci, path, query.score = NULL, parallel = FALSE, verbose = FALSE)query(query.gr, loci, path, ...) ## S4 method for signature 'GenomicRanges,list,ANY' query(query.gr, loci, N = 2897310462) ## S4 method for signature 'GenomicRanges,eqtlSet,geneSet' query(query.gr, loci, path, query.score = NULL, verbose = FALSE) ## S4 method for signature 'GenomicRanges,list,geneSet' query(query.gr, loci, path, query.score = NULL, parallel = FALSE, verbose = FALSE)
query.gr |
a GenomicRange object, representing query regions |
loci |
a list of eqtlSet; each member should be an eqtlSet; Or it can be a single eqtlSet. |
path |
Pathways or geneSets to be tested for enrichment |
... |
additional params |
N |
the total number of non-N nucleotides in the genome; default N=2897310462 is for hg19 |
query.score |
optional, set to NULL if the regions are not ordered. If the query regions are ordered, query.score is the quantity based on which the regions are ordered |
verbose |
bool; whether to show eqtlSet/geneSet summary information; default is FALSE |
parallel |
bool; whether to enable parallel computing; default is FALSE |
The user need to specify
Query region;
loci; one or more eQTL set; this is usually more than one eQTL set. Only multiple eQTL set derived from different cells/tissues will show cell/tissue specificity.
path; pre-defined Pathways, or gene sets. the gene sets that enrichment tests would be performed to.
loci must be provided; path is optional. When path is
missing, the tissue-specificity query for the regions is performed.
The most common case for loci is an eQTL set list.
This function perform enrichment test between one eQTL set and a group of
gene sets. Usually query are based on eQTL set list, rather than only one
eQTL set. Several result exploring functions (getMat,
getHeatmap, getPval, etc...) are designed for query result
from eQTL set list and gene sets. The class loci2pathResult is also
designed for eQTL set list query result only. The result returns a
loci2pathResult only the class of loci is a list
of eqtlSet.
If user input one eQTL set as argument loci, a simple
list object will be returned for specific research purpose.
a data.frame showing the tissue enrichment of the query regions by binomial test.
a list; result.table is the major result table showing
enrichment assessment;
cover.gene is the vector showing the genes from the eqtl Sets
covered by the query region(s)
a loci2pathResult class object
loci2pathResult
gr.tissue <- query(query.gr, eset.list) #build one eqtlset skin.eset <- eset.list$Skin #query one egset res.one <- query(query.gr, skin.eset, biocarta) #enrichment result table res.one$result.table #all the genes associated with eQTLs covered by the query region res.one$cover.gene result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) #enrichment result table resultTable(result) #all the genes associated with eQTLs covered by the query region coveredGene(result)gr.tissue <- query(query.gr, eset.list) #build one eqtlset skin.eset <- eset.list$Skin #query one egset res.one <- query(query.gr, skin.eset, biocarta) #enrichment result table res.one$result.table #all the genes associated with eQTLs covered by the query region res.one$cover.gene result <- query(query.gr=query.gr, loci=eset.list, path=biocarta) #enrichment result table resultTable(result) #all the genes associated with eQTLs covered by the query region coveredGene(result)
Demo Data to show how to perform eQTL-geneset enrichment query.
data(loci2path.demo)data(loci2path.demo)
An object of class GRanges of length 47.
eqtl.set.list A list of eQTLset objects; eQTL data are collected from
GTeX.
biocarta A Geneset object; Geneset are from Broad Institute's MSigDB:
curated gene set, category 'cp': BIOCARTA
query.gr A Genomic Region object; Query regions are from immunoBase,
crohn's disease.
data(loci2path.demo)data(loci2path.demo)