| Title: | Disease Ontology Semantic and Enrichment analysis |
|---|---|
| Description: | This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data. |
| Authors: | Guangchuang Yu [aut, cre], Li-Gen Wang [ctb], Vladislav Petyuk [ctb], Giovanni Dall'Olio [ctb] |
| Maintainer: | Guangchuang Yu <[email protected]> |
| License: | Artistic-2.0 |
| Version: | 4.7.0 |
| Built: | 2026-05-29 10:31:02 UTC |
| Source: | https://github.com/bioc/DOSE |
semantic similarity between two gene clusters
clusterSim( cluster1, cluster2, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )clusterSim( cluster1, cluster2, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
cluster1 |
a vector of gene IDs |
cluster2 |
another vector of gene IDs |
ont |
one of "HDO", "HPO" and "MPO" |
organism |
one of "hsa" and "mmu" |
measure |
One of "Resnik", "Lin", "Rel", "Jiang" and "Wang" methods. |
combine |
One of "max", "avg", "rcmax", "BMA" methods, for combining |
given two gene clusters, this function calculates semantic similarity between them.
similarity
Yu Guangchuang
## Not run: cluster1 <- c("835", "5261","241", "994") cluster2 <- c("307", "308", "317", "321", "506", "540", "378", "388", "396") clusterSim(cluster1, cluster2) ## End(Not run)## Not run: cluster1 <- c("835", "5261","241", "994") cluster2 <- c("307", "308", "317", "321", "506", "540", "378", "388", "396") clusterSim(cluster1, cluster2) ## End(Not run)
compute information content
computeIC(ont = "HDO")computeIC(ont = "HDO")
ont |
one of "DO", "HPO" and "MPO" |
Guangchuang Yu https://yulab-smu.top
Shared parameters for DOSE functions
gene |
a vector of entrez gene id |
organism |
one of "hsa" and "mmu" |
ont |
one of "HDO", "HPO" or "MPO" |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
geneList |
order ranked geneList |
exponent |
weight of each step |
nPerm |
permutation numbers |
verbose |
print message or not |
adaptive |
logical, use adaptive permutation or not (default: FALSE) |
minPerm |
minimum number of permutations for adaptive mode (default: 1000) |
maxPerm |
maximum number of permutations for adaptive mode (default: 10000) |
method |
method of GSEA, one of "multilevel", "permute", "sample" |
measuring similarities between two DO term vectors.
doseSim(DOID1, DOID2, measure = "Wang", ont = "HDO") doSim(DOID1, DOID2, measure = "Wang", ont = "HDO")doseSim(DOID1, DOID2, measure = "Wang", ont = "HDO") doSim(DOID1, DOID2, measure = "Wang", ont = "HDO")
DOID1 |
DO term, MPO term or HPO term vector |
DOID2 |
DO term, MPO term or HPO term vector |
measure |
one of "Wang", "Resnik", "Rel", "Jiang", "Lin", and "TCSS". |
ont |
one of "HDO", "HPO" and "MPO" |
provide two term vectors, this function will calculate their similarities.
score matrix
Guangchuang Yu https://yulab-smu.top
given a vector of genes, this function will return the enrichment NCG categories with FDR control
enrichDGN( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )enrichDGN( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
gene |
a vector of entrez gene id |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
A enrichResult instance
Guangchuang Yu
Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database bav028 http://database.oxfordjournals.org/content/2015/bav028.long
Enrichment analysis based on the DisGeNET (http://www.disgenet.org/)
enrichDGNv( snp, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )enrichDGNv( snp, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
snp |
a vector of SNP |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
given a vector of genes, this function will return the enrichment NCG categories with FDR control
A enrichResult instance
Guangchuang Yu
Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database bav028 http://database.oxfordjournals.org/content/2015/bav028.long
Given a vector of genes, this function will return the enrichment DO categories with FDR control.
enrichDO( gene, ont = "HDO", organism = "hsa", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )enrichDO( gene, ont = "HDO", organism = "hsa", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
gene |
a vector of entrez gene id |
ont |
one of "HDO", "HPO" or "MPO" |
organism |
one of "hsa" and "mmu" |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
A enrichResult instance.
Guangchuang Yu https://yulab-smu.top
data(geneList) gene = names(geneList)[geneList > 1] yy = enrichDO(gene, pvalueCutoff=0.05) summary(yy)data(geneList) gene = names(geneList)[geneList > 1] yy = enrichDO(gene, pvalueCutoff=0.05) summary(yy)
Enrichment analysis based on the Network of Cancer Genes database (http://ncg.kcl.ac.uk/)
enrichNCG( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )enrichNCG( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
gene |
a vector of entrez gene id |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
given a vector of genes, this function will return the enrichment NCG categories with FDR control
A enrichResult instance
Guangchuang Yu
provide gene ID, this function will convert to the corresponding DO Terms
gene2DO(gene, organism = "hsa", ont = "HDO")gene2DO(gene, organism = "hsa", ont = "HDO")
gene |
entrez gene ID |
organism |
organism |
ont |
ont |
DO Terms
Guangchuang Yu https://yulab-smu.top
measuring similarities bewteen two gene vectors.
geneSim( geneID1, geneID2 = NULL, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )geneSim( geneID1, geneID2 = NULL, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
geneID1 |
entrez gene vector |
geneID2 |
entrez gene vector |
ont |
one of "HDO" and "MPO" |
organism |
one of "hsa" and "mmu" |
measure |
one of "Wang", "Resnik", "Rel", "Jiang", and "Lin". |
combine |
One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein. |
provide two entrez gene vectors, this function will calculate their similarity.
score matrix
Guangchuang Yu https://yulab-smu.top
g <- c("835", "5261","241", "994") geneSim(g)g <- c("835", "5261","241", "994") geneSim(g)
perform gsea analysis
gseDGN( geneList, exponent = 1, nPerm = 1000, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, method = "multilevel", adaptive = FALSE, minPerm = 1000, maxPerm = 10000, ... )gseDGN( geneList, exponent = 1, nPerm = 1000, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, method = "multilevel", adaptive = FALSE, minPerm = 1000, maxPerm = 10000, ... )
geneList |
order ranked geneList |
exponent |
weight of each step |
nPerm |
permutation numbers |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
verbose |
print message or not |
method |
method of GSEA, one of "multilevel", "permute", "sample" |
adaptive |
logical, use adaptive permutation or not (default: FALSE) |
minPerm |
minimum number of permutations for adaptive mode (default: 1000) |
maxPerm |
maximum number of permutations for adaptive mode (default: 10000) |
... |
other parameter |
gseaResult object
Guangchuang Yu
perform gsea analysis
gseDO( geneList, ont = "HDO", organism = "hsa", exponent = 1, nPerm = 1000, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, method = "multilevel", adaptive = FALSE, minPerm = 1000, maxPerm = 10000, ... )gseDO( geneList, ont = "HDO", organism = "hsa", exponent = 1, nPerm = 1000, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, method = "multilevel", adaptive = FALSE, minPerm = 1000, maxPerm = 10000, ... )
geneList |
order ranked geneList |
ont |
one of "HDO", "HPO" or "MPO" |
organism |
one of "hsa" and "mmu" |
exponent |
weight of each step |
nPerm |
permutation numbers |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
verbose |
print message or not |
method |
method of GSEA, one of "multilevel", "permute", "sample" |
adaptive |
logical, use adaptive permutation or not (default: FALSE) |
minPerm |
minimum number of permutations for adaptive mode (default: 1000) |
maxPerm |
maximum number of permutations for adaptive mode (default: 10000) |
... |
other parameter |
gseaResult object
Guangchuang Yu
perform gsea analysis
gseNCG( geneList, exponent = 1, nPerm = 1000, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, method = "multilevel", adaptive = FALSE, minPerm = 1000, maxPerm = 10000, ... )gseNCG( geneList, exponent = 1, nPerm = 1000, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, method = "multilevel", adaptive = FALSE, minPerm = 1000, maxPerm = 10000, ... )
geneList |
order ranked geneList |
exponent |
weight of each step |
nPerm |
permutation numbers |
minGSSize |
minimal size of genes annotated by ontology term for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
verbose |
print message or not |
method |
method of GSEA, one of "multilevel", "permute", "sample" |
adaptive |
logical, use adaptive permutation or not (default: FALSE) |
minPerm |
minimum number of permutations for adaptive mode (default: 1000) |
maxPerm |
maximum number of permutations for adaptive mode (default: 10000) |
... |
other parameter |
gseaResult object
Guangchuang Yu
Pairwise semantic similarity for a list of gene clusters
mclusterSim( clusters, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )mclusterSim( clusters, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
clusters |
A list of gene clusters |
ont |
one of "HDO", "HPO" and "MPO" |
organism |
organism |
measure |
one of "Wang", "Resnik", "Rel", "Jiang", and "Lin". |
combine |
One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein. |
similarity matrix
Guangchuang Yu
## Not run: cluster1 <- c("835", "5261","241") cluster2 <- c("578","582") cluster3 <- c("307", "308", "317") clusters <- list(a=cluster1, b=cluster2, c=cluster3) mclusterSim(clusters, measure="Wang") ## End(Not run)## Not run: cluster1 <- c("835", "5261","241") cluster2 <- c("578","582") cluster3 <- c("307", "308", "317") clusters <- list(a=cluster1, b=cluster2, c=cluster3) mclusterSim(clusters, measure="Wang") ## End(Not run)
plotting similarity matrix
simplot( sim, xlab = "", ylab = "", color.low = "white", color.high = "red", labs = TRUE, digits = 2, labs.size = 3, font.size = 14 )simplot( sim, xlab = "", ylab = "", color.low = "white", color.high = "red", labs = TRUE, digits = 2, labs.size = 3, font.size = 14 )
sim |
similarity matrix |
xlab |
xlab |
ylab |
ylab |
color.low |
color of low value |
color.high |
color of high value |
labs |
logical, add text label or not |
digits |
round digit numbers |
labs.size |
lable size |
font.size |
font size |
ggplot object
Yu Guangchuang
ggplot theme of DOSE
theme_dose(font.size = 14)theme_dose(font.size = 14)
font.size |
font size |
ggplot theme
library(ggplot2) qplot(1:10) + theme_dose()library(ggplot2) qplot(1:10) + theme_dose()