Title: | Disease Ontology Semantic and Enrichment analysis |
---|---|
Description: | This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data. |
Authors: | Guangchuang Yu [aut, cre], Li-Gen Wang [ctb], Vladislav Petyuk [ctb], Giovanni Dall'Olio [ctb] |
Maintainer: | Guangchuang Yu <[email protected]> |
License: | Artistic-2.0 |
Version: | 4.1.0 |
Built: | 2024-12-29 05:07:52 UTC |
Source: | https://github.com/bioc/DOSE |
semantic similarity between two gene clusters
clusterSim( cluster1, cluster2, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
clusterSim( cluster1, cluster2, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
cluster1 |
a vector of gene IDs |
cluster2 |
another vector of gene IDs |
ont |
one of "HDO", "HPO" and "MPO" |
organism |
one of "hsa" and "mmu" |
measure |
One of "Resnik", "Lin", "Rel", "Jiang" and "Wang" methods. |
combine |
One of "max", "avg", "rcmax", "BMA" methods, for combining |
given two gene clusters, this function calculates semantic similarity between them.
similarity
Yu Guangchuang
## Not run: cluster1 <- c("835", "5261","241", "994") cluster2 <- c("307", "308", "317", "321", "506", "540", "378", "388", "396") clusterSim(cluster1, cluster2) ## End(Not run)
## Not run: cluster1 <- c("835", "5261","241", "994") cluster2 <- c("307", "308", "317", "321", "506", "540", "378", "388", "396") clusterSim(cluster1, cluster2) ## End(Not run)
Class "compareClusterResult" This class represents the comparison result of gene clusters by GO categories at specific level or GO enrichment analysis.
compareClusterResult
cluster comparing result
geneClusters
a list of genes
fun
one of groupGO, enrichGO and enrichKEGG
gene2Symbol
gene ID to Symbol
keytype
Gene ID type
readable
logical flag of gene ID in symbol or not.
.call
function call
termsim
Similarity between term
method
method of calculating the similarity between nodes
dr
dimension reduction result
Guangchuang Yu https://yulab-smu.top
compute information content
computeIC(ont = "HDO")
computeIC(ont = "HDO")
ont |
one of "DO", "HPO" and "MPO" |
Guangchuang Yu https://yulab-smu.top
measuring similarities between two DO term vectors.
doseSim(DOID1, DOID2, measure = "Wang", ont = "HDO") doSim(DOID1, DOID2, measure = "Wang", ont = "HDO")
doseSim(DOID1, DOID2, measure = "Wang", ont = "HDO") doSim(DOID1, DOID2, measure = "Wang", ont = "HDO")
DOID1 |
DO term, MPO term or HPO term vector |
DOID2 |
DO term, MPO term or HPO term vector |
measure |
one of "Wang", "Resnik", "Rel", "Jiang", "Lin", and "TCSS". |
ont |
one of "HDO", "HPO" and "MPO" |
provide two term vectors, this function will calculate their similarities.
score matrix
Guangchuang Yu https://yulab-smu.top
given a vector of genes, this function will return the enrichment NCG categories with FDR control
enrichDGN( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
enrichDGN( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
gene |
a vector of entrez gene id |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by NCG category for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
A enrichResult
instance
Guangchuang Yu
Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database bav028 http://database.oxfordjournals.org/content/2015/bav028.long
Enrichment analysis based on the DisGeNET (http://www.disgenet.org/)
enrichDGNv( snp, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
enrichDGNv( snp, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
snp |
a vector of SNP |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by NCG category for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
given a vector of genes, this function will return the enrichment NCG categories with FDR control
A enrichResult
instance
Guangchuang Yu
Janet et al. (2015) DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database bav028 http://database.oxfordjournals.org/content/2015/bav028.long
Given a vector of genes, this function will return the enrichment DO categories with FDR control.
enrichDO( gene, ont = "HDO", organism = "hsa", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
enrichDO( gene, ont = "HDO", organism = "hsa", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
gene |
a vector of entrez gene id |
ont |
one of "HDO", "HPO" or "MPO". |
organism |
one of "hsa" and "mmu" |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by NCG category for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
A enrichResult
instance.
Guangchuang Yu https://yulab-smu.top
data(geneList) gene = names(geneList)[geneList > 1] yy = enrichDO(gene, pvalueCutoff=0.05) summary(yy)
data(geneList) gene = names(geneList)[geneList > 1] yy = enrichDO(gene, pvalueCutoff=0.05) summary(yy)
interal method for enrichment analysis
enricher_internal( gene, pvalueCutoff, pAdjustMethod = "BH", universe = NULL, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, USER_DATA )
enricher_internal( gene, pvalueCutoff, pAdjustMethod = "BH", universe = NULL, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, USER_DATA )
gene |
a vector of entrez gene id. |
pvalueCutoff |
Cutoff value of pvalue. |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes, default is the intersection of the 'universe' with genes that have annotations. Users can set ‘options(enrichment_force_universe = TRUE)' to force the ’universe' untouched. |
minGSSize |
minimal size of genes annotated by Ontology term for testing. |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
cutoff of qvalue |
USER_DATA |
ontology information |
using the hypergeometric model
A enrichResult
instance.
Guangchuang Yu https://yulab-smu.top
Enrichment analysis based on the Network of Cancer Genes database (http://ncg.kcl.ac.uk/)
enrichNCG( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
enrichNCG( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, readable = FALSE )
gene |
a vector of entrez gene id |
pvalueCutoff |
pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes |
minGSSize |
minimal size of genes annotated by NCG category for testing |
maxGSSize |
maximal size of each geneSet for analyzing |
qvalueCutoff |
qvalue cutoff |
readable |
whether mapping gene ID to gene Name |
given a vector of genes, this function will return the enrichment NCG categories with FDR control
A enrichResult
instance
Guangchuang Yu
Class "enrichResult" This class represents the result of enrichment analysis.
result
enrichment analysis
pvalueCutoff
pvalueCutoff
pAdjustMethod
pvalue adjust method
qvalueCutoff
qvalueCutoff
organism
only "human" supported
ontology
biological ontology
gene
Gene IDs
keytype
Gene ID type
universe
background gene
gene2Symbol
mapping gene to Symbol
geneSets
gene sets
readable
logical flag of gene ID in symbol or not.
termsim
Similarity between term
method
method of calculating the similarity between nodes
dr
dimension reduction result
Guangchuang Yu https://yulab-smu.top
mapping gene ID to gene Symbol
EXTID2NAME(OrgDb, geneID, keytype)
EXTID2NAME(OrgDb, geneID, keytype)
OrgDb |
OrgDb |
geneID |
entrez gene ID |
keytype |
keytype |
gene symbol
Guangchuang Yu https://yulab-smu.top
provide gene ID, this function will convert to the corresponding DO Terms
gene2DO(gene, organism = "hsa", ont = "HDO")
gene2DO(gene, organism = "hsa", ont = "HDO")
gene |
entrez gene ID |
organism |
organism |
ont |
ont |
DO Terms
Guangchuang Yu https://yulab-smu.top
geneID generic
geneID(x)
geneID(x)
x |
enrichResult object |
'geneID' return the 'geneID' column of the enriched result which can be converted to data.frame via 'as.data.frame'
data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- enrichDO(de) geneID(x)
data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- enrichDO(de) geneID(x)
geneInCategory generic
geneInCategory(x)
geneInCategory(x)
x |
enrichResult |
'geneInCategory' return a list of genes, by spliting the input gene vector to enriched functional categories
data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- enrichDO(de) geneInCategory(x)
data(geneList, package="DOSE") de <- names(geneList)[1:100] x <- enrichDO(de) geneInCategory(x)
measuring similarities bewteen two gene vectors.
geneSim( geneID1, geneID2 = NULL, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
geneSim( geneID1, geneID2 = NULL, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
geneID1 |
entrez gene vector |
geneID2 |
entrez gene vector |
ont |
one of "HDO" and "MPO" |
organism |
one of "hsa" and "mmu" |
measure |
one of "Wang", "Resnik", "Rel", "Jiang", and "Lin". |
combine |
One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein. |
provide two entrez gene vectors, this function will calculate their similarity.
score matrix
Guangchuang Yu https://yulab-smu.top
g <- c("835", "5261","241", "994") geneSim(g)
g <- c("835", "5261","241", "994") geneSim(g)
generic function for gene set enrichment analysis
GSEA_internal( geneList, exponent, minGSSize, maxGSSize, eps, pvalueCutoff, pAdjustMethod, verbose, seed = FALSE, USER_DATA, by = "fgsea", ... )
GSEA_internal( geneList, exponent, minGSSize, maxGSSize, eps, pvalueCutoff, pAdjustMethod, verbose, seed = FALSE, USER_DATA, by = "fgsea", ... )
geneList |
order ranked geneList |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
eps |
This parameter sets the boundary for calculating the p value. |
pvalueCutoff |
p value Cutoff |
pAdjustMethod |
p value adjustment method |
verbose |
print message or not |
seed |
set seed inside the function to make result reproducible. FALSE by default. |
USER_DATA |
annotation data |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Yu Guangchuang
Class "gseaResult" This class represents the result of GSEA analysis
result
GSEA anaysis
organism
organism
setType
setType
geneSets
geneSets
geneList
order rank geneList
keytype
ID type of gene
permScores
permutation scores
params
parameters
gene2Symbol
gene ID to Symbol
readable
whether convert gene ID to symbol
dr
dimension reduction result
Guangchuang Yu https://yulab-smu.top
perform gsea analysis
gseDGN( geneList, exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
gseDGN( geneList, exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
geneList |
order ranked geneList |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
pvalue Cutoff |
pAdjustMethod |
p value adjustment method |
verbose |
print message or not |
seed |
logical |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Yu Guangchuang
perform gsea analysis
gseDO( geneList, ont = "HDO", organism = "hsa", exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
gseDO( geneList, ont = "HDO", organism = "hsa", exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
geneList |
order ranked geneList |
ont |
one of "HDO", "HPO" or "MPO" |
organism |
one of "hsa" and "mmu" |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
pvalue Cutoff |
pAdjustMethod |
p value adjustment method |
verbose |
print message or not |
seed |
logical |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Yu Guangchuang
perform gsea analysis
gseNCG( geneList, exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
gseNCG( geneList, exponent = 1, minGSSize = 10, maxGSSize = 500, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
geneList |
order ranked geneList |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of each geneSet for analyzing |
pvalueCutoff |
pvalue Cutoff |
pAdjustMethod |
p value adjustment method |
verbose |
print message or not |
seed |
logical |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Yu Guangchuang
filter enriched result by gene set size or gene count
gsfilter(x, by = "GSSize", min = NA, max = NA)
gsfilter(x, by = "GSSize", min = NA, max = NA)
x |
instance of enrichResult or compareClusterResult |
by |
one of 'GSSize' or 'Count' |
min |
minimal size |
max |
maximal size |
update object
Guangchuang Yu
Pairwise semantic similarity for a list of gene clusters
mclusterSim( clusters, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
mclusterSim( clusters, ont = "HDO", organism = "hsa", measure = "Wang", combine = "BMA" )
clusters |
A list of gene clusters |
ont |
one of "HDO", "HPO" and "MPO" |
organism |
organism |
measure |
one of "Wang", "Resnik", "Rel", "Jiang", and "Lin". |
combine |
One of "max", "avg", "rcmax", "BMA" methods, for combining semantic similarity scores of multiple DO terms associated with gene/protein. |
similarity matrix
Guangchuang Yu
## Not run: cluster1 <- c("835", "5261","241") cluster2 <- c("578","582") cluster3 <- c("307", "308", "317") clusters <- list(a=cluster1, b=cluster2, c=cluster3) mclusterSim(clusters, measure="Wang") ## End(Not run)
## Not run: cluster1 <- c("835", "5261","241") cluster2 <- c("578","582") cluster3 <- c("307", "308", "317") clusters <- list(a=cluster1, b=cluster2, c=cluster3) mclusterSim(clusters, measure="Wang") ## End(Not run)
parse character ratio to double value, such as 1/5 to 0.2
parse_ratio(ratio)
parse_ratio(ratio)
ratio |
character vector of ratio to parse |
A numeric vector (double) of parsed ratio
Guangchuang Yu
mapping geneID to gene Symbol
setReadable(x, OrgDb, keyType = "auto")
setReadable(x, OrgDb, keyType = "auto")
x |
enrichResult Object |
OrgDb |
OrgDb |
keyType |
keyType of gene |
enrichResult Object
Yu Guangchuang
show method for gseaResult
instance
show method for enrichResult
instance
show(object) show(object)
show(object) show(object)
object |
A |
message
message
Guangchuang Yu https://yulab-smu.top
plotting similarity matrix
simplot( sim, xlab = "", ylab = "", color.low = "white", color.high = "red", labs = TRUE, digits = 2, labs.size = 3, font.size = 14 )
simplot( sim, xlab = "", ylab = "", color.low = "white", color.high = "red", labs = TRUE, digits = 2, labs.size = 3, font.size = 14 )
sim |
similarity matrix |
xlab |
xlab |
ylab |
ylab |
color.low |
color of low value |
color.high |
color of high value |
labs |
logical, add text label or not |
digits |
round digit numbers |
labs.size |
lable size |
font.size |
font size |
ggplot object
Yu Guangchuang
summary method for gseaResult
instance
summary method for enrichResult
instance
summary(object, ...) summary(object, ...)
summary(object, ...) summary(object, ...)
object |
A |
... |
additional parameter |
A data frame
A data frame
Guangchuang Yu https://guangchuangyu.github.io
Guangchuang Yu http://guangchuangyu.github.io
ggplot theme of DOSE
theme_dose(font.size = 14)
theme_dose(font.size = 14)
font.size |
font size |
ggplot theme
library(ggplot2) qplot(1:10) + theme_dose()
library(ggplot2) qplot(1:10) + theme_dose()