Title: | A universal enrichment tool for interpreting omics data |
---|---|
Description: | This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions. |
Authors: | Guangchuang Yu [aut, cre, cph] , Li-Gen Wang [ctb], Xiao Luo [ctb], Meijun Chen [ctb], Giovanni Dall'Olio [ctb], Wanqian Wei [ctb], Chun-Hui Gao [ctb] |
Maintainer: | Guangchuang Yu <[email protected]> |
License: | Artistic-2.0 |
Version: | 4.15.1 |
Built: | 2024-12-30 03:21:52 UTC |
Source: | https://github.com/bioc/clusterProfiler |
add KEGG pathway category information
append_kegg_category(x)
append_kegg_category(x)
x |
KEGG enrichment result |
This function appends the KEGG pathway category information to KEGG enrichment result (either output of 'enrichKEGG' or 'gseKEGG'
update KEGG enrichment result with category information
Guangchuang Yu
Biological Id TRanslator
bitr(geneID, fromType, toType, OrgDb, drop = TRUE)
bitr(geneID, fromType, toType, OrgDb, drop = TRUE)
geneID |
input gene id |
fromType |
input id type |
toType |
output id type |
OrgDb |
annotation db |
drop |
drop NA or not |
data.frame
Guangchuang Yu
convert biological ID using KEGG API
bitr_kegg(geneID, fromType, toType, organism, drop = TRUE)
bitr_kegg(geneID, fromType, toType, organism, drop = TRUE)
geneID |
input gene id |
fromType |
input id type |
toType |
output id type |
organism |
supported organism, can be search using search_kegg_organism function |
drop |
drop NA or not |
data.frame
Guangchuang Yu
open KEGG pathway with web browser
browseKEGG(x, pathID)
browseKEGG(x, pathID)
x |
an instance of enrichResult or gseaResult |
pathID |
pathway ID |
url
Guangchuang Yu
Given a list of gene set, this function will compute profiles of each gene cluster.
compareCluster( geneClusters, fun = "enrichGO", data = "", source_from = NULL, ... )
compareCluster( geneClusters, fun = "enrichGO", data = "", source_from = NULL, ... )
geneClusters |
a list of entrez gene id. Alternatively, a formula of type |
fun |
One of "groupGO", "enrichGO", "enrichKEGG", "enrichDO" or "enrichPathway" . Users can also supply their own function. |
data |
if geneClusters is a formula, the data from which the clusters must be extracted. |
source_from |
If using a custom function in "fun", provide the source package as a string here. Otherwise, the function will be obtained from the global environment. |
... |
Other arguments. |
A clusterProfResult
instance.
Guangchuang Yu https://yulab-smu.top
compareClusterResult-class
, groupGO
enrichGO
## Not run: data(gcSample) xx <- compareCluster(gcSample, fun="enrichKEGG", organism="hsa", pvalueCutoff=0.05) as.data.frame(xx) # plot(xx, type="dot", caption="KEGG Enrichment Comparison") dotplot(xx) ## formula interface mydf <- data.frame(Entrez=c('1', '100', '1000', '100101467', '100127206', '100128071'), logFC = c(1.1, -0.5, 5, 2.5, -3, 3), group = c('A', 'A', 'A', 'B', 'B', 'B'), othergroup = c('good', 'good', 'bad', 'bad', 'good', 'bad')) xx.formula <- compareCluster(Entrez~group, data=mydf, fun='groupGO', OrgDb='org.Hs.eg.db') as.data.frame(xx.formula) ## formula interface with more than one grouping variable xx.formula.twogroups <- compareCluster(Entrez~group+othergroup, data=mydf, fun='groupGO', OrgDb='org.Hs.eg.db') as.data.frame(xx.formula.twogroups) ## End(Not run)
## Not run: data(gcSample) xx <- compareCluster(gcSample, fun="enrichKEGG", organism="hsa", pvalueCutoff=0.05) as.data.frame(xx) # plot(xx, type="dot", caption="KEGG Enrichment Comparison") dotplot(xx) ## formula interface mydf <- data.frame(Entrez=c('1', '100', '1000', '100101467', '100127206', '100128071'), logFC = c(1.1, -0.5, 5, 2.5, -3, 3), group = c('A', 'A', 'A', 'B', 'B', 'B'), othergroup = c('good', 'good', 'bad', 'bad', 'good', 'bad')) xx.formula <- compareCluster(Entrez~group, data=mydf, fun='groupGO', OrgDb='org.Hs.eg.db') as.data.frame(xx.formula) ## formula interface with more than one grouping variable xx.formula.twogroups <- compareCluster(Entrez~group+othergroup, data=mydf, fun='groupGO', OrgDb='org.Hs.eg.db') as.data.frame(xx.formula.twogroups) ## End(Not run)
Datasets gcSample contains a sample of gene clusters.
Datasets kegg_species contains kegg species information
Datasets kegg_category contains kegg pathway category information
Datasets DE_GSE8057 contains differential epxressed genes obtained from GSE8057 dataset
download the latest version of KEGG pathway/module
download_KEGG(species, keggType = "KEGG", keyType = "kegg")
download_KEGG(species, keggType = "KEGG", keyType = "kegg")
species |
species |
keggType |
one of 'KEGG' or 'MKEGG' |
keyType |
supported keyType, see bitr_kegg |
list
Guangchuang Yu
drop GO term of specific level or specific terms (mostly too general).
dropGO(x, level = NULL, term = NULL)
dropGO(x, level = NULL, term = NULL)
x |
an instance of 'enrichResult' or 'compareClusterResult' |
level |
GO level |
term |
GO term |
modified version of x
Guangchuang Yu
enrichment analysis by DAVID
enrichDAVID( gene, idType = "ENTREZ_GENE_ID", universe, minGSSize = 10, maxGSSize = 500, annotation = "GOTERM_BP_FAT", pvalueCutoff = 0.05, pAdjustMethod = "BH", qvalueCutoff = 0.2, species = NA, david.user )
enrichDAVID( gene, idType = "ENTREZ_GENE_ID", universe, minGSSize = 10, maxGSSize = 500, annotation = "GOTERM_BP_FAT", pvalueCutoff = 0.05, pAdjustMethod = "BH", qvalueCutoff = 0.2, species = NA, david.user )
gene |
input gene |
idType |
id type |
universe |
background genes. If missing, the all genes listed in the database (eg TERM2GENE table) will be used as background. |
minGSSize |
minimal size of genes annotated for testing |
maxGSSize |
maximal size of genes annotated for testing |
annotation |
david annotation |
pvalueCutoff |
adjusted pvalue cutoff on enrichment tests to report |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
qvalueCutoff |
qvalue cutoff on enrichment tests to report as significant. Tests must pass i) |
species |
species |
david.user |
david user |
A enrichResult
instance
Guangchuang Yu
A universal enrichment analyzer
enricher( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe = NULL, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, gson = NULL, TERM2GENE, TERM2NAME = NA )
enricher( gene, pvalueCutoff = 0.05, pAdjustMethod = "BH", universe = NULL, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, gson = NULL, TERM2GENE, TERM2NAME = NA )
gene |
a vector of gene id |
pvalueCutoff |
adjusted pvalue cutoff on enrichment tests to report |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes. If missing, the all genes listed in the database (eg TERM2GENE table) will be used as background. |
minGSSize |
minimal size of genes annotated for testing |
maxGSSize |
maximal size of genes annotated for testing |
qvalueCutoff |
qvalue cutoff on enrichment tests to report as significant. Tests must pass i) |
gson |
a GSON object, if not NULL, use it as annotation data. |
TERM2GENE |
user input annotation of TERM TO GENE mapping, a data.frame of 2 column with term and gene. Only used when gson is NULL. |
TERM2NAME |
user input of TERM TO NAME mapping, a data.frame of 2 column with term and name. Only used when gson is NULL. |
A enrichResult
instance
Guangchuang Yu https://yulab-smu.top
GO Enrichment Analysis of a gene set. Given a vector of genes, this function will return the enrichment GO categories after FDR control.
enrichGO( gene, OrgDb, keyType = "ENTREZID", ont = "MF", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, qvalueCutoff = 0.2, minGSSize = 10, maxGSSize = 500, readable = FALSE, pool = FALSE )
enrichGO( gene, OrgDb, keyType = "ENTREZID", ont = "MF", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, qvalueCutoff = 0.2, minGSSize = 10, maxGSSize = 500, readable = FALSE, pool = FALSE )
gene |
a vector of entrez gene id. |
OrgDb |
OrgDb |
keyType |
keytype of input gene |
ont |
One of "BP", "MF", and "CC" subontologies, or "ALL" for all three. |
pvalueCutoff |
adjusted pvalue cutoff on enrichment tests to report |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes. If missing, the all genes listed in the database (eg TERM2GENE table) will be used as background. |
qvalueCutoff |
qvalue cutoff on enrichment tests to report as significant. Tests must pass i) |
minGSSize |
minimal size of genes annotated by Ontology term for testing. |
maxGSSize |
maximal size of genes annotated for testing |
readable |
whether mapping gene ID to gene Name |
pool |
If ont='ALL', whether pool 3 GO sub-ontologies |
An enrichResult
instance.
Guangchuang Yu https://yulab-smu.top
enrichResult-class
, compareCluster
## Not run: data(geneList, package = "DOSE") de <- names(geneList)[1:100] yy <- enrichGO(de, 'org.Hs.eg.db', ont="BP", pvalueCutoff=0.01) head(yy) ## End(Not run)
## Not run: data(geneList, package = "DOSE") de <- names(geneList)[1:100] yy <- enrichGO(de, 'org.Hs.eg.db', ont="BP", pvalueCutoff=0.01) head(yy) ## End(Not run)
KEGG Enrichment Analysis of a gene set. Given a vector of genes, this function will return the enrichment KEGG categories with FDR control.
enrichKEGG( gene, organism = "hsa", keyType = "kegg", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, use_internal_data = FALSE )
enrichKEGG( gene, organism = "hsa", keyType = "kegg", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2, use_internal_data = FALSE )
gene |
a vector of entrez gene id. |
organism |
supported organism listed in 'https://www.genome.jp/kegg/catalog/org_list.html' |
keyType |
one of "kegg", 'ncbi-geneid', 'ncbi-proteinid' and 'uniprot' |
pvalueCutoff |
adjusted pvalue cutoff on enrichment tests to report |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes. If missing, the all genes listed in the database (eg TERM2GENE table) will be used as background. |
minGSSize |
minimal size of genes annotated by Ontology term for testing. |
maxGSSize |
maximal size of genes annotated for testing |
qvalueCutoff |
qvalue cutoff on enrichment tests to report as significant. Tests must pass i) |
use_internal_data |
logical, use KEGG.db or latest online KEGG data |
A enrichResult
instance.
Guangchuang Yu https://yulab-smu.top
enrichResult-class
, compareCluster
## Not run: data(geneList, package='DOSE') de <- names(geneList)[1:100] yy <- enrichKEGG(de, pvalueCutoff=0.01) head(yy) ## End(Not run)
## Not run: data(geneList, package='DOSE') de <- names(geneList)[1:100] yy <- enrichKEGG(de, pvalueCutoff=0.01) head(yy) ## End(Not run)
KEGG Module Enrichment Analysis of a gene set. Given a vector of genes, this function will return the enrichment KEGG Module categories with FDR control.
enrichMKEGG( gene, organism = "hsa", keyType = "kegg", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2 )
enrichMKEGG( gene, organism = "hsa", keyType = "kegg", pvalueCutoff = 0.05, pAdjustMethod = "BH", universe, minGSSize = 10, maxGSSize = 500, qvalueCutoff = 0.2 )
gene |
a vector of entrez gene id. |
organism |
supported organism listed in 'https://www.genome.jp/kegg/catalog/org_list.html' |
keyType |
one of "kegg", 'ncbi-geneid', 'ncbi-proteinid' and 'uniprot' |
pvalueCutoff |
adjusted pvalue cutoff on enrichment tests to report |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
universe |
background genes. If missing, the all genes listed in the database (eg TERM2GENE table) will be used as background. |
minGSSize |
minimal size of genes annotated by Ontology term for testing. |
maxGSSize |
maximal size of genes annotated for testing |
qvalueCutoff |
qvalue cutoff on enrichment tests to report as significant. Tests must pass i) |
A enrichResult
instance.
ORA analysis for Pathway Commons
enrichPC(gene, ...)
enrichPC(gene, ...)
gene |
a vector of genes (either hgnc symbols or uniprot IDs) |
... |
additional parameters, see also the parameters supported by the enricher() function |
This function performs over-representation analysis using Pathway Commons
A enrichResult
instance
ORA analysis for WikiPathways
enrichWP(gene, organism, ...)
enrichWP(gene, organism, ...)
gene |
a vector of entrez gene id |
organism |
supported organisms, which can be accessed via the get_wp_organisms() function |
... |
additional parameters, see also the parameters supported by the enricher() function |
This function performs over-representation analysis using WikiPathways
A enrichResult
instance
Guangchuang Yu
list supported organism of WikiPathways
get_wp_organisms()
get_wp_organisms()
This function extracts information from 'https://data.wikipathways.org/current/gmt/' and lists all supported organisms
supported organism list
Guangchuang Yu
getPPI
getPPI( x, ID = 1, taxID = "auto", required_score = NULL, network_type = "functional", add_nodes = 0, show_query_node_labels = 0, output = "igraph" )
getPPI( x, ID = 1, taxID = "auto", required_score = NULL, network_type = "functional", add_nodes = 0, show_query_node_labels = 0, output = "igraph" )
x |
an 'enrichResult“ object or a vector of proteins, e.g. 'c("PTCH1", "TP53", "BRCA1", "BRCA2")' |
ID |
ID or index to extract genes in the enriched term(s) if 'x' is an 'enrichResult' object |
taxID |
NCBI taxon identifiers (e.g. Human is 9606, see: [STRING organisms](https://string-db.org/cgi/input.pl?input_page_active_form=organisms). |
required_score |
threshold of significance to include a interaction, a number between 0 and 1000 (default depends on the network) |
network_type |
network type: functional (default), physical |
add_nodes |
adds a number of proteins with to the network based on their confidence score (default:1) |
show_query_node_labels |
when available use submitted names in the preferredName column when (0 or 1) (default:0) |
output |
one of 'data.frame' or 'igraph' |
[Getting the STRING network interactions](https://string-db.org/cgi/help.pl?sessionId=btsvnCeNrBk7).
a 'data.frame' or an 'igraph' object
Yonghe Xia and modified by Guangchuang Yu
Convert species scientific name to taxonomic ID
getTaxID(species)
getTaxID(species)
species |
scientific name of a species |
taxonomic ID
Guangchuang Yu
Query taxonomy information from 'stringdb' or 'ensembl' web services
getTaxInfo(species, source = "stringdb")
getTaxInfo(species, source = "stringdb")
species |
scientific name of a species |
source |
one of 'stringdb' or 'ensembl' |
a 'data.frame' of query information
Guangchuang Yu
read GFF file and build gene information table
Gff2GeneTable(gffFile, compress = TRUE)
Gff2GeneTable(gffFile, compress = TRUE)
gffFile |
GFF file |
compress |
compress file or not |
given a GFF file, this function extracts information from it and save it in working directory
file save.
Yu Guangchuang
convert goid to ontology (BP, CC, MF)
go2ont(goid)
go2ont(goid)
goid |
a vector of GO IDs |
data.frame
Guangchuang Yu
convert goid to descriptive term
go2term(goid)
go2term(goid)
goid |
a vector of GO IDs |
data.frame
Guangchuang Yu
filter GO enriched result at specific level
gofilter(x, level = 4)
gofilter(x, level = 4)
x |
output from enrichGO or compareCluster |
level |
GO level |
updated object
Guangchuang Yu
Functional Profile of a gene set at specific GO level. Given a vector of genes, this function will return the GO profile at a specific level.
groupGO( gene, OrgDb, keyType = "ENTREZID", ont = "CC", level = 2, readable = FALSE )
groupGO( gene, OrgDb, keyType = "ENTREZID", ont = "CC", level = 2, readable = FALSE )
gene |
a vector of entrez gene id. |
OrgDb |
OrgDb |
keyType |
key type of input gene |
ont |
One of "MF", "BP", and "CC" subontologies. |
level |
Specific GO Level. |
readable |
if readable is TRUE, the gene IDs will mapping to gene symbols. |
A groupGOResult
instance.
Guangchuang Yu https://yulab-smu.top
groupGOResult-class
, compareCluster
data(gcSample) yy <- groupGO(gcSample[[1]], 'org.Hs.eg.db', ont="BP", level=2) head(summary(yy)) #plot(yy)
data(gcSample) yy <- groupGO(gcSample[[1]], 'org.Hs.eg.db', ont="BP", level=2) head(summary(yy)) #plot(yy)
Class "groupGOResult" This class represents the result of functional Profiles of a set of gene at specific GO level.
result
GO classification result
ontology
Ontology
level
GO level
organism
one of "human", "mouse" and "yeast"
gene
Gene IDs
readable
logical flag of gene ID in symbol or not.
Guangchuang Yu https://yulab-smu.top
compareClusterResult
compareCluster
groupGO
a universal gene set enrichment analysis tools
GSEA( geneList, exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", gson = NULL, TERM2GENE, TERM2NAME = NA, verbose = TRUE, seed = FALSE, by = "fgsea", ... )
GSEA( geneList, exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", gson = NULL, TERM2GENE, TERM2NAME = NA, verbose = TRUE, seed = FALSE, by = "fgsea", ... )
geneList |
order ranked geneList |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of genes annotated for testing |
eps |
This parameter sets the boundary for calculating the p value. |
pvalueCutoff |
adjusted pvalue cutoff |
pAdjustMethod |
one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none" |
gson |
a GSON object, if not NULL, use it as annotation data. |
TERM2GENE |
user input annotation of TERM TO GENE mapping, a data.frame of 2 column with term and gene. Only used when gson is NULL. |
TERM2NAME |
user input of TERM TO NAME mapping, a data.frame of 2 column with term and name. Only used when gson is NULL. |
verbose |
logical |
seed |
logical |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Guangchuang Yu https://yulab-smu.top
Gene Set Enrichment Analysis of Gene Ontology
gseGO( geneList, ont = "BP", OrgDb, keyType = "ENTREZID", exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
gseGO( geneList, ont = "BP", OrgDb, keyType = "ENTREZID", exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
geneList |
order ranked geneList |
ont |
one of "BP", "MF", and "CC" subontologies, or "ALL" for all three. |
OrgDb |
OrgDb |
keyType |
keytype of gene |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of genes annotated for testing |
eps |
This parameter sets the boundary for calculating the p value. |
pvalueCutoff |
pvalue Cutoff |
pAdjustMethod |
pvalue adjustment method |
verbose |
print message or not |
seed |
logical |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Yu Guangchuang
Gene Set Enrichment Analysis of KEGG
gseKEGG( geneList, organism = "hsa", keyType = "kegg", exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, use_internal_data = FALSE, seed = FALSE, by = "fgsea", ... )
gseKEGG( geneList, organism = "hsa", keyType = "kegg", exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, use_internal_data = FALSE, seed = FALSE, by = "fgsea", ... )
geneList |
order ranked geneList |
organism |
supported organism listed in 'https://www.genome.jp/kegg/catalog/org_list.html' |
keyType |
one of "kegg", 'ncbi-geneid', 'ncib-proteinid' and 'uniprot' |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of genes annotated for testing |
eps |
This parameter sets the boundary for calculating the p value. |
pvalueCutoff |
pvalue Cutoff |
pAdjustMethod |
pvalue adjustment method |
verbose |
print message or not |
use_internal_data |
logical, use KEGG.db or latest online KEGG data |
seed |
logical |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Yu Guangchuang
Gene Set Enrichment Analysis of KEGG Module
gseMKEGG( geneList, organism = "hsa", keyType = "kegg", exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
gseMKEGG( geneList, organism = "hsa", keyType = "kegg", exponent = 1, minGSSize = 10, maxGSSize = 500, eps = 1e-10, pvalueCutoff = 0.05, pAdjustMethod = "BH", verbose = TRUE, seed = FALSE, by = "fgsea", ... )
geneList |
order ranked geneList |
organism |
supported organism listed in 'https://www.genome.jp/kegg/catalog/org_list.html' |
keyType |
one of "kegg", 'ncbi-geneid', 'ncib-proteinid' and 'uniprot' |
exponent |
weight of each step |
minGSSize |
minimal size of each geneSet for analyzing |
maxGSSize |
maximal size of genes annotated for testing |
eps |
This parameter sets the boundary for calculating the p value. |
pvalueCutoff |
pvalue Cutoff |
pAdjustMethod |
pvalue adjustment method |
verbose |
print message or not |
seed |
logical |
by |
one of 'fgsea' or 'DOSE' |
... |
other parameter |
gseaResult object
Yu Guangchuang
GSEA analysis for Pathway Commons
gsePC(geneList, ...)
gsePC(geneList, ...)
geneList |
a ranked gene list |
... |
additional parameters, see also the parameters supported by the GSEA() function |
This function performs GSEA using Pathway Commons
A gseaResult
instance
GSEA analysis for WikiPathways
gseWP(geneList, organism, ...)
gseWP(geneList, organism, ...)
geneList |
ranked gene list |
organism |
supported organisms, which can be accessed via the get_wp_organisms() function |
... |
additional parameters, see also the parameters supported by the GSEA() function |
This function performs GSEA using WikiPathways
A gseaResult
instance
Guangchuang Yu
download the latest version of KEGG pathway and stored in a 'GSON' object
gson_GO(OrgDb, keytype = "ENTREZID", ont = "BP")
gson_GO(OrgDb, keytype = "ENTREZID", ont = "BP")
OrgDb |
OrgDb |
keytype |
keytype of genes. |
ont |
one of "BP", "MF", "CC", and "ALL" |
a 'GSON' object
download the latest version of KEGG pathway and stored in a 'GSON' object
gson_KEGG(species, KEGG_Type = "KEGG", keyType = "kegg")
gson_KEGG(species, KEGG_Type = "KEGG", keyType = "kegg")
species |
species |
KEGG_Type |
one of "KEGG" and "MKEGG" |
keyType |
one of "kegg", 'ncbi-geneid', 'ncib-proteinid' and 'uniprot'. |
a 'GSON' object
Guangchuang Yu
KEGG Mapper service can annotate protein sequences for novel species with KO database, and KO annotation need to be converted into Pathway or Module annotation, which can then be used in 'clusterProfiler'
gson_KEGG_mapper( file, format = c("BLAST", "Ghost", "Kofam"), type = c("pathway", "module"), species = NULL, ... )
gson_KEGG_mapper( file, format = c("BLAST", "Ghost", "Kofam"), type = c("pathway", "module"), species = NULL, ... )
file |
the name of the file which comes from the KEGG Mapper service, see Details for file format |
format |
string indicate format of KEGG Mapper result |
type |
string indicate annotation database |
species |
your species, NULL if ignored |
... |
pass to gson::gson() |
File is a two-column dataset with K numbers in the second column, optionally preceded by the user's identifiers in the first column. This is consistent with the output files of automatic annotation servers, BlastKOALA, GhostKOALA, and KofamKOALA. KOALA (KEGG Orthology And Links Annotation) is KEGG's internal annotation tool for K number assignment of KEGG GENES using SSEARCH computation. BlastKOALA and GhostKOALA assign K numbers to the user's sequence data by BLAST and GHOSTX searches, respectively, against a nonredundant set of KEGG GENES. KofamKOALA is a new member of the KOALA family available at GenomeNet using the HMM profile search, rather than the sequence similarity search, for K number assignment. see https://www.kegg.jp/blastkoala/, https://www.kegg.jp/ghostkoala/ and https://www.genome.jp/tools/kofamkoala/ for more information.
a gson instance
## Not run: file = system.file('extdata', "kegg_mapper_blast.txt", package='clusterProfiler') gson_KEGG_mapper(file, format = "BLAST", type = "pathway") ## End(Not run)
## Not run: file = system.file('extdata', "kegg_mapper_blast.txt", package='clusterProfiler') gson_KEGG_mapper(file, format = "BLAST", type = "pathway") ## End(Not run)
Download the latest version of WikiPathways data and stored in a 'GSON' object
gson_WP(organism)
gson_WP(organism)
organism |
supported organism, which can be accessed via the get_wp_organisms() function. |
list ID types supported by annoDb
idType(OrgDb = "org.Hs.eg.db")
idType(OrgDb = "org.Hs.eg.db")
OrgDb |
annotation db |
character vector
Guangchuang Yu
convert ko ID to descriptive name
ko2name(ko)
ko2name(ko)
ko |
ko ID |
data.frame
guangchuang yu
merge a list of enrichResult objects to compareClusterResult
merge_result(enrichResultList)
merge_result(enrichResultList)
enrichResultList |
a list of enrichResult objects |
a compareClusterResult instance
Guangchuang Yu
plot GO graph
plotGOgraph( x, firstSigNodes = 10, useInfo = "all", sigForAll = TRUE, useFullNames = TRUE, ... )
plotGOgraph( x, firstSigNodes = 10, useInfo = "all", sigForAll = TRUE, useFullNames = TRUE, ... )
x |
output of enrichGO or gseGO |
firstSigNodes |
number of significant nodes (retangle nodes in the graph) |
useInfo |
additional info |
sigForAll |
if TRUE the score/p-value of all nodes in the DAG is shown, otherwise only score will be shown |
useFullNames |
logical |
... |
additional parameter of showSigOfNodes, please refer to topGO |
GO DAG graph
Guangchuang Yu
Parse gmt file from Pathway Common
read.gmt.pc(gmtfile, output = "data.frame")
read.gmt.pc(gmtfile, output = "data.frame")
gmtfile |
A gmt file |
output |
one of 'data.frame' or 'GSON' |
This function parse gmt file downloaded from Pathway common
A data.frame or A GSON object depends on the value of 'output'
search kegg organism, listed in https://www.genome.jp/kegg/catalog/org_list.html
search_kegg_organism( str, by = "scientific_name", ignore.case = FALSE, use_internal_data = TRUE )
search_kegg_organism( str, by = "scientific_name", ignore.case = FALSE, use_internal_data = TRUE )
str |
string |
by |
one of 'kegg.code', 'scientific_name' and 'common_name' |
ignore.case |
TRUE or FALSE |
use_internal_data |
logical, use kegg_species.rda or latest online KEGG data |
data.frame
Guangchuang Yu
simplify output from enrichGO and gseGO by removing redundancy of enriched GO terms
simplify output from compareCluster by removing redundancy of enriched GO terms
## S4 method for signature 'enrichResult' simplify( x, cutoff = 0.7, by = "p.adjust", select_fun = min, measure = "Wang", semData = NULL ) ## S4 method for signature 'gseaResult' simplify( x, cutoff = 0.7, by = "p.adjust", select_fun = min, measure = "Wang", semData = NULL ) ## S4 method for signature 'compareClusterResult' simplify( x, cutoff = 0.7, by = "p.adjust", select_fun = min, measure = "Wang", semData = NULL )
## S4 method for signature 'enrichResult' simplify( x, cutoff = 0.7, by = "p.adjust", select_fun = min, measure = "Wang", semData = NULL ) ## S4 method for signature 'gseaResult' simplify( x, cutoff = 0.7, by = "p.adjust", select_fun = min, measure = "Wang", semData = NULL ) ## S4 method for signature 'compareClusterResult' simplify( x, cutoff = 0.7, by = "p.adjust", select_fun = min, measure = "Wang", semData = NULL )
x |
output of enrichGO |
cutoff |
similarity cutoff |
by |
feature to select representative term, selected by 'select_fun' function |
select_fun |
function to select feature passed by 'by' parameter |
measure |
method to measure similarity |
semData |
GOSemSimDATA object |
updated enrichResult object
updated compareClusterResult object
Guangchuang Yu
Gwang-Jin Kim and Guangchuang Yu
issue #28 https://github.com/GuangchuangYu/clusterProfiler/issues/28
issue #162 https://github.com/GuangchuangYu/clusterProfiler/issues/162
retreve annotation data from uniprot
uniprot_get(taxID)
uniprot_get(taxID)
taxID |
taxonomy ID |
gene table data frame
guangchuang yu