Title: | NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment |
---|---|
Description: | While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast. |
Authors: | Gulden Olgun [aut, cre] |
Maintainer: | Gulden Olgun <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.19.0 |
Built: | 2024-10-30 08:33:54 UTC |
Source: | https://github.com/bioc/NoRCE |
Annotate the set of genes with the GO terms for a given species and assembly
annGO( genes, GOtype = c("BP", "CC", "MF"), org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
annGO( genes, GOtype = c("BP", "CC", "MF"), org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
genes |
List of mRNA genes. Supported format for genes is Hugo. |
GOtype |
Hierarchical category of the GO ontology. Possible values are 'BP', 'CC', 'MF'. |
org_assembly |
Genome assembly of interest. Possible assemblies are 'mm10' for mouse, 'dre10' for zebrafish, 'rn6' for rat, 'dm6' for fruit fly, 'ce11' for worm, 'hg19' and 'hg38' for human |
data frame of the GO term annotation of the genes
Get the required information for the given assembly
assembly( org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
assembly( org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
setting required information
## Not run: assembly('hg19') ## End(Not run)
## Not run: assembly('hg19') ## End(Not run)
Differentially expressed non-coding gene
brain_disorder_ncRNA
brain_disorder_ncRNA
Not Available
http://resource.psychencode.org/
data(brain_disorder_ncRNA)
data(brain_disorder_ncRNA)
Differentially expressed human brain data
brain_mirna
brain_mirna
Not Available
http://resource.psychencode.org/
data(brain_mirna)
data(brain_mirna)
Protein coding genes that are differentially expressed in TCGA breast cancer RNAseq data.
breastmRNA
breastmRNA
Not Available
https://portal.gdc.cancer.gov/
data(breastmRNA)
data(breastmRNA)
Calculates the correlation coefficient values between two custom expression data.
calculateCorr( exp1, exp2, label1 = "", label2 = "", corrMethod = "pearson", varCutoff = 0.0025, corCutoff = 0.3, pcut = 0.05, alternate = "greater", conf = 0.95 )
calculateCorr( exp1, exp2, label1 = "", label2 = "", corrMethod = "pearson", varCutoff = 0.0025, corCutoff = 0.3, pcut = 0.05, alternate = "greater", conf = 0.95 )
exp1 |
Custom expression data matrix or SummarizedExperiment data. Columns must be genes and rows must be patients. |
exp2 |
Custom expression data matrix or SummarizedExperiment data. Columns must be genes and rows must be patients. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
corrMethod |
Correlation coeffient method that will be used for evaluation. Possible values are "pearson", "kendall", "spearman" |
varCutoff |
Variance cut off that genes have less variance than this value will be trimmed |
corCutoff |
Correlation cut off values for the given correlation method |
pcut |
P-value cut off for the correlation values |
alternate |
Holds the alternative hypothesis and "two.sided", "greater" or "less" are the possible values. |
conf |
Confidence level for the returned confidence interval. It is only used for the Pearson correlation coefficient if there are at least 4 complete pairs of observations. |
Pairwise relations between gene-gene with corresponding correlation value and pvalue
## Not run: #Assume that mirnanorce and mrnanorce are custom patient by gene data a<-calculateCorr(exp1 = mirna, exp2 = mrna ) ## End(Not run)
## Not run: #Assume that mirnanorce and mrnanorce are custom patient by gene data a<-calculateCorr(exp1 = mirna, exp2 = mrna ) ## End(Not run)
Convert gene ids according to the gene type
convertGeneID( genetype = c("Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI"), genelist, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
convertGeneID( genetype = c("Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI"), genelist, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
genetype |
Type of the input gene list. Possible values are "Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI". For HUGO gene symbol "NCBI" value, for Entrez gene id "Entrez", for mirbase id "mirna" is used. |
genelist |
Input gene list |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
GRange object of the given input
## Not run: convGene <-convertGeneID(genetype = "mirna", genelist = brain_mirna[1:30,], org_assembly = 'hg19') ## End(Not run)
## Not run: convGene <-convertGeneID(genetype = "mirna", genelist = brain_mirna[1:30,], org_assembly = 'hg19') ## End(Not run)
Convert gmt formatted pathway file to the Pathway ID, Entrez, symbol formatted data frame
convertGMT(gmtName, org_assembly, isSymbol = FALSE)
convertGMT(gmtName, org_assembly, isSymbol = FALSE)
gmtName |
Custom pathway gmt file |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
isSymbol |
Boolean variable that hold the gene format of the gmt file. If it is set as TRUE, gene format of the gmt file should be symbol. Otherwise, gene format should be ENTREZ ID. By default, it is FALSE. |
return data frame
Pearson correlation coefficient value of the miRNA genes between miRNA:mRNA for a given correlation cut-off and cancer.
corrbased(mirnagene, cancer, minAbsCor, databaseFile)
corrbased(mirnagene, cancer, minAbsCor, databaseFile)
mirnagene |
Data frame of the miRNA genes in mature format |
cancer |
Name of the TCGA project code such as 'BRCA' that is analyzed for miRNA-mRNA correlation. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
minAbsCor |
Cut-off value for the Pearson correlation coefficient of the miRNA-mRNA |
databaseFile |
Path of the miRcancer.db file |
Data frame of the miRNA-mRNA correlation result
Pearson correlation coefficient value of the mRNA genes between miRNA:mRNA for a given correlation cut-off and cancer.
corrbasedMrna(mRNAgene, cancer, minAbsCor, databaseFile)
corrbasedMrna(mRNAgene, cancer, minAbsCor, databaseFile)
mRNAgene |
Data frame of the mRNA genes |
cancer |
Name of the TCGA project code such as 'BRCA' that is analyzed for miRNA-mRNA correlation. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
minAbsCor |
Cut-off value for the Pearson correlation coefficient of the miRNA-mRNA |
databaseFile |
Path of miRcancer.db file |
Data frame of the miRNA-mRNA correlation result
Create interaction network for top n enriched GO term:coding RNA or GO-term:noncoding RNA interaction. Nodes are GO term and RNA, edges are interactions between them. Each GO-term is annotated and enriched with the mRNAs provided from the input list.
createNetwork( mrnaObject, type = "pvalue", n, isNonCode = FALSE, takeID = FALSE )
createNetwork( mrnaObject, type = "pvalue", n, isNonCode = FALSE, takeID = FALSE )
mrnaObject |
Output of enrichment results |
type |
Sort in terms of p-values or FDR. Possible values "pvalue", "padjust" |
n |
Number of top enrichments |
isNonCode |
Boolean value that checks whether node of the network is GO-term\& coding or GO-term\& noncoding genes. By default, it is FALSE so node of the network is GO-term\& coding gene. Otherwise, nodes are GO-term\& noncoding genes. |
takeID |
Boolean value that checks the name decision of the GO/pathway node, GO-term/pathway-term or GO ID-pathway ID. If it is true, name of the GO/pathway node will be GO ID/pathway ID will be used, otherwise, name of the GO/pathway node is GO-term. By default, it is FALSE. It is suggested to used when the GO-term is two long or the GO-term is missing for the custom enrichment database. |
Network
Draw dot plot of the enrichment object
drawDotPlot(mrnaObject, type = "pAdjust", n)
drawDotPlot(mrnaObject, type = "pAdjust", n)
mrnaObject |
Object of the enrichment result |
type |
Draw the dot plot according to the p-value or adjusted p-value ("pvalue", "pAdjust") |
n |
Number of GO terms or pathways, that ordered by type and has least number of top p-value |
Dot plot of the top n enrichment results
Get the biotype of the non-coding genes. It is suitable for the GENCODE gtf files
extractBiotype(gtfFile)
extractBiotype(gtfFile)
gtfFile |
Path of the input gtf file which contains biotype information. The gtf file must be provided from the Ensembl or Gencode site. For space efficiency, gft files should be in a zip format. |
Tabular form of the gtf file with the required features such as gene id and biotypes
## Not run: fileImport<-system.file("extdata", "temp.gtf", package = "NoRCE") gtf <- extractBiotype(gtfFile = fileImport) ## End(Not run)
## Not run: fileImport<-system.file("extdata", "temp.gtf", package = "NoRCE") gtf <- extractBiotype(gtfFile = fileImport) ## End(Not run)
Extract the genes that have user provided biotypes. This method is useful when input gene list is mixed or when research of the interest is only focused on specific group of genes.
filterBiotype(gtfFile, biotypes)
filterBiotype(gtfFile, biotypes)
gtfFile |
Input gtf file for the genes provided by the extractBiotype function |
biotypes |
Selected biotypes for the genes |
Table format of genes with a given biotypes
## Not run: biotypes <- c('unprocessed_pseudogene','transcribed_unprocessed_pseudogene') fileImport<-system.file("extdata", "temp.gtf", package = "NoRCE") extrResult <- filterBiotype(fileImport, biotypes) ## End(Not run)
## Not run: biotypes <- c('unprocessed_pseudogene','transcribed_unprocessed_pseudogene') fileImport<-system.file("extdata", "temp.gtf", package = "NoRCE") extrResult <- filterBiotype(fileImport, biotypes) ## End(Not run)
Given genes that fall in a given upstream and downstream region of mRNAs of interest, GO term enrichment analysis is carried out
geneGOEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), genetype = c("Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI"), backG = "", backGType = "pc_gene", near = FALSE, isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile )
geneGOEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), genetype = c("Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI"), backG = "", backGType = "pc_gene", near = FALSE, isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile )
gene |
Input genes other than miRNA |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
genetype |
Type of the input gene list. Possible values are "Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI". For HUGO gene symbol "NCBI" value, for Entrez gene id "Entrez" is used. |
backG |
The set of genes that tested against to the input (background gene) |
backGType |
Type of the background gene. If miRNA gene set is used for background gene, backGType should be set to the 'mirna' |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
GO term enrichment object for the given input
## Not run: ncGO<-geneGOEnricher(gene = brain_disorder_ncRNA, org_assembly='hg19', near=TRUE, genetype = 'Ensembl_gene') ## End(Not run)
## Not run: ncGO<-geneGOEnricher(gene = brain_disorder_ncRNA, org_assembly='hg19', near=TRUE, genetype = 'Ensembl_gene') ## End(Not run)
Given genes that fall in the given upstream and downstream region of mRNAs of interest, pathway enrichment analysis is carried out
genePathwayEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), genetype = c("Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI"), near = TRUE, isTADSearch = FALSE, TAD = tad_hg19, gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
genePathwayEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), genetype = c("Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI"), near = TRUE, isTADSearch = FALSE, TAD = tad_hg19, gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
gene |
Input noncoding genes other than miRNA |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
genetype |
Type of the input gene list. Possible values are "Entrez", "mirna", "Ensembl_gene", "Ensembl_trans", "NCBI". For HUGO gene symbol "NCBI" value, for Entrez gene id "Entrez", for mirbase id "mirna" is used. |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
gmtName |
Custom pathway gmt file |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL,COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM, LGG |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
Pathway enrichment object for the given input
## Not run: #Pathway enrichment based on the gen sets that falls into the TAD regions ncRNAPathway<-genePathwayEnricher(gene = brain_disorder_ncRNA , org_assembly='hg19', isTADSearch = TRUE, TAD = tad_hg19, genetype = 'Ensembl_gene') ## End(Not run)
## Not run: #Pathway enrichment based on the gen sets that falls into the TAD regions ncRNAPathway<-genePathwayEnricher(gene = brain_disorder_ncRNA , org_assembly='hg19', isTADSearch = TRUE, TAD = tad_hg19, genetype = 'Ensembl_gene') ## End(Not run)
Given gene regions that fall in the given upstream and downstream region of mRNAs of interest, GO term enrichment analysis is carried out
geneRegionGOEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = TRUE, backG = "", backGType = "pc_gene", isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile )
geneRegionGOEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = TRUE, backG = "", backGType = "pc_gene", isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile )
region |
Bed format of the input gene regions other than miRNA |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
backG |
The set of genes that tested against to the input (background gene) |
backGType |
Type of the background gene. If miRNA gene set is used for background gene, backGType should be set to the 'mirna' |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
GO term enrichment object for the given input
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") regionGO<-geneRegionGOEnricher(region = regionNC, org_assembly= 'hg19', near = TRUE) ## End(Not run)
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") regionGO<-geneRegionGOEnricher(region = regionNC, org_assembly= 'hg19', near = TRUE) ## End(Not run)
Given gene regions that fall in the given upstream and downstream region of mRNAs of interest, pathway enrichment analysis is carried out
geneRegionPathwayEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, isTADSearch = FALSE, TAD = tad_hg19, gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
geneRegionPathwayEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, isTADSearch = FALSE, TAD = tad_hg19, gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
region |
Bed format of input gene regions other than miRNA. Input must be Granges object. |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
gmtName |
Custom pathway gmt file |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
Pathway enrichment object of the given input
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") ncPath<-geneRegionPathwayEnricher(region = regionNC, org_assembly = 'hg19', near = TRUE) ## End(Not run)
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") ncPath<-geneRegionPathwayEnricher(region = regionNC, org_assembly = 'hg19', near = TRUE) ## End(Not run)
Plot and save the GO term DAG of the top n enrichments in terms of p-values or adjusted p-values with an user provided format
getGoDag( mrnaObject, type, n, filename, imageFormat, p_range = seq(0, 0.05, by = 0.001) )
getGoDag( mrnaObject, type, n, filename, imageFormat, p_range = seq(0, 0.05, by = 0.001) )
mrnaObject |
Output of enrichment results |
type |
Sort in terms of p-values or FDR. possible values "pvalue", "padjust" |
n |
Number of top enrichments |
filename |
Name of the DAG file |
imageFormat |
Image format of the DAG. possible values "png" or "svg" |
p_range |
Break points for the p-values or FDR. By default [0.05, 0.001, 0.0005, 0.0001, 0.00005,0.00001,0] is used |
Saves image file in a given format
## Not run: ncRNAPathway<-mirnaPathwayEnricher(gene = brain_mirna, org_assembly = 'hg19',near = TRUE) ## End(Not run)
## Not run: ncRNAPathway<-mirnaPathwayEnricher(gene = brain_mirna, org_assembly = 'hg19',near = TRUE) ## End(Not run)
Display the enriched KEGG diagram of the KEGG pathway. This function is specific to only one KEGG pathway id and identifies the enriched genes in the diagram.
getKeggDiagram( mrnaObject, pathway, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
getKeggDiagram( mrnaObject, pathway, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
mrnaObject |
Output of enrichment results |
pathway |
Kegg pathway term such as 'hsa04010' |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
Shows kegg diagram marked with an enriched genes in a browser
## Not run: ncRNAPathway<-mirnaPathwayEnricher(gene = brain_mirna, org_assembly = 'hg19',near = TRUE) getKeggDiagram(mrnaObject = ncRNAPathway, org_assembly ='hg19', pathway = ncRNAPathway@ID[1]) ## End(Not run)
## Not run: ncRNAPathway<-mirnaPathwayEnricher(gene = brain_mirna, org_assembly = 'hg19',near = TRUE) getKeggDiagram(mrnaObject = ncRNAPathway, org_assembly ='hg19', pathway = ncRNAPathway@ID[1]) ## End(Not run)
Get TCGA miRNAseq expression of miRNA genes for the given cancer
getmiRNACount(mirnagene, cancer, databaseFile)
getmiRNACount(mirnagene, cancer, databaseFile)
mirnagene |
Data frame of the mature format |
cancer |
Name of the TCGA project code such as 'BRCA' |
databaseFile |
Path of miRcancer.db file |
Data frame of the raw read count of the given miRNA genes for different patients
Get only those neighbouring genes that fall within exon region
getNearToExon( bedfile, upstream, downstream, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
getNearToExon( bedfile, upstream, downstream, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
bedfile |
Input bed formated file |
upstream |
Maximum upstream distance from the TSS position |
downstream |
Maximum downstream distance from the TES position |
org_assembly |
genomee assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
genes
## Not run: regions <- system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") r<-getNearToExon(bedfile = regionNC, upstream = 1000, downstream = 2000, org_assembly = 'hg19') ## End(Not run)
## Not run: regions <- system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") r<-getNearToExon(bedfile = regionNC, upstream = 1000, downstream = 2000, org_assembly = 'hg19') ## End(Not run)
Get only those neighbouring genes that fall within intron region
getNearToIntron( bedfile, upstream, downstream, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
getNearToIntron( bedfile, upstream, downstream, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
bedfile |
Bed file |
upstream |
upstream distance |
downstream |
downstream distance |
org_assembly |
genomee assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
genes
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") r<-getNearToExon(bedfile = regionNC, upstream = 1000, downstream = 2000, org_assembly = 'hg19') ## End(Not run)
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") r<-getNearToExon(bedfile = regionNC, upstream = 1000, downstream = 2000, org_assembly = 'hg19') ## End(Not run)
Display the enriched Reactome diagram of the given Reactome pathway id. This function is specific to only one pathway id and identifies the enriched genes in the diagram.
getReactomeDiagram(mrnaObject, pathway, imageFormat)
getReactomeDiagram(mrnaObject, pathway, imageFormat)
mrnaObject |
Output of enrichment results |
pathway |
Reactome pathway term |
imageFormat |
Image format of the diagram. Possible image formats are 'png', 'svg' |
Shows reactome diagram marked with an enriched genes in a browser
## Not run: br_enr<-reactomeEnrichment(genes = breastmRNA,org_assembly='hg19') getReactomeDiagram(mrnaObject = br_enr,pathway = br_enr@ID[1], imageFormat = 'png') ## End(Not run)
## Not run: br_enr<-reactomeEnrichment(genes = breastmRNA,org_assembly='hg19') getReactomeDiagram(mrnaObject = br_enr,pathway = br_enr@ID[1], imageFormat = 'png') ## End(Not run)
For given region of interest, overlapped genes in the TAD regions are found. Results can be filtered according to the available cell lines.
getTADOverlap( bedfile, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), tad = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), near = FALSE, upstream = 10000, downstream = 10000, cellline = "all" )
getTADOverlap( bedfile, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), tad = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), near = FALSE, upstream = 10000, downstream = 10000, cellline = "all" )
bedfile |
Region of interest |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
tad |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
upstream |
Holds upstream distance from the transcription start position |
downstream |
Holds downstream distance from the transcription end position |
cellline |
Cell lines for TAD regions. |
List of protein coding genes that falls into the TAD regions
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") r<-getTADOverlap(bedfile = regionNC, tad = tad_hg19, org_assembly = 'hg19', cellline = 'HUVEC') ## End(Not run)
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") r<-getTADOverlap(bedfile = regionNC, tad = tad_hg19, org_assembly = 'hg19', cellline = 'HUVEC') ## End(Not run)
When downstream = 0 / upstream = 0, function converts bed formated regions to HUGO genes
getUCSC( bedfile, upstream, downstream, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
getUCSC( bedfile, upstream, downstream, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3") )
bedfile |
Bed formated input gene regions |
upstream |
Maximum upstream distance from the transcription start region of the input gene |
downstream |
Maximum downstream distance from the transcription end region of the input gene |
org_assembly |
genomee assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
genes
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") neighbour <- getUCSC(bedfile = regionNC, upstream = 1000, downstream = 1000, org_assembly = 'hg19') ## End(Not run)
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") neighbour <- getUCSC(bedfile = regionNC, upstream = 1000, downstream = 1000, org_assembly = 'hg19') ## End(Not run)
Perform enrichment analysis of the given genes
goEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), GOtype = c("BP", "CC", "MF"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, backG = "", backGType = "pc_gene", enrichTest = c("hyper", "binom", "fisher", "chi") )
goEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), GOtype = c("BP", "CC", "MF"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, backG = "", backGType = "pc_gene", enrichTest = c("hyper", "binom", "fisher", "chi") )
genes |
Set of input genes. Supported format HUGO. |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
GOtype |
Hierarchical category of the GO ontology. Possible values are "BP"(default), "CC", "MF". |
pCut |
Threshold value for the pvalue. Default value is 0.05 |
pAdjCut |
Cutoff value for the adjusted p-values using one of given method. Default value is 0.05. |
pAdjust |
Methods of the adjusted p-values. Possible methods are "bonferroni", "holm", "BH"(default) |
min |
Minimum number of gene that are required for enrichment. By default, it is set to 5 |
backG |
The set of genes that tested against to the input (background gene) |
backGType |
Type of the background gene. If miRNA gene set is used for background gene, backGType should be set to the 'mirna' |
enrichTest |
Types of enrichment methods to perform enrichment analysis. Possible values are "hyper"(default), "binom", "fisher", "chi". |
GO enrichment results
## Not run: subsetGene <- breastmRNA[1:30,] breastEnr <- goEnrichment(genes = subsetGene, org_assembly = 'hg19', GOtype = 'MF', min = 2) ## End(Not run)
## Not run: subsetGene <- breastmRNA[1:30,] breastEnr <- goEnrichment(genes = subsetGene, org_assembly = 'hg19', GOtype = 'MF', min = 2) ## End(Not run)
KEGG pathway enrichment
KeggEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, gmtFile = "", isSymbol = "", isGeneEnrich = "" )
KeggEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, gmtFile = "", isSymbol = "", isGeneEnrich = "" )
genes |
Input genes |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
pCut |
Threshold value for the pvalue. Default value is 0.05 |
pAdjCut |
Cutoff value for the adjusted p-values using one of given method. Default value is 0.05. |
pAdjust |
Methods of the adjusted p-values. Possible methods are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY","fdr", "none" |
min |
Minimum number of genes that are required for enrichment. By default, it is set to 5. |
gmtFile |
File path of the gmt file |
isSymbol |
Boolean value that controls the gene formats. If it is TRUE, gene format of the gmt file should be symbol. Otherwise, gene format must be ENTREZ ID. |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
KEGG pathway enrichment results
## Not run: subsetGene <- breastmRNA[1:30,] br_enr<-KeggEnrichment(genes = subsetGene, org_assembly='hg19') ## End(Not run)
## Not run: subsetGene <- breastmRNA[1:30,] br_enr<-KeggEnrichment(genes = subsetGene, org_assembly='hg19') ## End(Not run)
List cell line of the given topological domain regions
listTAD(TADName)
listTAD(TADName)
TADName |
input TAD regions |
cell line of the input tad data
## Not run: listTAD(TADName = tad_hg19) ## End(Not run)
## Not run: listTAD(TADName = tad_hg19) ## End(Not run)
Brain miRNA expression retrieved from the TCGA
mirna
mirna
Not Available
data(mirna)
data(mirna)
GO term enrichments of the microRNA genes with mRNAs that fall in the given upstream/downstream regions of the microRNA genes
mirnaGOEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, backGenes = "", backGType = "pc_gene", isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile = "" )
mirnaGOEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, backGenes = "", backGType = "pc_gene", isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile = "" )
gene |
Input microRNA gene. It supports both pre-miRNA and mature miRNA, however, when target prediction is performed (target= TRUE), miRNA genes should be mature. |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
target |
Boolean value shows whether miRNA target prediction should be performed |
backGenes |
The set of genes that tested against to the input |
backGType |
Type of the background gene. If miRNA gene set is used for background gene, backGType should be set to the 'mirna' |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
MiRNA GO term enrichment object for the given input
## Not run: subsetGene <- brain_mirna[1:30,] miGO <-mirnaGOEnricher(gene=subsetGene, org_assembly='hg19', near = TRUE, target = FALSE) ## End(Not run)
## Not run: subsetGene <- brain_mirna[1:30,] miGO <-mirnaGOEnricher(gene=subsetGene, org_assembly='hg19', near = TRUE, target = FALSE) ## End(Not run)
Pathway enrichments of the microRNA genes with mRNAs that fall in the given upstream/downstream regions of the microRNA genes
mirnaPathwayEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
mirnaPathwayEnricher( gene, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
gene |
Input microRNA gene. It supports both pre-miRNA and mature miRNA, however, when target prediction is performed(target= TRUE), miRNA genes should be mature. |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
target |
Boolean value shows whether miRNA target prediction should be performed |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
gmtName |
Custom pathway gmt file |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
MiRNA pathway enrichment object for the given input
## Not run: miPath <- mirnaPathwayEnricher(gene = brain_mirna, org_assembly = 'hg19', near = TRUE) ## End(Not run)
## Not run: miPath <- mirnaPathwayEnricher(gene = brain_mirna, org_assembly = 'hg19', near = TRUE) ## End(Not run)
GO enrichments of the microRNA regions with mRNAs that fall in the given upstream/downstream regions of the microRNA genes
mirnaRegionGOEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, backG = "", backGType = "pc-genes", isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile )
mirnaRegionGOEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, backG = "", backGType = "pc-genes", isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile )
region |
MiRNA region in a bed format |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
target |
Boolean value shows whether miRNA target prediction should be performed |
backG |
The set of genes that tested against to the input |
backGType |
Type of the background gene. If miRNA gene set is used for background gene, backGType should be set to the 'mirna' |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
MiRNA GO enrichment object for the given input
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") a<- mirnaRegionGOEnricher(region = regionNC, org_assembly = 'hg19', near = TRUE) ## End(Not run)
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") a<- mirnaRegionGOEnricher(region = regionNC, org_assembly = 'hg19', near = TRUE) ## End(Not run)
Pathway enrichments of the microRNA regions with mRNAs that fall in the given upstream/downstream regions of the microRNA genes
mirnaRegionPathwayEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
mirnaRegionPathwayEnricher( region, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), near = FALSE, target = FALSE, isTADSearch = FALSE, TAD = c(tad_hg19, tad_dmel, tad_hg38, tad_mm10), gmtName = "", express = FALSE, isCustomExp = FALSE, cancer, exp1, exp2, label1 = "", label2 = "", isUnionCorGene = FALSE, databaseFile, isGeneEnrich = FALSE )
region |
MiRNA region in a bed format |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
near |
Boolean value presents whether cis-neighbourhood should be considered in the analysis |
target |
Boolean value shows whether miRNA target prediction should be performed |
isTADSearch |
Boolean value that shows whether TAD analysis is performed. This value has to be TRUE for TAD analysis. |
TAD |
TAD genomic regions for the species. Predefined TAD regions or any new TAD regions can be used for the analysis. TAD regions must be formated as GRanges object. Predefined TAD regions are 'tad_hg19', 'tad_hg38', 'tad_mm10', 'tad_dmel' for hg19, hg38, mm9 and dm6 assembly, respectively. |
gmtName |
Custom pathway gmt file |
express |
Boolean variable whether co-expression analysis is performed. If this option is set to TRUE, co-expression analysis will be performed. |
isCustomExp |
Boolean variable whether co-expression analysis with custom data will be performed. When this option is set, exp1 and exp2 parameters must be defined. |
cancer |
Defines the name of the TCGA project code such as 'BRCA' for correlation analysis. Possible cancer types ACC, BLCA, BRCA, CESC, CHOL, COAD, COADREAD, DLBC, ESCA, GBMLGG, HNSC, KICH, KIPAN, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, STES, TGCT, THCA, THYM, UCEC, UCS, UVM |
exp1 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
exp2 |
Custom expression data matrix. Columns must be genes and rows must be patients. If gene names are provided as header, no need to redefine the headers(labels) of the expression data. |
label1 |
Gene names of the custom exp1 expression data. If it is not provided, column name of the exp1 data will be taken. |
label2 |
Gene names of the custom exp2 expression data. If it is not provided, column name of the exp2 data will be taken. |
isUnionCorGene |
Boolean value that shows whether union of the output of the co-expression analysis and the other analysis should be considered |
databaseFile |
Path of miRcancer.db file |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
miRNA pathway enrichment object for the given input
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") a<- mirnaRegionPathwayEnricher(region = regionNC, org_assembly = 'hg19') ## End(Not run)
## Not run: regions<-system.file("extdata", "ncRegion.txt", package = "NoRCE") regionNC <- rtracklayer::import(regions, format = "BED") a<- mirnaRegionPathwayEnricher(region = regionNC, org_assembly = 'hg19') ## End(Not run)
Brain mRNA expression retrieved from the TCGA
mrna
mrna
Not Available
data(mrna)
data(mrna)
Differentially expressed non-coding gene regions
ncRegion
ncRegion
Not Available
http://resource.psychencode.org/
data(ncRegion)
data(ncRegion)
An S4 class to represent enrichment
ID
factor
Term
factor
geneList
factor
ncGeneList
factor
pvalue
factor
pAdj
factor
GeneRatio
factor
BckRatio
factor
Check the package availability for the given assembly
packageCheck(pkg)
packageCheck(pkg)
pkg |
Required packages |
return install packages
For a given gmt file of a specific pathway database, pathway enrichment can be performed. Function supports Entrez ID and symbol based gmt file.
pathwayEnrichment( genes, gmtFile, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), isSymbol, min = 5, isGeneEnrich = FALSE )
pathwayEnrichment( genes, gmtFile, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), isSymbol, min = 5, isGeneEnrich = FALSE )
genes |
Input genes |
gmtFile |
File path of the gmt file |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
pCut |
Threshold value for the pvalue. Default value is 0.05 |
pAdjCut |
Cutoff value for the adjusted p-values using one of given method. Default value is 0.05. |
pAdjust |
Methods of the adjusted p-values. Possible methods are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY","fdr", "none" |
isSymbol |
Boolean value that controls the gene formats. If it is TRUE, gene format of the gmt file should be symbol. Otherwise, gene format must be ENTREZ ID. |
min |
Minimum number of genes that are required for enrichment. By default, it is set to 5. |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
Pathway Enrichment
Predict the miRNA targets for the miRNA or mRNA genes, which is specified with type parameter
predictmiTargets(gene, type, org_assembly)
predictmiTargets(gene, type, org_assembly)
gene |
Data frame of miRNA or mRNA gene. Formats should be NCBI gene name, ENSEMBL gene or transcript id, and mirna |
type |
Format of the gene, it should be "NCBI" for NCBI gene name, "Ensembl_gene" for ENSEMBL gene id, "Ensembl_trans" for Ensembl transcript id and "mirna" for miRNA gene |
org_assembly |
Analyzed genome assembly. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "hg19" and "hg38" for human |
miRNA:mRNA target sets of the given genes
## Not run: a<- predictmiTargets(gene = brain_mirna[1:100,], org_assembly = 'hg19', type = "mirna") ## End(Not run)
## Not run: a<- predictmiTargets(gene = brain_mirna[1:100,], org_assembly = 'hg19', type = "mirna") ## End(Not run)
Reactome pathway enrichment
reactomeEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, gmtFile = "", isSymbol = "", isGeneEnrich = "" )
reactomeEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, gmtFile = "", isSymbol = "", isGeneEnrich = "" )
genes |
Input genes |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
pCut |
Threshold value for the pvalue. Default value is 0.05 |
pAdjCut |
Cutoff value for the adjusted p-values using one of given method. Default value is 0.05. |
pAdjust |
Methods of the adjusted p-values. Possible methods are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY","fdr", "none" |
min |
Minimum number of genes that are required for enrichment. By default, it is set to 5. |
gmtFile |
File path of the gmt file |
isSymbol |
Boolean value that controls the gene formats. If it is TRUE, gene format of the gmt file should be symbol. Otherwise, gene format must be ENTREZ ID. |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
Reactome pathway enrichment results
## Not run: br_enr<-reactomeEnrichment(genes = breastmRNA,org_assembly='hg19') ## End(Not run)
## Not run: br_enr<-reactomeEnrichment(genes = breastmRNA,org_assembly='hg19') ## End(Not run)
Parameters: upstream: Upstream distance from the transcription start position downstream: Downstream distance from the transcription end position searchRegion: Search space of the cis-region. Possible values are "all", "exon", "intron" GOtype: Hierarchical category of the GO ontology. Possible values are "BP", "CC", "MF" pCut: Threshold value for the pvalue. Default value is 0.05 pAdjCut: Cutoff value for the adjusted p-values using one of given method. Default value is 0.05. pAdjust: Methods of the adjusted p-values. Possible methods are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY","fdr", "none" min: Minimum number of genes that are required for enrichment. By default, this value is set to 5. cellline: Cell lines for TAD regions. corrMethod Correlation coeffient method that will be used for evaluation. Possible values are "pearson", "kendall", "spearman" varCutoff: Variance cutt off that genes have less variance than this value will be trimmed pcut: P-value cut off for the correlation values alternate: Holds the alternative hypothesis and "two.sided", "greater" or "less" are the possible values. conf: Confidence level for the returned confidence interval. It is only used for the Pearson correlation coefficient if there are at least 4 complete pairs of observations. minAbsCor: Cut-off value for the Pearson correlation coefficient of the miRNA-mRNA pathwayType: Pathway database for enrichment. Possible values are 'reactome' for Reactome, 'kegg' for KEGG, 'wiki' for WikiPathways, 'other' for custom database enrichTest: Types of enrichment methods to perform enrichment analysis. Possible values are "hyper"(default), "binom", "fisher", "chi". isSymbol: Boolean variable that hold the gene format of the gmt file. If it is set as TRUE, gene format of the gmt file should be symbol. Otherwise, gene format should be ENTREZ ID. By default, it is FALSE.
setParameters(type, value)
setParameters(type, value)
type |
List of parameter names |
value |
New values for the parameters. Value and the parameter names must be in the same order. |
changed parameters
## Not run: type <- c('downstream','upstream') value <- c(2000,30000) setParameters(type,value) ## End(Not run)
## Not run: type <- c('downstream','upstream') value <- c(2000,30000) setParameters(type,value) ## End(Not run)
TAD regions for the fly
tad_dmel
tad_dmel
Not Available
http://chorogenome.ie-freiburg.mpg.de/data_sources.html#hi-c_datasets
data(tad_dmel)
data(tad_dmel)
TAD regions for human hg19 assembly
tad_hg19
tad_hg19
Not Available
http://promoter.bx.psu.edu/hi-c/publications.html
data(tad_hg19)
data(tad_hg19)
TAD regions for human hg38 assembly
tad_hg38
tad_hg38
Not Available
http://promoter.bx.psu.edu/hi-c/publications.html
data(tad_hg38)
data(tad_hg38)
TAD regions for mouse
tad_mm10
tad_mm10
Not Available
http://promoter.bx.psu.edu/hi-c/publications.html
data(tad_mm10)
data(tad_mm10)
Number of top enrichment results of the pathway or GO terms for the given object and the order type - p-value or adjusted p-value.
topEnrichment(mrnaObject, type, n)
topEnrichment(mrnaObject, type, n)
mrnaObject |
Object of the enrichment result |
type |
Draw the dot plot according to the p-value or adjusted p-value ("pvalue", "pAdjust") |
n |
Number of GO terms or pathways, that ordered by type and has least number of top p-value |
Give top n enrichment results
## Not run: ncGO<-geneGOEnricher(gene = brain_disorder_ncRNA, org_assembly='hg19', near=TRUE, genetype = 'Ensembl_gene') result = topEnrichment(mrnaObject = ncGO, type = "pvalue", n = 10) ## End(Not run)
## Not run: ncGO<-geneGOEnricher(gene = brain_disorder_ncRNA, org_assembly='hg19', near=TRUE, genetype = 'Ensembl_gene') result = topEnrichment(mrnaObject = ncGO, type = "pvalue", n = 10) ## End(Not run)
WikiPathways Enrichment
WikiEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, gmtFile = "", isSymbol = "", isGeneEnrich = "" )
WikiEnrichment( genes, org_assembly = c("hg19", "hg38", "mm10", "dre10", "rn6", "dm6", "ce11", "sc3"), pCut = 0.05, pAdjCut = 0.05, pAdjust = c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), min = 5, gmtFile = "", isSymbol = "", isGeneEnrich = "" )
genes |
Input genes |
org_assembly |
Genome assembly of interest for the analysis. Possible assemblies are "mm10" for mouse, "dre10" for zebrafish, "rn6" for rat, "dm6" for fruit fly, "ce11" for worm, "sc3" for yeast, "hg19" and "hg38" for human |
pCut |
Threshold value for the pvalue. Default value is 0.05 |
pAdjCut |
Cutoff value for the adjusted p-values using one of given method. Default value is 0.05. |
pAdjust |
Methods of the adjusted p-values. Possible methods are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY","fdr", "none" |
min |
Minimum number of genes that are required for enrichment. By default, it is set to 5. |
gmtFile |
File path of the gmt file |
isSymbol |
Boolean value that controls the gene formats. If it is TRUE, gene format of the gmt file should be symbol. Otherwise, gene format must be ENTREZ ID. |
isGeneEnrich |
Boolean value whether gene enrichment should be performed |
Wiki Pathway Enrichment
Write the tabular form of the pathway or GO term enrichment results
writeEnrichment(mrnaObject, fileName, sept = "\t", type = "pAdjust", n)
writeEnrichment(mrnaObject, fileName, sept = "\t", type = "pAdjust", n)
mrnaObject |
Object of the enrichment result |
fileName |
File name of the txt file |
sept |
File separator, by default, it is tab('\t') |
type |
Draw the dot plot according to the p-value or adjusted p-value ("pvalue", "pAdjust"). Default value is "pAdjust". |
n |
Number of GO terms or pathways, that ordered by type and has least number of top p-value |
Text file of the enrichment results in a tabular format
## Not run: ncGO<-geneGOEnricher(gene = brain_disorder_ncRNA, org_assembly='hg19', near=TRUE, genetype = 'Ensembl_gene') writeEnrichment(mrnaObject = ncGO,fileName = "a.txt",sept = '\t') ## End(Not run)
## Not run: ncGO<-geneGOEnricher(gene = brain_disorder_ncRNA, org_assembly='hg19', near=TRUE, genetype = 'Ensembl_gene') writeEnrichment(mrnaObject = ncGO,fileName = "a.txt",sept = '\t') ## End(Not run)