Title: | GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using single-cell or bulk chromatin accessibility and RNA-seq data |
---|---|
Description: | Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have cell-type specific activity. This TF activity can be quantified with existing tools such as diffTF and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using single-cell or bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally (Capture) Hi-C data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach. |
Authors: | Christian Arnold [cre, aut], Judith Zaugg [aut], Rim Moussa [aut], Armando Reyes-Palomares [ctb], Giovanni Palla [ctb], Maksim Kholmatov [ctb] |
Maintainer: | Christian Arnold <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.11.0 |
Built: | 2024-10-31 15:51:43 UTC |
Source: | https://github.com/bioc/GRaNIE |
GRN
objectRuns the main function fitExtractVarPartModel
of the package variancePartition
: Fits a linear (mixed) model to estimate contribution of multiple sources of variation while simultaneously correcting for all other variables for the features in a GRN object (TFs, peaks, and genes) given particular metadata. The function reports the fraction of variance attributable to each metadata variable.
Note: The results are not added to GRN@connections$all.filtered
, rerun the function getGRNConnections
and set include_variancePartitionResults
to TRUE
to do so.
The results object is stored in GRN@stats$variancePartition
and can be used for the various diagnostic and plotting functions from variancePartition
.
add_featureVariation( GRN, formula = "auto", metadata = c("all"), features = "all_filtered", nCores = 1, forceRerun = FALSE, ... )
add_featureVariation( GRN, formula = "auto", metadata = c("all"), features = "all_filtered", nCores = 1, forceRerun = FALSE, ... )
GRN |
Object of class |
formula |
Character(1). Either |
metadata |
Character vector. Default |
features |
Character(1). Either |
nCores |
Integer >0. Default 1. Number of cores to use.
A value >1 requires the |
forceRerun |
|
... |
Additional parameters passed on to |
The normalized count matrices are used as input for fitExtractVarPartModel
.
An updated GRN
object, with additional information added from this function to GRN@stats$variancePartition
as well as the elements genes
, consensusPeaks
and TFs
within GRN@annotation
.
As noted above, the results are not added to GRN@connections$all.filtered
; rerun the function getGRNConnections
and set include_variancePartitionResults
to TRUE
to include the results in the eGRN output table.
# See the Workflow vignette on the GRaNIE website for examples # GRN = loadExampleObject() # GRN = add_featureVariation(GRN, metadata = c("mt_frac"), forceRerun = TRUE)
# See the Workflow vignette on the GRaNIE website for examples # GRN = loadExampleObject() # GRN = add_featureVariation(GRN, metadata = c("mt_frac"), forceRerun = TRUE)
GRN
object.The information is currently stored in GRN@connections$TF_genes.filtered
. Note that raw p-values are not adjusted.
add_TF_gene_correlation( GRN, corMethod = "pearson", nCores = 1, forceRerun = FALSE )
add_TF_gene_correlation( GRN, corMethod = "pearson", nCores = 1, forceRerun = FALSE )
GRN |
Object of class |
corMethod |
Character. One of |
nCores |
Integer >0. Default 1. Number of cores to use.
A value >1 requires the |
forceRerun |
|
An updated GRN
object, with additional information added from this function.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = add_TF_gene_correlation(GRN, forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = add_TF_gene_correlation(GRN, forceRerun = FALSE)
GRN
objectAfter the execution of this function, QC plots can be plotted with the function plotDiagnosticPlots_peakGene
unless this has already been done by default due to plotDiagnosticPlots = TRUE
addConnections_peak_gene( GRN, overlapTypeGene = "TSS", corMethod = "pearson", promoterRange = 250000, TADs = NULL, TADs_mergeOverlapping = FALSE, knownLinks = NULL, knownLinks_separator = c(":", "-"), knownLinks_useExclusively = FALSE, shuffleRNACounts = TRUE, nCores = 4, plotDiagnosticPlots = TRUE, plotGeneTypes = list(c("all"), c("protein_coding")), outputFolder = NULL, forceRerun = FALSE )
addConnections_peak_gene( GRN, overlapTypeGene = "TSS", corMethod = "pearson", promoterRange = 250000, TADs = NULL, TADs_mergeOverlapping = FALSE, knownLinks = NULL, knownLinks_separator = c(":", "-"), knownLinks_useExclusively = FALSE, shuffleRNACounts = TRUE, nCores = 4, plotDiagnosticPlots = TRUE, plotGeneTypes = list(c("all"), c("protein_coding")), outputFolder = NULL, forceRerun = FALSE )
GRN |
Object of class |
overlapTypeGene |
Character. |
corMethod |
Character. One of |
promoterRange |
Integer >=0. Default 250000. The size of the neighborhood in bp to correlate peaks and genes in vicinity. Only peak-gene pairs will be correlated if they are within the specified range. Increasing this value leads to higher running times and more peak-gene pairs to be associated, while decreasing results in the opposite. |
TADs |
Data frame with TAD domains. Default |
TADs_mergeOverlapping |
|
knownLinks |
|
knownLinks_separator |
Character vector of length 1 or 2. Default |
knownLinks_useExclusively |
|
shuffleRNACounts |
|
nCores |
Integer >0. Default 1. Number of cores to use.
A value >1 requires the |
plotDiagnosticPlots |
|
plotGeneTypes |
List of character vectors. Default |
outputFolder |
Character or |
forceRerun |
|
An updated GRN
object, with additional information added from this function.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = addConnections_peak_gene(GRN, promoterRange=10000, plotDiagnosticPlots = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = addConnections_peak_gene(GRN, promoterRange=10000, plotDiagnosticPlots = FALSE)
GRN
objectAfter the execution of this function, QC plots can be plotted with the function plotDiagnosticPlots_TFPeaks
unless this has already been done by default due to plotDiagnosticPlots = TRUE
addConnections_TF_peak( GRN, plotDiagnosticPlots = TRUE, plotDetails = FALSE, outputFolder = NULL, corMethod = "pearson", connectionTypes = c("expression"), removeNegativeCorrelation = c(FALSE), maxFDRToStore = 0.3, addForBackground = TRUE, useGCCorrection = FALSE, percBackground_size = 75, percBackground_resample = TRUE, forceRerun = FALSE )
addConnections_TF_peak( GRN, plotDiagnosticPlots = TRUE, plotDetails = FALSE, outputFolder = NULL, corMethod = "pearson", connectionTypes = c("expression"), removeNegativeCorrelation = c(FALSE), maxFDRToStore = 0.3, addForBackground = TRUE, useGCCorrection = FALSE, percBackground_size = 75, percBackground_resample = TRUE, forceRerun = FALSE )
GRN |
Object of class |
plotDiagnosticPlots |
|
plotDetails |
|
outputFolder |
Character or |
corMethod |
Character. One of |
connectionTypes |
Character vector. Default |
removeNegativeCorrelation |
Vector of |
maxFDRToStore |
Numeric[0,1]. Default 0.3. Maximum TF-peak FDR value to permanently store a particular TF-peak connection in the object? This parameter has a large influence on the overall memory size of the object, and we recommend not storing connections with a high FDR due to their sheer number. |
addForBackground |
|
useGCCorrection |
|
percBackground_size |
Numeric[0,100]. Default 75. EXPERIMENTAL. Percentage of the background to use as basis for sampling. If set to 0, an automatic iterative procedure will identify the maximum percentage so that all relevant GC bins with a rel. frequency above 5% from the foreground can be matched. For more details, see the Package Details vignette. Only relevant if |
percBackground_resample |
|
forceRerun |
|
An updated GRN
object, with additional information added from this function.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = addConnections_TF_peak(GRN, plotDiagnosticPlots = FALSE, forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = addConnections_TF_peak(GRN, plotDiagnosticPlots = FALSE, forceRerun = FALSE)
GRN
object.This function adds both RNA and peak data to a GRN
object, along with data normalization.
In addition, and highly recommended, sample metadata can be optionally provided.
addData( GRN, counts_peaks, normalization_peaks = "DESeq2_sizeFactors", idColumn_peaks = "peakID", counts_rna, normalization_rna = "limma_quantile", idColumn_RNA = "ENSEMBL", sampleMetadata = NULL, additionalParams.l = list(), allowOverlappingPeaks = FALSE, keepOriginalReadCounts = FALSE, EnsemblVersion = NULL, genomeAnnotationSource = "AnnotationHub", forceRerun = FALSE )
addData( GRN, counts_peaks, normalization_peaks = "DESeq2_sizeFactors", idColumn_peaks = "peakID", counts_rna, normalization_rna = "limma_quantile", idColumn_RNA = "ENSEMBL", sampleMetadata = NULL, additionalParams.l = list(), allowOverlappingPeaks = FALSE, keepOriginalReadCounts = FALSE, EnsemblVersion = NULL, genomeAnnotationSource = "AnnotationHub", forceRerun = FALSE )
GRN |
Object of class |
counts_peaks |
Data frame. No default. Counts for the peaks, with raw or normalized counts for each peak (rows) across all samples (columns).
In addition to the count data, it must also contain one ID column with a particular format, see the argument |
normalization_peaks |
Character. Default |
idColumn_peaks |
Character. Default |
counts_rna |
Data frame. No default. Counts for the RNA-seq data, with raw or normalized counts for each gene (rows) across all samples (columns).
In addition to the count data, it must also contain one ID column with a particular format, see the argument |
normalization_rna |
Character. Default |
idColumn_RNA |
Character. Default |
sampleMetadata |
Data frame. Default |
additionalParams.l |
Named list. Default |
allowOverlappingPeaks |
|
keepOriginalReadCounts |
|
EnsemblVersion |
|
genomeAnnotationSource |
|
forceRerun |
|
If the ChIPseeker
package is installed, additional peak annotation is provided in the annotation slot and a peak annotation QC plot is produced as part of peak-gene QC.
This is fully optional, however, and has no consequences for downstream functions.
Normalizing the data sensibly is very important. When quantile
is chose, limma::normalizeQuantiles
is used, which in essence does the following:
Each quantile of each column is set to the mean of that quantile across arrays. The intention is to make all the normalized columns have the same empirical distribution.
This will be exactly true if there are no missing values and no ties within the columns: the normalized columns are then simply permutations of one another.
An updated GRN
object, with added data from this function(e.g., slots GRN@data$peaks
and GRN@data$RNA
)
# See the Workflow vignette on the GRaNIE website for examples # library(readr) # rna.df = read_tsv("https://www.embl.de/download/zaugg/GRaNIE/rna.tsv.gz") # peaks.df = read_tsv("https://www.embl.de/download/zaugg/GRaNIE/peaks.tsv.gz") # meta.df = read_tsv("https://www.embl.de/download/zaugg/GRaNIE/sampleMetadata.tsv.gz") # GRN = loadExampleObject() # We omit sampleMetadata = meta.df in the following line, becomes too long otherwise # GRN = addData(GRN, counts_peaks = peaks.df, counts_rna = rna.df, forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples # library(readr) # rna.df = read_tsv("https://www.embl.de/download/zaugg/GRaNIE/rna.tsv.gz") # peaks.df = read_tsv("https://www.embl.de/download/zaugg/GRaNIE/peaks.tsv.gz") # meta.df = read_tsv("https://www.embl.de/download/zaugg/GRaNIE/sampleMetadata.tsv.gz") # GRN = loadExampleObject() # We omit sampleMetadata = meta.df in the following line, becomes too long otherwise # GRN = addData(GRN, counts_peaks = peaks.df, counts_rna = rna.df, forceRerun = FALSE)
We do not yet provide full support for this function. It is currently being tested. Use at our own risk.
addData_TFActivity( GRN, normalization = "cyclicLoess", name = "TF_activity", forceRerun = FALSE )
addData_TFActivity( GRN, normalization = "cyclicLoess", name = "TF_activity", forceRerun = FALSE )
GRN |
Object of class |
normalization |
Character. Default |
name |
Name in object under which it should be stored. This corresponds to the |
forceRerun |
|
An updated GRN
object, with added data from this function
(GRN@data$TFs[[name]]
in particular, with name
referring to the value of tje name
parameter)
GRN
object and associate SNPs to peaks.This function accepts a vector of SNP IDs (rsID), retrieves their genomic positions and
overlaps them with the peaks to extend the peak metadata ('GRN@data$peaks$counts_metadata') by storing the number, positions and rsids of all
overlapping SNPs per peak (new columns starting with 'SNP_').
Optionally, SNPs in LD with the user-provided SNPs can be identified using the LDlinkR
package. Note that only SNPs in LD are associated with a peak for those SNPs directly overlapping a peak.
That is, if a user-provided SNP does not overlap with any peak, neither the SNP itself nor any of the SNPs in LD will be associated with any peak, even if an LD SNP overlaps another peak.
The results of are stored in GRN@annotation$SNPs
(full, unfiltered table) and GRN@annotation$SNPs_filtered
(filtered table),
and rapid re-filtering is possible without re-querying the database (time-consuming)
addSNPData( GRN, SNP_IDs, EnsemblVersion = NULL, add_SNPs_LD = FALSE, requeryLD = FALSE, population = "CEU", r2d = "r2", token = NULL, filter = "R2 > 0.8", forceRerun = FALSE )
addSNPData( GRN, SNP_IDs, EnsemblVersion = NULL, add_SNPs_LD = FALSE, requeryLD = FALSE, population = "CEU", r2d = "r2", token = NULL, filter = "R2 > 0.8", forceRerun = FALSE )
GRN |
Object of class |
SNP_IDs |
Character vector. No default. Vector of SNP IDs (rsID) that should be integrated and overlapped with the peaks. |
EnsemblVersion |
|
add_SNPs_LD |
|
requeryLD |
|
population |
Character vector. Default |
r2d |
|
token |
Character or |
filter |
Character. Default |
forceRerun |
|
'biomaRt' is used to retrieve genomic positions for the user-defined SNPs, which can take a long time depending
on the number of SNPs provided. Similarly, querying the LDlink
servers may take a long time.
An updated GRN
object, with additional information added from this function.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = addSNPData(GRN, SNP_IDs = c("rs7570219", "rs6445264", "rs12067275"), forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = addSNPData(GRN, SNP_IDs = c("rs7570219", "rs6445264", "rs12067275"), forceRerun = FALSE)
GRN
object.For this, a folder that contains one TFBS file per TF in bed or bed.gz format must be given (see details). The folder must also contain a so-called translation table, see the argument translationTable
for details.
We provide example files for selected supported genome assemblies (hg19, hg38 and mm10, mm39) that are fully compatible with GRaNIE as separate downloads. For more information, check https://difftf.readthedocs.io/en/latest/chapter2.html#dir-tfbs.
addTFBS( GRN, source = "custom", motifFolder = NULL, TFs = "all", translationTable = "translationTable.csv", translationTable_sep = " ", filesTFBSPattern = "_TFBS", fileEnding = ".bed", nTFMax = NULL, EnsemblVersion = NULL, JASPAR_useSpecificTaxGroup = NULL, JASPAR_removeAmbiguousTFs = TRUE, forceRerun = FALSE, ... )
addTFBS( GRN, source = "custom", motifFolder = NULL, TFs = "all", translationTable = "translationTable.csv", translationTable_sep = " ", filesTFBSPattern = "_TFBS", fileEnding = ".bed", nTFMax = NULL, EnsemblVersion = NULL, JASPAR_useSpecificTaxGroup = NULL, JASPAR_removeAmbiguousTFs = TRUE, forceRerun = FALSE, ... )
GRN |
Object of class |
source |
Character. One of |
motifFolder |
Character. No default. Only relevant if |
TFs |
Character vector. Default |
translationTable |
Character. Default |
translationTable_sep |
Character. Default |
filesTFBSPattern |
Character. Default |
fileEnding |
Character. Default |
nTFMax |
|
EnsemblVersion |
|
JASPAR_useSpecificTaxGroup |
|
JASPAR_removeAmbiguousTFs |
|
forceRerun |
|
... |
Additional named elements for the |
An updated GRN
object, with additional information added from this function(GRN@annotation$TFs
in particular)
# See the Workflow vignette on the GRaNIE website for examples
# See the Workflow vignette on the GRaNIE website for examples
GRN
objectRun the activator-repressor classification for the TFs for a GRN
object
AR_classification_wrapper( GRN, significanceThreshold_Wilcoxon = 0.05, plot_minNoTFBS_heatmap = 100, deleteIntermediateData = TRUE, plotDiagnosticPlots = TRUE, outputFolder = NULL, corMethod = "pearson", forceRerun = FALSE )
AR_classification_wrapper( GRN, significanceThreshold_Wilcoxon = 0.05, plot_minNoTFBS_heatmap = 100, deleteIntermediateData = TRUE, plotDiagnosticPlots = TRUE, outputFolder = NULL, corMethod = "pearson", forceRerun = FALSE )
GRN |
Object of class |
significanceThreshold_Wilcoxon |
Numeric[0,1]. Default 0.05. Significance threshold for Wilcoxon test that is run in the end for the final classification. See the Vignette and *diffTF* paper for details. |
plot_minNoTFBS_heatmap |
Integer[1,]. Default 100. Minimum number of TFBS for a TF to be included in the heatmap that is part of the output of this function. |
deleteIntermediateData |
|
plotDiagnosticPlots |
|
outputFolder |
Character or |
corMethod |
Character. One of |
forceRerun |
|
An updated GRN
object, with additional information added from this function.
# See the Workflow vignette on the GRaNIE website for examples # GRN = loadExampleObject() # GRN = AR_classification_wrapper(GRN, outputFolder = ".", forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples # GRN = loadExampleObject() # GRN = AR_classification_wrapper(GRN, outputFolder = ".", forceRerun = FALSE)
This function requires a filtered set of connections in the GRN
object as generated by filterGRNAndConnectGenes
build_eGRN_graph( GRN, model_TF_gene_nodes_separately = FALSE, allowLoops = FALSE, removeMultiple = FALSE, directed = FALSE, forceRerun = FALSE )
build_eGRN_graph( GRN, model_TF_gene_nodes_separately = FALSE, allowLoops = FALSE, removeMultiple = FALSE, directed = FALSE, forceRerun = FALSE )
GRN |
Object of class |
model_TF_gene_nodes_separately |
|
allowLoops |
|
removeMultiple |
|
directed |
|
forceRerun |
|
An updated GRN
object, with the graph(s) being stored in the slot 'graph' (i.e., 'GRN@graph' for both TF-gene and TF-peak-gene graphs)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = build_eGRN_graph(GRN, forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = build_eGRN_graph(GRN, forceRerun = FALSE)
GRN
objectThe enrichment analysis is based on the subset of the network connected to a particular community as identified by calculateCommunitiesStats
, see calculateTFEnrichment
and calculateGeneralEnrichment
for
TF-specific and general enrichment, respectively.
This function requires the existence of the eGRN graph in the GRN
object as produced by build_eGRN_graph
as well as community information as calculated by calculateCommunitiesStats
.
Results can subsequently be visualized with the function plotCommunitiesEnrichment
.
calculateCommunitiesEnrichment( GRN, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", background_geneTypes = "all", selection = "byRank", communities = NULL, pAdjustMethod = "BH", forceRerun = FALSE )
calculateCommunitiesEnrichment( GRN, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", background_geneTypes = "all", selection = "byRank", communities = NULL, pAdjustMethod = "BH", forceRerun = FALSE )
GRN |
Object of class |
ontology |
Character vector of ontologies. Default |
algorithm |
Character. Default |
statistic |
Character. Default |
background |
Character. Default |
background_geneTypes |
Character vector of gene types that should be considered for the background. Default |
selection |
Character. Default |
communities |
|
pAdjustMethod |
Character. Default |
forceRerun |
|
All enrichment functions use the TF-gene graph as defined in the 'GRN' object. See the 'ontology' argument for currently supported ontologies. Also note that some parameter combinations for 'algorithm' and 'statistic' are incompatible, an error message will be thrown in such a case.
An updated GRN
object, with the enrichment results stored in the stats$Enrichment$byCommunity
slot.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateCommunitiesEnrichment(GRN, ontology = c("GO_BP"), forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateCommunitiesEnrichment(GRN, ontology = c("GO_BP"), forceRerun = FALSE)
The results can subsequently be visualized with the function plotCommunitiesStats
This function requires a filtered set of connections in the GRN
object as generated by filterGRNAndConnectGenes
.
It then generates the TF-gene graph from the filtered connections, and clusters its vertices into communities using established community detection algorithms.
calculateCommunitiesStats(GRN, clustering = "louvain", forceRerun = FALSE, ...)
calculateCommunitiesStats(GRN, clustering = "louvain", forceRerun = FALSE, ...)
GRN |
Object of class |
clustering |
Character. Default |
forceRerun |
|
... |
Additional parameters for the used clustering method, see the |
An updated GRN
object, with a table that consists of the connections clustered into communities stored in the
GRN@graph$TF_gene$clusterGraph
slot as well as within the igraph
object in GRN@graph$TF_gene$graph
(retrievable via igraph
using igraph::vertex.attributes(GRN@graph$TF_gene$graph)$community
, for example.)
calculateCommunitiesEnrichment
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateCommunitiesStats(GRN, forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateCommunitiesStats(GRN, forceRerun = FALSE)
GRN
objectThe enrichment analysis is based on the whole network, see calculateCommunitiesEnrichment
and calculateTFEnrichment
for
community- and TF-specific enrichment, respectively.
This function requires the existence of the eGRN graph in the GRN
object as produced by build_eGRN_graph
.
Results can subsequently be visualized with the function plotGeneralEnrichment
.
calculateGeneralEnrichment( GRN, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", background_geneTypes = "all", pAdjustMethod = "BH", forceRerun = FALSE )
calculateGeneralEnrichment( GRN, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", background_geneTypes = "all", pAdjustMethod = "BH", forceRerun = FALSE )
GRN |
Object of class |
ontology |
Character vector of ontologies. Default |
algorithm |
Character. Default |
statistic |
Character. Default |
background |
Character. Default |
background_geneTypes |
Character vector of gene types that should be considered for the background. Default |
pAdjustMethod |
Character. Default |
forceRerun |
|
All enrichment functions use the TF-gene graph as defined in the 'GRN' object. See the 'ontology' argument for currently supported ontologies. Also note that some parameter combinations for 'algorithm' and 'statistic' are incompatible, an error message will be thrown in such a case.
An updated GRN
object, with the enrichment results stored in the stats$Enrichment$general
slot.
calculateCommunitiesEnrichment
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateGeneralEnrichment(GRN, ontology = "GO_BP", forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateGeneralEnrichment(GRN, ontology = "GO_BP", forceRerun = FALSE)
GRN
objectThe enrichment analysis is based on the subset of the network connected to particular TFs (TF regulons), see calculateCommunitiesEnrichment
and calculateGeneralEnrichment
for
community- and general enrichment, respectively.
This function requires the existence of the eGRN graph in the GRN
object as produced by build_eGRN_graph
.
Results can subsequently be visualized with the function plotTFEnrichment
.
calculateTFEnrichment( GRN, rankType = "degree", n = 3, TF.IDs = NULL, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", background_geneTypes = "all", pAdjustMethod = "BH", forceRerun = FALSE )
calculateTFEnrichment( GRN, rankType = "degree", n = 3, TF.IDs = NULL, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", background_geneTypes = "all", pAdjustMethod = "BH", forceRerun = FALSE )
GRN |
Object of class |
rankType |
Character. Default |
n |
Numeric. Default 3. If this parameter is passed as a value between 0 and 1, it is treated as a percentage of top nodes. If the value is passed as an integer it will be treated as the number of top nodes. This parameter is not relevant if |
TF.IDs |
Character vector. Default |
ontology |
Character vector of ontologies. Default |
algorithm |
Character. Default |
statistic |
Character. Default |
background |
Character. Default |
background_geneTypes |
Character vector of gene types that should be considered for the background. Default |
pAdjustMethod |
Character. Default |
forceRerun |
|
All enrichment functions use the TF-gene graph as defined in the 'GRN' object. See the 'ontology' argument for currently supported ontologies. Also note that some parameter combinations for 'algorithm' and 'statistic' are incompatible, an error message will be thrown in such a case.
An updated GRN
object, with the enrichment results stored in the stats$Enrichment$byTF
slot.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateTFEnrichment(GRN, n = 5, ontology = "GO_BP", forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = calculateTFEnrichment(GRN, n = 5, ontology = "GO_BP", forceRerun = FALSE)
Change the output directory of a GRN object
changeOutputDirectory(GRN, outputDirectory = ".")
changeOutputDirectory(GRN, outputDirectory = ".")
GRN |
Object of class |
outputDirectory |
Character. Default |
An updated GRN
object, with the output directory being adjusted accordingly
GRN = loadExampleObject() GRN = changeOutputDirectory(GRN, outputDirectory = ".")
GRN = loadExampleObject() GRN = changeOutputDirectory(GRN, outputDirectory = ".")
AR_classification_wrapper
and summary statistics that may occupy a lot of spaceOptional convenience function to delete intermediate data from the function AR_classification_wrapper
and summary statistics that may occupy a lot of space
deleteIntermediateData(GRN)
deleteIntermediateData(GRN)
GRN |
Object of class |
An updated GRN
object, with some slots being deleted (GRN@data$TFs$classification
as well as GRN@stats$connectionDetails.l
)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = deleteIntermediateData(GRN)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = deleteIntermediateData(GRN)
This helper function provides an easy and flexible way to retain particular connections for plotting and discard all others. Note that this filtering is only
relevant and applicable for the function 'visualizeGRN()' and ignored anywhere else. This makes it possible to visualize only specific TF regulons or to plot only
connections that fulfill particular filter criteria. Due to the flexibility of the implementation by allowing arbitrary filters that are passed directly to
dplyr::filter
, users can visually investigate the eGRN, which is particularly useful when the eGRNs is large and has many connections.
filterConnectionsForPlotting(GRN, plotAll = TRUE, ..., forceRerun = FALSE)
filterConnectionsForPlotting(GRN, plotAll = TRUE, ..., forceRerun = FALSE)
GRN |
Object of class |
plotAll |
|
... |
An arbitrary set of arguments that is used directly, without modification, as input for dplyr::filter and therefore has to be valid expression that dplyr::filter understands.
The filtering is based on the |
forceRerun |
|
An updated GRN
object, with added data from this function.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = filterConnectionsForPlotting (GRN, plotAll = FALSE, TF.ID == "E2F6.0.A") GRN = filterConnectionsForPlotting (GRN, plotAll = FALSE, TF_peak.r > 0.7 | TF_peak.fdr < 0.2) GRN = filterConnectionsForPlotting (GRN, plotAll = FALSE, TF_peak.r > 0.7, TF_peak.fdr < 0.2)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = filterConnectionsForPlotting (GRN, plotAll = FALSE, TF.ID == "E2F6.0.A") GRN = filterConnectionsForPlotting (GRN, plotAll = FALSE, TF_peak.r > 0.7 | TF_peak.fdr < 0.2) GRN = filterConnectionsForPlotting (GRN, plotAll = FALSE, TF_peak.r > 0.7, TF_peak.fdr < 0.2)
GRN
objectThis function marks genes and/or peaks as filtered
depending on the chosen filtering criteria and is based on the count data AFTER
potential normalization as chosen when using the addData
function. Most of the filters may not be meaningful and useful anymore to apply
after using particular normalization schemes that can give rise to, for example, negative values such as cyclic loess normalization. If normalized counts do
not represents counts anymore but rather a deviation from a mean or something a like, the filtering critieria usually do not make sense anymore.
Filtered genes / peaks will then be disregarded when adding connections in subsequent steps via addConnections_TF_peak
and addConnections_peak_gene
. This function does NOT (re)filter existing connections when the GRN
object already contains connections. Thus, upon re-execution of this function with different filtering criteria, all downstream steps have to be re-run.
filterData( GRN, minNormalizedMean_peaks = NULL, maxNormalizedMean_peaks = NULL, minNormalizedMeanRNA = NULL, maxNormalizedMeanRNA = NULL, chrToKeep_peaks = NULL, minSize_peaks = 20, maxSize_peaks = 10000, minCV_peaks = NULL, maxCV_peaks = NULL, minCV_genes = NULL, maxCV_genes = NULL, forceRerun = FALSE )
filterData( GRN, minNormalizedMean_peaks = NULL, maxNormalizedMean_peaks = NULL, minNormalizedMeanRNA = NULL, maxNormalizedMeanRNA = NULL, chrToKeep_peaks = NULL, minSize_peaks = 20, maxSize_peaks = 10000, minCV_peaks = NULL, maxCV_peaks = NULL, minCV_genes = NULL, maxCV_genes = NULL, forceRerun = FALSE )
GRN |
Object of class |
minNormalizedMean_peaks |
Numeric[0,] or |
maxNormalizedMean_peaks |
Numeric[0,] or |
minNormalizedMeanRNA |
Numeric[0,] or |
maxNormalizedMeanRNA |
Numeric[0,] or |
chrToKeep_peaks |
Character vector or |
minSize_peaks |
Integer[1,] or |
maxSize_peaks |
Integer[1,] or |
minCV_peaks |
Numeric[0,] or |
maxCV_peaks |
Numeric[0,] or |
minCV_genes |
Numeric[0,] or |
maxCV_genes |
Numeric[0,] or |
forceRerun |
|
All this function does is setting (or modifying) the filtering flag in GRN@data$peaks$counts_metadata
and GRN@data$RNA$counts_metadata
, respectively.
An updated GRN
object, with added data from this function.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = filterData(GRN, forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = filterData(GRN, forceRerun = FALSE)
This is one of the main integrative functions of the GRaNIE
package. It has two main functions:
First, filtering both TF-peak and peak-gene connections according to different criteria such as FDR and other properties
Second, joining the three major elements that an eGRN consist of (TFs, peaks, genes) into one data frame, with one row per unique TF-peak-gene connection.
After successful execution, the connections (along with additional feature metadata) can be retrieved with the function getGRNConnections
.
Note that a previously stored eGRN graph is reset upon successful execution of this function along with printing a descriptive warning,
and re-running the function build_eGRN_graph
is necessary when any of the network functions of the package shall be executed.
If the filtered connections changed, all network related enrichment functions also have to be rerun.
Internally, before joining them, both TF-peak links and peak-gene connections are filtered separately for reasons of memory and computational efficacy:
First filtering out unwanted links dramatically reduces the memory needed for the full eGRN. Peak-gene p-value adjustment is only done after all filtering steps on the remaining set of
connections to lower the statistical burden of multiple-testing adjustment; therefore, this may lead to initially counter-intuitive effects such as a particular connections not being included anymore as compared to a
filtering based on different thresholds, or the FDR being different for the same reason.
filterGRNAndConnectGenes( GRN, TF_peak.fdr.threshold = 0.2, TF_peak.connectionTypes = "all", peak.SNP_filter = list(min_nSNPs = 0, filterType = "orthogonal"), peak_gene.p_raw.threshold = NULL, peak_gene.fdr.threshold = 0.2, peak_gene.fdr.method = "BH", peak_gene.IHW.covariate = NULL, peak_gene.IHW.nbins = "auto", outputFolder = NULL, gene.types = c("all"), allowMissingTFs = FALSE, allowMissingGenes = TRUE, peak_gene.r_range = c(0, 1), peak_gene.selection = "all", peak_gene.maxDistance = NULL, filterTFs = NULL, filterGenes = NULL, filterPeaks = NULL, TF_peak_FDR_selectViaCorBins = FALSE, filterLoops = TRUE, resetGraphAndStoreInternally = TRUE, silent = FALSE, forceRerun = FALSE )
filterGRNAndConnectGenes( GRN, TF_peak.fdr.threshold = 0.2, TF_peak.connectionTypes = "all", peak.SNP_filter = list(min_nSNPs = 0, filterType = "orthogonal"), peak_gene.p_raw.threshold = NULL, peak_gene.fdr.threshold = 0.2, peak_gene.fdr.method = "BH", peak_gene.IHW.covariate = NULL, peak_gene.IHW.nbins = "auto", outputFolder = NULL, gene.types = c("all"), allowMissingTFs = FALSE, allowMissingGenes = TRUE, peak_gene.r_range = c(0, 1), peak_gene.selection = "all", peak_gene.maxDistance = NULL, filterTFs = NULL, filterGenes = NULL, filterPeaks = NULL, TF_peak_FDR_selectViaCorBins = FALSE, filterLoops = TRUE, resetGraphAndStoreInternally = TRUE, silent = FALSE, forceRerun = FALSE )
GRN |
Object of class |
TF_peak.fdr.threshold |
Numeric[0,1]. Default 0.2. Maximum FDR for the TF-peak links. Set to 1 or NULL to disable this filter. |
TF_peak.connectionTypes |
Character vector. Default |
peak.SNP_filter |
Named list. Default |
peak_gene.p_raw.threshold |
Numeric[0,1]. Default NULL. Threshold for the peak-gene connections, based on the raw p-value. All peak-gene connections with a larger raw p-value will be filtered out. |
peak_gene.fdr.threshold |
Numeric[0,1]. Default 0.2. Threshold for the peak-gene connections, based on the FDR. All peak-gene connections with a larger FDR will be filtered out. |
peak_gene.fdr.method |
Character. Default "BH". One of: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none", "IHW".
Method for adjusting p-values for multiple testing.
If set to "IHW", the package |
peak_gene.IHW.covariate |
Character. Default |
peak_gene.IHW.nbins |
Integer or "auto". Default "auto". Number of bins for IHW. Only relevant if |
outputFolder |
Character or |
gene.types |
Character vector of supported gene types. Default |
allowMissingTFs |
|
allowMissingGenes |
|
peak_gene.r_range |
Numeric(2). Default |
peak_gene.selection |
|
peak_gene.maxDistance |
Integer >0. Default |
filterTFs |
Character vector. Default |
filterGenes |
Character vector. Default |
filterPeaks |
Character vector. Default |
TF_peak_FDR_selectViaCorBins |
|
filterLoops |
|
resetGraphAndStoreInternally |
|
silent |
|
forceRerun |
|
An updated GRN
object, with additional information added from this function.
The filtered and merged TF-peak and peak-gene connections in the slot GRN@connections$all.filtered
and can be retrieved (along with other feature metadata) using the function getGRNConnections
.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = filterGRNAndConnectGenes(GRN)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = filterGRNAndConnectGenes(GRN)
GRN
object.This functions calls filterGRNAndConnectGenes
repeatedly and stores the total number of connections and other statistics each time to summarize them afterwards.
All arguments are identical to the ones in filterGRNAndConnectGenes
, see the help for this function for details.
The function plot_stats_connectionSummary
can be used afterwards for plotting.
generateStatsSummary( GRN, TF_peak.fdr = c(0.001, 0.01, 0.05, 0.1, 0.2), TF_peak.connectionTypes = "all", peak_gene.fdr = c(0.001, 0.01, 0.05, 0.1, 0.2), peak_gene.r_range = c(0, 1), gene.types = c("all"), allowMissingGenes = c(FALSE, TRUE), allowMissingTFs = c(FALSE), forceRerun = FALSE )
generateStatsSummary( GRN, TF_peak.fdr = c(0.001, 0.01, 0.05, 0.1, 0.2), TF_peak.connectionTypes = "all", peak_gene.fdr = c(0.001, 0.01, 0.05, 0.1, 0.2), peak_gene.r_range = c(0, 1), gene.types = c("all"), allowMissingGenes = c(FALSE, TRUE), allowMissingTFs = c(FALSE), forceRerun = FALSE )
GRN |
Object of class |
TF_peak.fdr |
Numeric vector[0,1]. Default |
TF_peak.connectionTypes |
Character vector. Default |
peak_gene.fdr |
Numeric vector[0,1]. Default |
peak_gene.r_range |
Numeric vector of length 2[-1,1]. Default |
gene.types |
Character vector of supported gene types. Default |
allowMissingGenes |
Logical vector of length 1 or 2. Default |
allowMissingTFs |
Logical vector of length 1 or 2. Default |
forceRerun |
|
An updated GRN
object, with additional information added from this function.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = generateStatsSummary(GRN, TF_peak.fdr = c(0.01, 0.1), peak_gene.fdr = c(0.01, 0.1))
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = generateStatsSummary(GRN, TF_peak.fdr = c(0.01, 0.1), peak_gene.fdr = c(0.01, 0.1))
GRN
objectGet counts for the various data defined in a GRN
object.
Note: This function, as all get
functions from this package, does NOT return a GRN
object.
getCounts( GRN, type, permuted = FALSE, asMatrix = FALSE, includeIDColumn = TRUE, includeFiltered = FALSE )
getCounts( GRN, type, permuted = FALSE, asMatrix = FALSE, includeIDColumn = TRUE, includeFiltered = FALSE )
GRN |
Object of class |
type |
Character. Either |
permuted |
|
asMatrix |
Logical. |
includeIDColumn |
Logical. |
includeFiltered |
Logical. |
Data frame of counts, with the type as indicated by the function parameters. This function does **NOT** return a GRN
object.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() counts.df = getCounts(GRN, type = "peaks", permuted = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() counts.df = getCounts(GRN, type = "peaks", permuted = FALSE)
GRN
object as a data frame.Returns stored connections/links (either TF-peak, peak-genes, TF-genes or the filtered set of connections as produced by filterGRNAndConnectGenes
).
Additional meta columns (TF, peak and gene metadata) can be added optionally.
Note: This function, as all get
functions from this package, does NOT return a GRN
object.
getGRNConnections( GRN, type = "all.filtered", background = FALSE, include_TF_gene_correlations = FALSE, include_TFMetadata = FALSE, include_peakMetadata = FALSE, include_geneMetadata = FALSE, include_variancePartitionResults = FALSE )
getGRNConnections( GRN, type = "all.filtered", background = FALSE, include_TF_gene_correlations = FALSE, include_TFMetadata = FALSE, include_peakMetadata = FALSE, include_geneMetadata = FALSE, include_variancePartitionResults = FALSE )
GRN |
Object of class |
type |
Character. One of |
background |
Integer (0 or 1). Default |
include_TF_gene_correlations |
Logical. |
include_TFMetadata |
Logical. |
include_peakMetadata |
Logical. |
include_geneMetadata |
Logical. |
include_variancePartitionResults |
Logical. |
A data frame with the requested connections. This function does **NOT** return a GRN
object. Depending on the arguments, the
data frame that is returned has different columns, which however can be divided into the following classes according to their name:
TF-related: Starting with TF.
:
TF.name
and TF.ID
: Name / ID of the TF
TF.ENSEMBL
: Ensembl ID (unique)
peak-related: Starting with peak.
:
peak.ID
: ID (coordinates)
peak.mean
, peak.median
, peak.CV
: peak mean, median and its coefficient of variation (CV) across all samples
peak.annotation
: Peak annotation as determined by ChIPseeker
such as Promoter, 5’ UTR, 3’ UTR, Exon, Intron, Downstream, Intergenic
peak.nearestGene*
: Additional metadata for the nearest gene such as position (chr
, start
, end
, strand
),
name (name
, symbol
and ENSEMBL
), and distance to the TSS (distanceToTSS
)
peak.GC.perc
: GC percentage
gene-related: Starting with gene.
:
gene.name
and gene.ENSEMBL
: gene name and Ensembl ID
gene.type
: gene type (such as protein_coding
, lincRNA
) as retrieved by biomaRt
gene.mean
, gene.median
, gene.CV
: gene mean, median and its coefficient of variation (CV) across all samples
TF-peak-related: Starting with TF_peak.
:
TF_peak.r
and TF_peak.r_bin
: Correlation coefficient of the TF-peak pair and its correlation bin (in bins of width 0.05, such as (-0.55,-0.5] for r = -0.53)
TF_peak.fdr
and TF_peak.fdr_direction
: TF-peak FDR and the directionality from which it was derived (see Methods in the paper, pos
or neg
)
TF_peak.connectionType
: TF-peak connection type. This is by default expression
, meaning that expression was used to construct the TF and peak
peak-gene-related: Starting with peak_gene.
:
peak_gene.source
: Source/Origin of the identified connection. Either neighborhood
, TADs
or knownLinks
,
depending on the parameters used when running the function addConnections_peak_gene
peak_gene.bait_OE_ID
: Only present when known links have been provided (see addConnections_peak_gene
). This column denotes the original IDs of the bait and OE coordinates that identified this link.
peak_gene.tad_ID
: Only present when TADs have been provided (see addConnections_peak_gene
). This column denotes the original ID of the TAD ID that identified this link.
peak_gene.distance
: Peak-gene distance (usually taken the TSS of the gene as reference unless specified otherwise, see the parameter overlapTypeGene
for more information from addConnections_peak_gene
).
If the peak-gene connection is across chromosomes (as defined by the known links, see addConnections_peak_gene
), the distance is set to NA.
peak_gene.r
: Correlation coefficient of the peak-gene pair
peak_gene.p_raw
and peak_gene.p_adj
: Raw and adjusted p-value of the peak-gene pair
TF-gene-related: Starting with TF_gene.
:
TF_gene.r
: Correlation coefficient of the TF-gene pair
TF_gene.p_raw
: Raw p-value of the TF-gene pair
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN_con.all.df = getGRNConnections(GRN)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN_con.all.df = getGRNConnections(GRN)
GRN
object to a named list for comparison with other GRN
objects.Note: This function, as all get
functions from this package, does NOT return a GRN
object.
getGRNSummary(GRN, silent = FALSE)
getGRNSummary(GRN, silent = FALSE)
GRN |
Object of class |
silent |
|
A named list summarizing the GRN object. This function does **NOT** return a GRN
object, but instead a named lsit with the
following elements:
data
:
peaks
, genes
and TFs
:
sharedSamples
:
metadata
:
parameters
and config
: GRN parameters and config information
connections
: Connection summary for different connection types
TF_peak
: TF-peak (number of connections for different FDR thresholds)
peak_genes
: Peak-gene
TF_peak_gene
: TF-peak-gene
network
: Network-related summary, including the number of nodes, edges, communities and enrichment for both the TF-peak-gene and TF-gene network
TF_gene
TF_peak_gene
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() summary.l = getGRNSummary(GRN)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() summary.l = getGRNSummary(GRN)
GRN
object.Note: This function, as all get
functions from this package, does NOT return a GRN
object.
getParameters(GRN, type = "parameter", name = "all")
getParameters(GRN, type = "parameter", name = "all")
GRN |
Object of class |
type |
Character. Either |
name |
Character. Default |
The requested parameters. This function does **NOT** return a GRN
object.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() params.l = getParameters(GRN, type = "parameter", name = "all")
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() params.l = getParameters(GRN, type = "parameter", name = "all")
GRN
object.This function requires a filtered set of connections in the GRN
object as generated by filterGRNAndConnectGenes
.
Note: This function, as all get
functions from this package, does NOT return a GRN
object.
getTopNodes(GRN, nodeType, rankType, n = 0.1, use_TF_gene_network = TRUE)
getTopNodes(GRN, nodeType, rankType, n = 0.1, use_TF_gene_network = TRUE)
GRN |
Object of class |
nodeType |
Character. One of: |
rankType |
Character. One of: |
n |
Numeric. Default 0.1. If this parameter is passed as a value between [0,1], it is treated as a percentage of top nodes. If the value is passed as an integer >=1 it will be treated as the number of top nodes. |
use_TF_gene_network |
|
A data frame with the node names and the corresponding scores used to rank them
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() topGenes = getTopNodes(GRN, nodeType = "gene", rankType = "degree", n = 3) topTFs = getTopNodes(GRN, nodeType = "TF", rankType = "EV", n = 5)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() topGenes = getTopNodes(GRN, nodeType = "gene", rankType = "degree", n = 3) topTFs = getTopNodes(GRN, nodeType = "TF", rankType = "EV", n = 5)
Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have celltype specific activity. This TF activity can be quantified with existing tools such as diffTF
and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (eGRN
) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a eGRN
using bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally TF activity data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.
See the Vignettes for a workflow example and more generally https://grp-zaugg.embl-community.io/GRaNIE for all project-related information.
The GRaNIE
package works with GRN
objects. See GRN
for details.
Please check out https://grp-zaugg.embl-community.io/GRaNIE for how to get in contact with us.
Maintainer: Christian Arnold [email protected]
Authors:
Judith Zaugg [email protected]
Rim Moussa [email protected]
Other contributors:
Armando Reyes-Palomares [email protected] [contributor]
Giovanni Palla [email protected] [contributor]
Maksim Kholmatov [email protected] [contributor]
Useful links:
Report bugs at https://git.embl.de/grp-zaugg/GRaNIE/issues
The class GRN
stores data and information related to our eGRN
approach to construct enhancer-mediated gene regulatory networks out of open chromatin and RNA-Seq data. See the description below for more details, and visit our project website at https://grp-zaugg.embl-community.io/GRaNIE and have a look at the various Vignettes.
data
Currently stores 4 different types of data:
peaks
:
counts
:
counts_metadata
:
RNA
:
counts
:
counts_metadata
:
counts_permuted_index
:
TFs
:
TF_activity
:
TF_peak_overlap
:
classification
:
config
Contains general configuration data and parameters such as parameters, files, directories, flags, and recorded function parameters.
connections
Stores various types of connections
annotation
Stores annotation data for peaks and genes
stats
Stores statistical and summary information for a GRN
network. Currently, connection details are stored here.
graph
Stores the eGRN graph related information and data structures
Currently, a GRN
object is created by executing the function initializeGRN
.
In the following code snippets, GRN
is a GRN
object.
# Get general annotation of a GRN object from the GRaNIE package
nPeaks(GRN))
, nTFs(GRN))
and nGenes(GRN))
: Retrieve the number of peaks, TFs and genes, respectively, that have been added to the object (both before and after filtering)
We do not yet provide full support for this function. It is currently being tested. Use at our own risk.
importTFData( GRN, data, name, idColumn = "ENSEMBL", nameColumn = "TF.name", normalization = "none", forceRerun = FALSE )
importTFData( GRN, data, name, idColumn = "ENSEMBL", nameColumn = "TF.name", normalization = "none", forceRerun = FALSE )
GRN |
Object of class |
data |
Data frame. No default. Data with TF data. |
name |
Name in object under which it should be stored. This corresponds to the |
idColumn |
Character. Default |
nameColumn |
Character. Default |
normalization |
Character. Default |
forceRerun |
|
An updated GRN
object, with added data from this function.
GRN
object.Executing this function is the very first step in the *GRaNIE* workflow. After its execution, data can be added to the object.
initializeGRN(objectMetadata = list(), outputFolder = ".", genomeAssembly)
initializeGRN(objectMetadata = list(), outputFolder = ".", genomeAssembly)
objectMetadata |
List. Default |
outputFolder |
Output folder, either absolute or relative to the current working directory. Default |
genomeAssembly |
Character. No default. The genome assembly of all data that to be used within this object.
Currently, supported genomes are: |
Empty GRN
object
meta.l = list(name = "exampleName", date = "01.03.22") GRN = initializeGRN(objectMetadata = meta.l, outputFolder = "output", genomeAssembly = "hg38")
meta.l = list(name = "exampleName", date = "01.03.22") GRN = initializeGRN(objectMetadata = meta.l, outputFolder = "output", genomeAssembly = "hg38")
Loads an example GRN object with 6 TFs, ~61.000 peaks, ~19.000 genes, 259 filtered connections and pre-calculated enrichments.
This function uses BiocFileCache
if installed to cache the example object, which is
considerably faster than re-downloading the file anew every time the function is executed.
If not, the file is re-downloaded every time anew. Thus, to enable caching, you may install the package BiocFileCache
.
loadExampleObject( forceDownload = FALSE, fileURL = "https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds" )
loadExampleObject( forceDownload = FALSE, fileURL = "https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds" )
forceDownload |
|
fileURL |
Character. Default https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds. URL to the GRN example object in rds format. |
An small example GRN
object
GRN = loadExampleObject()
GRN = loadExampleObject()
GRN
object.Returns the number of genes (all or only non-filtered ones) from the provided RNA-seq data in the GRN
object.
nGenes(GRN, filter = TRUE)
nGenes(GRN, filter = TRUE)
GRN |
Object of class |
filter |
TRUE or FALSE. Default TRUE. Should genes marked as filtered be included in the count? |
Integer. Number of genes that are defined in the GRN
object, either by excluding (filter = TRUE) or including (filter = FALSE) genes that are currently marked as filtered.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() nGenes(GRN, filter = TRUE) nGenes(GRN, filter = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() nGenes(GRN, filter = TRUE) nGenes(GRN, filter = FALSE)
GRN
object.Returns the number of peaks (all or only non-filtered ones) from the provided peak datain the GRN
object.
nPeaks(GRN, filter = TRUE)
nPeaks(GRN, filter = TRUE)
GRN |
Object of class |
filter |
TRUE or FALSE. Default TRUE. Should peaks marked as filtered be included in the count? |
Integer. Number of peaks that are defined in the GRN
object, either by excluding (filter = TRUE) or including (filter = FALSE) peaks that are currently marked as filtered.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() nPeaks(GRN, filter = TRUE) nPeaks(GRN, filter = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() nPeaks(GRN, filter = TRUE) nPeaks(GRN, filter = FALSE)
GRN
object.Returns the number of TFs from the provided TFBS data in the GRN
object.
nTFs(GRN)
nTFs(GRN)
GRN |
Object of class |
Integer. Number of TFs that are defined in the GRN
object.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() nTFs(GRN)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() nTFs(GRN)
GRN
objectIf the source was set to JASPAR
in addTFBS
, the argument nCores
is ignored.
overlapPeaksAndTFBS(GRN, nCores = 2, forceRerun = FALSE, ...)
overlapPeaksAndTFBS(GRN, nCores = 2, forceRerun = FALSE, ...)
GRN |
Object of class |
nCores |
Integer >0. Default 1. Number of cores to use.
A value >1 requires the |
forceRerun |
|
... |
No default. Only relevant if |
An updated GRN
object, with added data from this function (GRN@data$TFs$TF_peak_overlap
in particular)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = overlapPeaksAndTFBS(GRN, nCores = 2, forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = overlapPeaksAndTFBS(GRN, nCores = 2, forceRerun = FALSE)
A convenience function that calls all network-related functions in one-go, using selected default parameters and a set of adjustable ones also.
For full adjustment, run the individual functions separately.
This function requires a filtered set of connections in the GRN
object as generated by filterGRNAndConnectGenes
performAllNetworkAnalyses( GRN, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", clustering = "louvain", communities = NULL, selection = "byRank", topnGenes = 20, topnTFs = 20, maxWidth_nchar_plot = 50, display_pAdj = FALSE, outputFolder = NULL, forceRerun = FALSE )
performAllNetworkAnalyses( GRN, ontology = c("GO_BP", "GO_MF"), algorithm = "weight01", statistic = "fisher", background = "neighborhood", clustering = "louvain", communities = NULL, selection = "byRank", topnGenes = 20, topnTFs = 20, maxWidth_nchar_plot = 50, display_pAdj = FALSE, outputFolder = NULL, forceRerun = FALSE )
GRN |
Object of class |
ontology |
Character vector of ontologies. Default |
algorithm |
Character. Default |
statistic |
Character. Default |
background |
Character. Default |
clustering |
Character. Default |
communities |
|
selection |
Character. Default |
topnGenes |
Integer > 0. Default 20. Number of genes to plot, sorted by their rank or label. |
topnTFs |
Integer > 0. Default 20. Number of TFs to plot, sorted by their rank or label. |
maxWidth_nchar_plot |
Integer (>=10). Default 50. Maximum number of characters for a term before it is truncated. |
display_pAdj |
|
outputFolder |
Character or |
forceRerun |
|
An updated GRN
object, with added data from this function.
calculateCommunitiesEnrichment
# See the Workflow vignette on the GRaNIE website for examples # GRN = loadExampleObject() # GRN = performAllNetworkAnalyses(GRN, outputFolder = ".", forceRerun = FALSE)
# See the Workflow vignette on the GRaNIE website for examples # GRN = loadExampleObject() # GRN = performAllNetworkAnalyses(GRN, outputFolder = ".", forceRerun = FALSE)
GRN
objectPlot various network connectivity summaries for a GRN
object
plot_stats_connectionSummary( GRN, type = "heatmap", outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plot_stats_connectionSummary( GRN, type = "heatmap", outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
type |
Character. Either |
outputFolder |
Character or |
basenameOutput |
|
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
The same GRN
object, without modifications.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plot_stats_connectionSummary(GRN, forceRerun = FALSE, plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plot_stats_connectionSummary(GRN, forceRerun = FALSE, plotAsPDF = FALSE, pages = 1)
GRN
objectSimilarly to plotGeneralEnrichment
and plotTFEnrichment
, the results of the community-based enrichment analysis are plotted.
This function produces multiple plots. First, one plot per community to summarize the community-specific enrichment.
Second, a summary heatmap of all significantly enriched terms across all communities and for the whole eGRN. The latter allows to compare the results with the general network enrichment.
Third, a subset of the aforementioned heatmap, showing only the top most significantly enriched terms per community and for the whole eGRN (as specified by nID
) for improved visibility
plotCommunitiesEnrichment( GRN, outputFolder = NULL, basenameOutput = NULL, selection = "byRank", communities = NULL, topn_pvalue = 30, p = 0.05, nSignificant = 2, nID = 10, maxWidth_nchar_plot = 50, display_pAdj = FALSE, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plotCommunitiesEnrichment( GRN, outputFolder = NULL, basenameOutput = NULL, selection = "byRank", communities = NULL, topn_pvalue = 30, p = 0.05, nSignificant = 2, nID = 10, maxWidth_nchar_plot = 50, display_pAdj = FALSE, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
selection |
Character. Default |
communities |
|
topn_pvalue |
Numeric. Default 30. Maximum number of ontology terms that meet the p-value significance threshold to display in the enrichment dot plot |
p |
Numeric. Default 0.05. p-value threshold to determine significance. |
nSignificant |
Numeric > 0. Default 3. Threshold to filter out an ontology term with less than |
nID |
Numeric > 0. Default 10. For the reduced summary heatmap, number of top terms to select per community / for the general enrichment. |
maxWidth_nchar_plot |
Integer (>=10). Default 50. Maximum number of characters for a term before it is truncated. |
display_pAdj |
|
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
The same GRN
object, without modifications.
calculateCommunitiesEnrichment
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotCommunitiesEnrichment(GRN, plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotCommunitiesEnrichment(GRN, plotAsPDF = FALSE, pages = 1)
GRN
Similarly to the statistics produced by plotGeneralGraphStats
, summaries regarding the vertex degrees and the most important vertices per community are generated. Note that the communities need to first be calculated using the calculateCommunitiesStats
function
plotCommunitiesStats( GRN, outputFolder = NULL, basenameOutput = NULL, selection = "byRank", communities = seq_len(5), topnGenes = 20, topnTFs = 20, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plotCommunitiesStats( GRN, outputFolder = NULL, basenameOutput = NULL, selection = "byRank", communities = seq_len(5), topnGenes = 20, topnTFs = 20, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
selection |
Character. Default |
communities |
|
topnGenes |
Integer > 0. Default 20. Number of genes to plot, sorted by their rank or label. |
topnTFs |
Integer > 0. Default 20. Number of TFs to plot, sorted by their rank or label. |
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
The same GRN
object, without modifications.
calculateCommunitiesEnrichment
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotCommunitiesStats(GRN, plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotCommunitiesStats(GRN, plotAsPDF = FALSE, pages = 1)
GRN
objectThe user can select multiple filters to plot only pairs of interest. The data that is shown is the same that has been used to construct the eGRN.
plotCorrelations( GRN, type = "all.filtered", TF.IDs = NULL, peak.IDs = NULL, gene.IDs = NULL, min_abs_r = 0, TF_peak_maxFDR = 0.2, peak_gene_max_rawP = 0.2, TF_gene_max_rawP = 0.2, nMax = 10, nSelectionType = "random", dataType = c("real"), corMethod = NULL, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, plotsPerPage = c(2, 2), pdf_width = 10, pdf_height = 8, forceRerun = FALSE )
plotCorrelations( GRN, type = "all.filtered", TF.IDs = NULL, peak.IDs = NULL, gene.IDs = NULL, min_abs_r = 0, TF_peak_maxFDR = 0.2, peak_gene_max_rawP = 0.2, TF_gene_max_rawP = 0.2, nMax = 10, nSelectionType = "random", dataType = c("real"), corMethod = NULL, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, plotsPerPage = c(2, 2), pdf_width = 10, pdf_height = 8, forceRerun = FALSE )
GRN |
Object of class |
type |
Character(1). Default |
TF.IDs |
Character(). Default |
peak.IDs |
Character(). Default |
gene.IDs |
Character(). Default |
min_abs_r |
Numeric[0,1]. Default 0. Filter for all types of pairs: Minimum correlation coefficient (absolute value) required to include a particular pair. |
TF_peak_maxFDR |
Numeric[0,1]. Default 0.2. Filter for TF-peak pairs: Which maximum FDR should a pair to plot have? Only applicable when |
peak_gene_max_rawP |
Numeric[0,1]. Default 0.2. Filter for peak-gene pairs: Which maximum FDR should a pair to plot have? Only applicable when |
TF_gene_max_rawP |
Numeric[0,1]. Default 0.2. Filter for TF-gene pairs: Which maximum FDR should a pair to plot have? Only applicable when |
nMax |
Numeric(1). Default 10. Filter for all types of pairs: maximum number of selected pairs that fulfill all other filters that should be plotted. If set to 0, this filter will be disabled and all pairs that fulfill the user-defined criteria will be plotted. If set to a value > 0, different pairs may be selected each time the function is run (if the total number of remaining pairs is large enough) |
nSelectionType |
|
dataType |
Character vector. One of, or both of, |
corMethod |
Character. Either |
outputFolder |
Character or |
basenameOutput |
|
plotAsPDF |
|
plotsPerPage |
Integer vector of length 2. Default |
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
forceRerun |
|
An updated GRN
object.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotCorrelations(GRN, nMax = 1, min_abs_r = 0.8, plotsPerPage = c(1,1), plotAsPDF = FALSE)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotCorrelations(GRN, nMax = 1, min_abs_r = 0.8, plotsPerPage = c(1,1), plotAsPDF = FALSE)
GRN
objectPlot diagnostic plots for peak-gene connections for a GRN
object
plotDiagnosticPlots_peakGene( GRN, outputFolder = NULL, basenameOutput = NULL, gene.types = list(c("all"), c("protein_coding")), useFiltered = FALSE, plotDetails = FALSE, plotPerTF = FALSE, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plotDiagnosticPlots_peakGene( GRN, outputFolder = NULL, basenameOutput = NULL, gene.types = list(c("all"), c("protein_coding")), useFiltered = FALSE, plotDetails = FALSE, plotPerTF = FALSE, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
gene.types |
List of character vectors. Default |
useFiltered |
Logical. |
plotDetails |
|
plotPerTF |
Logical. |
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
An updated GRN
object.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() types = list(c("protein_coding")) GRN = plotDiagnosticPlots_peakGene(GRN, gene.types=types, plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() types = list(c("protein_coding")) GRN = plotDiagnosticPlots_peakGene(GRN, gene.types=types, plotAsPDF = FALSE, pages = 1)
GRN
objectPlot diagnostic plots for TF-peak connections for a GRN
object
plotDiagnosticPlots_TFPeaks( GRN, outputFolder = NULL, basenameOutput = NULL, plotDetails = FALSE, dataType = c("real", "background"), nTFMax = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height_base = 8, pages = NULL, forceRerun = FALSE )
plotDiagnosticPlots_TFPeaks( GRN, outputFolder = NULL, basenameOutput = NULL, plotDetails = FALSE, dataType = c("real", "background"), nTFMax = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height_base = 8, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
plotDetails |
|
dataType |
Character vector. One of, or both of, |
nTFMax |
|
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height_base |
Number. Default 8. Base height of the PDF, in cm, per connection type. The total height is automatically determined based on the number of connection types that are found in the object (e.g., expression or TF activity). For example, when two connection types are found, the base height is multiplied by 2. |
pages |
Integer vector or |
forceRerun |
|
An updated GRN
object.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotDiagnosticPlots_TFPeaks(GRN, outputFolder = ".", dataType = "real", nTFMax = 2, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotDiagnosticPlots_TFPeaks(GRN, outputFolder = ".", dataType = "real", nTFMax = 2, pages = 1)
GRN
objectNote: The arguments nTFMax and pages are not implemented yet
plotDiagnosticPlots_TFPeaks_GC( GRN, outputFolder = NULL, basenameOutput = NULL, dataType = c("real", "background"), nTFMax = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height_base = 15, pages = NULL, forceRerun = FALSE )
plotDiagnosticPlots_TFPeaks_GC( GRN, outputFolder = NULL, basenameOutput = NULL, dataType = c("real", "background"), nTFMax = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height_base = 15, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
dataType |
Character vector. One of, or both of, |
nTFMax |
|
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height_base |
Number. Default 8. Base height of the PDF, in cm, per connection type. The total height is automatically determined based on the number of connection types that are found in the object (e.g., expression or TF activity). For example, when two connection types are found, the base height is multiplied by 2. |
pages |
Integer vector or |
forceRerun |
|
This function plots the results of the general enrichment analysis for every specified ontology.
plotGeneralEnrichment( GRN, outputFolder = NULL, basenameOutput = NULL, ontology = NULL, topn_pvalue = 30, p = 0.05, display_pAdj = FALSE, maxWidth_nchar_plot = 50, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plotGeneralEnrichment( GRN, outputFolder = NULL, basenameOutput = NULL, ontology = NULL, topn_pvalue = 30, p = 0.05, display_pAdj = FALSE, maxWidth_nchar_plot = 50, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
ontology |
Character. |
topn_pvalue |
Numeric. Default 30. Maximum number of ontology terms that meet the p-value significance threshold to display in the enrichment dot plot |
p |
Numeric. Default 0.05. p-value threshold to determine significance. |
display_pAdj |
|
maxWidth_nchar_plot |
Integer (>=10). Default 50. Maximum number of characters for a term before it is truncated. |
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
The same GRN
object, without modifications.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotGeneralEnrichment(GRN, plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotGeneralEnrichment(GRN, plotAsPDF = FALSE, pages = 1)
GRN
objectThis function generates graphical summaries about the structure and connectivity of the TF-peak-gene and TF-gene graphs. These include, distribution of vertex types (TF, peak, gene) and edge types (tf-peak, peak-gene), the distribution of vertex degrees, and the most "important" vertices according to degree centrality and eigenvector centrality scores.
plotGeneralGraphStats( GRN, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plotGeneralGraphStats( GRN, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
The same GRN
object, without modifications.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotGeneralGraphStats(GRN, plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotGeneralGraphStats(GRN, plotAsPDF = FALSE, pages = 1)
GRN
objectProduce a PCA plot of the data from a GRN
object
plotPCA_all( GRN, outputFolder = NULL, basenameOutput = NULL, data = c("rna", "peaks"), topn = c(500, 1000, 5000), type = "normalized", removeFiltered = TRUE, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plotPCA_all( GRN, outputFolder = NULL, basenameOutput = NULL, data = c("rna", "peaks"), topn = c(500, 1000, 5000), type = "normalized", removeFiltered = TRUE, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
data |
Character. Either |
topn |
Integer vector. Default |
type |
Character. Must be |
removeFiltered |
Logical. |
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
An updated GRN
object with the data of the screeplot and PCA stored in GRN@stats$PCA. Already existing slots are overwritten.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotPCA_all(GRN, topn = 500, data = "rna", type = "normalized", plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotPCA_all(GRN, topn = 500, data = "rna", type = "normalized", plotAsPDF = FALSE, pages = 1)
Similarly to plotGeneralEnrichment
and plotCommunitiesEnrichment
, the results of the TF-based enrichment analysis are plotted.
This function produces multiple plots. First, one plot per community to summarize the TF-specific enrichment.
Second, a summary heatmap of all significantly enriched terms across all TFs and for the whole eGRN. The latter allows to compare the results with the general network enrichment.
Third, a subset of the aforementioned heatmap, showing only the top most significantly enriched terms per TF and for the whole eGRN (as specified by nID
) for improved visibility .
plotTFEnrichment( GRN, rankType = "degree", n = NULL, TF.IDs = NULL, topn_pvalue = 30, p = 0.05, nSignificant = 2, nID = 10, display_pAdj = FALSE, maxWidth_nchar_plot = 50, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
plotTFEnrichment( GRN, rankType = "degree", n = NULL, TF.IDs = NULL, topn_pvalue = 30, p = 0.05, nSignificant = 2, nID = 10, display_pAdj = FALSE, maxWidth_nchar_plot = 50, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, pages = NULL, forceRerun = FALSE )
GRN |
Object of class |
rankType |
Character. One of: "degree", "EV", "custom". This parameter will determine the criterion to be used to identify the "top" nodes. If set to "degree", the function will select top nodes based on the number of connections they have, i.e. based on their degree-centrality. If set to "EV" it will select the top nodes based on their eigenvector-centrality score in the network. |
n |
NULL or numeric. Default NULL. If set to NULL, all previously calculated TF enrichments will be plotted. If set to a value between (0,1), it is treated as a percentage of top nodes. If the value is passed as an integer it will be treated as the number of top nodes. This parameter is not relevant if rankType = "custom". |
TF.IDs |
|
topn_pvalue |
Numeric. Default 30. Maximum number of ontology terms that meet the p-value significance threshold to display in the enrichment dot plot |
p |
Numeric. Default 0.05. p-value threshold to determine significance. |
nSignificant |
Numeric > 0. Default 3. Threshold to filter out an ontology term with less than |
nID |
Numeric > 0. Default 10. For the reduced summary heatmap, number of top terms to select per community / for the general enrichment. |
display_pAdj |
|
maxWidth_nchar_plot |
Integer (>=10). Default 50. Maximum number of characters for a term before it is truncated. |
outputFolder |
Character or |
basenameOutput |
|
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
pages |
Integer vector or |
forceRerun |
|
The same GRN
object, without modifications.
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotTFEnrichment(GRN, n = 5, plotAsPDF = FALSE, pages = 1)
# See the Workflow vignette on the GRaNIE website for examples GRN = loadExampleObject() GRN = plotTFEnrichment(GRN, n = 5, plotAsPDF = FALSE, pages = 1)
This function can visualize a filtered eGRN in a very flexible manner and requires a GRN
object as generated by build_eGRN_graph
.
visualizeGRN( GRN, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, title = NULL, maxEdgesToPlot = 500, nCommunitiesMax = 8, graph = "TF-gene", colorby = "type", layout = "fr", vertice_color_TFs = list(h = 10, c = 85, l = c(25, 95)), vertice_color_peaks = list(h = 135, c = 45, l = c(35, 95)), vertice_color_genes = list(h = 260, c = 80, l = c(30, 90)), vertexLabel_cex = 0.4, vertexLabel_dist = 0, forceRerun = FALSE )
visualizeGRN( GRN, outputFolder = NULL, basenameOutput = NULL, plotAsPDF = TRUE, pdf_width = 12, pdf_height = 12, title = NULL, maxEdgesToPlot = 500, nCommunitiesMax = 8, graph = "TF-gene", colorby = "type", layout = "fr", vertice_color_TFs = list(h = 10, c = 85, l = c(25, 95)), vertice_color_peaks = list(h = 135, c = 45, l = c(35, 95)), vertice_color_genes = list(h = 260, c = 80, l = c(30, 90)), vertexLabel_cex = 0.4, vertexLabel_dist = 0, forceRerun = FALSE )
GRN |
Object of class |
outputFolder |
Character or |
basenameOutput |
|
plotAsPDF |
|
pdf_width |
Number>0. Default 12. Width of the PDF, in cm. |
pdf_height |
Number >0. Default 12. Height of the PDF, in cm. |
title |
|
maxEdgesToPlot |
Integer > 0. Default 500. Refers to the maximum number of connections to be plotted. If the network size is above this limit, nothing will be drawn. In such a case, it may help to either increase the value of this parameter or set the filtering criteria for the network to be more stringent, so that the network becomes smaller. |
nCommunitiesMax |
Integer > 0. Default 8. Maximum number of communities that get a distinct coloring. All additional communities will be colored with the same (gray) color. |
graph |
Character. Default |
colorby |
Character. Default |
layout |
Character. Default |
vertice_color_TFs |
Named list. Default |
vertice_color_peaks |
Named list. Default |
vertice_color_genes |
Named list. Default |
vertexLabel_cex |
Numeric. Default |
vertexLabel_dist |
Numeric. Default |
forceRerun |
|
The same GRN
object, without modifications.
GRN = loadExampleObject() GRN = visualizeGRN(GRN, maxEdgesToPlot = 700, graph = "TF-gene", colorby = "type")
GRN = loadExampleObject() GRN = visualizeGRN(GRN, maxEdgesToPlot = 700, graph = "TF-gene", colorby = "type")