addConnections_TF_peak()
that threw errors for some edge cases when percBackground_size = 0
patchwork
version 1.2.0 and above, due to an incompatibility reported between ggplot2 3.5 and patchwork
versions < 1.2.0. The error that may be thrown in plotDiagnosticPlots_peakGene
is this: Error in Ops.data.frame(guide_loc, panel_loc) : ‘==’ only defined for equally-sized data frames
forcats
warning for package versions of 1.0.0 and above fixedgene.types
in filterGRNAndConnectGenes()
to all
and adjusted documentation across functions that have also this argument - by default, there is now no standard filter for the gene type (genes were filtered to only protein-coding genes before using protein_coding
). This results in bigger eGRNs and seems more sensible than to automatically filter genes for most use cases.plotDiagnosticPlots_peakGene()
(upper right plot on page 1). This was caused by not correctly shuffling the RNA cunts data due to a code change that occurred in version 1.7.4. We apologize for the confusion this may have caused.addTFBS()
function. We tested this database extensively and use it as default TF database for our GRaNIE networks.tidyr
1.3.0 or above due to the usage of separate_wider_delim
in the code in a recent upgradeAnnotationHub::getAnnotationHub
addConnections_peak_gene()
.JASPAR2024
package and TF motives.plotPCA_all()
now also stores all screeplot and PCA results in the object within GRN@stats$PCA
addTFBS
when using the JASPAR databaseAnnotationHub
and caching annotation data in cases when the cache directory is corrupted or deletedfilterData
for cases when too few peaks survive the filtering. Before, the function threw an error when the number of peaks left was < 1000. Due to the error, the original GRN object was not updated in such a case and one could accidentally continue with the normal workflow because the GRN object before the filtering would be taken as input instead for subsequent functions. Now, only a warning is produced.filterConnectionsForPlotting()
that can be used to include or exclude particular connections from the stored eGRN for visualization purposes only (!). Note that this filter only applies to visualization and enables a flexible system to visually explore particular features of the stored eGRN. This is particularly handy when the eGRN is large. For more details, see the help pages of the new function.visualizeGRN()
now by default only visualizes connections that are marked as such (the result from filterConnectionsForPlotting()
) - that is, it excludes connections that the user beforehand excluded from plotting. This allows to specifically plot only part of the eGRN network and explore specific T&F regulons, for example, a feature that before was not so easy to do.GRaNIE
via the new function addSNPData()
. For more information, see the Package vignette.biomaRt
for the full genome annotation retrieval in addData
with a different approach that is more reliable, as we had more and more issues with biomaRt
in the recent past. While using the old biomaRt
approach is still an option, the default is now to use the AnnotationHub
package from Bioconductor. This makes GRaNIE overall more stable and less reliant on biomaRt
due to the strict timeouts and query size restrictions.addRobustRegression
) that was available as an experimental feature until now. It is implemented in the WGCNA
package and called biweight midcorrelation or short bicor, a robust type of correlation based on medians that can be used as an alternative to Spearman correlation. Biweight midcorrelation has been shown to be more robust in evaluating similarity in gene expression networks and is often used for weighted correlation network analysis. In addition, this new correlation type can now also be selected for the TF-peak correlations, which was not possible before. Lastly, the code has been cleaned and simplified and all instances of addRobustRegression
have been removed and replaced by a new third option bicor
(in addition to Pearson and Spearman, as before) for the corMethod
argument in multiple functions that support this feature. All vignettes have been updated accordingly.addData
, when a DESeq2
size factor normalization is selected by the user, it is now explicitly checked whether enough genes are available that contain no 0 values on which the size factor normalization is based on. If this is not the case (the default and hard-coded limit is currently set to a minimum of 100), an error is thrown. This becomes particularly relevant for single-cell derived data with a high fraction of 0s, and prevents a normalization based on very few genes and improves the error messages that DESeq2
throws otherwise for an improved user experience.biomaRt
call, which did not work as originally intended in case of temporary connection failures. Now, calls to biomaRt
are attempted up to 40 times to increase the chances of not suffering from connection issues. Also, the approach to deal with BiocParallel
failures has been changed.we provide two new functions with this update:
getGRNSummary()
that summarizes a GRN
object and returns a named list, which can be used to compare different GRN
objects ore easily among each other, for example.plotCorrelations()
for scatter plots of the underlying data for either TF-peak, peak-gene or TF-gene pairs. This can be useful to visualize specific TF-peak, peak-gene or TF-gene pairs to investigate the underlying data and to judge the reasonability of the inferred connection.methods vignette updates
TF.ID
instead of TF.name
column as unique TF identifierrn6
/rn7
and dm6
for the rat and the Drosophila (fruit fly) genome, respectivelyGRaNIE
. We now additionally offer a more user-friendly way by making it possible to directly use the JASPAR2022
database. You do not need any custom files anymore for this approach! See the Package vignette for more details.addConnections_TF_peak
(Column
peak.GC.class doesn't exist.
) that was caused due to the recent GC modificationsplotDiagnosticPlots_TFPeaks
(and indirectly in addConnections_TF_peak
when plotDiagnosticPlots = TRUE
) on page 1 that shows the total number of connections for real and background TF-peak links as calculated and stored in the GRN
object, stratified by TF-peak FDR and correlation bin. This is a similar plot as we show in the paper and helps comparing foreground and background.plotDiagnosticPlots_TFPeaks
(and indirectly in addConnections_TF_peak
when plotDiagnosticPlots = TRUE
) when plotAsPDF = FALSE
addConnections_TF_peak
when using useGCCorrection = TRUE
dplyr
(1.1.0) changed their default behavior for the function if_else
when NULL
is involved, which caused an error. We changed the implementation to accommodate for that and now avoid dplyr::if_else
and use base R ifelse
instead.GenomeInfoDb::getChromInfoFromUCSC("hg38")
(see here for more details), the minimum required version of GenomeInfoDb
had to be increased to 1.34.8
. If you have troubles installing at least this version, we recommend updating to the newest Bioconductor version 3.16 or (without warranties) use the following line to manually install the newest version directly from GitHub outside of Bioconductor (not recommended): BiocManager::install("Bioconductor/GenomeInfoDb)"
addData()
so that peak IDs are stored with the same name in the object in case the user-provided peak IDs have the format chr:start:end
as opposed to the required chr:start-end
. filterData()
otherwise incorrectly discarded all peaks because of the ID mismatch caused by the two different formats.filterGRNAndConnectGenes()
that caused an error when 0 TF-peak connections were found beforehandGRaNIE
for single-cell data! We plan to update it regularly with new information. Check it out here!addData()
: geneAnnotation_customHost
to specify a custom host and overriding the default and previously hard-coded hostname when retrieving gene annotation data via biomaRt
.getGRNConnections()
can now also include the various additional metadata for all type
parameters and not only the default type all.filtered
.biomaRt
such as GL000194.1
. Peaks from chromosomes with irretrievable lengths are now automatically discarded.plotDiagnosticPlots_peakGene()
(which is also called indirectly from addConnections_peak_gene()
when setting plotDiagnosticPlots = TRUE
) now stores the plot data for the QC plots from the first page into the GRN object. It is stored in GRN@stats$peak_genes
getGRNConnections()
are now explained in detail in the R help, and we reference this from the Vignette and other placesgetGRNConnections()
, which now does not return duplicate columns for particular cases anymorefilterData()
and addData()
loadExampleObject()
function has been optimized and should now force download an example object when requesting it.tidyselect
changes in version 1.2.0 to eliminate deprecated warningsaddConnections_peak_gene()
and plotDiagnosticPlots_peakGene()
have been homogenized and changed to list(c("all"), c("protein_coding"))
. Before, the default was list(c("protein_coding", "lincRNA"))
, but we decided to now split this into two separate lists: Once for all genes irrespective of the gene type and once for only protein-coding genes. As before, lincRNA
or other gene types can of course still be selected and chosen.plotCommunitiesEnrichment()
that was introduced due to the tidyselect
1.2.0 changestidyselect
changes in version 1.2.0 to eliminate deprecated warningsGRN
object in loadExampleObject()
had to be changed due to changes in the IT infrastructure. The new stable default URL is now \url{https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds}, in the same Git repository that provides GRaNIE
outside of Bioconductor.tidyselect
changes in version 1.2.0 to eliminate deprecated warningsaddConnections_peak_gene()
: TADs_mergeOverlapping
. See the R help for more details.addConnections_peak_gene()
: shuffleRNACounts
. See the R help for more details.tidyselect
changes in version 1.2.0 to eliminate deprecated warningstopGO
package is now required package and not optional anymore. The reasoning for this is that the standard vignette should run through with the default arguments, and GO
annotation is the default ontology so topGO
is needed for this. Despite this package still being optional from a strict workflow point of view, we feel this is a better way and improves user friendliness by not having to install another package in the middle of the workflow.initializeGRN()
, the objectMetadata
argument is now checked whether it contains only atomic elements, and an error is thrown if this is not the case. As this list is not supposed to contain real data, checking this prevents the print(GRN) function to unnecessarily print the whole content of the provided object metadata, thereby breaking the original purpose.addTFBS()
got two more arguments to make it more flexible. Now, it is possible to specify the file name of the translation table to be used via the argument translationTable
, which makes it more flexible than the previously hard-coded name "translationTable.csv
. In addition, the column separator for this file can now be specified via the argument translationTable_sep
decoy
, for example. If such elements are found, a warning is now thrown and they are ignored as they are usually not wanted anyway.print
function now give a more user-friendly warning / error message.?addData
for details.GRaNIE
is now more readily applicable for larger analyses and single-cell analysis even though we just started actively optimizing for it, so we cannot yet recommend applying our framework in a single-cell manner. Older GRN objects are automatically changed internally when executing the major functions upon the first invocation.generateStatsSummary()
now doesnt alter the stored filtered connections in the object anymore. This makes its usage more intuitive and it can be used anywhere in the workflow.biomaRt
calls in the code. This saves time and makes the code less vulnerable to timeout issues caused by remote servicesplotPCA_all()
now can only plot the normalized counts and not the raw counts anymore (except when no normalization is wanted)build_eGRN_graph()
in the GRaNIE
object is now reset whenever the function filterGRNAndConnectGenes()
is successfully executed to make sure that enrichment functions etc are not using an outdated graph structure.Suggests
to lower the installation burden of the package. In addition, removed topGO
from the Depends
section (now in Suggests
) and removed tidyverse
altogether (before in Depends
). Detailed explanations when and how the packages listed under Suggests
are needed can now be found in the new Package Details Vignette and are clearly given to the user when executing the respective functionsgetGRNConnections()
, which now has more arguments allowing a more fine-tuned and rich retrieval of eGRN connections, features and feature metadataadd_featureVariation()
to quantify and interpret multiple sources of biological and technical variation for features (TFs, peaks, and genes) in a GRN object, see the R help for more informationfilterGRNAndConnectGenes()
now doesnt include feature metadata columns to save space in the result data frame that is created. The help has been updated to make clear that getGRNConnections()
includes these features now.GRN@data$TFs@translationTable
to GRN@annotation@TFs
. All exported functions run automatically a small helper function to make this change for any GRN object automatically to adapt to the new structurelncRNA
from biomaRt
to lincRNA
to be compatible with older versions of GRaNIE
filterGRaNIEAndConnectGenes()
(peak_gene.maxDistance
) as well as more flexibility how to adjust the peak-gene raw p-values for multiple testing (including the possibility to use IHW - experimental)plotDiagnosticPlots_TFPeaks()
for plotting (this function was previously called only internally, but is now properly exported), in analogy to plotDiagnosticPlots_peakGene()
first published package version